Author: 陈江(恬泰)
PolarDB Mysql版支持OR子句/IN-List表达式转UNION ALL的查询优化功能,能够在生成计划期间尝试将合适的OR子句/IN-List转换成UNION ALL形式,再进行基于代价的路径选择,得到更优的执行计划。
目前,Mysql优化器对SQL中的OR子句过滤条件的优化能力较为有限。如果OR子句中的过滤条件仅涉及一张表,且所有过滤条件上均具备适当的索引,则优化器会为此类场景生成一个Range Access Path或者Index Merge Access Path。例如:
mysql> desc select c2 from t3 where c2 >98 or c2 <= 1;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+------------------------------------+
| 1 | SIMPLE | t3 | NULL | range | idx_c2_c3 | idx_c2_c3 | 5 | NULL | 4737 | 100.00 | Using index condition; Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+------------------------------------+
mysql> desc select * from t0 where key1 < 3 or key2 > 1020;
+----+-------------+-------+------------+-------------+---------------+-------+---------+------+------+----------+--------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------------+---------------+-------+---------+------+------+----------+--------------------------------------+
| 1 | SIMPLE | t0 | NULL | index_merge | i1,i2 | i1,i2 | 4,4 | NULL | 3 | 100.00 | Using sort_union(i1,i2); Using where |
+----+-------------+-------+------------+-------------+---------------+-------+---------+------+------+----------+--------------------------------------+
如果OR子句涉及多张表,优化器只能将该OR子句视为连接后的过滤条件,这可能导致SQL执行效率降低。例如:
desc analyze select * from t1,t3 where t3.c1 >98 or t1.b <= 0;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: ((t3.c1 > 98) or (t1.b <= 0)) (cost=409669.60 rows=4096000) (actual time=115.259..5416.434 rows=81920 loops=1)
-> Inner hash join (no condition) (cost=409669.60 rows=4096000) (actual time=37.843..3556.126 rows=4096000 loops=1)
-> Table scan on t3 (cost=1.01 rows=6400) (actual time=0.187..433.147 rows=6400 loops=1)
-> Hash
-> Table scan on t1 (cost=64.75 rows=640) (actual time=1.216..35.969 rows=640 loops=1)
|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (5.44 sec)
上述OR子句被当作一个整体,优化器无法使用t1.b或者t3.c1上的索引,导致t1与t2进行全表扫描。实际上,OR子句在逻辑上可以转化为UNION ALL,包含两个或多个分支的查询形式。例如,上述示例可以改写为:
EXPLAIN ANALYZE
SELECT * FROM t1 ,t3 WHERE t1.b <= 0
UNION ALL
SELECT * FROM t1,t3 WHERE t3.c1 >98 AND (t1.b >0 OR (t1.b <= 0) IS NULL);
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Append (actual time=58.272..302.546 rows=81920 loops=1)
-> Stream results (actual time=39.204..39.204 rows=0 loops=1)
-> Inner hash join (no condition) (cost=136589.21 rows=1365197) (actual time=39.202..39.202 rows=0 loops=1)
-> Table scan on t3 (cost=3.02 rows=6400) (never executed)
-> Hash
-> Filter: (t1.b <= 0) (cost=64.75 rows=213) (actual time=39.118..39.118 rows=0 loops=1)
-> Table scan on t1 (cost=64.75 rows=640) (actual time=0.322..38.382 rows=640 loops=1)
-> Stream results (actual time=19.062..237.913 rows=81920 loops=1)
-> Inner hash join (no condition) (cost=8239.26 rows=81920) (actual time=19.053..149.818 rows=81920 loops=1)
-> Filter: ((t1.b > 0) or ((t1.b <= 0) is null)) (cost=0.51 rows=640) (actual time=0.180..40.371 rows=640 loops=1)
-> Table scan on t1 (cost=0.51 rows=640) (actual time=0.176..39.572 rows=640 loops=1)
-> Hash
-> Index range scan on t3 using idx_c1, with index condition: (t3.c1 > 98) (cost=46.50 rows=128) (actual time=7.875..18.329 rows=128 loops=1)
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.32 sec)
改写后优化器利用t3.c1上的索引,减少中间处理的数据量,从而提高执行性能。 PolarDB Mysql版在生成计划期间,能够尝试将适当的OR子句/IN-List转换为UNION ALL形式,随后基于代价进行路径选择,从而最终获得更优的执行计划。
该功能是基于代价自适应应用改写的,如果您的sql估算代价比cbqt_cost_threshold大,优化器就会尝试应用or展开改写,如果改写后执行的更快,优化器将选用改写的方式进行执行,当然您可以通过将参数or_expansion_mode=on来开启该改写功能。设置参数值的具体操作请参见设置集群参数和节点参数。
| 参数名称 | 描述 |
|---|---|
| cbqt_cost_threshold | 用于控制对SQL尝试进行OR转换的执行计划总代价阈值,如果 SQL 原始的执行计划总代价没有超过阈值,则不会进行OR转换。取值范围:[0, +∞),默认值为100000。设置参数为0时,表示对任何SQL都尝试进行OR转换。不建议设置为0,可能会让所有SQL计划时间变长,影响性能。 |
| or_expansion_mode | 用于控制OR子句/IN-List转换UNION ALL功能的开关,取值如下:● on: ro/rw全开。● off:关闭。● replica_on: 仅ro开启,默认值为replica_on. |
CREATE TABLE `t1` (
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
KEY `idx_a` (`a`)
) ENGINE=InnoDB;
INSERT INTO `t1` VALUES (1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(7,7),(8,8),(9,9),(10,10);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
CREATE TABLE `t3` (
`c1` int(11) NOT NULL ,
`c2` int(11) DEFAULT NULL,
`c3` int(11) DEFAULT NULL,
`c4` int(11) DEFAULT NULL,
KEY idx_c1(`c1`),
KEY `idx_c2_c3` (`c2`,`c3`)
) ENGINE=InnoDB;
INSERT INTO `t3` VALUES (1,0,1,0),(2,0,2,0),(3,0,3,0),(4,0,4,0),(5,0,5,0),(6,0,6,0),(7,0,7,0),(8,0,8,0),(9,0,9,0),(10,0,10,0),(11,0,11,0),(12,0,12,0),(13,0,13,0),(14,0,14,0),(15,0,15,0),(16,0,16,0),(17,0,17,0),(18,0,18,0),(19,0,19,0),(20,0,20,0),(21,0,21,0),(22,0,22,0),(23,0,23,0),(24,0,24,0),(25,1,25,0),(26,1,26,0),(27,1,27,0),(28,1,28,0),(29,1,29,0),(30,1,30,0),(31,1,31,0),(32,1,32,0),(33,1,33,0),(34,1,34,0),(35,1,35,0),(36,1,36,0),(37,1,37,0),(38,1,38,0),(39,1,39,0),(40,1,40,0),(41,1,41,0),(42,1,42,0),(43,1,43,0),(44,1,44,0),(45,1,45,0),(46,1,46,0),(47,1,47,0),(48,1,48,0),(49,1,49,0),(50,1,50,1),(51,1,51,1),(52,1,52,1),(53,1,53,1),(54,1,54,1),(55,1,55,1),(56,1,56,1),(57,1,57,1),(58,1,58,1),(59,1,59,1),(60,1,60,1),(61,1,61,1),(62,1,62,1),(63,1,63,1),(64,1,64,1),(65,1,65,1),(66,1,66,1),(67,1,67,1),(68,1,68,1),(69,1,69,1),(70,1,70,1),(71,1,71,1),(72,1,72,1),(73,1,73,1),(74,1,74,1),(75,2,75,1),(76,2,76,1),(77,2,77,1),(78,2,78,1),(79,2,79,1),(80,2,80,1),(81,2,81,1),(82,2,82,1),(83,2,83,1),(84,2,84,1),(85,2,85,1),(86,2,86,1),(87,2,87,1),(88,2,88,1),(89,2,89,1),(90,2,90,1),(91,2,91,1),(92,2,92,1),(93,2,93,1),(94,2,94,1),(95,2,95,1),(96,2,96,1),(97,2,97,1),(98,2,98,1),(99,2,99,1),(100,2,100,1);
insert into t3 select * from t3;
insert into t3 select * from t3;
insert into t3 select * from t3;
insert into t3 select * from t3;
insert into t3 select * from t3;
insert into t3 select * from t3;
analyze table t1,t3;
说明:目前该功能在灰度测试中,在ro上默认开启,rw上需要额外设置
set or_expansion_mode=on
set cbqt_cost_threshold=1;
set polar_optimizer_switch='or_expansion=on';
mysql> desc select * from t1,t3 where t3.c1 >98 or t1.a<5;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------------------------+
| 1 | PRIMARY | t1 | NULL | range | idx_a | idx_a | 5 | NULL | 256 | 100.00 | Using index condition; Using MRR |
| 1 | PRIMARY | t3 | NULL | ALL | NULL | NULL | NULL | NULL | 6400 | 100.00 | Using join buffer (hash join) |
| 2 | UNION | t3 | NULL | range | idx_c1 | idx_c1 | 4 | NULL | 128 | 100.00 | Using index condition; Using MRR |
| 2 | UNION | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 640 | 66.67 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------------------------+
综上,开启OR子句转UNION ALL功能后,执行计划中可以利用t1.a和t3.c1上的索引,达到和手动改写成UNION ALL相同的效果。
#######先关闭改写,看下原生耗时
set polar_optimizer_switch='or_expansion=off';
mysql> desc analyze select c2 from t3 where (c2 = 2 or c2= 0 ) ORDER BY `t3`.`c3` DESC LIMIT 5;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 5 row(s) (actual time=193.389..193.393 rows=5 loops=1)
-> Sort: t3.c3 DESC, limit input to 5 row(s) per chunk (cost=641.82 rows=3200) (actual time=193.386..193.388 rows=5 loops=1)
-> Index range scan on t3 using idx_c2_c3, with index condition: ((t3.c2 = 2) or (t3.c2 = 0)) (actual time=0.348..187.455 rows=3200 loops=1)
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.20 sec)
mysql> set polar_optimizer_switch='or_expansion=on';
Query OK, 0 rows affected (0.00 sec)
mysql> desc analyze select c2 from t3 where (c2 = 2 or c2= 0 ) ORDER BY `t3`.`c3` DESC LIMIT 5;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 5 row(s) (actual time=1.249..1.254 rows=5 loops=1)
-> Sort: derived_1_2.Name_exp_1 DESC, limit input to 5 row(s) per chunk (actual time=0.104..0.106 rows=5 loops=1)
-> Table scan on derived_1_2 (actual time=0.006..0.013 rows=10 loops=1)
-> Union materialize (actual time=1.246..1.249 rows=5 loops=1)
-> Limit: 5 row(s) (actual time=0.336..0.571 rows=5 loops=1)
-> Index lookup on t3 using idx_c2_c3 (c2=2; iterate backwards) (cost=0.00 rows=5) (actual time=0.333..0.566 rows=5 loops=1)
-> Limit: 5 row(s) (actual time=0.215..0.431 rows=5 loops=1)
-> Index lookup on t3 using idx_c2_c3 (c2=0; iterate backwards) (cost=0.00 rows=5) (actual time=0.214..0.427 rows=5 loops=1)
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
order by因为使用了t3.c3的索引被优化掉了, 执行耗时从200ms提升到1ms,查询得到上百倍提升。
mysql> set polar_optimizer_switch='or_expansion=off';
Query OK, 0 rows affected (0.00 sec)
mysql> desc analyze select c2 from t3 where c2 in (2 , 0) ORDER BY `t3`.`c3` DESC LIMIT 5;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 5 row(s) (actual time=197.497..197.501 rows=5 loops=1)
-> Sort: t3.c3 DESC, limit input to 5 row(s) per chunk (cost=641.82 rows=3200) (actual time=197.494..197.496 rows=5 loops=1)
-> Index range scan on t3 using idx_c2_c3, with index condition: (t3.c2 in (2,0)) (actual time=0.319..191.560 rows=3200 loops=1)
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.20 sec)
mysql> set polar_optimizer_switch='or_expansion=on';
mysql> desc analyze select c2 from t3 where c2 in (2 , 0) ORDER BY `t3`.`c3` DESC LIMIT 5;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Limit: 5 row(s) (actual time=1.256..1.260 rows=5 loops=1)
-> Sort: derived_1_2.Name_exp_1 DESC, limit input to 5 row(s) per chunk (actual time=0.090..0.093 rows=5 loops=1)
-> Table scan on derived_1_2 (actual time=0.005..0.012 rows=10 loops=1)
-> Union materialize (actual time=1.252..1.255 rows=5 loops=1)
-> Limit: 5 row(s) (actual time=0.259..0.545 rows=5 loops=1)
-> Index lookup on t3 using idx_c2_c3 (c2=2; iterate backwards) (cost=0.00 rows=5) (actual time=0.256..0.540 rows=5 loops=1)
-> Limit: 5 row(s) (actual time=0.237..0.455 rows=5 loops=1)
-> Index lookup on t3 using idx_c2_c3 (c2=0; iterate backwards) (cost=0.00 rows=5) (actual time=0.236..0.451 rows=5 loops=1)
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
使用HINT功能进行query block级别的功能控制。
desc SELECT /*+NO_OR_EXPAND(@subq1) */ * FROM t1 WHERE exists (SELECT /*+ QB_NAME(subq1) */ 1 FROM t3 WHERE (t1.a = 1 or t1.b = 2) and t3.c1 < 5 and t1.b = t3.c1);
+----+-------------+-------+------------+------+---------------+--------+---------+------------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+------------+------+----------+-----------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | idx_a | NULL | NULL | NULL | 640 | 19.00 | Using where |
| 1 | SIMPLE | t3 | NULL | ref | idx_c1 | idx_c1 | 4 | test2.t1.b | 64 | 100.00 | Using index; FirstMatch(t1) |
+----+-------------+-------+------------+------+---------------+--------+---------+------------+------+----------+-----------------------------+
mysql> desc SELECT /*+OR_EXPAND(@subq1) */ * FROM t1 WHERE exists (SELECT /*+ QB_NAME(subq1) */ 1 FROM t3 WHERE (t3.c1 = 1 or t1.b = 2) and t3.c1 < 5 and t1.b = t3.c1);
+----+--------------------+-------+------------+------+---------------+--------+---------+-------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+------+---------------+--------+---------+-------+------+----------+------------------------------------+
| 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 640 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | t3 | NULL | ref | idx_c1 | idx_c1 | 4 | const | 64 | 100.00 | Using where; Using index |
| 3 | DEPENDENT UNION | t3 | NULL | ref | idx_c1 | idx_c1 | 4 | const | 64 | 100.00 | Using index condition; Using index |
+----+--------------------+-------+------------+------+---------------+--------+---------+-------+------+----------+------------------------------------+
query_block idx)强制某个表达式改成成union all其中idx是该表达式在where表达式中第几个孩子,下标从1开始,示例是将 (t3.c2= 1 or t1.b = 2) 展开成union all了。
mysql> desc format=tree SELECT /*+OR_EXPAND(@subq1 3) */ * FROM t1 WHERE exists (SELECT /*+ QB_NAME(subq1) */ 1 FROM t3 WHERE (t3.c2 = 999 or t1.b = 999) and t3.c1 < 5 and t1.b = t3.c1 and (t3.c2= 1 or t1.b = 2));
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: exists(select #2) (cost=64.75 rows=640)
-> Table scan on t1 (cost=64.75 rows=640)
-> Select #2 (subquery in condition; dependent)
-> Limit: 1 row(s)
-> Append
-> Stream results
-> Filter: (t3.c2 = 1) (cost=17.45 rows=32)
-> Index lookup on t3 using idx_c1 (c1=t1.b), with index condition: ((t1.b = 999) and (t3.c1 < 5)) (cost=17.45 rows=64)
-> Stream results
-> Filter: (t3.c1 = 2) (cost=0.51 rows=0)
-> Index lookup on t3 using idx_c2_c3 (c2=999), with index condition: ((t1.b = 2) and lnnvl((t3.c2 = 1))) (cost=0.51 rows=1)

