数据库内核月报

数据库内核月报 - 2024 / 12

What's new in PolarDB8.4(一) prepare once

Author: 陈江(恬泰)

概述

PolarDB8.4是polardb8.0.2的下一个版本,对标的是社区mysql 8.4,目前正在紧锣密鼓的内测当中,不久就能通过polardb-mysql官网进行购买,是时候花点时间介绍一下PolarDB8.4内核层面的新特性了 。 对笔者而言最大的惊喜是另做了一个比较现代的hypergraph optimizer,之前我和其他同事有系统性阐述过hyper optimizer, 可以参考

DpHyper算法介绍

Mysql Hyper Optimizer Interesting Order代码分享

HyperOptimizer join order搜索空间正确性完备性实现

现在让我们来看看8.4的另一个让人心动的特性,prepare once。在此之前先让我们看看polardb8.0.2的ps协议的实现

polardb8.0.2 PS协议

polardb802是基于mysql 8.0.18 codebase上进行研发的,这个版本上ps的prepare phase会进入Query_block::prepare函数, 逻辑是进行fix_fields以及进行一些rule base的查询改写。execution phase会重新进入Query_block::prepare函数,包括重新fix_fields及重新进行改写。然后继续进行optimize+execute iterator。这会带来如下问题

  1. 不够快,execution phase只是跳过了SQL的语法解析,其实完全可以跳过上下文检查(fix_fields),以及查询改写。
  2. 为了可重入执行Query_block::prepare,mysql一般选择做完查询改写然后尽量revert回原始的stmt或可解析的stmt,这导致prepare phase的查询改写就没啥意义了。其次,revert通过change record实现,这大大增加了查询改写复杂程度,导致原生的mysql查询改写规则一直比较少,比较经典的有view merge,子查询转semijoin。

8.4实现了prepare once能力,如字面意思,仅prepare一次,很好地让ps协议 prepare,execution phase各司其职。

下面我将介绍prepare once原理,重点强调区别于basic 协议地方,不会深入探讨QueryBlock::prepare,optimize流程。本文使用mysql 8.4.2版本代码

命名

stmt: 查询树,query_block tree, sql解析成内存数据结构是一个树状的query_block

ps协议例子

##sql1,包含derived table的例子
prepare s from 'select a=? from (select distinct a from t1)dt';
set @b='bbb';
EXECUTE s USING @b;

##sql2,包含set操作,需要使用临时表
prepare s2 from  'select  a from t1 union select  i from t';
EXECUTE s2;

Prepared_statement生命周期管理

mysql线程模型是one thread per connection,一个连接在后端永远对应同一个thread,所有的用户数据,上下文数据都保存在THD对象中,thd->stmt_map将缓存所有parse后的stmt,这玩意是会话粒度的可见性,其他连接看不到当前用户定义的stmt。

stmt_map是一个字典,通过name找到Prepared_statement对象,name就是上文实例里面的’s’, ‘s2’, 上文sql1,sql2执行prepare语句后都会解析生成一个个Prepared_statement对象,然后保存在thd->stmt_map集合中,随着会话退出而销毁对象。execution阶段通过name就可以查找Prepared_statement

class Prepared_statement {
Query_arena m_arena; //持久化的内存块,如可供后面execution反复使用的对象保存在这,thd->stmt_arena指向这
Item_param **m_param_array{nullptr};//prepare语句中的?占位符
LEX *m_lex{nullptr};//做过prepare的stmt保存在这,execution阶段一直复用这个缓存
LEX_CSTRING m_name{NULL_CSTR};//name,唯一标识,用于在stmt_map集合中查找
}

Item_change_list THD::change_list; //change record集合,记录item上的更改,用于回滚

一个会话中对于一个个SQL都是串行执行的,所以Prepared_statement虽然只有一份,没有线程安全问题。

prepare phase

#0  Query_block::prepare (this=0x7fbfb8ecb6b0, thd=0x7fbfb802a560, insert_field_list=0x0) at /home/jacob.cj/mysql-bld/sql/sql_resolver.cc:180
#1  0x00000000062ec6a1 in Sql_cmd_select::prepare_inner (this=0x7fbfb8eaaee8, thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_select.cc:651
#2  0x00000000062ec0c5 in Sql_cmd_dml::prepare (this=0x7fbfb8eaaee8, thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_select.cc:565
#3  0x00000000062abf7d in Prepared_statement::prepare_query (this=0x7fbfb8eba5e0, thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:1362
#4  0x00000000062af05d in Prepared_statement::prepare (this=0x7fbfb8eba5e0, thd=0x7fbfb802a560, query_str=0x7fbfb8ecd430 "select * from t1 where a <= ?", query_length=29, orig_param_array=0x0) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:2468
#5  0x00000000062ad0dc in mysql_sql_stmt_prepare (thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:1768
#6  0x000000000625f8ab in mysql_execute_command (thd=0x7fbfb802a560, first_level=true) at /home/jacob.cj/mysql-bld/sql/sql_parse.cc:3355
#7  0x000000000626614e in dispatch_sql_command (thd=0x7fbfb802a560, parser_state=0x7fc0dc3d98a0) at /home/jacob.cj/mysql-bld/sql/sql_parse.cc:5392

贴出函数栈以帮助读者查阅代码,prepare流程如下

bool Query_block::setup_wild(THD *thd) {
  // PS/SP uses arena so that changes are made permanently.
  Prepared_stmt_arena_holder ps_arena_holder(thd);//切stmt_arena

  1. 对于view,包括inline-view以及table function都会创建SQL层Table对象,该table生命周期同Prepared_statement
  2. 对于query_expression set 操作,如union, exception, intersect,不仅会创建SQL层Table对象,还会创建engine层Table对象(open_tmp_table)。ps的每个phase结束都会close_tmp_table,销毁engine层 Table, sql层Table仍有效
  3. fix_fields时,会调用resolve_type,对占位符’?’解析出来的Item_param进行类型推断
    • 因为optimize/execution会改变table一些属性,以及base_ref_items, groupby, orderby,prepare完要存一个快照,以后每次进入execution phase需要利用快照恢复
    • 保存query block属性
bool Query_block::save_properties(THD *thd) {
  saved_cond_count = cond_count;

  if (!base_ref_items.empty()) {
    m_saved_base_items =
        base_ref_items.prefix(fields.size()).Clone(thd->mem_root);
  }
  if (group_list.first &&
      save_order_properties(thd, &group_list, &group_list_ptrs))
    return true;
  if (order_list.first &&
      save_order_properties(thd, &order_list, &order_list_ptrs))
    return true;
  return false;
}
bool Table_ref::save_properties() {
  MEM_ROOT *const mem_root = *THR_MALLOC;
  if (save_bitmap(mem_root, *table->read_set, &read_set_saved) ||
      save_bitmap(mem_root, *table->write_set, &write_set_saved) ||
      save_bitmap(mem_root, table->read_set_internal,
                  &read_set_internal_saved)) {
    return true;
  }
  covering_keys_saved = table->covering_keys;
  merge_keys_saved = table->merge_keys;
  keys_in_use_for_query_saved = table->keys_in_use_for_query;
  keys_in_use_for_group_by_saved = table->keys_in_use_for_group_by;
  keys_in_use_for_order_by_saved = table->keys_in_use_for_order_by;
  nullable_saved = table->is_nullable();
  force_index_saved = table->force_index;
  force_index_order_saved = table->force_index_order;
  force_index_group_saved = table->force_index_group;
  partition_info *const part = table->part_info;
  if (part != nullptr) {
    if (save_bitmap(mem_root, part->lock_partitions, &lock_partitions_saved)) {
      return true;
    }
  }
  return false;
}
  1. 基表都需要关闭,下次重新再开
  2. query_block的leaf_tables不会revert回去,像polardb并行改了leaf_tables需要重新适配,整个prepare期间不会产生任何一条change record

execute phase

#0  Query_block::optimize (this=0x7fbfb8ecb6b0, thd=0x7fbfb802a560, finalize_access_paths=true) at /home/jacob.cj/mysql-bld/sql/sql_select.cc:2033
#1  0x00000000063989de in Query_expression::optimize (this=0x7fbfb8ecb5c8, thd=0x7fbfb802a560, materialize_destination=0x0, create_iterators=true, finalize_access_paths=true) at /home/jacob.cj/mysql-bld/sql/sql_union.cc:1017
#2  0x00000000062ed958 in Sql_cmd_dml::execute_inner (this=0x7fbfb8eaaee8, thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_select.cc:1034
#3  0x00000000062ecde2 in Sql_cmd_dml::execute (this=0x7fbfb8eaaee8, thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_select.cc:782
#4  0x0000000006263e34 in mysql_execute_command (thd=0x7fbfb802a560, first_level=true) at /home/jacob.cj/mysql-bld/sql/sql_parse.cc:4737
#5  0x00000000062b1d53 in Prepared_statement::execute (this=0x7fbfb8ebf960, thd=0x7fbfb802a560, expanded_query=0x7fc0dc3d8070, open_cursor=false) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:3539
#6  0x00000000062b01aa in Prepared_statement::execute_loop (this=0x7fbfb8ebf960, thd=0x7fbfb802a560, expanded_query=0x7fc0dc3d8070, open_cursor=false) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:2934
#7  0x00000000062ad6fa in mysql_sql_stmt_execute (thd=0x7fbfb802a560) at /home/jacob.cj/mysql-bld/sql/sql_prepare.cc:1906

对于一个prepared statement,可以多次执行execute,所以要保证stmt是clean的,可反复重入。 执行流程如下:

总结

下一篇:What’s new in PolarDB8.4(二)完备的集合操作