数据库内核月报

数据库内核月报 - 2019 / 08

PgSQL · 特性分析 · 浅析PostgreSQL 的JIT

Author: 卓刀

背景

估计很多同学看过之前的月报PgSQL · 特性分析· JIT 在数据仓库中的应用价值,对JIT(just in time)和LLVM(Low Level Virtual Machine)有了一定的了解。概括地来说:

PostgreSQL 社区从2016年就开始对JIT 的实现进行了讨论,详见邮件列表

该邮件中解释了PostgreSQL 需要JIT 技术的原因。因为PostgreSQL 代码中实现的都是通用的逻辑,这就导致在执行过程中可能造成大量不必要的跳转和代码分支执行,继而造成大量不必要的指令执行,造成CPU 的压力。而使用JIT 技术可以将代码扁平化(inline)执行,直接调用对应的函数,而且如果已经知道具体输入,可以直接删除掉很多间接代码的执行。

此外,邮件中也说明了在PostgreSQL 中实现JIT 选择LLVM 的理由,概括起来就是LLVM 成熟度更高,更稳定,license 更友好,支持C 语言。

在PostgreSQL 11 的版本中实现了基于LLVM 的JIT,本文主要是浅析JIT 在PostgreSQL 11 中的使用。

PostgreSQL 中JIT 的实现概述

PostgreSQL 11 中实现的JIT,是把对应的JIT 的提供者封装成了一个外部依赖库。这避免了JIT 对主体代码的侵入,用户可以按需开启/关闭JIT 功能,而且还能通过进一步的抽象支持后期扩展不同的JIT 解决方案(目前使用的是LLVM)。不过这样带来的问题就是各个部分使用JIT 技术编译的代码必须和原来的代码位置分开,这样代码易读性可能有所降低。

作为支持JIT 的第一个正式版本,PostgreSQL 11 只实现了一部分功能,下文将简单讲解下各个功能。

PostgreSQL 中支持的JIT 功能

在之前的月报PgSQL · 特性分析· JIT 在数据仓库中的应用价值中提出了在数据库实现中LLVM 可优化的点,其中包括:

PostgreSQL 11 中基本上也实现了这几方面的优化,但是略有不同,目前包含JIT accelerated operations,inlining,optimization。

JIT accelerated operations

利用LLVM 的特性,PostgreSQL 定制化地实现了两方面的加速操作,包括表达式计算优化(expression evaluation)和元组变形优化(tuple deforming)。

表达式计算优化可以针对WHERE 条件,agg 运算等实时将表达式的路径编译为具体的代码执行,在此过程中大量的不必要的调用和跳转会被优化掉。

元组变形优化可以将具体元组转化为其在内存中运行的状态,然后根据元组每列的具体类型和元组中列的个数实时编译为具体的代码执行,在此过程中不必要的代码分支会被优化掉。

表达式和元组操作经常会造成分析型场景下的CPU 性能瓶颈,加速这两方面可以提高PostgreSQL 的分析能力。但是除了这两方面,其他的场景也可以进行JIT 的优化,例如元组排序,COPY 解析/输出以及查询中其他部分等等,这些目前没有实现,社区计划将在后续版本中实现。

inlining

PostgreSQL 源码中含有大量通用的代码,执行时会经过很多不必要的函数调用和操作。为了提高执行效率,将通用代码重写或维护两份很明显是不可取的。而JIT 技术带来的好处之一就是执行的时候将代码扁平化,去掉不必要的函数调用和操作。以LLVM 为例,Clang 编译器可以生成LLVM IR(中间表示代码)并优化,这在一定意义上就代表了两份代码。在PostgreSQL 中LLVM IR 使用的是bitcode(二进制格式),对应安装在$pkglibdir/bitcode/postgres/ 中,而对应插件的bitcode 会安装在$pkglibdir/bitcode/[extension]/ 中,其中extension 为插件名。

optimization

LLVM 中实现了对产生的中间表示代码的优化,这一定程度上也会提升数据库查询的执行速度。但是该过程本身是有相应的代价的,有些优化可能代价比较低,可以很好地提高性能,而有些可能只有在大的查询中才会体现其提高性能的作用。所以,在PostgreSQL 中定制了一些GUC 参数来限制JIT 功能的开启,详见下文。

与JIT 相关的GUC 参数

在使用JIT 的过程中,有以下几个GUC 参数与之相关,分别是:

可以看出,因为目前JIT 功能开启所需要的代价没有很好的办法进行建模,也没有很好的方法来估计,所以导致JIT 功能无法作为代价估计模型中一种可量化的代价。目前实现的策略是按照查询的代价来一刀切是否使用JIT 相应功能,还算是比较简单有效。但是,这并不是特别的优雅。很有可能只有某个部分的查询计划更适合使用JIT 功能。不过要想实现查询的某个部分使用JIT 功能需要一些额外的信息输入和判断,这带来的代价是否足够小也是存疑的。

直到这里,我们基本对PostgreSQL 中的JIT 功能有所了解。接下来,我们会讲如何启用JIT,并且以两个例子看下JIT 的效果。

如何启用JIT

如果你是使用RPM 安装的PostgreSQL 11,则需要另行安装postgresql11-llvmjit 包。如果你使用的是源码编译,则需要在configure 阶段增加–with-llvm 选项,同时指定LLVM_CONFIG 变量,即LLVM 包的llvm-config 位置,还需要指定CLANG 变量,即Clang 的路径,举例如下:

./configure --with-llvm LLVM_CONFIG=/opt/rh/llvm-toolset-7/root/usr/bin/llvm-config CLANG=/opt/rh/llvm-toolset-7/root/usr/bin/clang

简单的测试

我们针对JIT 做了下两组简单的测试,加深对其的理解。

先来一组社区邮件中给出的经典测试:

postgres=# select version();
     version
-----------------
 PostgreSQL 11.4
(1 row)

postgres=# create table t1(id integer primary key,c1 integer,c2 integer,c3 nteger,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
CREATE TABLE
postgres=# create table t2(id integer primary key,c1 integer,c2 integer,c3 integer,c4 integer,c5 integer,c6 integer,c7 integer,c8 integer,c9 integer);
CREATE TABLE
postgres=# create table t3(id integer primary key,c1 integer not null,c2 integer not null,c3 integer not null,c4 integer not null,c5 integer not null,c6 integer not null,c7 integer not null,c8 integer not null,c9 integer not null);
CREATE TABLE
postgres=# insert into t1 (id,c1,c2,c3,c4,c5,c6,c7,c8) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
INSERT 0 10000000
postgres=# insert into t2 (id,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0);
INSERT 0 10000000
postgres=# insert into t3 (id,c1,c2,c3,c4,c5,c6,c7,c8,c9) values (generate_series(1,10000000),0,0,0,0,0,0,0,0,0);
INSERT 0 10000000
postgres=# vacuum analyze t1;
VACUUM
postgres=# vacuum analyze t2;
VACUUM
postgres=# vacuum analyze t3;
VACUUM
postgres=# set jit=off;
SET
postgres=# explain analyze select sum(c8) from t1;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=218457.84..218457.85 rows=1 width=8) (actual time=1762.126..1762.126 rows=1 loops=1)
   ->  Seq Scan on t1  (cost=0.00..193457.87 rows=9999987 width=4) (actual time=0.010..820.756 rows=10000000 loops=1)
 Planning Time: 0.242 ms
 Execution Time: 1762.159 ms
(4 rows)

postgres=# explain analyze select sum(c8) from t2;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=218458.08..218458.09 rows=1 width=8) (actual time=1820.825..1820.825 rows=1 loops=1)
   ->  Seq Scan on t2  (cost=0.00..193458.06 rows=10000006 width=4) (actual time=0.011..820.387 rows=10000000 loops=1)
 Planning Time: 0.102 ms
 Execution Time: 1820.855 ms
(4 rows)

postgres=# explain analyze select sum(c8) from t3;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual time=1640.345..1640.345 rows=1 loops=1)
   ->  Seq Scan on t3  (cost=0.00..183332.58 rows=9999858 width=4) (actual time=0.011..767.184 rows=10000000 loops=1)
 Planning Time: 0.101 ms
 Execution Time: 1640.374 ms
(4 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8)
from t1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=368457.64..368457.65 rows=1 width=56) (actual time=2416.711..2416.711 rows=1 loops=1)
   ->  Seq Scan on t1  (cost=0.00..193457.87 rows=9999987 width=28) (actual time=0.022..833.951 rows=10000000 loops=1)
 Planning Time: 0.069 ms
 Execution Time: 2416.755 ms
(4 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8)
from t2;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=368458.17..368458.18 rows=1 width=56) (actual time=2451.844..2451.844 rows=1 loops=1)
   ->  Seq Scan on t2  (cost=0.00..193458.06 rows=10000006 width=28) (actual time=0.019..842.359 rows=10000000 loops=1)
 Planning Time: 0.113 ms
 Execution Time: 2451.890 ms
(4 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8)
from t3;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=358330.10..358330.11 rows=1 width=56) (actual time=2273.825..2273.825 rows=1 loops=1)
   ->  Seq Scan on t3  (cost=0.00..183332.58 rows=9999858 width=28) (actual time=0.017..792.839 rows=10000000 loops=1)
 Planning Time: 0.114 ms
 Execution Time: 2273.865 ms
(4 rows)

postgres=# set jit=on;
SET
postgres=# explain analyze select sum(c8) from t1;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=218457.84..218457.85 rows=1 width=8) (actual time=1421.869..1421.869 rows=1 loops=1)
   ->  Seq Scan on t1  (cost=0.00..193457.87 rows=9999987 width=4) (actual time=0.024..820.463 rows=10000000 loops=1)
 Planning Time: 0.058 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.551 ms, Inlining 2.166 ms, Optimization 20.364 ms, Emission 13.673 ms, Total 36.755 ms
 Execution Time: 1422.491 ms
(8 rows)

postgres=# explain analyze select sum(c8) from t2;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=218458.08..218458.09 rows=1 width=8) (actual time=1414.109..1414.109 rows=1 loops=1)
   ->  Seq Scan on t2  (cost=0.00..193458.06 rows=10000006 width=4) (actual time=0.022..818.406 rows=10000000 loops=1)
 Planning Time: 0.058 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.557 ms, Inlining 2.231 ms, Optimization 20.261 ms, Emission 13.313 ms, Total 36.363 ms
 Execution Time: 1414.733 ms
(8 rows)

postgres=# explain analyze select sum(c8) from t3;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=208332.23..208332.24 rows=1 width=8) (actual time=1388.406..1388.406 rows=1 loops=1)
   ->  Seq Scan on t3  (cost=0.00..183332.58 rows=9999858 width=4) (actual time=0.023..768.711 rows=10000000 loops=1)
 Planning Time: 0.058 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.549 ms, Inlining 2.177 ms, Optimization 20.383 ms, Emission 13.440 ms, Total 36.550 ms
 Execution Time: 1389.025 ms
(8 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=368457.64..368457.65 rows=1 width=56) (actual time=1687.375..1687.375 rows=1 loops=1)
   ->  Seq Scan on t1  (cost=0.00..193457.87 rows=9999987 width=28) (actual time=0.025..830.483 rows=10000000 loops=1)
 Planning Time: 0.072 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.756 ms, Inlining 2.180 ms, Optimization 26.391 ms, Emission 19.554 ms, Total 48.881 ms
 Execution Time: 1688.213 ms
(8 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t2;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=368458.17..368458.18 rows=1 width=56) (actual time=1682.681..1682.681 rows=1 loops=1)
   ->  Seq Scan on t2  (cost=0.00..193458.06 rows=10000006 width=28) (actual time=0.023..828.408 rows=10000000 loops=1)
 Planning Time: 0.071 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.726 ms, Inlining 2.176 ms, Optimization 26.306 ms, Emission 19.807 ms, Total 49.015 ms
 Execution Time: 1683.482 ms
(8 rows)

postgres=# explain analyze select sum(c2), sum(c3), sum(c4), sum(c5), sum(c6), sum(c7), sum(c8) from t3;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=358330.10..358330.11 rows=1 width=56) (actual time=1613.259..1613.259 rows=1 loops=1)
   ->  Seq Scan on t3  (cost=0.00..183332.58 rows=9999858 width=28) (actual time=0.020..773.426 rows=10000000 loops=1)
 Planning Time: 0.069 ms
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.784 ms, Inlining 2.265 ms, Optimization 26.395 ms, Emission 19.780 ms, Total 49.224 ms
 Execution Time: 1614.121 ms
(8 rows)

可以看出:

再来一组简单查询开启JIT 后的测试:

postgres=# create table test (id serial);
CREATE TABLE
postgres=# insert INTO test (id) select * from generate_series(1, 10000000);
INSERT 0 10000000
postgres=# set jit=off;
SET
postgres=# explain  select count(*) from test;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
               ->  Parallel Seq Scan on test  (cost=0.00..85914.57 rows=4166657 width=0)
(5 rows)

postgres=# set jit = 'on';
SET
postgres=# show jit_above_cost;
 jit_above_cost
----------------
 100000
(1 row)

postgres=# show jit_inline_above_cost;
 jit_inline_above_cost
-----------------------
 500000
(1 row)

postgres=# show jit_optimize_above_cost;
 jit_optimize_above_cost
-------------------------
 500000
(1 row)

postgres=# explain  select count(*) from test;
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
               ->  Parallel Seq Scan on test  (cost=0.00..85914.57 rows=4166657 width=0)
(5 rows)

postgres=# explain  analyze select count(*) from test;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=415.747..415.748 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=415.658..418.129 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=409.043..409.044 rows=1 loops=3)
               ->  Parallel Seq Scan on test  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.148..250.496 rows=3333333 loops=3)
 Planning Time: 0.054 ms
 Execution Time: 418.175 ms
(8 rows)

postgres=# set jit_above_cost = 10; set jit_inline_above_cost = 10; set jit_optimize_above_cost = 10;
SET
SET
SET
postgres=# show max_parallel_workers_per_gather;
 max_parallel_workers_per_gather
---------------------------------
 2
(1 row)

postgres=# explain  analyze select count(*) from test;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8) (actual time=441.672..441.672 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=8) (actual time=441.547..446.028 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8) (actual time=434.128..434.129 rows=1 loops=3)
               ->  Parallel Seq Scan on test  (cost=0.00..85914.57 rows=4166657 width=0) (actual time=0.161..251.158 rows=3333333 loops=3)
 Planning Time: 0.057 ms
 JIT:
   Functions: 9
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.096 ms, Inlining 109.551 ms, Optimization 22.201 ms, Emission 19.127 ms, Total 151.974 ms
 Execution Time: 446.673 ms
(12 rows)

postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain analyze select count(*) from test;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=169247.71..169247.72 rows=1 width=8) (actual time=1172.932..1172.933 rows=1 loops=1)
   ->  Seq Scan on test  (cost=0.00..144247.77 rows=9999977 width=0) (actual time=0.028..745.134 rows=10000000 loops=1)
 Planning Time: 0.046 ms
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.298 ms, Inlining 0.881 ms, Optimization 3.986 ms, Emission 3.788 ms, Total 8.952 ms
 Execution Time: 1173.292 ms
(8 rows)

可以看出:

总结

JIT 技术对数据库操作系统来说是提高AP 能力的有效手段,但是在工程化的道路上要考虑很多实现的问题。PostgreSQL 社区目前是采用的外部依赖按需加载的方式,将JIT 作为一种外挂手段在一定场景下提高了性能,但是由于没有有效手段评估JIT 开启的代价,需要经验和具体业务场景的测试来判断JIT 功能开启是否能够提高性能。另外,目前实现的JIT 功能相对来说比较单一,只是初期版本,尚未成熟,还需要很长的开发周期来稳定和迭代。

参考文献

[1] PgSQL · 特性分析· JIT 在数据仓库中的应用价值

[2] Hello, JIT World: The Joy of Simple JITs

[3] LLVM’s Analysis and Transform Passes