数据库内核月报 - 2019 / 10

PgSQL · 应用案例 · PG 12 tpcc - use sysbench-tpcc by Percona-Lab

背景

PostgreSQL 已与2019.10.3正式发布,测试其tpcc性能。

环境

阿里云虚拟机

[root@PostgreSQL12 ~]# lscpu      
Architecture:          x86_64      
CPU op-mode(s):        32-bit, 64-bit      
Byte Order:            Little Endian      
CPU(s):                16      
On-line CPU(s) list:   0-15      
Thread(s) per core:    2      
Core(s) per socket:    8      
Socket(s):             1      
NUMA node(s):          1      
Vendor ID:             GenuineIntel      
CPU family:            6      
Model:                 85      
Model name:            Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz      
Stepping:              4      
CPU MHz:               2500.014      
BogoMIPS:              5000.02      
Hypervisor vendor:     KVM      
Virtualization type:   full      
L1d cache:             32K      
L1i cache:             32K      
L2 cache:              1024K      
L3 cache:              33792K      
NUMA node0 CPU(s):     0-15      
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 spec_ctrl intel_stibp      
      
[root@PostgreSQL12 ~]# lsblk      
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT      
vda    253:0    0  200G  0 disk       
└─vda1 253:1    0  200G  0 part /      
vdb    253:16   0  1.8T  0 disk       
└─vdb1 253:17   0  1.8T  0 part /data01      
vdc    253:32   0  1.8T  0 disk       
└─vdc1 253:33   0  1.8T  0 part /data02      
      
[root@PostgreSQL12 ~]# uname -a      
Linux PostgreSQL12 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux      
      
[root@PostgreSQL12 ~]# free -g      
              total        used        free      shared  buff/cache   available      
Mem:            125           5          79          19          39          91      
Swap:             0           0           0      

测试方法

1、快设备设置

parted -s /dev/vdb mklabel gpt      
parted -s /dev/vdc mklabel gpt      
parted -s /dev/vdb mkpart primary 1MiB 100%      
parted -s /dev/vdc mkpart primary 1MiB 100%      

2、文件系统设置

mkfs.ext4 /dev/vdb1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data01      
mkfs.ext4 /dev/vdc1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data02      
      
vi /etc/fstab         
LABEL=data01 /data01 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0      
LABEL=data02 /data02 ext4 defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback 0 0        
      
mkdir /data01       
mkdir /data02      
      
mount -a       

3、系统内核设置

vi /etc/sysctl.conf      
      
# add by digoal.zhou            
fs.aio-max-nr = 1048576            
fs.file-max = 76724600            
            
# 可选:kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p                     
# /data01/corefiles 事先建好,权限777,如果是软链接,对应的目录修改为777            
            
kernel.sem = 4096 2147483647 2147483646 512000                
# 信号量, ipcs -l 或 -u 查看,每16个进程一组,每组信号量需要17个信号量。            
            
kernel.shmall = 107374182                  
# 所有共享内存段相加大小限制 (建议内存的80%),单位为页。            
kernel.shmmax = 274877906944               
# 最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用,单位为字节。            
kernel.shmmni = 819200                     
# 一共能生成多少共享内存段,每个PG数据库集群至少2个共享内存段            
            
net.core.netdev_max_backlog = 10000            
net.core.rmem_default = 262144                   
# The default setting of the socket receive buffer in bytes.            
net.core.rmem_max = 4194304                      
# The maximum receive socket buffer size in bytes            
net.core.wmem_default = 262144                   
# The default setting (in bytes) of the socket send buffer.            
net.core.wmem_max = 4194304                      
# The maximum send socket buffer size in bytes.            
net.core.somaxconn = 4096            
net.ipv4.tcp_max_syn_backlog = 4096            
net.ipv4.tcp_keepalive_intvl = 20            
net.ipv4.tcp_keepalive_probes = 3            
net.ipv4.tcp_keepalive_time = 60            
net.ipv4.tcp_mem = 8388608 12582912 16777216            
net.ipv4.tcp_fin_timeout = 5            
net.ipv4.tcp_synack_retries = 2            
net.ipv4.tcp_syncookies = 1                
# 开启SYN Cookies。当出现SYN等待队列溢出时,启用cookie来处理,可防范少量的SYN攻击            
net.ipv4.tcp_timestamps = 1                
# 减少time_wait            
net.ipv4.tcp_tw_recycle = 0                
# 如果=1则开启TCP连接中TIME-WAIT套接字的快速回收,但是NAT环境可能导致连接失败,建议服务端关闭它            
net.ipv4.tcp_tw_reuse = 1                  
# 开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接            
net.ipv4.tcp_max_tw_buckets = 262144            
net.ipv4.tcp_rmem = 8192 87380 16777216            
net.ipv4.tcp_wmem = 8192 65536 16777216            
            
net.nf_conntrack_max = 1200000            
net.netfilter.nf_conntrack_max = 1200000            
            
vm.dirty_background_bytes = 409600000                   
#  系统脏页到达这个值,系统后台刷脏页调度进程 pdflush(或其他) 自动将(dirty_expire_centisecs/100)秒前的脏页刷到磁盘            
#  默认为10%,大内存机器建议调整为直接指定多少字节            
            
vm.dirty_expire_centisecs = 3000                         
#  比这个值老的脏页,将被刷到磁盘。3000表示30秒。            
vm.dirty_ratio = 95                                      
#  如果系统进程刷脏页太慢,使得系统脏页超过内存 95 % 时,则用户进程如果有写磁盘的操作(如fsync, fdatasync等调用),则需要主动把系统脏页刷出。            
#  有效防止用户进程刷脏页,在单机多实例,并且使用CGROUP限制单实例IOPS的情况下非常有效。              
            
vm.dirty_writeback_centisecs = 100                        
#  pdflush(或其他)后台刷脏页进程的唤醒间隔, 100表示1秒。            
            
vm.swappiness = 0            
#  不使用交换分区            
            
vm.mmap_min_addr = 65536            
vm.overcommit_memory = 0                 
#  在分配内存时,允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存,内存较少的测试环境可以使用 1 .              
            
vm.overcommit_ratio = 90                 
#  当overcommit_memory = 2 时,用于参与计算允许指派的内存大小。            
vm.swappiness = 0                        
#  关闭交换分区            
vm.zone_reclaim_mode = 0                 
# 禁用 numa, 或者在vmlinux中禁止.             
net.ipv4.ip_local_port_range = 40000 65535                
# 本地自动分配的TCP, UDP端口号范围            
fs.nr_open=20480000            
# 单个进程允许打开的文件句柄上限            
            
# 以下参数请注意            
vm.extra_free_kbytes = 4096000            
vm.min_free_kbytes = 2097152    # vm.min_free_kbytes 建议每32G内存分配1G vm.min_free_kbytes        
# 如果是小内存机器,以上两个值不建议设置            
# vm.nr_hugepages = 66536                
#  建议shared buffer设置超过64GB时 使用大页,页大小 /proc/meminfo Hugepagesize            
vm.lowmem_reserve_ratio = 1 1 1            
# 对于内存大于64G时,建议设置,否则建议默认值 256 256 32        
      
      
sysctl -p      

4、系统资源限制设置

vi /etc/security/limits.conf      
      
* soft    nofile  1024000            
* hard    nofile  1024000            
* soft    nproc   unlimited            
* hard    nproc   unlimited            
* soft    core    unlimited            
* hard    core    unlimited            
* soft    memlock unlimited            
* hard    memlock unlimited      

5、自启动

vi /etc/rc.local        
             
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then            
   echo never > /sys/kernel/mm/transparent_hugepage/enabled            
fi            
su - postgres -c "pg_ctl start"          
chmod +x /etc/rc.d/rc.local      

6、EPEL包

rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm      
      
yum -y install coreutils glib2 lrzsz dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc gcc-c++ make smartmontools flex bison perl-devel perl-ExtUtils* openldap-devel jadetex  openjade bzip2 git iotop       

7、PG 12包

yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm      
      
yum install -y postgresql12*      

8、PG12 环境变量

su - postgres        
        
vi .bash_profile        
        
export PS1="$USER@`/bin/hostname -s`-> "            
export PGPORT=1921            
export PGDATA=/data01/pg12/pg_root$PGPORT            
export LANG=en_US.utf8            
export PGHOME=/usr/pgsql-12          
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH            
export DATE=`date +"%Y%m%d%H%M"`          
export PATH=$PGHOME/bin:$PATH:.            
export MANPATH=$PGHOME/share/man:$MANPATH            
export PGHOST=$PGDATA            
export PGUSER=postgres            
export PGDATABASE=postgres            
alias rm='rm -i'            
alias ll='ls -lh'            
unalias vi         

9、部署PG12文件系统

mkdir /data01/pg12        
mkdir /data02/pg12        
chown postgres:postgres /data01/pg12        
chown postgres:postgres /data02/pg12        

10、初始化PG12数据库实例

su - postgres        
      
initdb -D $PGDATA -X /data02/pg12/pg_wal1921 -U postgres -E SQL_ASCII --locale=C       

11、数据库参数设置

vi $PGDATA/postgresql.auto.conf       
      
listen_addresses = '0.0.0.0'      
port = 1921      
max_connections = 1000      
superuser_reserved_connections = 13      
unix_socket_directories = '., /var/run/postgresql, /tmp'      
unix_socket_permissions = 0700      
tcp_keepalives_idle = 60      
tcp_keepalives_interval = 10      
tcp_keepalives_count = 10      
tcp_user_timeout = 60      
shared_buffers = 32GB      
maintenance_work_mem = 2GB      
dynamic_shared_memory_type = posix      
max_files_per_process = 2000      
vacuum_cost_delay = 0      
bgwriter_delay = 10ms      
bgwriter_lru_maxpages = 1000      
bgwriter_lru_multiplier = 10.0      
effective_io_concurrency = 0      
max_worker_processes = 8      
max_parallel_maintenance_workers = 4      
max_parallel_workers_per_gather = 0      
max_parallel_workers = 8      
wal_level = minimal        
synchronous_commit = off      
full_page_writes = off      
wal_buffers = 16MB      
wal_writer_delay = 10ms      
checkpoint_timeout = 15min      
max_wal_size = 128GB      
min_wal_size = 16GB      
checkpoint_completion_target = 0.1      
max_wal_senders = 0      
random_page_cost = 1.2      
effective_cache_size = 128GB      
jit = off      
log_destination = 'csvlog'      
logging_collector = on      
log_directory = 'log'      
log_filename = 'postgresql-%a.log'      
log_truncate_on_rotation = on      
log_rotation_age = 1d      
log_rotation_size = 0      
log_checkpoints = on      
log_error_verbosity = verbose        
log_line_prefix = '%m [%p] '      
log_statement = 'ddl'      
log_timezone = 'Asia/Shanghai'      
autovacuum = on      
log_autovacuum_min_duration = 0      
autovacuum_max_workers = 3      
autovacuum_vacuum_scale_factor = 0.02      
autovacuum_analyze_scale_factor = 0.01      
autovacuum_freeze_max_age = 800000000      
autovacuum_multixact_freeze_max_age = 900000000      
autovacuum_vacuum_cost_delay = 0ms      
vacuum_freeze_min_age = 500000000      
vacuum_freeze_table_age = 750000000      
vacuum_multixact_freeze_min_age = 5000000      
vacuum_multixact_freeze_table_age = 750000000      
datestyle = 'iso, mdy'      
timezone = 'Asia/Shanghai'      
lc_messages = 'C'      
lc_monetary = 'C'      
lc_numeric = 'C'      
lc_time = 'C'      
default_text_search_config = 'pg_catalog.english'      

12、数据库防火墙设置

vi $PGDATA/pg_hba.conf      
      
host all all 192.168.0.0/24 trust      

13、启动数据库

pg_ctl start       

14、数据库表空间设置

mkdir /data01/pg12/tbs1      
mkdir /data02/pg12/tbs2      
      
psql      
create tablespace tbs1 location '/data01/pg12/tbs1';      
create tablespace tbs2 location '/data02/pg12/tbs2';      

15、sysbench部署

curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.rpm.sh | sudo bash        
sudo yum -y install sysbench      
      
      
su - postgres      
git clone https://github.com/digoal/sysbench-tpcc      
      
      
cd sysbench-tpcc      
      
chmod 700 *.lua      

16、清理数据方法

drop schema public cascade;      
create schema public;      
grant all on schema public to public;      

17、初始化数据(装载速度约每秒37MB)

export pgsql_table_options="tablespace tbs1"      
export pgsql_index_options="tablespace tbs2"      

测1000个仓库(1套表,112GB)

nohup time ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql prepare >./out.log 2>&1 &      

测10000个仓库(10套表,每套1000个仓库, 1120GB)

nohup time ./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql prepare >./out.log 2>&1 &      

18、压测

远程建议3倍cpu客户端,本地建议2倍cpu客户端。

run 时不调用purge

测1000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 run      
      
      
测10000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 run      

run 时调用purge

测1000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 --enable_purge=yes run      
      
      
测10000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=3600 --report-interval=5 --enable_purge=yes run      

cleanup

测1000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=1 --scale=1000 --trx_level=RC --db-driver=pgsql cleanup      
      
      
测10000个仓库      
./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=64 --tables=10 --scale=1000 --trx_level=RC --db-driver=pgsql cleanup      

结果解读

run 时不调用purge

./tpcc.lua --pgsql-host=/tmp --pgsql-port=1921 --pgsql-user=postgres --pgsql-db=postgres --threads=32 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=60 --report-interval=5 run    
sysbench 1.0.17 (using system LuaJIT 2.0.4)    
    
Running the test with following options:    
Number of threads: 32    
Report intermediate results every 5 second(s)    
Initializing random number generator from current time    
    
    
Initializing worker threads...    
    
Threads started!    
    
[ 5s ] thds: 32 tps: 3248.44 qps: 93258.54 (r/w/o: 42390.64/44038.35/6829.54) lat (ms,95%): 27.66 err/s 10.00 reconn/s: 0.00    
[ 10s ] thds: 32 tps: 3626.37 qps: 102832.62 (r/w/o: 46883.60/48696.28/7252.74) lat (ms,95%): 23.52 err/s 14.00 reconn/s: 0.00    
[ 15s ] thds: 32 tps: 3838.38 qps: 109478.46 (r/w/o: 49903.95/51897.94/7676.56) lat (ms,95%): 21.50 err/s 18.60 reconn/s: 0.00    
[ 20s ] thds: 32 tps: 4006.41 qps: 114816.04 (r/w/o: 52365.11/54437.71/8013.22) lat (ms,95%): 20.00 err/s 19.20 reconn/s: 0.00    
[ 25s ] thds: 32 tps: 4103.01 qps: 116394.38 (r/w/o: 53051.28/55137.28/8205.81) lat (ms,95%): 20.00 err/s 17.20 reconn/s: 0.00    
[ 30s ] thds: 32 tps: 4115.59 qps: 116128.74 (r/w/o: 52981.68/54915.87/8231.18) lat (ms,95%): 20.00 err/s 15.20 reconn/s: 0.00    
[ 35s ] thds: 32 tps: 4109.69 qps: 117433.18 (r/w/o: 53571.93/55641.86/8219.39) lat (ms,95%): 19.65 err/s 19.19 reconn/s: 0.00    
[ 40s ] thds: 32 tps: 4169.11 qps: 118802.26 (r/w/o: 54157.77/56306.27/8338.22) lat (ms,95%): 19.65 err/s 15.81 reconn/s: 0.00    
[ 45s ] thds: 32 tps: 4170.78 qps: 118412.12 (r/w/o: 53997.63/56072.92/8341.57) lat (ms,95%): 19.65 err/s 18.80 reconn/s: 0.00    
[ 50s ] thds: 32 tps: 4225.57 qps: 120878.63 (r/w/o: 55162.50/57264.98/8451.15) lat (ms,95%): 19.65 err/s 22.20 reconn/s: 0.00    
[ 55s ] thds: 32 tps: 4128.25 qps: 116929.64 (r/w/o: 53310.25/55362.88/8256.51) lat (ms,95%): 20.00 err/s 19.40 reconn/s: 0.00    
[ 60s ] thds: 32 tps: 4096.19 qps: 116335.90 (r/w/o: 53103.86/55039.66/8192.38) lat (ms,95%): 20.37 err/s 18.00 reconn/s: 0.00    

统计方法:

SQL statistics:    
    queries performed:    
        read:                            3104738    
        write:                           3224417    
        other:                           480086  -- 统计 begin;commit;rollback;     
        total:                           6809241 -- 统计所有请求,以上相加     
    transactions:                        239227 (3973.25 per sec.)     -- 统计每秒完成事务数(不包括rollback;) 使用这个计算 total tpmc = 3973.25*60 = 238395     
    queries:                             6809241 (113092.77 per sec.)  -- 所有请求     
    ignored errors:                      1038   (17.24 per sec.)    
    reconnects:                          0      (0.00 per sec.)    
    
General statistics:    
    total time:                          60.2077s    
    total number of events:              239227    
    
Latency (ms):    
         min:                                    0.42    
         avg:                                    8.02  -- 平均事务处理时间    
         max:                                  329.15    
         95th percentile:                       20.37  -- 95% 的事务处理时间低于 20.37 ms    
         sum:                              1919757.02  -- 总耗时= --threads=32 乘以 --time=60    
    
Threads fairness:    
    events (avg/stddev):           7475.8438/78.44    
    execution time (avg/stddev):   59.9924/0.01    

统计结果如下:

total tpmc= 3973.25*60=238395     
    
new orders tpmc= (total tpmc)*(10/23) = 103650   # (取决于run时是否 ```--enable_purge=yes```) 或   调用purge : (total tpmc)*(10/24)     
function event()    
  -- print( NURand (1023,1,3000))    
  local max_trx =  sysbench.opt.enable_purge == "yes" and 24 or 23    
  local trx_type = sysbench.rand.uniform(1,max_trx)    
  if trx_type <= 10 then    
    trx="new_order"    
  elseif trx_type <= 20 then    
    trx="payment"    
  elseif trx_type <= 21 then    
    trx="orderstatus"    
  elseif trx_type <= 22 then    
    trx="delivery"    
  elseif trx_type <= 23 then    
    trx="stocklevel"    
  elseif trx_type <= 24 then    
    trx="purge"    
  end    

32c64ht 512G 1000仓库 机器测试结果

  
Architecture:          x86_64  
CPU op-mode(s):        32-bit, 64-bit  
Byte Order:            Little Endian  
CPU(s):                64  
On-line CPU(s) list:   0-63  
Thread(s) per core:    2  
Core(s) per socket:    32  
Socket(s):             1  
NUMA node(s):          1  
Vendor ID:             GenuineIntel  
CPU family:            6  
Model:                 85  
Model name:            Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz  
Stepping:              4  
CPU MHz:               2500.008  
BogoMIPS:              5000.01  
Hypervisor vendor:     KVM  
Virtualization type:   full  
L1d cache:             32K  
L1i cache:             32K  
L2 cache:              1024K  
L3 cache:              33792K  
NUMA node0 CPU(s):     0-63  
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 spec_ctrl intel_stibp  
  
[root@pg11-test ~]# free -g  
              total        used        free      shared  buff/cache   available  
Mem:            503         313           3          17         186         170  
Swap:             0           0           0  
  
  
[root@pg11-test ~]# lsblk  
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT  
vda             253:0    0  200G  0 disk   
└─vda1          253:1    0  200G  0 part /  
vdb             253:16   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdc             253:32   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdd             253:48   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vde             253:64   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdf             253:80   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdg             253:96   0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdh             253:112  0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
vdi             253:128  0  1.8T  0 disk   
├─vgdata01-lv01 252:0    0    4T  0 lvm  /data01  
├─vgdata01-lv02 252:1    0    4T  0 lvm  /data02  
├─vgdata01-lv03 252:2    0    4T  0 lvm  /data03  
└─vgdata01-lv04 252:3    0    2T  0 lvm  /data04  
  
  
[root@pg11-test ~]# pvs  
  PV         VG       Fmt  Attr PSize  PFree  
  /dev/vdb   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdc   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdd   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vde   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdf   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdg   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdh   vgdata01 lvm2 a--  <1.75t    0   
  /dev/vdi   vgdata01 lvm2 a--  <1.75t    0   
[root@pg11-test ~]# vgs  
  VG       #PV #LV #SN Attr   VSize   VFree  
  vgdata01   8   4   0 wz--n- <13.97t    0   
[root@pg11-test ~]# lvs  
  LV   VG       Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert  
  lv01 vgdata01 -wi-ao----  4.00t                                                      
  lv02 vgdata01 -wi-ao----  4.00t                                                      
  lv03 vgdata01 -wi-ao----  4.00t                                                      
  lv04 vgdata01 -wi-ao---- <1.97t   
  
  
[root@pg11-test ~]# lvdisplay -vv  
      devices/global_filter not found in config: defaulting to global_filter = [ "a|.*/|" ]  
      Setting global/locking_type to 1  
      Setting global/use_lvmetad to 1  
      global/lvmetad_update_wait_time not found in config: defaulting to 10  
      Setting response to OK  
      Setting protocol to lvmetad  
      Setting version to 1  
      Setting global/use_lvmpolld to 1  
      Setting devices/sysfs_scan to 1  
      Setting devices/multipath_component_detection to 1  
      Setting devices/md_component_detection to 1  
      Setting devices/fw_raid_component_detection to 0  
      Setting devices/ignore_suspended_devices to 0  
      Setting devices/ignore_lvm_mirrors to 1  
      devices/filter not found in config: defaulting to filter = [ "a|.*/|" ]  
      Setting devices/cache_dir to /etc/lvm/cache  
      Setting devices/cache_file_prefix to   
      devices/cache not found in config: defaulting to /etc/lvm/cache/.cache  
      Setting devices/write_cache_state to 1  
      Setting global/use_lvmetad to 1  
      Setting activation/activation_mode to degraded  
      metadata/record_lvs_history not found in config: defaulting to 0  
      Setting activation/monitoring to 1  
      Setting global/locking_type to 1  
      Setting global/wait_for_locks to 1  
      File-based locking selected.  
      Setting global/prioritise_write_locks to 1  
      Setting global/locking_dir to /run/lock/lvm  
      Setting global/use_lvmlockd to 0  
      Setting response to OK  
      Setting token to filter:3239235440  
      Setting daemon_pid to 11015  
      Setting response to OK  
      Setting global_disable to 0  
      report/output_format not found in config: defaulting to basic  
      log/report_command_log not found in config: defaulting to 0  
      Obtaining the complete list of VGs before processing their LVs  
      Setting response to OK  
      Setting response to OK  
      Setting name to vgdata01  
      Processing VG vgdata01 jwrfAR-tEXe-qf6u-rd95-yhPW-O7Xw-JPUjyr  
      Locking /run/lock/lvm/V_vgdata01 RB  
      Reading VG vgdata01 jwrfAR-tEXe-qf6u-rd95-yhPW-O7Xw-JPUjyr  
      Setting response to OK  
      Setting response to OK  
      Setting name to vgdata01  
      Setting metadata/format to lvm2  
      Setting id to 8Wny3c-lLb1-27xY-9rFC-HCOc-XsaD-HmvN5l  
      Setting format to lvm2  
      Setting device to 64784  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to ClcfJi-9Omy-hZdN-ll46-B6J2-fAAL-MLrleE  
      Setting format to lvm2  
      Setting device to 64800  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to uFhANC-PCAV-JwJL-zSNn-O8np-I2Wi-ue8Vv1  
      Setting format to lvm2  
      Setting device to 64816  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to hKBbU0-a3gm-sHq1-eU7Q-ZJ3m-Iwoo-MuKzzj  
      Setting format to lvm2  
      Setting device to 64832  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to cOZaeJ-Drns-9BcP-5Aoq-oZ88-0hVs-M7K8SU  
      Setting format to lvm2  
      Setting device to 64848  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to EgaC5R-Q0An-X79Q-xGRL-5zDI-MN16-lclIBO  
      Setting format to lvm2  
      Setting device to 64864  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to NnvDT4-eUM4-V2dP-Fqv1-O28z-OVoH-z939Bh  
      Setting format to lvm2  
      Setting device to 64880  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting id to XZsZfn-y2aH-MiNA-mo95-jpdQ-Jufp-eIgBga  
      Setting format to lvm2  
      Setting device to 64896  
      Setting dev_size to 3749707776  
      Setting label_sector to 1  
      Setting ext_flags to 1  
      Setting ext_version to 2  
      Setting size to 1044480  
      Setting start to 4096  
      Setting ignore to 0  
      Setting response to OK  
      Setting response to OK  
      /dev/vdb: size is 3749707776 sectors  
      /dev/vdc: size is 3749707776 sectors  
      /dev/vdd: size is 3749707776 sectors  
      /dev/vde: size is 3749707776 sectors  
      /dev/vdf: size is 3749707776 sectors  
      /dev/vdg: size is 3749707776 sectors  
      /dev/vdh: size is 3749707776 sectors  
      /dev/vdi: size is 3749707776 sectors  
      Adding vgdata01/lv01 to the list of LVs to be processed.  
      Adding vgdata01/lv02 to the list of LVs to be processed.  
      Adding vgdata01/lv03 to the list of LVs to be processed.  
      Adding vgdata01/lv04 to the list of LVs to be processed.  
      Processing LV lv01 in VG vgdata01.  
  --- Logical volume ---  
      global/lvdisplay_shows_full_device_path not found in config: defaulting to 0  
  LV Path                /dev/vgdata01/lv01  
  LV Name                lv01  
  VG Name                vgdata01  
  LV UUID                GtVTn9-mWcL-sTJA-QyRq-VocV-eu1s-374mkU  
  LV Write Access        read/write  
  LV Creation host, time pg11-test, 2018-08-24 20:44:30 +0800  
  LV Status              available  
  # open                 1  
  LV Size                4.00 TiB  
  Current LE             32768  
  Segments               1  
  Allocation             inherit  
  Read ahead sectors     auto  
  - currently set to     8192  
  Block device           252:0  
     
      Processing LV lv02 in VG vgdata01.  
  --- Logical volume ---  
      global/lvdisplay_shows_full_device_path not found in config: defaulting to 0  
  LV Path                /dev/vgdata01/lv02  
  LV Name                lv02  
  VG Name                vgdata01  
  LV UUID                17VdCH-KVNZ-FF3a-g7ic-IY4y-qav3-jdX3CJ  
  LV Write Access        read/write  
  LV Creation host, time pg11-test, 2018-08-24 20:44:37 +0800  
  LV Status              available  
  # open                 1  
  LV Size                4.00 TiB  
  Current LE             32768  
  Segments               1  
  Allocation             inherit  
  Read ahead sectors     auto  
  - currently set to     8192  
  Block device           252:1  
     
      Processing LV lv03 in VG vgdata01.  
  --- Logical volume ---  
      global/lvdisplay_shows_full_device_path not found in config: defaulting to 0  
  LV Path                /dev/vgdata01/lv03  
  LV Name                lv03  
  VG Name                vgdata01  
  LV UUID                XY3M0w-EJdu-rx4z-Jn9n-QigT-mAVi-zps4te  
  LV Write Access        read/write  
  LV Creation host, time pg11-test, 2018-08-24 20:44:57 +0800  
  LV Status              available  
  # open                 1  
  LV Size                4.00 TiB  
  Current LE             32768  
  Segments               1  
  Allocation             inherit  
  Read ahead sectors     auto  
  - currently set to     8192  
  Block device           252:2  
     
      Processing LV lv04 in VG vgdata01.  
  --- Logical volume ---  
      global/lvdisplay_shows_full_device_path not found in config: defaulting to 0  
  LV Path                /dev/vgdata01/lv04  
  LV Name                lv04  
  VG Name                vgdata01  
  LV UUID                vWtHPq-ycHf-n8AO-3E0V-R5F6-WTXc-LocpJ8  
  LV Write Access        read/write  
  LV Creation host, time pg11-test, 2018-09-28 10:08:27 +0800  
  LV Status              available  
  # open                 1  
  LV Size                <1.97 TiB  
  Current LE             16120  
  Segments               1  
  Allocation             inherit  
  Read ahead sectors     auto  
  - currently set to     8192  
  Block device           252:3  
     
      Unlocking /run/lock/lvm/V_vgdata01  
      Setting global/notify_dbus to 1  
  
  
  
  
  
./tpcc.lua --pgsql-host=/tmp --pgsql-port=4801 --pgsql-user=postgres --pgsql-db=postgres --threads=96 --tables=1 --scale=1000 --trx_level=RC --db-ps-mode=auto --db-driver=pgsql --time=60 --report-interval=5 --enable_purge=yes run  
  
SQL statistics:  
    queries performed:  
        read:                            12443189  
        write:                           12830786  
        other:                           1992972  
        total:                           27266947  
    transactions:                        994038 (16549.36 per sec.)  
    queries:                             27266947 (453957.01 per sec.)  
    ignored errors:                      4229   (70.41 per sec.)  
    reconnects:                          0      (0.00 per sec.)  
  
General statistics:  
    total time:                          60.0634s  
    total number of events:              994038  
  
Latency (ms):  
         min:                                    0.36  
         avg:                                    5.79  
         max:                                  138.96  
         95th percentile:                       16.41  
         sum:                              5757585.45  
  
Threads fairness:  
    events (avg/stddev):           10354.5625/127.26  
    execution time (avg/stddev):   59.9748/0.01  
  
  
数据装载速度 : 89.5 MB/s  
  
tpmc total : 99.3万   
  
tpcm neworder : 41.4万   
1000仓库结果  
  
tpmc total: 36万  
  
tpmc neworder :  15万  
  
瓶颈:io  

性能小结

1、pg 12 (单机自建) ecs 16c128g + 1.8T local ssd*2
1000仓库,64并发,tpmc total: 26万
10000仓库,64并发,tpmc total: 13万

2、pg 12 (单机自建) ecs 64c512g + 1.8T local ssd*8
1000仓库,64并发,tpmc total: 99万
10000仓库,64并发,tpmc total: 41万

3、pg 12 (单机自建)(104c 768g,essd 32TB,hugepage,sharedbuffer=600GB)
unlogged table:
1000仓库,208并发,tpmc total: 184万
logged table:
1000仓库,104并发,tpmc total: 168万

参考

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

https://github.com/digoal/sysbench-tpcc/blob/master/README.md