数据库内核月报 - 2019 / 07

PgSQL · 应用案例 · 使用SQL查询数据库日志

背景

如果使用了CSV日志个数,可以使用file_fdw访问数据库的日志(慢日志,错误日志,审计日志),如下:

《使用SQL查询数据库日志 - file_fdw , csvlog , program , find》

日志是什么格式?日志在哪个目录?数据盘在哪个目录?都可以通过参数得到。所以我们不需要记录这些东西,直接使用函数来创建file_fdw外部表来查询日志。

如果不是CSV格式,可以使用pg_read_file函数来读取日志的内容。

log_destination  
  
Valid values are combinations of "stderr", "syslog", "csvlog", and "eventlog", depending on the platform.  
  

使用psql变量存储日志、数据盘位置,便于你创建fdw表

man psql

variable用法

《PostgreSQL Oracle 兼容性之 psql sqlplus 变量 & set variable》

《使用SQL查询数据库日志 - file_fdw , csvlog , program , find》

psql 客户端,使用gset代替分号,可以将结果存储到psql变量中

如下,创建两个psql变量data_directory,log_directory

1、获取数据库的数据、日志目录位置

select setting as data_directory from pg_settings where name='data_directory'  
\gset  
  
select setting as log_directory from pg_settings where name='log_directory'  
\gset  

2、使用pg_ls_logdir()可列出日志文件目录的文件

postgres=# select * from pg_ls_logdir() order by modification;  
               name               |  size  |      modification        
----------------------------------+--------+------------------------  
 postgresql-2019-03-26_062949.log |    168 | 2019-03-26 14:29:49+08  
 postgresql-2019-03-26_062949.csv |   1800 | 2019-03-26 14:41:48+08  
 postgresql-2019-03-26_064150.csv | 143182 | 2019-03-27 10:03:12+08  
 postgresql-2019-03-26_064150.log |   1734 | 2019-03-27 10:03:12+08  
 postgresql-2019-03-27_020316.log |    168 | 2019-03-27 10:03:16+08  
 postgresql-2019-03-27_020316.csv |  25697 | 2019-05-25 00:11:30+08  
 postgresql-2019-05-28_080838.log |   8085 | 2019-07-10 09:39:33+08  
 postgresql-2019-05-28_080838.csv |  46466 | 2019-07-10 10:08:10+08  
(8 rows)  
  
postgres=# select * from pg_ls_logdir() where name ~ 'csv$' order by modification;  
               name               |  size  |      modification        
----------------------------------+--------+------------------------  
 postgresql-2019-03-26_062949.csv |   1800 | 2019-03-26 14:41:48+08  
 postgresql-2019-03-26_064150.csv | 143182 | 2019-03-27 10:03:12+08  
 postgresql-2019-03-27_020316.csv |  25697 | 2019-05-25 00:11:30+08  
 postgresql-2019-05-28_080838.csv |  46466 | 2019-07-10 10:08:10+08  
(4 rows)  

3、将日志文件名记录到变量中

select name as log_name from pg_ls_logdir() where name ~ 'csv$' order by modification desc limit 1  
\gset  

4、使用psql变量拼出sql,使用pg_read_file直接读取文件内容,指定从哪个字节读到哪个字节。

注意pg_read_file函数返回的结果不分行,也不区分字段,如下输出0字节到100字节。但是pg_read_file可以读取任意格式的文件。

select log from pg_read_file(:'data_directory'||'/'||:'log_directory'||'/'||:'log_name', 0, 100) as log;  
postgres=# select log from pg_read_file(:'data_directory'||'/'||:'log_directory'||'/'||:'log_name', 0, 10000) as log;  
                                                                                                                                                                                  log               
                                                                                                                                                                        
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 2019-07-20 00:02:14.771 CST,,,10766,,5d2aea65.2a0e,19,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint starting: time",,,,,,,,"LogCheckpointStart, xlog.c:8340",""                                
                                                                                                                                                                     +  
 2019-07-20 00:02:27.258 CST,,,10766,,5d2aea65.2a0e,20,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint complete: wrote 124 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=12.  
484 s, sync=0.000 s, total=12.487 s; sync files=48, longest=0.000 s, average=0.000 s; distance=218 kB, estimate=1753715 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8422",""+  
 2019-07-20 00:30:09.784 CST,,,65215,"[local]",5d31f011.febf,1,"",2019-07-20 00:30:09 CST,,0,LOG,00000,"connection received: host=[local]",,,,,,,,"BackendInitialize, postmaster.c:4294",""         
                                                                                                                                                                     +  
 2019-07-20 00:30:09.785 CST,"postgres","postgres",65215,"[local]",5d31f011.febf,2,"authentication",2019-07-20 00:30:09 CST,3/20405,0,LOG,00000,"connection authorized: user=postgres database=pos  
tgres application_name=psql",,,,,,,,"PerformAuthentication, postinit.c:303",""                                                                                       +  
 2019-07-20 00:37:14.358 CST,,,10766,,5d2aea65.2a0e,21,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint starting: time",,,,,,,,"LogCheckpointStart, xlog.c:8340",""                                
                                                                                                                                                                     +  
 2019-07-20 00:37:14.624 CST,,,10766,,5d2aea65.2a0e,22,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.264  
 s, sync=0.000 s, total=0.266 s; sync files=2, longest=0.000 s, average=0.000 s; distance=5 kB, estimate=1578344 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8422",""       +  
   
(1 row)  

5、使用regexp_split_to_table可以分行,但是如果sql也是分行的不太好处理

https://www.postgresql.org/docs/12/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

postgres=# select regexp_split_to_table(log,'\n') with ordinality from pg_read_file(:'data_directory'||'/'||:'log_directory'||'/'||:'log_name', 0, 1000000) as log;  
                                                                                                                                                                         regexp_split_to_table      
                                                                                                                                                                        
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 2019-07-20 00:02:14.771 CST,,,10766,,5d2aea65.2a0e,19,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint starting: time",,,,,,,,"LogCheckpointStart, xlog.c:8340",""  
 2019-07-20 00:02:27.258 CST,,,10766,,5d2aea65.2a0e,20,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint complete: wrote 124 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=12.  
484 s, sync=0.000 s, total=12.487 s; sync files=48, longest=0.000 s, average=0.000 s; distance=218 kB, estimate=1753715 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8422",""  
 2019-07-20 00:30:09.784 CST,,,65215,"[local]",5d31f011.febf,1,"",2019-07-20 00:30:09 CST,,0,LOG,00000,"connection received: host=[local]",,,,,,,,"BackendInitialize, postmaster.c:4294",""  
 2019-07-20 00:30:09.785 CST,"postgres","postgres",65215,"[local]",5d31f011.febf,2,"authentication",2019-07-20 00:30:09 CST,3/20405,0,LOG,00000,"connection authorized: user=postgres database=pos  
tgres application_name=psql",,,,,,,,"PerformAuthentication, postinit.c:303",""  
 2019-07-20 00:37:14.358 CST,,,10766,,5d2aea65.2a0e,21,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint starting: time",,,,,,,,"LogCheckpointStart, xlog.c:8340",""  
 2019-07-20 00:37:14.624 CST,,,10766,,5d2aea65.2a0e,22,,2019-07-14 16:40:05 CST,,0,LOG,00000,"checkpoint complete: wrote 2 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.264  
 s, sync=0.000 s, total=0.266 s; sync files=2, longest=0.000 s, average=0.000 s; distance=5 kB, estimate=1578344 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8422",""  
   
(7 rows)  

6、使用file_fdw查看日志内容,很完美的呈现。

create schema _sys;  
set search_path=_sys,public,"$user";  
create extension file_fdw with schema _sys;  
CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;    

为了让file_fdw可以读取所有日志文件,使用program来读取文件内容。将外部表中的program的内容存入变量

\set pro '''find ':data_directory'/':log_directory' -type f -name "*.csv" -exec cat {} \\;'', format ''csv'''  
  
  
postgres=# \set pro '''find ':data_directory'/':log_directory' -type f -name "*.csv" -exec cat {} \\;'', format ''csv'''  
postgres=# \echo :pro  
'find /data01/pg12/pg_root12000/log -type f -name "*.csv" -exec cat {} \;', format 'csv'  

7、建立fdw表如下

CREATE FOREIGN TABLE pglog (    
  log_time timestamp(3) with time zone,    
  user_name text,    
  database_name text,    
  process_id integer,    
  connection_from text,    
  session_id text,    
  session_line_num bigint,    
  command_tag text,    
  session_start_time timestamp with time zone,    
  virtual_transaction_id text,    
  transaction_id bigint,    
  error_severity text,    
  sql_state_code text,    
  message text,    
  detail text,    
  hint text,    
  internal_query text,    
  internal_query_pos integer,    
  context text,    
  query text,    
  query_pos integer,    
  location text,    
  application_name text    
) SERVER pglog    
OPTIONS ( program :pro );   

8、如果想过滤掉一些记录,可以创建视图

create view v_log as select * from pglog where ... -- 过滤不想输出的内容  
grant select on v_log to public;    

9、查询例子

postgres=# select * from pglog limit 2;  
-[ RECORD 1 ]----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
-----------------------------------------------------  
log_time               | 2019-07-20 14:02:14.771+08  
user_name              |   
database_name          |   
process_id             | 10766  
connection_from        |   
session_id             | 5d2aea65.2a0e  
session_line_num       | 19  
command_tag            |   
session_start_time     | 2019-07-15 06:40:05+08  
virtual_transaction_id |   
transaction_id         | 0  
error_severity         | LOG  
sql_state_code         | 00000  
message                | checkpoint starting: time  
detail                 |   
hint                   |   
internal_query         |   
internal_query_pos     |   
context                |   
query                  |   
query_pos              |   
location               | LogCheckpointStart, xlog.c:8340  
application_name       |   
-[ RECORD 2 ]----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
-----------------------------------------------------  
log_time               | 2019-07-20 14:02:27.258+08  
user_name              |   
database_name          |   
process_id             | 10766  
connection_from        |   
session_id             | 5d2aea65.2a0e  
session_line_num       | 20  
command_tag            |   
session_start_time     | 2019-07-15 06:40:05+08  
virtual_transaction_id |   
transaction_id         | 0  
error_severity         | LOG  
sql_state_code         | 00000  
message                | checkpoint complete: wrote 124 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=12.484 s, sync=0.000 s, total=12.487 s; sync files=48, longest=0.000 s,   
average=0.000 s; distance=218 kB, estimate=1753715 kB  
detail                 |   
hint                   |   
internal_query         |   
internal_query_pos     |   
context                |   
query                  |   
query_pos              |   
location               | LogCheckpointEnd, xlog.c:8422  
application_name       |   
  

使用plpgsql创建file_fdw表,更加简便

do language plpgsql $$  
declare  
  v_datadir text;  
  v_logdir text;  
begin  
  select setting into v_datadir from pg_settings where name='data_directory';  
  select setting into v_logdir  from pg_settings where name='log_directory';  
  
  create schema IF NOT EXISTS _sys ;  
  
  set search_path=_sys,public,"$user";  
  
  create extension IF NOT EXISTS file_fdw with schema _sys;  
  
  CREATE SERVER IF NOT EXISTS pglog FOREIGN DATA WRAPPER file_fdw;    
  
  execute format('CREATE FOREIGN TABLE IF NOT EXISTS pglog (    
    log_time timestamp(3) with time zone,    
    user_name text,    
    database_name text,    
    process_id integer,    
    connection_from text,    
    session_id text,    
    session_line_num bigint,    
    command_tag text,    
    session_start_time timestamp with time zone,    
    virtual_transaction_id text,    
    transaction_id bigint,    
    error_severity text,    
    sql_state_code text,    
    message text,    
    detail text,    
    hint text,    
    internal_query text,    
    internal_query_pos integer,    
    context text,    
    query text,    
    query_pos integer,    
    location text,    
    application_name text    
    ) SERVER pglog    
    OPTIONS ( program %L , format ''csv'')' ,  
  format('find %s/%s -type f -name "*.csv" -exec cat {} \;', v_datadir, v_logdir)   
  );   
  
end;  
$$;  

是不是很方便呢?以上执行完,自动创建pglog外部表,查询它,可以看到日志的内容。换行没有问题。

-[ RECORD 15 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
-----------------------------------------------------  
log_time               | 2019-07-20 14:47:35.418+08  
user_name              | postgres  
database_name          | postgres  
process_id             | 65215  
connection_from        | [local]  
session_id             | 5d31f011.febf  
session_line_num       | 11  
command_tag            | idle  
session_start_time     | 2019-07-20 14:30:09+08  
virtual_transaction_id | 3/20443  
transaction_id         | 0  
error_severity         | LOG  
sql_state_code         | 00000  
message                | statement: SELECT n.nspname as "Schema",                                                                                                                                   
                                                     +  
                       |   p.proname as "Name",                                                                                                                                                     
                                                     +  
                       |   pg_catalog.pg_get_function_result(p.oid) as "Result data type",                                                                                                          
                                                     +  
                       |   pg_catalog.pg_get_function_arguments(p.oid) as "Argument data types",                                                                                                    
                                                     +  
                       |  CASE p.prokind                                                                                                                                                            
                                                     +  
                       |   WHEN 'a' THEN 'agg'                                                                                                                                                      
                                                     +  
                       |   WHEN 'w' THEN 'window'                                                                                                                                                   
                                                     +  
                       |   WHEN 'p' THEN 'proc'                                                                                                                                                     
                                                     +  
                       |   ELSE 'func'                                                                                                                                                              
                                                     +  
                       |  END as "Type"                                                                                                                                                             
                                                     +  
                       | FROM pg_catalog.pg_proc p                                                                                                                                                  
                                                     +  
                       |      LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace                                                                                                         
                                                     +  
                       | WHERE p.proname OPERATOR(pg_catalog.~) '^(.*ordi.*)$' COLLATE pg_catalog.default                                                                                           
                                                     +  
                       | ORDER BY 1, 2, 4;  
detail                 |   
hint                   |   
internal_query         |   
internal_query_pos     |   
context                |   
query                  |   
query_pos              |   
location               | exec_simple_query, postgres.c:1045  
application_name       | psql  

参考

man find

man psql

《PostgreSQL Oracle 兼容性之 psql sqlplus 变量 & set variable》

《使用SQL查询数据库日志 - file_fdw , csvlog , program , find》

https://www.postgresql.org/docs/12/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG

https://www.postgresql.org/docs/12/file-fdw.html

https://www.postgresql.org/docs/12/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE