您的位置:首页 > 其它

pg_waldump的使用

2021-06-10 19:12 106 查看 https://www.cnblogs.com/zhangf

pg_waldump的使用

目录
  • 使用pgbench数据加压
  • 查看wal统计信息
  • 分析wal日志主要内容
  • 从wal日志中找到最后一次checkpoint
  • 根据pg_waldump找到删除的事务ID
  • pg_waldump作用

    pg_waldump可解码并显示PostgreSQL预写日志,用于调试

    1、分析pg_waldump出来的wal日志信息

    2、wal中的FPI占比很高问题分析

    3、full_page_writes作用

    4、wal日志大,使用压缩更靠谱。不建议将full_page_writes设置为false来提升性能,就像把fsync设置为false提升性能一样是不合理的。

    [thunisoft@sdbserver1 pg_wal]$ pg_waldump --help
    pg_waldump decodes and displays PostgreSQL write-ahead logs for debugging.
    
    Usage:
    pg_waldump [OPTION]... [STARTSEG [ENDSEG]]
    
    Options:
    -b, --bkp-details      output detailed information about backup blocks 输出备份块即full-write-page的详细信息
    -e, --end=RECPTR       stop reading at WAL location RECPTR 搜索在此lsn偏移处结束
    -f, --follow           keep retrying after reaching end of WAL 在到达wal剩余时仍继续尝试
    -n, --limit=N          number of records to display 要显示的记录数
    -p, --path=PATH        directory in which to find log segment files or a  --pg_wal的路径,默认是$PGDATA/pg_wal,也可以指定对应的wal文件
    directory with a ./pg_wal that contains such files
    (default: current directory, ./pg_wal, $PGDATA/pg_wal)
    -r, --rmgr=RMGR        only show records generated by resource manager RMGR; 使用--rmgr=list列出有效的资源管理器名称
    use --rmgr=list to list valid resource manager names
    -s, --start=RECPTR     start reading at WAL location RECPTR 搜索从lsn偏移处开始
    -t, --timeline=TLI     timeline from which to read log records 从中读取日志记录的时间线,
    (default: 1 or the value used in STARTSEG)
    -V, --version          output version informat
    56c
    ion, then exit 输出版本信息,然后退出
    -x, --xid=XID          only show records with transaction ID XID 只输出指定的事务的xlog record
    -z, --stats[=record]   show statistics instead of records 显示统计信息而不是记录
    (optionally, show per-record statistics)
    -?, --help             show this help, then exit --帮助显示,然后退出

    PG10.x+做了一些改动,将关键字xlog改成wal

    PG9.x或以下版本 PG10.x+
    pg_xlogdump pg_waldump
    pg_xlogfile_name(pg_current_xlog_location()) pg_walfile_name(pg_current_wal_lsn())
    pg_xlog_location_diff(pg_current_xlog_location(),flush_location) pg_wal_lsn_diff(pg_current_wal_lsn(),flush_lsn)
    pg_xlog_location_diff(pg_current_xlog_location(),restart_lsn) pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn)

    pg_waldump解析日志

    postgres=# create table t_wal(n_id int);
    CREATE TABLE
    postgres=# select pg_current_wal_lsn();
    pg_current_wal_lsn
    --------------------
    0/8E459DB0
    (1 row)
    
    postgres=# insert into t_wal values(1);
    INSERT 0 1
    postgres=# insert into t_wal values(2);
    INSERT 0 1
    postgres=#
    56c
    checkpoint;
    CHECKPOINT
    postgres=# select pg_current_wal_lsn();
    pg_current_wal_lsn
    --------------------
    0/8E459F70
    (1 row)
    
    postgres=# \q
    [thunisoft@sdbserver1 pg_wal]$ pg_waldump  -s   0/8E459DB0 -e 0/8E459F70
    rmgr: Heap        len (rec/tot):     59/    59, tx:    7102114, lsn: 0/8E459DB0, prev 0/8E459D78, desc: INSERT+INIT off 1, blkref #0: rel 1663/13287/23713247 blk 0
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7102114, lsn: 0/8E459DF0, prev 0/8E459DB0, desc: COMMIT 2021-06-08 16:49:51.835650 CST
    rmgr: Heap        len (rec/tot):     59/    59, tx:    7102115, lsn: 0/8E459E20, prev 0/8E459DF0, desc: INSERT off 2, blkref #0: rel 1663/13287/23713247 blk 0
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7102115, lsn: 0/8E459E60, prev 0/8E459E20, desc: COMMIT 2021-06-08 16:49:55.316314 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/8E459E90, prev 0/8E459E60, desc: RUNNING_XACTS nextXid 7102116 latestCompletedXid 7102115 oldestRunningXid 7102116
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 0/8E459EC8, prev 0/8E459E90, desc: CHECKPOINT_ONLINE redo 0/8E459E90; tli 1; prev tli 1; fpw true; xid 0:7102116; oid 23721430; multi 1; offset 0; oldest xid 563 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 7102116; online
    rmgr: Standby     len (rec/tot):     50/
    ad0
    50, tx:          0, lsn: 0/8E459F38, prev 0/8E459EC8, desc: RUNNING_XACTS nextXid 7102116 latestCompletedXid 7102115 oldestRunningXid 7102116

    指定wal文件

    postgres=#  select pg_current_wal_lsn(),
    postgres-#         pg_walfile_name(pg_current_wal_lsn()),
    postgres-#         pg_walfile_name_offset(pg_current_wal_lsn());
    pg_current_wal_lsn |     pg_walfile_name      |       pg_walfile_name_offset
    --------------------+--------------------------+-------------------------------------
    0/FAFEBE50         | 0000000100000000000000FA | (0000000100000000000000FA,16694864)
    (1 row)
    
    postgres=# insert into t_wal values(2);
    INSERT 0 1
    postgres=# checkpoint;
    CHECKPOINT
    postgres=#  select pg_current_wal_lsn(),
    pg_walfile_name(pg_current_wal_lsn()),
    pg_walfile_name_offset(pg_current_wal_lsn());
    pg_current_wal_lsn |     pg_walfile_name      |       pg_walfile_name_offset
    --------------------+--------------------------+-------------------------------------
    0/FAFEC088         | 0000000100000000000000FA | (0000000100000000000000FA,16695432)
    (1 row)
    
    --指定wal文件:0000000100000000000000FA
    [thunisoft@sdbserver1 pg_wal]$ pg_waldump  $PGDATA/pg_wal/0000000100000000000000FA -s  0/FAFEBE50
    rmgr: Heap        len (rec/tot):     64/   268, tx:    7134486, lsn: 0/FAFEBE50, prev 0/FAFEBE18, desc: INSERT off 5, blkref #0: rel 1663/13287/23713247 blk 0 FPW
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7134486, lsn: 0/FAFEBF60, prev 0/FAFEBE50, desc: COMMIT 2021-06-08 18:42:16.028620 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBF90, prev 0/FAFEBF60, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBFC8, prev 0/FAFEBF90, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 0/FAFEC018, prev 0/FAFEBFC8, desc: CHECKPOINT_ONLINE redo 0/FAFEBFC8; tli 1; prev tli 1; fpw true; xid 0:7134487; oid 23721430; multi 1; offset 0; oldest xid 563 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 7134487; online
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEC088, prev 0/FAFEC018, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    pg_waldump: FATAL:  error in WAL record at 0/FAFEC088: invalid record length at 0/FAFEC0C0: wanted 24, got 0
    
    --不指定wal文件,结果是一样的
    [thunisoft@sdbserver1 pg_wal]$ pg_waldump -p $PGDATA/pg_wal -s  0/FAFEBE50
    rmgr: Heap        len (rec/tot):
    1044
    64/   268, tx:    7134486, lsn: 0/FAFEBE50, prev 0/FAFEBE18, desc: INSERT off 5, blkref #0: rel 1663/13287/23713247 blk 0 FPW
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7134486, lsn: 0/FAFEBF60, prev 0/FAFEBE50, desc: COMMIT 2021-06-08 18:42:16.028620 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBF90, prev 0/FAFEBF60, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBFC8, prev 0/FAFEBF90, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 0/FAFEC018, prev 0/FAFEBFC8, desc: CHECKPOINT_ONLINE redo 0/FAFEBFC8; tli 1; prev tli 1; fpw true; xid 0:7134487; oid 23721430; multi 1; offset 0; oldest xid 563 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 7134487; online
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEC088, prev 0/FAFEC018, desc: RUNNING_XACTS nextXid 7134487 latestCompletedXid 7134486 oldestRunningXid 7134487
    pg_waldump: FATAL:  error in WAL record at 0/FAFEC088: invalid record length at 0/FAFEC0C0: wanted 24, got 0

    各个字段的意思

    rmgr: Heap
    PostgreSQL内部将WAL日志归类到20多种不同的资源管理器。这条WAL记录所属资源管理器为Heap,即堆表。除了Heap还有Btree,Transaction等。
    len (rec/tot): 59/    59
    WAL记录的总长度是59字节,其中main data部分是59字节(只计数main data可能并不合理,本文的后面会有说明)。
    tx: 7102114
    事务号
    lsn: 0/8E459DB0
    本WAL记录的LSN
    prev 0/8E459D78
    上条WAL记录的LSN
    desc: INSERT+INIT off 1
    这是一条insert类型的记录(每个资源管理器最多包含16种不同的WAL记录类型),tuple在page中的位置为1。
    blkref #0: rel 1663/13287/23713247 blk 0
    引用的第一个page所属的对表文件为1663/13287/23713247,块号为0(即ctid的前半部分)。通过oid2name可以查到是哪个堆表。
    
    [thunisoft@sdbserver1 pg_wal]$ oid2name -o 23713247 -p 6777
    From database "postgres":
    Filenode  Table Name
    ----------------------
    23713247       t_wal
    
    postgres=# select pg_relation_filepath('t_wal');
    pg_relation_filepath
    ----------------------
    base/13287/23713247
    (1 row)

    rmgr=list

    [thunisoft@sdbserver1 pg_wal]$ pg_waldump  --rmgr=list
    XLOG
    Transaction
    Storage
    CLOG
    Database
    Tablespace
    MultiXact
    RelMap
    Standby
    Heap2
    Heap
    Btree
    Hash
    Gin
    Gist
    Sequence
    SPGist
    BRIN
    CommitTs
    ReplicationOrigin
    Generic
    LogicalMessage

    使用pgbench数据加压

    [thunisoft@sdbserver1 ~]$ pgbench -i -s 20 pgbenchdb -p 6777
    dropping old tables...
    注意:  表 "pgbench_accounts" 不存在
    注意:  表 "pgbench_branches" 不存在
    注意:  表 "pgbench_history" 不存在
    注意:  表 "pgbench_tellers" 不存在
    creating tables...
    generating data...
    100000 of 2000000 tuples (5%) done (elapsed 0.44 s, remaining 8.41 s)
    200000 of 2000000 tuples (10%) done (elapsed 0.86 s, remaining 7.74 s)
    300000 of 2000000 tuples (15%) done (elapsed 2.04 s, remaining 11.54 s)
    400000 of 2000000 tuples (20%) done (elapsed 2.36 s, remaining 9.43 s)
    500000 of 2000000 tuples (25%) done (elapsed 3.88 s, remaining 11.65 s)
    600000 of 2000000 tuples (30%) done (elapsed 5.17 s, remaining 12.06 s)
    700000 of 2000000 tuples (35%) done (elapsed 5.48 s, remaining 10.17 s)
    800000 of 2000000 tuples (40%) done (elapsed 6.51 s, remaining 9.76 s)
    900000 of 2000000 tuples (45%) done (elapsed 7.51 s, remaining 9.18 s)
    1000000 of 2000000 tuples (50%) done (elapsed 7.73 s, remaining 7.73 s)
    1100000 of 2000000 tuples (55%) done (elapsed 8.78 s, remaining 7.19 s)
    1200000 of 2000000 tuples (60%) done (elapsed 9.48 s, remaining 6.32 s)
    1300000 of 2000000 tuples (65%) done (elapsed 9.70 s, remaining 5.23 s)
    1400000 of 2000000 tuples (70%) done (elapsed 10.
    56c
    50 s, remaining 4.50 s)
    1500000 of 2000000 tuples (75%) done (elapsed 13.49 s, remaining 4.50 s)
    1600000 of 2000000 tuples (80%) done (elapsed 13.76 s, remaining 3.44 s)
    1700000 of 2000000 tuples (85%) done (elapsed 14.87 s, remaining 2.62 s)
    1800000 of 2000000 tuples (90%) done (elapsed 15.12 s, remaining 1.68 s)
    1900000 of 2000000 tuples (95%) done (elapsed 15.97 s, remaining 0.84 s)
    2000000 of 2000000 tuples (100%) done (elapsed 16.72 s, remaining 0.00 s)
    vacuuming...
    creating primary keys...
    done.
    [thunisoft@sdbserver1 ~]$ pgbench -r -j2 -c4 -T60 pgbenchdb
    connection to database "pgbenchdb" failed:
    could not connect to server: Connection refused
    Is the server running locally and accepting
    connections on Unix domain socket "/tmp/.s.PGSQL.6543"?
    [thunisoft@sdbserver1 ~]$ pgbench -r -j2 -c4 -T60 pgbenchdb  -p 6777
    starting vacuum...end.
    transaction type: <builtin: TPC-B (sort of)>
    scaling factor: 20
    query mode: simple
    number of clients: 4
    number of threads: 2
    duration: 60 s
    number of transactions actually processed: 15985
    latency average = 15.025 ms
    tps = 266.215712 (including connections establishing)
    tps = 266.257963 (excluding connections establishing)
    statement latencies in milliseconds:
    0.006  \set aid random(1, 100000 * :scale)
    0.001  \set bid random(1, 1 * :scale)
    0.001  \set tid random(1, 10 * :scale)
    0.001  \set
    56c
    delta random(-5000, 5000)
    0.220  BEGIN;
    0.576  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
    0.348  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
    0.464  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
    1.073  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
    0.298  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
    11.963  END;

    查看wal统计信息

    [thunisoft@sdbserver1 pg_wal]$ pg_waldump -p $PGDATA/pg_wal -z  -s    0/8E48C2B0 -e 0/9E392748
    Type                                           N      (%)          Record size      (%)             FPI size      (%)        Combined size      (%)
    ----                                           -      ---          -----------      ---             --------      ---        -------------      ---
    XLOG                                        5491 (  2.94)               269059 (  0.12)             40443352 ( 99.62)             40712411 ( 15.31)
    Transaction                                16000 (  8.56)               739416 (  0.33)                    0 (  0.00)               739416 (  0.28)
    Storage                                       12 (  0.
    ad8
    01)                  504 (  0.00)                    0 (  0.00)                  504 (  0.00)
    CLOG                                           1 (  0.00)                   30 (  0.00)                    0 (  0.00)                   30 (  0.00)
    Database                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Tablespace                                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    MultiXact                                      0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    RelMap                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Standby                                       34 (  0.02)                 1836 (  0.00)                    0 (  0.00)                 1836 (  0.00)
    Heap2                                      73844 ( 39.50)            216321593 ( 95.98)                32768 (  0.08)            216354361 ( 81.34)
    Heap                                       78541 ( 42.01)              7211473 (  3.20)                41548 (  0.10)              7253021 (  2.73)
    Btree                                      13015 (  6.96)               834561 (  0.37)                80352 (  0.20)               914913 (  0.34)
    Hash                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Gin                                            0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Gist                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Sequence                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    SPGist                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    BRIN                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    CommitTs                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    ReplicationOrigin                              0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    Generic                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    LogicalMessage
    1b14
    0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
    --------                      --------                      --------                      --------
    Total                                     186938                     225378472 [84.74%]             40598020 [15.26%]            265976492 [100%]

    列名解释:

    `各个列的解释:``Type:xlog中的日志类型``N: 次数及占比``Record size:记录的大小及占比``FPI size:      full page image大小及占比``combined size: 记录和FPI的大小总和`

    wal日志中FPI占比15.26%,Record占比84.74%,网上搜的是开启了full_page_writes会导致wal中的FPI占比大,但是我使用pgbench测试了下,差别不大,占比也是15%左右。

    分析wal日志主要内容

    [thunisoft@sdbserver1 pg_wal]$ pg_waldump -p $PGDATA/pg_wal  -s  0/AE2A74E0 -e 0/FAFEAB38  |grep -v " lock "| awk '{print $2" "$16" "$17}'|sort |uniq -c|sort -n -r
    182083 Heap2 tuples, blkref
    163942 Heap2 xid 7134476
    27429 XLOG blkref #0:
    88 Heap2 1663/23713278/1249; tid

    从wal日志中找到最后一次checkpoint

    postgres=# insert into t_wal values(1);
    INSERT 0 1
    postgres=# checkpoint;
    CHECKPOINT
    postgres=# \q
    --使用pg_waldump解析wal日志
    [thunisoft@sdbserver1 pg_wal]$ pg_waldump -p $PGDATA/pg_wal  -s  0/FAFEBBD0
    rmgr: Heap        len (rec/tot):     64/   196, tx:    7134484, lsn: 0/FAFEBBD0, prev 0/FAFEBB98, desc: INSERT off 3, blkref #0: rel 1663/13287/23713247 blk 0 FPW
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7134484, lsn: 0/FAFEBC98, prev 0/FAFEBBD0, desc: COMMIT 2021-06-08 18:20:55.274801 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBCC8, prev 0/FAFEBC98, desc: RUNNING_XACTS nextXid 7134485 latestCompletedXid 7134484 oldestRunningXid 7134485
    rmgr: Heap        len (rec/tot):     59/    59, tx:    7134485, lsn: 0/FAFEBD00, prev 0/FAFEBCC8, desc: INSERT off 4, blkref #0: rel 1663/13287/23713247 blk 0
    rmgr: Transaction len (rec/tot):     46/    46, tx:    7134485, lsn: 0/FAFEBD40, prev 0/FAFEBD00, desc: COMMIT 2021-06-08 18:21:25.749244 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBD70, prev 0/FAFEBD40, desc: RUNNING_XACTS nextXid 7134486 latestCompletedXid 7134485 oldestRunningXid 7134486
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 0/FAFEBDA8, prev 0/FAFEBD70, desc: CHECKPOINT_ONLINE redo 0/FAFEBD70; tli 1; prev tli 1; fpw true; xid 0:7134486; oid 23721430; multi 1; offset 0; oldest xid 563 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 7134486; online
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 0/FAFEBE18, prev 0/FAFEBDA8, desc: RUNNING_XACTS nextXid 7134486 latestCompletedXid 7134485 oldestRunningXid 7134486
    pg_waldump: FATAL:  error in WAL record at 0/FAFEBE18: invalid record length at 0/FAFEBE50: wanted 24, got 0

    CHECKPOINT_ONLINE操作的lsn:0/FAFEBDA8,prev上一条lsn是:0/FAFEBD70

    pg_controldata

    [thunisoft@sdbserver1 pg_wal]$ pg_controldata
    pg_control version number:            1100
    Catalog version number:               201809051
    Database system identifier:           6953123691206530231
    Database cluster state:               in production
    pg_control last modified:             Tue 08 Jun 2021 06:21:30 PM CST
    Latest checkpoint location:           0/FAFEBDA8
    Latest checkpoint's REDO location:    0/FAFEBD70
    Latest checkpoint's REDO WAL file:    0000000100000000000000FA
    Latest checkpoint's TimeLineID:       1
    Latest checkpoint's PrevTimeLineID:   1
    Latest checkpoint's full_page_writes: on
    Latest checkpoint's NextXID:          0:7134486
    Latest checkpoint's NextOID:          23721430
    Latest checkpoint's NextMultiXactId:  1
    Latest checkpoint's NextMultiOffset:  0
    Latest checkpoint's oldestXID:        563
    Latest checkpoint's oldestXID's DB:   1
    Latest checkpoint's oldestActiveXID:  7134486
    Latest checkpoint's oldestMultiXid:   1
    Latest checkpoint's oldestMulti's DB: 1
    Latest checkpoint's oldestCommitTsXid:0
    Latest checkpoint's newestCommitTsXid:0
    Time of latest checkpoint:            Tue 08 Jun 2021 06:21:30 PM CST

    Latest checkpoint location: 0/FAFEBDA8 --和wal日志中解析出得lsn一致
    Latest checkpoint's REDO location: 0/FAFEBD70

    根据解析wal日志还可以做很多事,比如表数据被误删以后,pg_waldump找到误删的事务号xid,停止数据库,然后重置xlog,启动数据库后就可以看到被误删的数据。

    根据pg_waldump找到删除的事务ID

    --创建表
    postgres=# create table t_delete(n_id int);
    CREATE TABLE
    --插入数据
    postgres=# insert into t_delete select generate_series(1,10);
    INSERT 0 10
    --查看当前lsn
    postgres=# select pg_current_wal_lsn();
    pg_current_wal_lsn
    --------------------
    2/C001CA88
    (1 row)
    --删除数据
    postgres=# delete from t_delete where n_id >5;
    DELETE 5
    --查看删除后的lsn
    postgres=# select pg_current_wal_lsn();
    pg_current_wal_lsn
    --------------------
    2/C001CC38
    (1 row)
    
    postgres=# \q
    --解析wal日志,事务id为710
    [thunisoft@sdbserver2 ~]$ pg_waldump -s 2/C001CA88 -p /opt/thunisoft/abdata/6.0/abase1/pg_wal/
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 2/C001CA88, prev 2/C001CA60, desc: RUNNING_XACTS nextXid 710 latestCompletedXid 709 oldestRunningXid 710
    rmgr: Heap        len (rec/tot):     54/    54, tx:        710, lsn: 2/C001CAC0, prev 2/C001CA88, desc: DELETE off 6 KEYS_UPDATED , blkref #0: rel 1663/13231/25679 blk 0
    rmgr: Heap        len (rec/tot):     54/    54, tx:        710, lsn: 2/C001CAF8, prev 2/C001CAC0, desc: DELETE off 7 KEYS_UPDATED , blkref #0: rel 1663/13231/25679 blk 0
    rmgr: Heap        len (rec/tot):     54/    54, tx:        710, lsn: 2/C001CB30, prev 2/C001CAF8, desc: DELETE off 8 KEYS_UPDATED , blkref #0: rel 1663/13231/25679 blk 0
    rmgr: Heap        len (rec/tot):     54/    54, tx:        710, lsn: 2/C001CB68, prev 2/C001CB30, desc: DELETE off 9 KEYS_UPDATED , blkref #0: rel 1663/13231/25679 blk 0
    rmgr: Heap        len (rec/tot):     54/    54, tx:        710, lsn: 2/C001CBA0, prev 2/C001CB68, desc: DELETE off 10 KEYS_UPDATED , blkref #0: rel 1663/13231/25679 blk 0
    rmgr: Transaction len (rec/tot):     34/    34, tx:        710, lsn: 2/C001CBD8, prev 2/C001CBA0, desc: COMMIT 2021-06-09 15:30:35.334034 CST
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 2/C001C
    56c
    C00, prev 2/C001CBD8, desc: RUNNING_XACTS nextXid 711 latestCompletedXid 710 oldestRunningXid 711
    rmgr: XLOG        len (rec/tot):     24/    24, tx:          0, lsn: 2/C001CC38, prev 2/C001CC00, desc: SWITCH
    rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 2/C1000028, prev 2/C001CC38, desc: CHECKPOINT_SHUTDOWN redo 2/C1000028; tli 1; prev tli 1; fpw true; xid 0:711; oid 25682; multi 1; offset 0; oldest xid 563 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown
    rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn: 2/C1000098, prev 2/C1000028, desc: RUNNING_XACTS nextXid 711 latestCompletedXid 710 oldestRunningXid 711
    pg_waldump: FATAL:  error in WAL record at 2/C1000098: invalid record length at 2/C10000D0: wanted 24, got 0
    --关闭数据库
    [thunisoft@sdbserver2 ~]$ sh stop_abase1.sh
    正在关闭数据库实例...
    waiting for server to shut down......... done
    server stopped
    --重置事务id为710,有时候设置710可能看不到,可以在设置小一点,710-1
    [thunisoft@sdbserver2 ~]$  pg_resetwal -x 710 -D /opt/thunisoft/abdata/6.0/abase1
    Write-ahead log reset
    --查看表数据,删除的大于5的数据可以正常看到。
    postgres=# select xmin,xmax,* from t_delete;
    xmin | xmax | n_id
    ------+------+------
    709 |    0 |    1
    709 |    0 |    2
    709 |    0 |
    ad8
    3
    709 |    0 |    4
    709 |    0 |    5
    709 |  710 |    6
    709 |  710 |    7
    709 |  710 |    8
    709 |  710 |    9
    709 |  710 |   10
    (10 rows)
    --查看事务id,为710,没查看一次便会增长一个
    postgres=# select txid_current();
    txid_current
    --------------
    710
    (1 row)
    --再次查看数据,发型又看不到了
    postgres=# select xmin,xmax,* from t_delete;
    xmin | xmax | n_id
    ------+------+------
    709 |    0 |    1
    709 |    0 |    2
    709 |    0 |    3
    709 |    0 |    4
    709 |    0 |    5
    (5 rows)
    
    postgres=# select txid_current();
    txid_current
    --------------
    711
    (1 row)
    
    postgres=# select xmin,xmax,* from t_delete;
    xmin | xmax | n_id
    ------+------+------
    709 |    0 |    1
    709 |    0 |    2
    709 |    0 |    3
    709 |    0 |    4
    709 |    0 |    5
    (5 rows)

    在重置后需要及时的将数据备份出来,可以使用copy,create as等方式将数据备份到其他表。不然随着xid的增长,数据很快又看不到了。

    参考资料:

    https://www.postgresql.org/docs/13/pgwaldump.html

    https://www.cnblogs.com/kuang17/p/13208204.html

    https://www.modb.pro/db/48220

    内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
    标签: