您的位置:首页 > 大数据 > 人工智能

postgres error : failed to read kind from backend

2016-04-07 11:14 751 查看
新安装了postgres master-slave 模式,连接上应用,发现如下报错

ERROR  - JDBC commit failed

org.postgresql.util.PSQLException: FATAL: failed to read kind from backend

  Detail: kind mismatch among backends. Possible last query was: "COMMIT" kind details are: 0[3] 1[1]

  Hint: check data consistency among db nodes

  从pgpool f&q查到如下描述:

  

  Why "SET default_transaction_isolation TO DEFAULT" fails ?

$ psql -h localhost -p 9999 -c 'SET default_transaction_isolation to DEFAULT;'

ERROR: kind mismatch among backends. Possible last query was: "SET default_transaction_isolation to DEFAULT;" kind details are: 0[N: statement: SET default_transaction_isolation to DEFAULT;] 1[C]

HINT: check data consistency among db nodes

ERROR: kind mismatch among backends. Possible last query was: "SET default_transaction_isolation to DEFAULT;" kind details are: 0[N: statement: SET default_transaction_isolation to DEFAULT;] 1[C]

HINT: check data consistency among db nodes

connection to server was lost

Pgpool-II detects that node 0 returns "N" (a NOTICE message comes from PostgreSQL) while node 1 returns "C" (which means the command finished).

Though pgpool-II expects that both node 0 and 1 returns identical messages, actually they are not. So pgpool-II threw an error.

Probably certain log/message settings are different in node 0 and 1. Please check client_min_messages or something like that.

从网上查到如下描述:

分析如下:

 send_to_where函数中,处在Master/Slave模式的时候,数据的增、删、改指令只向PrimaryDB发送。

 begin/commit这样的事务有关的指令,则既向Master送信,也向Slave送信。

该描述实际上是和官网上的描述有出入的,实际观察到的如下 log

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Parse: BEGIN

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: Parse: BEGIN

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: B message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: B message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Execute: BEGIN

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: Execute: BEGIN

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Parse: *******************************

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: B message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: D message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Execute: ****************************************

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: C message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: C message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Parse: COMMIT

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: Parse: COMMIT

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: B message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: B message

2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 0 backend pid: 5632 statement: Execute: COMMIT
2016-04-10 15:31:20: pid 5431: LOG:  DB node id: 1 backend pid: 9793 statement: Execute: COMMIT

2016-04-10 15:31:20: pid 5431: WARNING:  packet kind of backend 1 ['1'] does not match with master/majority nodes packet kind ['3']

2016-04-10 15:31:20: pid 5431: FATAL:  failed to read kind from backend

2016-04-10 15:31:20: pid 5431: DETAIL:  kind mismatch among backends. Possible last query was: "COMMIT" kind details are: 0[3] 1[1]

2016-04-10 15:31:20: pid 5431: HINT:  check data consistency among db nodes

 通过上述的分析,从pool_process_query→send_to_where 的调用关系,

 commit则既向Master送信,也向Slave送信,但是!

 由于子进程与Slave之间的网络通信被中断,pool_read发生错误,那么此子进程就exit消亡了。

 而此时,已经向PrimaryDB发送了的commit指令,已经成功,是无法取消的。

修改 postgres 备机上的hot_standby_feedback = off ,问题未解决,

更改pgpool 的client_min_messages =error,等待验证,依然无效

2016.4.14,

在官方网站上找到
http://www.pgpool.net/pipermail/pgpool-general/2016-March/004623.html
并且说是在pgpool3.6上会修复这个问题
http://pgpool.net/mediawiki/index.php/pgpool-II_3.6_development
只能等待了
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: