Postgresql 10 logical replication not working - postgresql

I install postgresql 10 using scripts:
$ wget -q https://www.postgresql.org/media/keys/ACCC4CF8.asc -O - | sudo apt-key add -
$ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" >> /etc/apt/sources.list.d/pgdg.list'
$ sudo apt-get update
$ sudo apt-get install postgresql postgresql-contrib
On Master server : xx.xxx.xxx.xx-:
And after that in postgresql.conf:
set wal_level = logical
On Slave server -:
in postgresql.conf:
set wal_level = logical
And after all i use below query on master server:
create table t1 (id integer primary key, val text);
create user replicant with replication;
grant select on t1 to replicant;
insert into t1 (id, val) values (10, 'ten'), (20, 'twenty'), (30, 'thirty');
create publication pub1 for table t1;
And at Slave server:
create table t1 (id integer primary key, val text, val2 text);
create subscription sub1 connection 'dbname=dbsrc user=replicant' publication pub1;
But the problem which is i am facing is table are not syncing and as per logical replication when i insert new row on master server, slave server not getting that row.
I am new to postgresql please help me.
Thanks for your precious time.
Now here is my postgresql log for Master server:
2017-10-17 11:06:16.644 UTC [10713] replicant#postgres LOG: starting logical decoding for slot "sub_1"
2017-10-17 11:06:16.644 UTC [10713] replicant#postgres DETAIL: streaming transactions committing after 1/F45EB0C8, reading WAL from 1/F45EB0C8
2017-10-17 11:06:16.645 UTC [10713] replicant#postgres LOG: logical decoding found consistent point at 1/F45EB0C8
2017-10-17 11:06:16.645 UTC [10713] replicant#postgres DETAIL: There are no running transactions.
Here is my slave server postgresql log:
2017-10-17 19:14:45.622 CST [7820] WARNING: out of logical replication worker slots
2017-10-17 19:14:45.622 CST [7820] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:45.670 CST [7821] WARNING: out of logical replication worker slots
2017-10-17 19:14:45.670 CST [7821] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:45.680 CST [7822] WARNING: out of logical replication worker slots
2017-10-17 19:14:45.680 CST [7822] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:50.865 CST [7820] WARNING: out of logical replication worker slots
2017-10-17 19:14:50.865 CST [7820] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:50.917 CST [7821] WARNING: out of logical replication worker slots
2017-10-17 19:14:50.917 CST [7821] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:50.928 CST [7822] WARNING: out of logical replication worker slots
2017-10-17 19:14:50.928 CST [7822] HINT: You might need to increase max_logical_replication_workers.
2017-10-17 19:14:55.871 CST [7820] WARNING: out of logical replication worker slots
2017-10-17 19:14:55.871 CST [7820] HINT: You might need to increase max_logical_replication_workers.
And after increasing the max_logical_replication_workers i am getting this:
2017-10-17 19:44:45.898 CST [7987] LOG: logical replication table synchronization worker for subscription "sub2", table "t1" has started
2017-10-17 19:44:45.982 CST [7988] LOG: logical replication table synchronization worker for subscription "myadav_test", table "test_replication" h$
2017-10-17 19:44:45.994 CST [7989] LOG: logical replication table synchronization worker for subscription "sub3", table "t1" has started
2017-10-17 19:44:48.621 CST [7987] ERROR: could not start initial contents copy for table "staging.t1": ERROR: permission denied for schema staging
2017-10-17 19:44:48.623 CST [7962] LOG: worker process: logical replication worker for subscription 20037 sync 20027 (PID 7987) exited with exit co$
2017-10-17 19:44:48.705 CST [7988] ERROR: could not start initial contents copy for table "staging.test_replication": ERROR: permission denied for$
2017-10-17 19:44:48.707 CST [7962] LOG: worker process: logical replication worker for subscription 20025 sync 20016 (PID 7988) exited with exit co$
2017-10-17 19:44:48.717 CST [7989] ERROR: duplicate key value violates unique constraint "t1_pkey"
2017-10-17 19:44:48.717 CST [7989] DETAIL: Key (id)=(10) already exists.
2017-10-17 19:44:48.717 CST [7989] CONTEXT: COPY t1, line 1
2017-10-17 19:44:48.718 CST [7962] LOG: worker process: logical replication worker for subscription 20038 sync 20027 (PID 7989) exited with exit co$
2017-10-17 19:44:51.629 CST [8008] LOG: logical replication table synchronization worker for subscription "sub2", table "t1" has started
2017-10-17 19:44:51.712 CST [8009] LOG: logical replication table synchronization worker for subscription "myadav_test", table "test_replication" h$
2017-10-17 19:44:51.722 CST [8010] LOG: logical replication table synchronization worker for subscription "sub3", table "t1" has started
Now i finally realize that logical replication is working for postgres database but not for my other database on same server. I am getting permission issue on schema that is is log.

The row changes are applied using the rights of the user who owns the subscription. By default that's the user who created the subscription.
So make sure the subscription is owned by a user with sufficient rights. Grant needed rights to tables, or if you can't be bothered, make the subscription owned by a superuser who has full rights to everything.
See:
CREATE SUBSCRIPTION
logical replication - security
logical replication

Related

ERROR: invalid logical replication message type "T"

I am getting below error from Postgres 10.3 logical replication.
Setup
In master, postgresql used 12.3
In logical, postgres 10.3
Logs
2021-03-22 13:06:57.332 IST # 25929 LOG: checkpoints are occurring too frequently (22 seconds apart)
2021-03-22 13:06:57.332 IST # 25929 HINT: Consider increasing the configuration parameter "max_wal_size".
2021-03-22 14:34:21.263 IST # 21461 ERROR: invalid logical replication message type "T"
2021-03-22 14:34:21.315 IST # 3184 LOG: logical replication apply worker for subscription "elk_subscription_133" has started
2021-03-22 14:34:21.367 IST # 3184 ERROR: invalid logical replication message type "T"
2021-03-22 14:34:21.369 IST # 25921 LOG: worker process: logical replication worker for subscription 84627 (PID 3184) exited with exit code 1
2021-03-22 14:34:22.259 IST # 25921 LOG: worker process: logical replication worker for subscription 84627 (PID 21461) exited with exit code 1
2021-03-22 14:34:27.281 IST # 3187 LOG: logical replication apply worker for subscription "elk_subscription_133" has started
2021-03-22 14:34:27.311 IST # 3187 ERROR: invalid logical replication message type "T"
2021-03-22 14:34:27.313 IST # 25921 LOG: worker process: logical replication worker for subscription 84627 (PID 3187) exited with exit code 1
2021-03-22 14:34:32.336 IST # 3188 LOG: logical replication apply worker for subscription "elk_subscription_133" has started
2021-03-22 14:34:32.362 IST # 3188 ERROR: invalid logical replication message type "T"
The documentation describes message T:
Truncate
      Byte1('T')
              Identifies the message as a truncate message.
Support for TRUNCATE was added in v11, so the primary server must be v11 or better.
You will have to remove the table from the publication, refresh the subscription, truncate the table manually, add it to the publication and refresh the subscription again.
Avoid TRUNCATE and change the publication:
ALTER PUBLICATION name SET (publish = 'insert, update, delete');

How to resolve error reading result of streaming command?

I have a master database doing logical replication with a publication and a slave database subscribing to that publication. It is on the slave that I am occasionally getting the following error:
ERROR: error reading result of streaming command:
LOG: logical replication table synchronization worker for subscription ABC, table XYZ
How do I stop the above error from happening?
Below is a screenshot of the log to demonstrate the error:
Here is the same information as text:
2020-11-25 06:50:51.736 UTC [91572] LOG: background worker "logical replication worker" (PID 96504) exited with exit code 1
2020-11-25 06:50:51.740 UTC [96505] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_devicekioskrating" has started
2020-11-25 06:50:52.197 UTC [96505] ERROR: error reading result of streaming command:
2020-11-25 06:50:52.200 UTC [91572] LOG: background worker "logical replication worker" (PID 96505) exited with exit code 1
2020-11-25 06:50:52.203 UTC [96506] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "workorders_sectorbranchinformation" has started
2020-11-25 06:50:52.286 UTC [96506] ERROR: error reading result of streaming command:
2020-11-25 06:50:52.288 UTC [91572] LOG: background worker "logical replication worker" (PID 96506) exited with exit code 1
2020-11-25 06:50:52.292 UTC [96507] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_kioskstatetransitions" has started
2020-11-25 06:52:14.887 UTC [96339] ERROR: error reading result of streaming command:
2020-11-25 06:52:14.896 UTC [91572] LOG: background worker "logical replication worker" (PID 96339) exited with exit code 1
2020-11-25 06:52:14.900 UTC [96543] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_sensordatafeed" has started
2020-11-25 06:52:21.385 UTC [96507] ERROR: error reading result of streaming command:
2020-11-25 06:52:21.393 UTC [91572] LOG: background worker "logical replication worker" (PID 96507) exited with exit code 1
2020-11-25 06:52:21.397 UTC [96547] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_sitemappoint" has started
2020-11-25 06:52:21.523 UTC [96547] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_sitemappoint" has finished
2020-11-25 06:52:21.528 UTC [96548] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "core_event" has started
2020-11-25 06:55:35.401 UTC [96543] ERROR: error reading result of streaming command:
2020-11-25 06:55:35.408 UTC [91572] LOG: background worker "logical replication worker" (PID 96543) exited with exit code 1
2020-11-25 06:55:35.412 UTC [96642] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_doorevents" has started
2020-11-25 06:56:43.633 UTC [96642] ERROR: error reading result of streaming command:
2020-11-25 06:56:43.641 UTC [91572] LOG: background worker "logical replication worker" (PID 96642) exited with exit code 1
2020-11-25 06:56:43.644 UTC [96678] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "workorders_sectorbranchinformation" has started
2020-11-25 06:56:43.776 UTC [96678] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "workorders_sectorbranchinformation" has finished
2020-11-25 06:56:43.782 UTC [96679] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "core_batteryhistory" has started
2020-11-25 06:57:04.166 UTC [96679] ERROR: error reading result of streaming command:
2020-11-25 06:57:04.174 UTC [91572] LOG: background worker "logical replication worker" (PID 96679) exited with exit code 1
2020-11-25 06:57:04.178 UTC [96685] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_attendantvisittime" has started
2020-11-25 06:57:06.100 UTC [96685] ERROR: error reading result of streaming command:
2020-11-25 06:57:06.160 UTC [91572] LOG: background worker "logical replication worker" (PID 96685) exited with exit code 1
2020-11-25 06:57:06.164 UTC [96693] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_kioskstatetransitions" has started
2020-11-25 06:59:50.375 UTC [96548] ERROR: error reading result of streaming command:
2020-11-25 06:59:50.382 UTC [91572] LOG: background worker "logical replication worker" (PID 96548) exited with exit code 1
2020-11-25 06:59:50.389 UTC [96755] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_sensordatafeed" has started
2020-11-25 07:00:56.844 UTC [96693] ERROR: error reading result of streaming command:
2020-11-25 07:00:56.852 UTC [91572] LOG: background worker "logical replication worker" (PID 96693) exited with exit code 1
2020-11-25 07:00:56.856 UTC [96779] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "workorders_wastestream" has started
2020-11-25 07:00:57.391 UTC [96779] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "workorders_wastestream" has finished
2020-11-25 07:00:57.397 UTC [96780] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "core_event" has started
2020-11-25 07:02:39.650 UTC [96755] ERROR: error reading result of streaming command:
2020-11-25 07:02:39.658 UTC [91572] LOG: background worker "logical replication worker" (PID 96755) exited with exit code 1
2020-11-25 07:02:39.662 UTC [96824] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_devicekioskrating" has started
2020-11-25 07:02:40.276 UTC [96824] ERROR: error reading result of streaming command:
2020-11-25 07:02:40.279 UTC [91572] LOG: background worker "logical replication worker" (PID 96824) exited with exit code 1
2020-11-25 07:02:40.283 UTC [96825] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_kioskstatetransitions" has started
2020-11-25 07:04:07.222 UTC [96825] ERROR: error reading result of streaming command:
2020-11-25 07:04:07.230 UTC [91572] LOG: background worker "logical replication worker" (PID 96825) exited with exit code 1
2020-11-25 07:04:07.234 UTC [96862] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "contractservices_attendantvisit" has started
2020-11-25 07:04:49.971 UTC [96862] ERROR: error reading result of streaming command:
2020-11-25 07:04:49.978 UTC [91572] LOG: background worker "logical replication worker" (PID 96862) exited with exit code 1
2020-11-25 07:04:50.432 UTC [97013] LOG: logical replication table synchronization worker for subscription "snf5_cba_isp_db_staging_app1_srv_sub", table "core_batteryhistory" has started
Despite this error in postgresql v13.0, the tables on the slave database seem to be replicating okay. However, I would like to resolve this error.
I also tried downloading postgresql v13.1, and noticed that I still get this error and that it does not replicate okay.
There was this post that I found:
https://www.postgresql-archive.org/BUG-16643-PG13-Logical-replication-initial-startup-never-finishes-and-gets-stuck-in-startup-loop-td6156051.html
The guy there (Henry Hinze) said it was a bug and it was fixed by installing version 13 RC1.
But my experience was the reverse, in postgresql v13.0 it was not getting stuck in the start up loop but I noticed after installing postgresql v13.1 it was doing that.
I can confirm that I am using postgresql version 13.1 as /usr/lib/postgresql/13/bin/postgres -V gives me the following output:
postgres (PostgreSQL) 13.1 (Ubuntu 13.1-1.pgdg18.04+1)
I am using Ubuntu v18.04.
I have uninstalled postgresql completely and reinstalled it and it has not resolved the issue.
The postgresql.conf settings on the slave are the default settings.
The relevant postgresql.conf on the master are as follows:
wal_level = logical
checkpoint_timeout = 5min
max_wal_size = 1GB
min_wal_size = 80MB

PostgreSQL 9.4.1 Switchover & Switchback without recover_target_timeline=latest

I have tested different scenarios to do switchover and switchback in postgreSQL 9.4.1 Version.
Scenario 1:- PostgreSQL Switchover and Switchback in 9.4.1
Scenario 2:- Is it mandatory parameter recover_target_timeline='latest' in switchover and switchback in PostgreSQL 9.4.1?
Scenario 3:- On this page
To test scenario 3 I have followed below steps to perform.
1) Stop the application connected to primary server.
2) Confirm all application was stopped and all thread was disconnected from primary DB.
#192.x.x.129(Primary)
3) Clean shutdown primary using
pg_ctl -D$PGDATA stop --mf
#DR(192.x.x.128) side check sync status:
postgres=# select pg_last_xlog_receive_location(),pg_last_xlog_replay_location();
-[ RECORD 1 ]-----------------+-----------
pg_last_xlog_receive_location | 4/57000090
pg_last_xlog_replay_location | 4/57000090
4)Stop DR server.DR(192.x.x.128)
pg_ctl -D $PGDATA stop -mf
pg_log:
2019-12-02 13:16:09 IST LOG: received fast shutdown request
2019-12-02 13:16:09 IST LOG: aborting any active transactions
2019-12-02 13:16:09 IST LOG: shutting down
2019-12-02 13:16:09 IST LOG: database system is shut down
#192.x.x.128(DR)
5) Make following changes on DR server.
mv recovery.conf recovery.conf_bkp
6)make changes in 192.x.x.129(Primary):
[postgres#localhost data]$ cat recovery.conf
standby_mode = 'on'
primary_conninfo = 'user=replication password=postgres host=192.x.x.128 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
restore_command = 'cp %p /home/postgres/restore/%f'
trigger_file='/tmp/promote'
7)Start DR as read write mode:
pg_ctl -D $DATA start
pg_log:
2019-12-02 13:20:21 IST LOG: database system was shut down in recovery at 2019-12-02 13:16:09 IST
2019-12-02 13:20:22 IST LOG: database system was not properly shut down; automatic recovery in progress
2019-12-02 13:20:22 IST LOG: consistent recovery state reached at 4/57000090
2019-12-02 13:20:22 IST LOG: invalid record length at 4/57000090
2019-12-02 13:20:22 IST LOG: redo is not required
2019-12-02 13:20:22 IST LOG: database system is ready to accept connections
2019-12-02 13:20:22 IST LOG: autovacuum launcher started
(END)
We can see in above log OLD primary is now DR of Primary(Which was OLD DR) and not showing any error because timeline id same on new primary which is already exit in new DR.
8)Start Primary as read only mode:-
pg_ctl -D$PGDATA start
logs:
2019-12-02 13:24:50 IST LOG: database system was shut down at 2019-12-02 11:14:50 IST
2019-12-02 13:24:51 IST LOG: entering standby mode
cp: cannot stat ‘pg_xlog/RECOVERYHISTORY’: No such file or directory
cp: cannot stat ‘pg_xlog/RECOVERYXLOG’: No such file or directory
2019-12-02 13:24:51 IST LOG: consistent recovery state reached at 4/57000090
2019-12-02 13:24:51 IST LOG: record with zero length at 4/57000090
2019-12-02 13:24:51 IST LOG: database system is ready to accept read only connections
2019-12-02 13:24:51 IST LOG: started streaming WAL from primary at 4/57000000 on timeline 9
2019-12-02 13:24:51 IST LOG: redo starts at 4/57000090
(END)
Question 1:- In This scenario i have perform only switch-over to show you. using this method we can do switch-over and switchback. but using below method Switch-over-switchback is work, then why PostgreSQL Community invented recovery_target_timeline=latest and apply patches see blog: https://www.enterprisedb.com/blog/switchover-switchback-in-postgresql-9-3 from PostgrSQL 9.3...to latest version.
Question 2:- What mean to say in above log cp: cannot stat ‘pg_xlog/RECOVERYHISTORY’: No such file or directory ?
Question 3:- I want to make sure from scenarios 1 and scenario 3 which method/Scenarios is correct way to do switchover and switchback? because scenario 2 is getting error because we must use recover_target_timeline=latest which all community experts know.
Answers:
If you shut down the standby cleanly, then remove recovery.conf and restart it, it will come up, but has to perform crash recovery (database system was not properly shut down).
The proper way to promote a standby to a primary is by using the trigger file or running pg_ctl promote (or, from v12 on, by running the SQL function pg_promote). Then you have no down time and don't need to perform crash recovery.
Promoting the standby will make it pick a new time line, so you need recovery_target_timeline = 'latest' if you want the new standby to follow that time line switch.
That is caused by your restore_command.
The method shown in 1. above is the correct one.

Streaming replication is failing with "WAL segment has already been moved"

I am trying to implement Master/Slave streaming replication on Postgres 11.5. I ran the following steps -
On Master
select pg_start_backup('replication-setup',true);
On Slave
Stopped the postgres 11 database and ran
rsync -aHAXxv --numeric-ids --progress -e "ssh -T -o Compression=no -x" --exclude pg_wal --exclude postgresql.pid --exclude pg_log MASTER:/var/lib/postgresql/11/main/* /var/lib/postgresql/11/main
On Master
select pg_stop_backup();
On Slave
rsync -aHAXxv --numeric-ids --progress -e "ssh -T -o Compression=no -x" MASTER:/var/lib/postgresql/11/main/pg_wal/* /var/lib/postgresql/11/main/pg_wal
I created the recovery.conf file on slave ~/11/main folder
standby_mode = 'on'
primary_conninfo = 'user=postgres host=MASTER port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
primary_slot_name='my_repl_slot'
When I start Postgres on Slave, I get the error on both MASTER and SLAVE logs -
019-11-08 09:03:51.205 CST [27633] LOG: 00000: database system was interrupted; last known up at 2019-11-08 02:53:04 CST
2019-11-08 09:03:51.205 CST [27633] LOCATION: StartupXLOG, xlog.c:6388
2019-11-08 09:03:51.252 CST [27633] LOG: 00000: entering standby mode
2019-11-08 09:03:51.252 CST [27633] LOCATION: StartupXLOG, xlog.c:6443
2019-11-08 09:03:51.384 CST [27634] LOG: 00000: started streaming WAL from primary at 12DB/C000000 on timeline 1
2019-11-08 09:03:51.384 CST [27634] LOCATION: WalReceiverMain, walreceiver.c:383
2019-11-08 09:03:51.384 CST [27634] FATAL: XX000: could not receive data from WAL stream: ERROR: requested WAL segment 00000001000012DB0000000C has already been removed
2019-11-08 09:03:51.384 CST [27634] LOCATION: libpqrcv_receive, libpqwalreceiver.c:772
2019-11-08 09:03:51.408 CST [27635] LOG: 00000: started streaming WAL from primary at 12DB/C000000 on timeline 1
2019-11-08 09:03:51.408 CST [27635] LOCATION: WalReceiverMain, walreceiver.c:383
The problem is the START WAL - 00000001000012DB0000000C is available right until I run the pg_stop_backup() and is getting archived and no longer available, once the pg_stop_backup() is executed. So this is not an issue of the WAL being archived out due to low WAL_KEEP_SEGMENTS.
postgres#SLAVE:~/11/main/pg_wal$ cat 00000001000012DB0000000C.00000718.backup
START WAL LOCATION: 12DB/C000718 (file 00000001000012DB0000000C)
STOP WAL LOCATION: 12DB/F4C30720 (file 00000001000012DB000000F4)
CHECKPOINT LOCATION: 12DB/C000750
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2019-11-07 15:47:26 CST
LABEL: replication-setup-mdurbha
START TIMELINE: 1
STOP TIME: 2019-11-08 08:48:35 CST
STOP TIMELINE: 1
My MASTER has archive_command set, and I have the missing WALs available. I copied them into a restore directory on the SLAVE and tried the recovery.conf below, but it still fails with the MASTER reporting the same WAL segment has already been moved error.
Any idea how I can address this issue? I have used rsync to setup replication without any issues in the past on Postgres 9.6, but have been experiencing this issue on Postgres 11.
standby_mode = 'on'
primary_conninfo = 'user=postgres host=MASTER port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres'
restore_command='cp /var/lib/postgresql/restore/%f %p'
Put a restore_command into recovery.conf that can restore archived WAL files and you are fine.

server closed the connection unexpectedly when pg_dump

i am run pg_dump on my vps server, it throw me error:
pg_dump: [archiver (db)] query failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: [archiver (db)] query was: SELECT
( SELECT alias FROM pg_catalog.ts_token_type('22171'::pg_catalog.oid) AS t
WHERE t.tokid = m.maptokentype ) AS tokenname,
m.mapdict::pg_catalog.regdictionary AS dictname
FROM pg_catalog.pg_ts_config_map AS m
WHERE m.mapcfg = '22172'
ORDER BY m.mapcfg, m.maptokentype, m.mapseqno
Then I notice the sql on the above error:
SELECT
( SELECT alias FROM pg_catalog.ts_token_type('22171'::pg_catalog.oid) AS t
WHERE t.tokid = m.maptokentype ) AS tokenname,
m.mapdict::pg_catalog.regdictionary AS dictname
FROM pg_catalog.pg_ts_config_map AS m
WHERE m.mapcfg = '22172'
ORDER BY m.mapcfg, m.maptokentype, m.mapseqno
So I try to run SELECT alias FROM pg_catalog.ts_token_type('22171'::pg_catalog.oid) on psql
So it throw me error:
pzz_development=# SELECT alias FROM pg_catalog.ts_token_type('22171'::pg_catalog.oid);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q
How can I figure out the problem, and dump my data properly?
EDIT:
Then i check postgresql log at /var/log/postgresql/postgresql-9.3-main.log
2015-08-10 16:22:49 CST LOG: server process (PID 4029) was terminated by signal 11: Segmentation fault
2015-08-10 16:22:49 CST DETAIL: Failed process was running: SELECT
( SELECT alias FROM pg_catalog.ts_token_type('22171'::pg_catalog.oid) AS t
WHERE t.tokid = m.maptokentype ) AS tokenname,
m.mapdict::pg_catalog.regdictionary AS dictname
FROM pg_catalog.pg_ts_config_map AS m
WHERE m.mapcfg = '22172'
ORDER BY m.mapcfg, m.maptokentype, m.mapseqno
2015-08-10 16:22:49 CST LOG: terminating any other active server processes
2015-08-10 16:22:49 CST WARNING: terminating connection because of crash of another server process
2015-08-10 16:22:49 CST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2015-08-10 16:22:49 CST HINT: In a moment you should be able to reconnect to the database and repeat your command.
2015-08-10 16:22:49 CST LOG: all server processes terminated; reinitializing
2015-08-10 16:22:49 CST LOG: database system was interrupted; last known up at 2015-08-10 16:22:45 CST
2015-08-10 16:22:50 CST LOG: database system was not properly shut down; automatic recovery in progress
2015-08-10 16:22:50 CST LOG: unexpected pageaddr 0/2AE6000 in log segment 000000010000000000000004, offset 11427840
2015-08-10 16:22:50 CST LOG: redo is not required
2015-08-10 16:22:50 CST LOG: MultiXact member wraparound protections are now enabled
2015-08-10 16:22:50 CST LOG: autovacuum launcher started
2015-08-10 16:22:50 CST LOG: database system is ready to accept connections