ceph fs mds stuck in `up:rejoin` state - `failed to open ino 0x101 err -116/0` - ceph

Ceph stuck in case of disk full, but after fixing, the cephfs mds stuck in rejoin state for a long time.
Ceph -s truncated output:
cluster:
id: (deleted)
health: HEALTH_WARN
1 filesystem is degraded
services:
mon: 6 daemons, deleted
mgr: deleted(active, since 3h), standbys:
mds: fs:2/2{fs:0=mds1=up:rejoin,fs:1=mds2=up:rejoin} 1 up:standby
osd: 9 osds: 9 up (since 3h), 9 in (since 6w)
data:
pools: 10 pools, 849 pgs
objects: deleted
usage: deleted
pgs: 849 active+clean
I've checked the log of mds1, it says mds.0.cache failed to open ino 0x101 err -116/0.
Could anyone help me to repair mds and make fs healthy?
Ceph version:
ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octopus (stable)
Full mds log here:
2020-11-11T11:59:53.940+0800 7f1bfaad0300 0 ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octopus (stable), process ceph-mds, pid 1437936
2020-11-11T11:59:53.940+0800 7f1bfaad0300 1 main not setting numa affinity
2020-11-11T11:59:53.940+0800 7f1bfaad0300 0 pidfile_write: ignore empty --pid-file
2020-11-11T11:59:53.948+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250302 from mon.2
2020-11-11T11:59:54.952+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250303 from mon.2
2020-11-11T11:59:54.952+0800 7f1be9df7700 1 mds.mds1 Monitors have assigned me to become a standby.
2020-11-11T11:59:54.961+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250304 from mon.2
2020-11-11T11:59:54.961+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map i am now mds.0.250304
2020-11-11T11:59:54.961+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map state change up:boot --> up:replay
2020-11-11T11:59:54.961+0800 7f1be9df7700 1 mds.0.250304 replay_start
2020-11-11T11:59:54.961+0800 7f1be9df7700 1 mds.0.250304 recovery set is 1
2020-11-11T11:59:54.962+0800 7f1be9df7700 1 mds.0.250304 waiting for osdmap 8067 (which blacklists prior instance)
2020-11-11T11:59:54.965+0800 7f1be35ea700 -1 mds.0.openfiles _load_finish got (2) No such file or directory
2020-11-11T11:59:54.969+0800 7f1be2de9700 0 mds.0.cache creating system inode with ino:0x100
2020-11-11T11:59:54.969+0800 7f1be2de9700 0 mds.0.cache creating system inode with ino:0x1
2020-11-11T11:59:59.340+0800 7f1be1de7700 1 mds.0.250304 Finished replaying journal
2020-11-11T11:59:59.340+0800 7f1be1de7700 1 mds.0.250304 making mds journal writeable
2020-11-11T12:00:00.018+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250305 from mon.2
2020-11-11T12:00:00.018+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map i am now mds.0.250304
2020-11-11T12:00:00.019+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map state change up:replay --> up:resolve
2020-11-11T12:00:00.019+0800 7f1be9df7700 1 mds.0.250304 resolve_start
2020-11-11T12:00:00.019+0800 7f1be9df7700 1 mds.0.250304 reopen_log
2020-11-11T12:00:40.991+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250307 from mon.2
2020-11-11T12:00:40.991+0800 7f1be9df7700 1 mds.0.cache handle_mds_failure mds.1 : recovery peers are 1
2020-11-11T12:00:46.078+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250308 from mon.2
2020-11-11T12:00:46.078+0800 7f1be9df7700 1 mds.0.250304 recovery set is 1
2020-11-11T12:00:46.279+0800 7f1be9df7700 1 mds.0.250304 resolve_done
2020-11-11T12:00:47.098+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250309 from mon.2
2020-11-11T12:00:47.098+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map i am now mds.0.250304
2020-11-11T12:00:47.098+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map state change up:resolve --> up:reconnect
2020-11-11T12:00:47.098+0800 7f1be9df7700 1 mds.0.250304 reconnect_start
2020-11-11T12:00:47.098+0800 7f1be9df7700 1 mds.0.server reconnect_clients -- 20 sessions
2020-11-11T12:00:47.098+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.421063 v1:192.168.60.121:0/3417198623 after 0
2020-11-11T12:00:47.098+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754486 v1:192.168.60.112:0/2544559814 after 0
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754513 v1:192.168.60.105:0/1293692070 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.1225207 v1:192.168.60.91:0/3148420742 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.1225703 v1:192.168.60.170:0/1268068775 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754480 v1:192.168.60.102:0/2002454818 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.1225690 v1:192.168.60.90:0/2591854104 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754449 v1:192.168.60.109:0/1906666522 after 0.00100002
2020-11-11T12:00:47.099+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.421610 v1:192.168.60.122:0/3403538656 after 0.00100002
2020-11-11T12:00:47.100+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.396098 v1:192.168.60.10:0/1483765764 after 0.00200004
2020-11-11T12:00:47.100+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.394564 v1:192.168.60.123:0/3786388104 after 0.00200004
2020-11-11T12:00:47.100+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.424769 v1:192.168.60.120:0/10753295 after 0.00200004
2020-11-11T12:00:47.102+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.396441 v1:192.168.60.101:0/3362363763 after 0.00400008
2020-11-11T12:00:47.104+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754541 v1:192.168.60.106:0/2279833643 after 0.00600011
2020-11-11T12:00:47.105+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754519 v1:192.168.60.111:0/2462281130 after 0.00700013
2020-11-11T12:00:47.106+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754535 v1:192.168.60.110:0/3350031855 after 0.00800015
2020-11-11T12:00:47.106+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754467 v1:192.168.60.100:0/3784129623 after 0.00800015
2020-11-11T12:00:47.107+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754461 v1:192.168.60.103:0/1624035805 after 0.00900017
2020-11-11T12:00:47.108+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754473 v1:192.168.60.108:0/1815689189 after 0.0100002
2020-11-11T12:00:47.108+0800 7f1be9df7700 0 log_channel(cluster) log [DBG] : reconnect by client.754580 v1:192.168.60.104:0/681341054 after 0.0100002
2020-11-11T12:00:47.109+0800 7f1be9df7700 1 mds.0.250304 reconnect_done
2020-11-11T12:00:48.097+0800 7f1be9df7700 1 mds.mds1 Updating MDS map to version 250310 from mon.2
2020-11-11T12:00:48.097+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map i am now mds.0.250304
2020-11-11T12:00:48.097+0800 7f1be9df7700 1 mds.0.250304 handle_mds_map state change up:reconnect --> up:rejoin
2020-11-11T12:00:48.097+0800 7f1be9df7700 1 mds.0.250304 rejoin_start
2020-11-11T12:00:48.103+0800 7f1be9df7700 1 mds.0.250304 rejoin_joint_start
2020-11-11T12:00:48.110+0800 7f1be35ea700 0 mds.0.cache failed to open ino 0x101 err -116/0
Waiting for your help, thanks!

Related

create tablespace problem in db2 HADR environment

We have Db2 10.5.0.7 on centos 6.9 and TSAMP 3.2 as our high availability solution, when we create a tablespace in primary database we encounter the following errors in the standby:
2019-08-31-08.47.32.164952+270 I87056E2779 LEVEL: Error (OS)
PID : 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 Common, OSSe, ossGetDiskInfo, probe:130
MESSAGE : ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied CALLED : OS, -, fopen OSERR: EACCES (13) DATA #1 : String, 12 bytes /proc/mounts DATA #2 :
String, 25 bytes /dbdata1/samdbTsContainer DATA #3 : unsigned integer,
8 bytes
2019-08-31-08.47.32.185625+270 E89836E494 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, high avail services,
sqlhaGetLocalDiskInfo, probe:9433 MESSAGE :
ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied
2019-08-31-08.47.32.186258+270 E90331E484 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, high avail services, sqlhaCreateMount,
probe:9746 RETCODE : ZRC=0x827300AA=-2106392406=HA_ZRC_FAILED "SQLHA
API call error"
2019-08-31-08.47.32.186910+270 I90816E658 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services,
sqlbDMSAddContainerRequest, probe:812 MESSAGE :
ZRC=0x827300AA=-2106392406=HA_ZRC_FAILED "SQLHA API call error" DATA
: String, 36 bytes Cluster add mount operation failed: DATA #2 : String, 37 bytes /dbdata1/samdbTsContainer/TSPKGCACH.1 DATA #3 :
String, 8 bytes SAMDB
2019-08-31-08.47.32.190537+270 E113909E951 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services,
sqlblog_reCreatePool, probe:3134 MESSAGE : ADM6106E Table space
"TSPKGCACH" (ID = "49") could not be created
during the rollforward operation. The most likely cause is that there
is not enough space to create the containers associated with the
table space. Connect to the database after the rollforward operation
completes and use the SET TABLESPACE CONTAINERS command to assign
containers to the table space. Then, issue another ROLLFORWARD
DATABASE command to complete recovery of this table space.
2019-08-31-08.47.32.200949+270 E114861E592 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services, sqlbIncPoolState,
probe:4628 MESSAGE : ADM12512W Log replay on the HADR standby has
stopped on table space
"TSPKGCACH" (ID "49") because it has been put into "ROLLFORWARD
PENDING" state.
There is free space available for the database and the specified path (/dbdata1/samdbTsContainer) exists on the server and we can create file manually on it .
all settings are equivalent on the primary and standby. db2inst1 is the owner of /dbdata1/samdbTsContainer and permission is drwxr-xr-x, the result of su - db2inst1 “ulimit -Hf” is unlimited and ext3 is file system type and create tablespace statement is as follows:
CREATE LARGE TABLESPACE TSPKGCACH IN DATABASE PARTITION GROUP IBMDEFAULTGROUP PAGESIZE 8 K MANAGED BY DATABASE USING (FILE '/dbdata1/samdbTsContainer/TSPKGCACH.1' 5120) ON DBPARTITIONNUM (0) EXTENTSIZE 64 PREFETCHSIZE 64 BUFFERPOOL BP8KPKGCACH OVERHEAD 10.5 TRANSFERRATE 0.14 DATA TAG NONE NO FILE SYSTEM CACHING;
SELinux is disabled and the sector size is 512 bytes. The mount options are as follws:
/dev/sdf1 /dbdata1 ext3 rw,relatime,errors=continue,barrier=1,data=ordered 0 0
We can not recreate the problem sometimes this problem occur and we don't know the reason of it, but the problem remains until server reboot.
When we restart the standby server problem solves but we need to drop the tablespace and recreate it, is there any idea for this problem?
From the error it looks to me that problem is not with the file access itself but rather /proc/mounts, which Db2 uses to do the mapping between containers and filesystems (to know e.g. the FS type). Hence I suggest to test whether all:
cat /proc/mounts
cat /proc/self/mounts
mount
work OK run as Db2 instance owner ID (db2inst1). If not, this implies some odd OS issue that Db2 is a victim of and we would need more OS diagnostics (e.g strace from the cat /proc/mounts command) to understand it.
Edit:
To confirm this theory I've run a quick test with Db2 11.1. Note this must be TSA-controlled environment for Db2 to follow sqlhaCreateMount code path (because if this will be a separate mount, Db2 will add it to the TSA resource model)
On both primary and standby:
mkdir /db2data
chown db2v111:db2iadm /db2data
then on standby:
chmod o-rx /proc
(couldn't find a "smarter" way to hit EACCES on mount info).
When I will run on primary:
db2 "create tablespace test managed by database using (file '/db2data/testts' 100 M)"
it completes fine on primary but standby hits exactly the error you are seeing:
2019-06-21-03.00.37.087693+120 I1774E2661 LEVEL: Error (OS)
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 Common, OSSe, ossGetDiskInfo, probe:130
MESSAGE : ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied
CALLED : OS, -, fopen OSERR: EACCES (13)
DATA #1 : String, 12 bytes
/proc/mounts
DATA #2 : String, 8 bytes
/db2data
DATA #3 : unsigned integer, 8 bytes
1
CALLSTCK: (Static functions may not be resolved correctly, as they are resolved to the nearest symbol)
[0] 0x00002AAAB9CFD84B /home/db2v111/sqllib/lib64/libdb2osse.so.1 + 0x23F84B
[1] 0x00002AAAB9CFED51 ossLogSysRC + 0x101
[2] 0x00002AAAB9D19647 ossGetDiskInfo + 0xF07
[3] 0x00002AAAAC52402C _Z21sqlhaGetLocalDiskInfoPKcjPcjS1_jS1_ + 0x26C
[4] 0x00002AAAAC523C5F _Z16sqlhaGetDiskInfoPKcS0_jPcjS1_jS1_ + 0x29F
[5] 0x00002AAAAC521CA0 _Z16sqlhaCreateMountPKcS0_m + 0x350
[6] 0x00002AAAACDE8D5D _Z26sqlbDMSAddContainerRequestP12SQLB_POOL_CBP16SQLB_POOLCONT_CBP12SQLB_GLOBALSP14SQLB_pfParIoCbbm + 0x90D
[7] 0x00002AAAACE14FF9 _Z29sqlbDoDMSAddContainerRequestsP12SQLB_POOL_CBP16SQLB_POOLCONT_CBjP26SQLB_AS_CONT_AND_PATH_INFOP12SQLB_GLOBALS + 0x2D9
[8] 0x00002AAAACE0C20F _Z17sqlbDMSCreatePoolP12SQLB_POOL_CBiP16SQLB_POOLCONT_CBbP12SQLB_GLOBALS + 0x103F
[9] 0x00002AAAACDB1EAC _Z13sqlbSetupPoolP12SQLB_GLOBALSP12SQLB_POOL_CBPKciiiihiP19SQLB_CONTAINER_SPECllblsib + 0xE4C
-> it is an issue with /proc/mounts access, not the target path itself, where i can write with no issues:
[db2v111#rhel-hadrs ~]$ echo "test" > /db2data/testfile
If that would be path access issue:
chmod o+rx /proc
chmod a-rw /db2data
then an error during the "CREATE TABLESPACE" redo on standby will be different:
2019-06-21-03.07.29.175486+120 I35023E592 LEVEL: Error
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbCreateAndLockParent, probe:918
MESSAGE : ZRC=0x8402001E=-2080243682=SQLB_CONTAINER_NOT_ACCESSIBLE
"Container not accessible"
DATA #1 : <preformatted>
Failed at directory /db2data.
2019-06-21-03.07.29.175799+120 I35616E619 LEVEL: Severe
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbCreateAndLockParent, probe:722
MESSAGE : ZRC=0x8402001E=-2080243682=SQLB_CONTAINER_NOT_ACCESSIBLE
"Container not accessible"
DATA #1 : <preformatted>
Failed to create a portion of the path /db2data/testts2
(few more errors follow pointing directly to the permissions on /db2data)
This proves it is the /proc access issue and you need to debug it with your OS team. Perhaps /proc gets completely unmounted?
In any case, the actual issue is db2sysc process hitting EACCES running fopen on /proc/mounts and you need debug it further with OS team.
Edit:
When it comes to the debugging and proving the error is returned by the OS, we would have to trace open() syscalls done by Db2. Strace can do that, but overhead is too high for a production system. If you can get SystemTap installed on the system, I suggest a script like this (this is a basic version):
probe nd_syscall.open.return
{
if ( user_string( #entry( pointer_arg(1) ) ) =~ ".*mounts")
{
printf("exec: %s pid: %d uid: %d (euid: %d) gid: %d (egid: %d) run open(%s) rc: %d\n", execname(), pid(), uid(), euid(), gid(), egid(), user_string(#entry(pointer_arg(1)), "-"), returnval() )
}
}
it uses nd_syscall probe, so it will work even without kernel debuginfo package. You can run it like this:
$ stap open.stap
exec: cat pid: 24159 uid: 0 (euid: 0) gid: 0 (egid: 0) run open(/proc/mounts) rc: 3
exec: cat pid: 24210 uid: 0 (euid: 0) gid: 0 (egid: 0) run open(/proc/mounts) rc: 3
exec: cat pid: 24669 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: 3
exec: cat pid: 24734 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: -13
exec: cat pid: 24891 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/self/mounts) rc: -13
exec: ls pid: 24971 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: -13
-> at some point I've revoked access from /proc and open attempt failed with -13 (EACCES). You just need to enable it on the system when you see the error and see if something is logged when Db2 fails.

What does the slash denote in postgresql virtual transaction identifiers?

For example, my postgresql-main.log contains entries such as the following:
process 20234 session 5901502e.4f0a vtransaction transaction 0 : LOG: connection received: host=[local]
process 20234 session 5901502e.4f0a vtransaction 2/1 transaction 0 : LOG: connection authorized: user=postgres database=postgres
process 20234 session 5901502e.4f0a vtransaction 2/2 transaction 0 : LOG: statement: SELECT d.datname as "Name",
pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
d.datcollate as "Collate",
d.datctype as "Ctype",
pg_catalog.array_to_string(d.datacl, E'\n') AS "Access privileges"
FROM pg_catalog.pg_database d
ORDER BY 1;
process 20234 session 5901502e.4f0a vtransaction transaction 0 : LOG: disconnection: session time: 0:00:00.004 user=postgres database=postgres host=[local]
process 20237 session 5901502f.4f0d vtransaction transaction 0 : LOG: connection received: host=[local]
process 20237 session 5901502f.4f0d vtransaction 2/3 transaction 0 : LOG: connection authorized: user=postgres database=postgres
2017-04-26 19:58:07 MDT process 20237 remote [local] session 5901502f.4f0d vtransaction 2/4 transaction 0 : LOG: statement: SELECT d.datname as "Name",
pg_catalog.pg_get_userbyid(d.datdba) as "Owner",
pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding",
d.datcollate as "Collate",
d.datctype as "Ctype",
pg_catalog.array_to_string(d.datacl, E'\n') AS "Access privileges"
FROM pg_catalog.pg_database d
ORDER BY 1;
process 20237 session 5901502f.4f0d vtransaction transaction 0 : LOG: disconnection: session time: 0:00:00.002 user=postgres database=postgres host=[local]
Is there any relation between virtual transaction ids 2/1 and 2/2 above since they share the prefix 2/?

Full text search with Postgres

How do I do a full text search in Postgres of all columns without preprocessing? I found http://www.postgresql.org/docs/9.3/static/textsearch-intro.html I'm not exactly sure what I need to do.
My initial impression is I need to auto concatenate each column (how do I do that? Can't find via Googling) put it in a WHERE and do ## to_tsquery
This is for https://github.com/timwis/node-soda2-parser/issues/1 I'm not concerned with bad performance
I tried starting with
select array_to_string(translate(string_to_array(r::text, ',')::text, '()', '')::text[], ' ')::tsvector FROM seattle_police_govqa_audit_trails as r LIMIT 1
But get:
{"readyState":4,"responseText":"{\"error\":[\"syntax error in tsvector: \\\"1 -1 -1 -1 -1 -1 0 0 1 1 2 3 3500 5 7007 198 1264 NULL NULL \\\"Answer created by staff\\\" NULL NULL \\\"9/24/2015 16:01\\\" A000198-092415\\\"\"]}","responseJSON":{"error":["syntax error in tsvector: \"1 -1 -1 -1 -1 -1 0 0 1 1 2 3 3500 5 7007 198 1264 NULL NULL \"Answer created by staff\" NULL NULL \"9/24/2015 16:01\" A000198-092415\""]},"status":400,"statusText":"Bad Request"}
select * FROM seattle_police_govqa_audit_trails as r WHERE regexp_replace(array_to_string(translate(string_to_array(r::text, ',')::text, '()', '')::text[], ' '), '[^a-zA-Z\s]', '', 'g')::tsvector ## 'created'::tsquery = true LIMIT 10

Unable to restore the Db2 backup

I am unable to restore the Db2 backup. Getting error while executing the following line
DB2 Version : 9.7, AIX Version 5.3
rollforward db VATAMC2 stop
SQL0752N Connecting to a database is not permitted within a logical unit of
work when the CONNECT type 1 setting is in use. SQLSTATE=0A001
Attached are the set of commands that I executed to restore the db.
I am getting Error due to the last statement only and the rest all the statements are executed successfully.
Logs attached.
Can someone help me out with the same?
Code which I executed:
restore db PRODVAT user mstadmin using password from /data/db2_backupfs taken at 20160304 into VATAMC2 redirect without prompting;
set tablespace containers for 29 using (path '/data/VAT_AMC2/tbs29/cont1');
set tablespace containers for 46 using (path '/data/VAT_AMC2/tbs46/cont1');
set tablespace containers for 47 using (path '/data/VAT_AMC2/tbs47/cont1');
set tablespace containers for 43 using (path '/data/VAT_AMC2/tbs43/cont1');
set tablespace containers for 48 using (path '/data/VAT_AMC2/tbs48/cont1');
set tablespace containers for 44 using (path '/data/VAT_AMC2/tbs44/cont1');
set tablespace containers for 34 using (file '/data/VAT_AMC2/tbs34/cont1' 589824);
set tablespace containers for 35 using (file '/data/VAT_AMC2/tbs35/cont1' 4680192, file '/data/VAT_AMC2/tbs35/cont1a' 4680192);
set tablespace containers for 36 using (file '/data/VAT_AMC2/tbs36/cont2' 3669984, file '/data/VAT_AMC2/tbs36/cont2a' 3669984);
set tablespace containers for 37 using (file '/data/VAT_AMC2/tbs37/cont1' 3898624, file '/data/VAT_AMC2/tbs37/cont1a' 3898624);
set tablespace containers for 38 using (file '/data/VAT_AMC2/tbs38/cont1' 3295788 , file '/data/VAT_AMC2/tbs38/cont1a' 1647892);
set tablespace containers for 39 using (file '/data/VAT_AMC2/tbs39/cont1' 3669976 , file '/data/VAT_AMC2/tbs39/cont12a' 3669976, file '/data/VAT_AMC2/tbs39/cont12b' 3669976, file '/data/VAT_AMC2/tbs39/cont13a' 3669976);
set tablespace containers for 40 using (file '/data/VAT_AMC2/tbs40/cont1' 5509291, file '/data/VAT_AMC2/tbs40/cont1a' 5509290, file '/data/VAT_AMC2/tbs40/cont1b' 5509290);
set tablespace containers for 41 using (file '/data/VAT_AMC2/tbs41/cont1' 2294656);
set tablespace containers for 42 using (file '/data/VAT_AMC2/tbs42/cont1' 4718592);
set tablespace containers for 45 using (file '/data/VAT_AMC2/tbs45/cont1' 1987872, file '/data/VAT_AMC2/tbs45/cont2' 1987872);
set tablespace containers for 49 using (file '/data/VAT_AMC2/tbs49/cont1' 4476974, file '/data/VAT_AMC2/tbs49/cont2' 4476973, file '/data/VAT_AMC2/tbs49/cont3' 4476973);
set tablespace containers for 50 using (file '/data/VAT_AMC2/tbs50/cont1' 3744512);
set tablespace containers for 51 using (file '/data/VAT_AMC2/tbs51/cont1' 5639424);
set tablespace containers for 52 using (file '/data/VAT_AMC2/tbs52/cont1' 4132096);
set tablespace containers for 53 using (file '/data/VAT_AMC2/tbs53/cont1' 3932160);
set tablespace containers for 54 using (file '/data/VAT_AMC2/tbs54/cont1' 4469376, file '/data/VAT_AMC2/tbs54/cont2' 4469376, file '/data/VAT_AMC2/tbs54/cont3' 4469376);
set tablespace containers for 55 using (file '/data/VAT_AMC2/tbs55/cont1' 4194304, file '/data/VAT_AMC2/tbs55/cont2a' 4194304, file '/data/VAT_AMC2/tbs55/cont2b' 4194304, file '/data/VAT_AMC2/tbs55/cont3a' 4194304);
restore db PRODVAT continue;
rollforward db VATAMC2 stop;
Db2 Log:
2016-03-10-09.06.41.911428-360 E6918736A469 LEVEL: Event
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FirstConnect, probe:1000
START : DATABASE: VATAMC2 : ACTIVATED: NO
2016-03-10-09.06.41.913584-360 E6919206A481 LEVEL: Info
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqluxGetDegreeParallelism, probe:774
DATA #1 :
Autonomic BAR - using parallelism = 2.
2016-03-10-09.06.41.915158-360 E6919688A492 LEVEL: Info
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludPrintStartingMsg, probe:1338
DATA #1 :
Starting a full database restore.
Agent EDU ID: 2572
2016-03-10-09.06.41.915354-360 E6920181A554 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqluCompareDB, probe:1445
DATA #1 : Sqlcode, PD_TYPE_SQLCODE, 4 bytes
2528
DATA #2 : Hexdump, 4 bytes
0x07000000491947D8 : 0000 09E0 ....
2016-03-10-09.06.41.915502-360 E6920736A985 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqluCompareDB, probe:1445
MESSAGE : SQL2540W Restore is successful, however a warning "" was encountered
during Database Restore while processing in No Interrupt mode.
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: 2540 sqlerrml: 4
sqlerrmc: 2528
sqlerrp : sqluComp
sqlerrd : (1) 0x00000000 (2) 0x00000000 (3) 0x00000000
(4) 0x00000000 (5) 0x00000000 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:
2016-03-10-09.06.41.919710-360 E6921722A520 LEVEL: Info
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqluxGetAvailableHeapPages, probe:888
DATA #1 :
Autonomic BAR - heap consumption.
Targetting (90%) - 112118 of 124576 pages.
2016-03-10-09.06.41.919864-360 E6922243A508 LEVEL: Info
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludTuneBuffers, probe:1139
DATA #1 :
Autonomic restore - tuning enabled.
Using buffer size = 4097, number = 4.
2016-03-10-09.06.42.976640-360 I6922752A522 LEVEL: Error
PID : 389280 TID : 13109 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000
EDUID : 13109 EDUNAME: db2bm.2572.0 (VATAMC2) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbDMSDoContainerOp, probe:810
MESSAGE : ZRC=0x8402001E=-2080243682=SQLB_CONTAINER_NOT_ACCESSIBLE
"Container not accessible"
DATA #1 :
Error checking container 0 (/dsexp11/cont/tbs60/cont1) for tbsp 56. Rc = 870F0011
2016-03-10-09.06.43.094056-360 E6923275A555 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludBMResponse, probe:310
DATA #1 : Sqlcode, PD_TYPE_SQLCODE, 4 bytes
1277
DATA #2 : Hexdump, 4 bytes
0x07000000491947D8 : 0000 04FD ....
2016-03-10-09.06.43.094275-360 E6923831A1057 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludBMResponse, probe:310
MESSAGE : SQL1277W A redirected restore operation is being performed. Table
space configuration can now be viewed and table spaces that do not
use automatic storage can have their containers reconfigured.
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: 1277 sqlerrml: 0
sqlerrmc:
sqlerrp : sqludBMR
sqlerrd : (1) 0x00000000 (2) 0x00000000 (3) 0x00000000
(4) 0x00000000 (5) 0x00000000 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:
2016-03-10-09.06.43.883679-360 E6924889A567 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludCheckRedirectedStatus, probe:1065
DATA #1 : Sqlcode, PD_TYPE_SQLCODE, 4 bytes
1277
DATA #2 : Hexdump, 4 bytes
0x07000000491947D8 : 0000 04FD ....
2016-03-10-09.06.43.883857-360 E6925457A1069 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludCheckRedirectedStatus, probe:1065
MESSAGE : SQL1277W A redirected restore operation is being performed. Table
space configuration can now be viewed and table spaces that do not
use automatic storage can have their containers reconfigured.
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: 1277 sqlerrml: 0
sqlerrmc:
sqlerrp : sqludChe
sqlerrd : (1) 0x00000000 (2) 0x00000000 (3) 0x00000000
(4) 0x00000000 (5) 0x00000000 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:
2016-03-10-09.06.43.884062-360 I6926527A840 LEVEL: Warning
PID : 389280 TID : 2572 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPHDL : 0-7 APPID: *LOCAL.db2inst1.160310150641
AUTHID : MSTADMIN
EDUID : 2572 EDUNAME: db2agent (VATAMC2) 0
FUNCTION: DB2 UDB, database utilities, sqludCheckRedirectedStatus, probe:1068
DATA #1 :
Insufficient space in tablespace DMS_TB6_TS; you must have at least 6402208
usable pages. (The "usable pages" total does not include pages used
internally by DB2, so the value specified on the SET TABLESPACE
CONTAINERS operation should be increased by one extent per container.
Based on the latest SET TABLESPACE CONTAINERS values specifed, the
tablespace should have 6402272 pages in total.)
2016-03-10-09.06.43.884818-360 I6927368A345 LEVEL: Severe
PID : 360608 TID : 1 PROC : db2bp
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPID : *LOCAL.db2inst1.160310150641
EDUID : 1
FUNCTION: DB2 UDB, base sys utilities, sqleStartUsingDatabase, probe:2
RETCODE : ZRC=0xFFFFFD10=-752
2016-03-10-09.06.43.885281-360 I6927714A376 LEVEL: Error
PID : 360608 TID : 1 PROC : db2bp
INSTANCE: db2inst1 NODE : 000 DB : VATAMC2
APPID : *LOCAL.db2inst1.160310150641
EDUID : 1
FUNCTION: DB2 UDB, data protection services, sqlpsrllf, probe:317
MESSAGE : Rollforward failed to connect to database. sqlcode -752.

Transactional Replication: Column name or number of supplied values does not match table definition

I have set up a transactionnal replication for some tables.
The Master and the Slave Database are identical.
I used this query and compared the result from master and slave to make sure the table is identical
select * from sys.columns c
join sys.tables t on t.object_id = c.object_id
where t.name = 'customers'
In the Replication Monitor I can find this error:
Column name or number of supplied values does not match table definition.
If I check the details I get this:
Command attempted:
if ##trancount > 0 rollback tran
(Transaction sequence number: 0x0011775200000105007600000000, Command ID: 1)
So I checked in the destribution database using this query to find the command that is failing.
sp_browsereplcmds #xact_seqno_start = '0x0011775200000105007600000000',
#xact_seqno_end = '0x0011775200000105007600000000'
This is the command (its in 2 lines in that table):
{CALL [sp_MSins_dboCustomers] (0,'575',N'todelete','575',N'todelete',118594,118595,118596,N'10T 3% Sk 30T net.',0,'Deutschland',4,24399158193054E-314,4,24399158193054E-314,4,24399158193054E-314,4,24399158193054E-314,2,54639494915833E-313,'','','','','','TGW',N'Liefern LKW',NULL,NULL,0,0,6,79038653108887E-311,NULL,'',0,NULL,NULL,NULL,0,0,0,-1,-1,1900-01-01 00:00:00,0,1,{AEB3D911-36D1-4A8A-B713-6B2F2CCA1641},0,0,2,'de-AT',25,NULL,NULL,0,1,NULL,NULL,2014-03-07 08:57:45.727,-1,NULL,0,'','','','','','','','','','','','',
'','','','','','','','')}
This is what I have in my DB
TypeID CustomerID Name SiteID SiteName AddressID BillAddressID ShipAddressID Terms TaxExempt TaxSchedID TaxPercent TaxPercent1 TaxPercent2 TaxPercent3 TaxPercent4 TaxTitle TaxTitle1 TaxTitle2 TaxTitle3 TaxTitle4 LocationID ShipVia PackingType PackingNoteID CutoffDay UploadAction LeadTime ExpDays Notes SalesPersonID CreditLimit OpenOrders OrderValueScheduleID OAHidePrices DefaultAckType DefaultInvType DefaultPackType UploadEmployee UploadDateTime OAHideImages MfgCustomer CustomerGUID PricingMethod DefaultCustomer EngineeringUnitSetID CurrencyCulture FamilyGroupID InvoiceMinimum InvoiceSurcharge InvoiceGroup InvoiceCopies DeliveryMinimum DeliverySurcharge CreateDate EnteredBy LanguageCulture DropShip UserDef1 UserDef2 UserDef3 UserDef4 UserDef5 UserDef6 UserDef7 UserDef8 UserDef9 UserDef10 UserDef11 UserDef12 UserDef13 UserDef14 UserDef15 UserDef16 UserDef17 UserDef18 UserDef19 UserDef20
0 575 todelete 575 todelete 118594 118595 118596 10T 3% Sk 30T net. 0 Deutschland 0 0 0 0 0 TGW Liefern LKW NULL NULL 0 0 0 NULL 302 NULL NULL NULL 0 0 0 -1 -1 1900-01-01 00:00:00 0 1 AEB3D911-36D1-4A8A-B713-6B2F2CCA1641 0 0 2 de-AT 25 NULL NULL 0 1 NULL NULL 2014-03-07 08:57:45.727 -1 NULL 0 1 2 3 4 0 1 2 3 4
As you can see here, the values for the taxpercent fields (after "Deutschland") are 0 in my DB, in the command they are really weird (4,24399158193054E-314)
The Datatype is "real"
Maybe this is not the issue but this is the only weird thing I could find.
I found my problem.
In fact this 4,24399158193054E-314 is a value for "0" in real, the problem is that it did not use the "." but the "," as decimal separator and therefore the call of the procedure had too much argument.
What I did is to change the statement delivery for insert, update, delete from "Call " to INSERT/UPDATE/DELETE statement.
I don't know why this is not selected by default, but now it works.