Restore data from Postgres data files - postgresql

Got system with Postgres broken (a RAID is the reason) , without any backups.
Trying to put data to another comptuter with Postgres (and make however backup).
But always when I set up data directory and run postgres I've got message
GET FATAL: database files are incompatible with server
2012-08-15 19:58:38 GET DETAIL: The database cluster was initialized with BLCKSZ 16777216, but the server was compiled with BLCKSZ 8192.
2012-08-15 19:58:38 GET HINT: It looks like you need to recompile or initdb.
It's very strange number 16777216(2 to power 24 - to big).
However I can't reset default value 8192 when compiling (playing with --with-blocksize= take no effect; BLCKSZ - I can't find it in headers files)
).
Any way to extract data ?
This is environment and circumstances:
harddrive: RAID 1 with 3 SAS disks in array
OS: ubuntu 10.04.04 amd64
Postgres: 9.1 (by apt-get (we change repository links to higher version of Ubuntu))
the system become broken - after some time got
AAC: Host Adapter BLINK LED 0x56
AACO: Adapter kernel panic'd 56
(filesystem or hardware error)
Somehow we got data directory. pg_conroldata shown:
pg_control version number: 903
Catalog version number: 201105231
Database system identifier: 5714530593695276911
Database cluster state: shut down
pg_control last modified: Tue 15 Aug 2012 11:50:50
Latest checkpoint location: 1B595668/2000020
Prior checkpoint location: 0/0
Latest checkpoint's REDO location: 1B595668/2000020
Latest checkpoint's TimeLineID: 1
Latest checkpoint's NextXID: 0/4057946
Latest checkpoint's NextOID: 40960
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 670
Latest checkpoint's oldestXID's DB: 1344846103
Latest checkpoint's oldestActiveXID: 0
Time of latest checkpoint: Tue 15 Aug 2012 11:50:50
Minimum recovery ending location: 0/0
Backup start location: 0/0
Current wal_level setting: minimal
Current max_connections setting: 100
Current max_prepared_xacts setting:0
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 16777216
Blocks per segment of large relation:131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 2387576020
Maximum size of a TOAST chunk: 0
Date/time type storage: floating-point numbers
Float4 argument passing: by reference
Float8 argument passing: by reference
First I effort to up DB in Ubuntu servers (harddisk - simple serial 2, Ubuntu 10.04 i386, Postgres 9.1) and got the same exception above (with BLCKSZ).
That's why I deployed Ubuntu 10.04 amd64 with english Postgres 9.1 (because got '?' instead of russian symbols in error logs in previous step) in virtual machine
Got the same exception (with BLCKSZ).
Ather that have removed apt-get postgres version and compiled it as described at docs http://www.postgresql.org/docs/9.1/static/installation.html.
Playing withconfigure --with-blocksize=BLOCKSIZE had take no effect - got the same error
Sorry, for the post.
The pg_contol was broken by some manipulations with.
Sow, the cluster was succeful restored by pg_resetxlog with initial data.

A blocksize of 16Mb would be really weird, and since these two values also look completely bogus:
Maximum columns in an index: 2387576020
Maximum size of a TOAST chunk: 0
...you might want to question the integrity of this data before spending time on compiling postgres with a non-standard block size.
If you look at the sizes of files corresponding to relations, are they multiple of 16Mb or 8Kb?
If the database have some gigabytes tables, what appears to be the cut-off size on disk (the size above which postgres split the data into several files)? This should be equal to data block size*Blocks per segment of large relation. On a default install, it's 1Gb.

See here for details on configuring kernel resources. Perhaps the default/current settings for this new OS won't allow the postmaster to start.
Here are details on the meaning and context of the BLCKSZ parameter. Was the system that failed running a 64bit build of PostgreSQL and the new system is a 32bit build? If possible, attempting to obtain version information on the failed system's PostgreSQL could shed light on the problem. Let us know what version, build, and OS were used. Was is a custom build?

Related

Error when init database postgresql 10.10: PANIC: could not generate secret authorization token

I have a problem when run command: sudo -su user_test ./pgsql/bin/initdb -D /example/folder
I had researched many sources from the internet but don’t found a solution.
I hope everyone could help me. Thanks.
Enviroment:
initdb (PostgreSQL) 10.10
OS: uname -a Linux DL2100 3.10.38 #1 SMP Build-gitb1820a8 x86_64 GNU/Linux
selecting default max_connections … 100
selecting default shared_buffers … 128MB
selecting default timezone … Europe/Helsinki
selecting dynamic shared memory implementation … posix
creating configuration files … ok
running bootstrap script … 2020-11-03 11:52:56.303 EET [3928] DEBUG: invoking IpcMemoryCreate(size=148545536)
2020-11-03 11:52:56.303 EET [3928] DEBUG: mmap(148897792) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2020-11-03 11:52:56.315 EET [3928] DEBUG: SlruScanDirectory invoking callback on pg_notify/0000
2020-11-03 11:52:56.315 EET [3928] DEBUG: removing file "pg_notify/0000"
2020-11-03 11:52:56.316 EET [3928] DEBUG: dynamic shared memory system will support 288 segments
2020-11-03 11:52:56.316 EET [3928] DEBUG: created dynamic shared memory control segment 1852866650 (6928 bytes)
2020-11-03 11:52:56.319 EET [3928] PANIC: could not generate secret authorization token
Aborted
child process exited with exit code 134```
The error is thrown in BootStrapXLOG in src/backend/access/transam/xlog.c:
/*
* Generate a random nonce. This is used for authentication requests that
* will fail because the user does not exist. The nonce is used to create
* a genuine-looking password challenge for the non-existent user, in lieu
* of an actual stored password.
*/
if (!pg_backend_random(mock_auth_nonce, MOCK_AUTH_NONCE_LEN))
ereport(PANIC,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("could not generate secret authorization token")));
src/backend/utils/misc/backend_random.c says:
pg_backend_random() function fills a buffer with random bytes. Normally,
it is just a thin wrapper around pg_strong_random(), but when compiled
with --disable-strong-random, we provide a built-in implementation.
So it seems that PostgreSQL was built on a system that had a source for strong random numbers (OpenSSL or /dev/urandom, if you are not on Windows), but the facility is not working on your current system.
try with the lates minor release of v10 (currently 10.15) – maybe a bug has been fixed.
run pg_config --configure to check if PostgreSQL was built --with-openssl
OpenSSL also uses /dev/urandom, so there is likely a problem with that source of random numbers; investigate there
If all fails, build PostgreSQL from source and configure it with
./configure --disable-strong-random ...
It worked fine. Thank you very much, #Laurenz Albe

Upgrading MongoDB from 4.2.9 to 4.4.0: Location13111: field not found, expected type date

I'm running a sharded MongoDB instance and as per the instructions, the config servers are a replica set. I'm unable to upgrade from v4.2.9 to 4.4.0. Per the upgrade instructions, I need to upgrade the config servers first, starting with a secondary. It already failed there. I shut down the secondary's instance, replaced the binaries, and restarted it. But it didn't start up again. The logs say the following (I removed the timestamps for clarity):
"msg":"The size storer reports that the oplog contains","attr":{"numRecords":53890848,"dataSize":13618131721}}
"msg":"Sampling the oplog to determine where to place markers for truncation"}
"msg":"Sampling from the oplog to determine where to place markers for truncation","attr":{"from":{"$timestamp":{"t":1494750837,"i":1}},"to":{"$timestamp":{"t":1598687615,"i":1}}}}
"msg":"Taking samples and assuming each oplog section contains","attr":{"numSamples":253,"containsNumRecords":2124552,"containsNumBytes":536870917}}
"msg":"User assertion","attr":{"error":"Location13111: field not found, expected type date","file":"src/mongo/bson/bsonelement.h","line":810}}
"msg":"WiredTiger record store oplog processing finished","attr":{"durationMillis":21}}
"msg":"~WiredTigerRecordStore for: {ns}","attr":{"ns":"local.oplog.rs"}}
"msg":"Invariant failure","attr":{"expr":"_oplogManagerCount > 0","file":"src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp","line":2467}}
"msg":"\n\n***aborting after invariant() failure\n\n"}
"msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}
"msg":"BACKTRACE: {bt}","attr":{"bt":{"backtrace":[{"a":"55C91A79E621","b":"55C917AE3000","o":"2CBB621","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.606","s+":"1E1"},{"a":"55C91A79FCC9","b":"55C917AE3000","o":"2CBCCC9","s":"_ZN5mongo15printStackTraceEv","s+":"29"},{"a":"55C91A79D4B6","b":"55C917AE3000","o":"2CBA4B6","s":"_ZN5mongo12_GLOBAL__N_116abruptQuitActionEiP9siginfo_tPv","s+":"66"},{"a":"7FAF200070E0","b":"7FAF1FFF6000","o":"110E0","s":"funlockfile","s+":"50"},{"a":"7FAF1FC89FFF","b":"7FAF1FC57000","o":"32FFF","s":"gsignal","s+":"CF"},{"a":"7FAF1FC8B42A","b":"7FAF1FC57000","o":"3442A","s":"abort","s+":"16A"},{"a":"55C9189E6C5F","b":"55C917AE3000","o":"F03C5F","s":"_ZN5mongo15invariantFailedEPKcS1_j","s+":"12C"},{"a":"55C9186CE4B6","b":"55C917AE3000","o":"BEB4B6","s":"_ZN5mongo18WiredTigerKVEngine16haltOplogManagerEv.cold.1904","s+":"18"},{"a":"55C918B0711C","b":"55C917AE3000","o":"102411C","s":"_ZN5mongo21WiredTigerRecordStoreD1Ev","s+":"2FC"},{"a":"55C918B0D68B","b":"55C917AE3000","o":"102A68B","s":"_ZN5mongo29StandardWiredTigerRecordStoreD0Ev","s+":"1B"},{"a":"55C9186CEC5B","b":"55C917AE3000","o":"BEBC5B","s":"_ZN5mongo18WiredTigerKVEngine21getGroupedRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsENS_8KVPrefixE.cold.1921","s+":"57"},{"a":"55C919378A76","b":"55C917AE3000","o":"1895A76","s":"_ZN5mongo17StorageEngineImpl15_initCollectionEPNS_16OperationContextENS_8RecordIdERKNS_15NamespaceStringEb","s+":"316"},{"a":"55C91937A7BD","b":"55C917AE3000","o":"18977BD","s":"_ZN5mongo17StorageEngineImpl11loadCatalogEPNS_16OperationContextE","s+":"90D"},{"a":"55C91937E3D0","b":"55C917AE3000","o":"189B3D0","s":"_ZN5mongo17StorageEngineImplC1EPNS_8KVEngineENS_20StorageEngineOptionsE","s+":"270"},{"a":"55C918AC8005","b":"55C917AE3000","o":"FE5005","s":"_ZNK5mongo12_GLOBAL__N_117WiredTigerFactory6createERKNS_19StorageGlobalParamsEPKNS_21StorageEngineLockFileE","s+":"1A5"},{"a":"55C9193889EE","b":"55C917AE3000","o":"18A59EE","s":"_ZN5mongo23initializeStorageEngineEPNS_14ServiceContextENS_22StorageEngineInitFlagsE","s+":"4CE"},{"a":"55C918A84587","b":"55C917AE3000","o":"FA1587","s":"_ZN5mongo12_GLOBAL__N_114_initAndListenEPNS_14ServiceContextEi.isra.1409","s+":"3F7"},{"a":"55C918A88610","b":"55C917AE3000","o":"FA5610","s":"_ZN5mongo12_GLOBAL__N_111mongoDbMainEiPPcS2_","s+":"650"},{"a":"55C9189F7849","b":"55C917AE3000","o":"F14849","s":"main","s+":"9"},{"a":"7FAF1FC772E1","b":"7FAF1FC57000","o":"202E1","s":"__libc_start_main","s+":"F1"},{"a":"55C918A83A3A","b":"55C917AE3000","o":"FA0A3A","s":"_start","s+":"2A"}],"processInfo":{"mongodbVersion":"4.4.0","gitVersion":"563487e100c4215e2dce98d0af2a6a5a2d67c5cf","compiledModules":[],"uname":{"sysname":"Linux","release":"4.9.0-7-amd64","version":"#1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13)","machine":"x86_64"},"somap":[{"b":"55C917AE3000","elfType":3,"buildId":"D7866CAA7FFAC402345915854064CD98A5B60C27"},{"b":"7FAF1FFF6000","path":"/lib/x86_64-linux-gnu/libpthread.so.0","elfType":3,"buildId":"16D609487BCC4ACBAC29A4EAA2DDA0D2F56211EC"},{"b":"7FAF1FC57000","path":"/lib/x86_64-linux-gnu/libc.so.6","elfType":3,"buildId":"775143E680FF0CD4CD51CCE1CE8CA216E635A1D6"}]}}}}
It appears to boil down to the following error message:
Location13111: field not found, expected type date.
src/mongo/bson/bsonelement.h:810
Googling didn't turn up anything useful. I didn't proceed after that but had to revert to v4.2.9. (I wanted to keep the damage to the config secondary and not get the same issue with the shards.)
I'm on Debian 9.13 and I tried both apt to install MongoDB 4.4.0 and directly installing the Debian 9.2 binaries. The error was the same both times.
Any ideas what to do about this one?

PostgreSQL 11 Shared Memory Error: could not open shared memory segment "/PostgreSQL.XXXXXXXX": No such file or directory

Shared Memory files getting deleted some time (~15 hours) in Postgres 11
2019-07-09 08:46:41 CDT [] [6723]: [1-1] user=,db=,e=58P01 ERROR: could not open shared memory segment "/PostgreSQL.291691635": No such file or directory
2019-07-09 08:46:41 CDT [] [6722]: [1-1] user=,db=,e=58P01 ERROR: could not open shared memory segment "/PostgreSQL.291691635": No such file or directory
2019-07-09 08:46:41 CDT [10.40.0.204(60550)] [13880]: [1-1] user=user_name,db=db_name,e=58P01 ERROR: could not open shared memory segment "/PostgreSQL.291691635": No such file or directory
2019-07-09 08:46:41 CDT [10.40.0.204(60550)] [13880]: [2-1] user=user_name,db=db_name,e=58P01 CONTEXT: parallel worker
2019-07-09 08:46:41 CDT [10.40.0.204(60550)] [13880]: [3-1] user=user_name,db=db_name,e=58P01 STATEMENT: WITH overall_reviewed AS (SQL Query)
GCP VM Config
CPU: 4
RAM: 16 GB
OS: Ubuntu 18.04.1 LTS
kernel shared memory setting shared
kernel.shmmax=8589934592
kernel.shmall=2097152
postgresql.config
max_connections = 500
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 4194kB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
During startup: no errors/warnings
After ~15 hours some of the shared memory files is getting deleted, I'm doubting is there any other process deleting files in "/dev/shm" ?
Not sure what is the root cause
making dynamic_shared_memory_type = none in postgresql.conf did solve the issue.
Got the same problem on Ubuntu 18.04 and PostgreSQL 11 and after some more research i have found a solution for us. The error occured when the backup user, which ist he same user as the PG service user, logs into the system. The following Link describes that the storeage under /dev/shm where deleted when a user logs in to the system (same user). So our solution was to change the following:
/etc/systemd/logind.conf
added the Line
RemoveIPC=no
and restart the service
systemctl restart systemd-logind.service
Sources:
https://www.postgresql-archive.org/systemd-deletes-shared-memory-segment-in-dev-shm-Postgresql-NNNNNN-td5883507.html
https://superuser.com/questions/1117764/why-are-the-contents-of-dev-shm-is-being-removed-automatically
We had the same issue and it turned out that someone set postgres user's UID bigger than 1000 (which means that postgres user was no longer a system account). And, as said here:
After hours of searching and reading, I found the culprit.
It's a setting for systemd. The /etc/systemd/logind.conf contains default configuration options, with each of them commented out.
The RemoveIPC option is set to yes by default. That option tells systemd to clean up interprocess communication (IPC) for "user accounts" who aren't logged in.
================================================
This does not affect "system accounts"
================================================
Met the same issue.....when I have opened two sqldeveloper with the same user account, one of them is my remote session which I completely forgot to close the session.
I was doing some aggregation operators like count(*) or max(...), and the error on both. And the error is similar:
ERROR: could not open shared memory segment "/PostgreSQL.798235387":
No such file or directory Where: parallel worker
Solution? I killed the remote session.... XD
And life is peaceful and happy again :D

kernel - postgres segfault error 15 in libc-2.19.so

Yesterday we had crash of PostgreSQL 9.5.14 running on Debian 8 (Linux xxxxxx 3.16.0-7-amd64 #1 SMP Debian 3.16.59-1 (2018-10-03) x86_64 GNU/Linux) - Segmentation fault. Database closed all connections and reinitialized itself staying ~1 minute in recovery mode.
PostgreSQL log:
2018-10-xx xx:xx:xx UTC [580-2] LOG: server process (PID 16461) was
terminated by signal 11: Segmentation fault
kern.log:
Oct xx xx:xx:xx xxxxxxxx kernel: [117977.301353] postgres[16461]:
segfault at 7efd3237db90 ip 00007efd3237db90 sp 00007ffd26826678 error
15 in libc-2.19.so[7efd322a2000+1a1000]
According to libc documentation (https://support.novell.com/docs/Tids/Solutions/10100304.html) error code 15 means:
NX_EDEADLK 15 resource deadlock would occur - which does not tell me much.
Could you tell me please if we can do something to avoid this problem in the future? Because this server is of course production one.
All packages are up to date currently. Upgrade of PG is unfortunately not the option. Server runs on Google Compute Engine.
error code 15 means: NX_EDEADLK 15
No, it doesn't mean that. This answer explains how to interpret 15 here.
It's bits 0, 1, 2, 3 set => protection fault, write access, user mode, use of reserved bit. Most likely your postgress process attempted to write to some wild pointer.
if we can do something to avoid this problem in the future?
The only thing you can do is find the bug and fix it, or upgrade to a release of postgress where that bug is already fixed (and hope that no new ones were introduced).
To understand where the bug might be, you should check whether a core dump was produced (if not, do enable them). If you have the core, use gdb /path/to/postgress /path/to/core, and then where GDB command. That will give you crash stack trace, which may allow you to find similar bug reports.

MongoDB Out of Memory

MongoDB is crashing. When I open the mongodb.log file, I get:
$ tail /var/log/mongodb/mongodb.log
Sat Jan 25 03:06:56.153 [initandlisten] connection accepted from 127.0.0.1:58492 #63331 (263 connections now open)
Sat Jan 25 03:07:02.694 out of memory, printing stack and exiting:
0xde05e1 0x6cf37e 0x12129fd 0xc490c3 0xc4404e 0xc44196 0xda4913 0xda53e4 0xe28e69 0x7f5cbaa19e9a 0x7f5cb9d2c3fd
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xde05e1]
/usr/bin/mongod(_ZN5mongo14my_new_handlerEv+0x3e) [0x6cf37e]
/usr/bin/mongod(_Znam+0x6d) [0x12129fd]
/usr/bin/mongod(_ZNK5mongo3Top8cloneMapERNS_9StringMapINS0_14CollectionDataEEE+0x83) [0xc490c3]
/usr/bin/mongod(_ZN5mongo9Snapshots12takeSnapshotEv+0x4e) [0xc4404e]
/usr/bin/mongod(_ZN5mongo14SnapshotThread3runEv+0x66) [0xc44196]
/usr/bin/mongod(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE+0xc3) [0xda4913]
/usr/bin/mongod(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74) [0xda53e4]
/usr/bin/mongod() [0xe28e69]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f5cbaa19e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f5cb9d2c3fd]
This question sounds similar: MongoDB: out of memory
But his problem was a ulimit issue. My memory settings are already unlimited.
Others had particular issues with .skip() or .limit() given unreasonably large values, but that's not happening here.
Anyone know what might be wrong?
The MongoDB docs recommend having enough swap space for MongoDB, despite it not being a requirement: http://docs.mongodb.org/manual/administration/production-notes/#ProductionNotes-Swap
I'm using Windows Azure hosting, and I discovered that their virtual servers don't have swap space by default:
$ sudo swapon -s
Filename Type Size Used Priority
(Azure defaults to no swap space: Part 1 & Part 2)
So I found a guide to creating a swap file: https://www.digitalocean.com/community/articles/how-to-add-swap-on-ubuntu-12-04
And it solved my problem!
Notes:
The guide says Ubuntu 12.04, but the same steps worked for me on 13.10.
You should use a swap file around half the size of your RAM, not the 512MB used in the guide.
I hope this helps others solve this problem.