postgresql initdb - directory not empty - postgresql

I am installing postgres 8.4 on an ubuntu lucid server (no, at the moment we are using the "lucid" LTS version on that server so an upgrade is not possible yet (although we are going to start testing the system on precise quite soon now))
I have set up an own partition for the /var/lib/postgresql/8.4/main directory with a ext4 file system. (Those of you who are really into postgres installs knows what is happening now...) Since ext4 puts a lost+found directory in the root of all file system, postgres will not use that directory as its data-directory since it is initially not empty...
initdb: directory "/var/lib/postgresql/8.4/main" exists but is not empty
If you want to create a new database system, either remove or empty
the directory "/var/lib/postgresql/8.4/main" or run initdb
with an argument other than "/var/lib/postgresql/8.4/main".
The easiest way to proceed would be to remove the lost+found and recreate it after initdb has done its job. - could that cause any problems? Does the lost+found have any special attributes or anything that makes it impossible to recreate, and also, it is needed at any other time than if checkdisk finds something it needs to put there?
Another way would be to unmount the .../main/ file system, init the database, temporary mount the .../main/ filesystem somewhere else, move things over there and mount it in place. Seems to be a bit more work than the "easiest way".
Or is it some way to make initdb ignore that the directory is not empty? (couldn't see any command line switches for that)
May a lost+found directory within postgres main directory cause any problems?
At the moment I am running the system on a virtual machine for testing, so it really doesn't matter if I mess up things, but before making this an official way of installing a mission-critical system, it would be nice to have some thoughts on this.

lost+found has preallocated blocks that make it easier for fsck to move data into it when the partition is short of free blocks. To create it, better use the mklost+found command rather than mkdir.
If you don't recreate it, fsck will do it anyway when it's needed.
But if it comes to the point where fsck finds corruption within PGDATA, I'd think about going for a backup rather than counting on lost+found to retrieve anything.

Related

Can I recover temporary file from IPython after a crash?

I was editing a long temporary file using the IPython's magis %edit -p function hwne my computer crashed from hardware reasons. I have /tmp mounted to ram, so the file cannot be retrieved from there. Can I recover the content of the temporary file? I spent a few hours writing that.
There is a way but its not very easy. You can search for deleted i nodes and you can try to recover them. But the best way is to be careful and don't work in your /tmp directory.
Here is a good explanation:
Unix undelete / recover deleted files

What are important mongo data files for backup

If I want to backup database by copying raw files. What files do I need to copy ? only db-name.ns, db-name.0, db-name.1.... or whole folder (local.ns.., journal). I'm running replica set. I understand procedure for locking hidden secondary node and then copying files to new location. But I'm wondering do I need to copy whole folder or just some files.
Thx
Simple answer: All of them. As obvious as it might sound. And here is why:
If you don't copy a namespaces file, your database will most likely not work.
When not copying all datafiles, some of your data is missing and your indices will point to void locations. The database in question might work (minus the data stored in the missing data file), but I would not bet on that – and since the data was important enough to create a backup in the first place, you don't want this to happen, do you?
Config, admin and local databases are vitally necessary for their respective features – and since you used the feature, you probably want to use it after a restore, too.
How do I backup all files?
The best solution save for MMS backup I have found so far is to create LVM snapshots of the filesystem the MongoDB data resides on. In order for tis to work, the journal needs to be included. Usually, you don't need a dedicated backup node for this approach. It is a bit complicated to set up, though.
Preparing LVM backups
Let's assume you have your data in the default data directory /data/db and you have not changed any paths. Then you would mount a logical volume to /data/db and use this to hold the data. Assuming that you don't have anything like this, here is a step by step guide:
Create a logical volume big enough to hold your data. I will call that one /dev/VolGroup/LogVol1 from now on. Make sure that you only use about 80% of the available disk space in the volume group for creating the logical volume.
Create a filesystem on the logical volume. I prefer XFS, so we create an xfs filesystem on /dev/VolGroup/LogVol1:
mkfs.xfs /dev/VolGroup/LogVol1
Mount the newly created filesystem on /mnt
mount /dev/VolGroup/LogVol1 /mnt
Shut down mongod:
killall mongod
(Note that the upstart scripts sometimes have problems shutting down mongod, and this command gracefully stops mongod anyway).
Copy the datafiles from /data/dbto /mntby issuing
cp -a /data/db/* /mnt
Adjust your /etc/fstab so that the logical volume gets mounted on reboot:
# The noatime parameter increases io speed of mongod significantly
/dev/VolGroup/LogVol1 /data/db xfs defaults,noatime 0 1
Umount the logical volume from it's current outpoint and remount it on the correct one:
cd && umount /mnt/ && mount /data/db
Restart mongod
Creating a backup
Creating a backup now becomes as easy as
Create a snapshot:
lvcreate -l100%FREE -s -n mongo_backup /dev/VolGroup/LogVol1
Mount the snapshot:
mount /dev/VolGroup/mongo_backup /mnt
Copy it somewhere. The reason we need to do this is that the snapshot can only be held up until the changes to the data files do not exceed the space in the volume group you did not allocate during preparation. For example, if you have a 100GB disk and you allocated 80GB for /dev/VolGroup/LogVol1, the snapshot size would be 20GB. While the changes on the filesystem from the point you took the snapshot are less than 20GB, everything runs fine. After that, the filesystem will refuse to take any changes. So you aren't in a hurry, but you should definitely move the data to an offsite location, an FTP server or whatever you deem appropriate. Note that compressing the datafiles can take quite long and you might run out of "change space" before finishing that. Personally, I like to have a slower HDD as a temporary place to store the backup, doing all other operations on the HDD. So my copy command looks like
cp -a /mnt/* /home/mongobackup/backups
when the HDD is mounted on /home/mongobackup.
Destroy the snapshot:
umount /mnt && lvremove /dev/VolGroup/mongo_backup
The space allocated for the snapshot is released and the restrictions to the amount of changes to the filesystem are removed.
Whole db-Data folder + where ever you have your logs and journalling
The best solution to backup data on MongoDB would be to use Mongo monitoring Service(MMS). All other solutions including copying files manually, mongodump, mongoexport are way behind MMS.

Postgres 9.2 pg_largeobject tablespace

I am currently moving some data around and I am running into an interesting issue.
I have a CentOS server (6.3) up and running with Postgres 9.2 on a server with limited built in disk space; however, I do have a large amount of extremely reliable external network disk space available.
I have set the tablespace to a directory on this storage devise for my database and everything seems to be working well, until...
I realized that I have a large amount of BLOB data that needs to be stored in pg_largeobject.
I have been goggling how to set the tablespace of pg_largeobject and I did find some results, but they are horribly out dated.
I did find one article that looks promising, but I'm hesitant because the thread also references that things will/should have changed.
I have two questions...
In an ideal world, I would like to move all of postgres (including pg_largeobject) onto this external storage for ease of maintenance. Is this possible?
If not, how can I get pg_largeobject to use my network storage?
As you alluded to, your best bet is to move the entirety of PostgreSQL onto the remote storage, assuming that storage uses a reliable file network block device like iSCSI, ATAoE or NBD. I wouldn't recommend running Pg on NFS, and running it on CIFS/SMBFS just won't work.
Just:
Make a backup
Take a note of the output of SHOW data_directory; in psql
Shut PostgreSQL down
Move the data directory (the folder containing pg_xlog, pg_clog, etc) to the remote storage
Adjust the permissions on the parent directories for the datadir's new location to make sure the postgres user, postgres, group or others permissions block has at least execute on each parent directory so it can traverse the tree.
Adjust your system startup scripts to set the new location as the PostgreSQL datadir or symlink the old datadir location (output by SHOW data_directory) to the new location.
Start PostgreSQL
Unfortunately, different systems and packages find the datadir different ways. Debian/Ubuntu use pg_wrapper, for example.

What does pg_resetxlog do? And how does it work?

I have looked at the postgres documentation and the synopsis below is given:
pg_resetxlog [-f] [-n] [-ooid ] [-x xid ] [-e xid_epoch ] [-m mxid ] [-O mxoff ] [-l timelineid,fileid,seg ] datadir
But at no point in the documentation do they explain what the datadir is.
Is it the %postgres-path%/9.0/data or could it be %postgres-path%/9.0/data/pgxlog ?
Also, if I want to change my xlog directory, can I simply move the items in my current pg_xlog directory and run the command to point to another directory? (Assume my current pg_xlog directory is in /data1/postgres/data/pg_xlog AND the directory I want it the logs to go to is: /data2/pg_xlog)
Would the following command achieve what I've just described?
mv /data1/postgres/data/pg_xlog /data2/pg_xlog
pg_resetxlog /data2
pg_resetxlog is a tool of last resort for getting your database running again after:
You deleted files you shouldn't have from pg_xlog;
You restored a file system level backup that omitted the pg_xlog directory due to a backup system configuration mistake (this happens more than you'd think, people think "it has log in the name so it must be unimportant; I'll leave it out of the backups").
File-system corruption due to a hardware fault or hard drive failure damaged your data directory; or potentially even
a PostgreSQL bug or operating system bug damaged the write-ahead logs (exceedingly rare).
As the manual says:
pg_resetxlog clears the write-ahead log (WAL) [...]. This
function is sometimes needed if these files have become corrupted. It
should be used only as a last resort, when the server will not start
due to such corruption.
Do not run pg_resetxlog unless you know exactly what you are doing and why. If you are unsure, ask on the pgsql-general mailing list or on https://dba.stackexchange.com/.
pg_resetxlog may corrupt your database, as the documentation warns. If you have to use it, you should REINDEX, dump your database(s), re-initdb, and reload your databases. Do not just continue using the damaged cluster. As per the documentation:
After running this command, it should be possible to start the server,
but bear in mind that the database might contain inconsistent data due
to partially-committed transactions. You should immediately dump your
data, run initdb, and reload. After reload, check for inconsistencies
and repair as needed.
If you simply want to move your write-ahead log directory to another location, you should:
Stop PostgreSQL
Move pg_xlog
Add a symbolic link from the old location to the new location
Start PostgreSQL
Or, as the documentation says:
It is advantageous if the log is located on a different disk from the
main database files. This can be achieved by moving the pg_xlog
directory to another location (while the server is shut down, of
course) and creating a symbolic link from the original location in the
main data directory to the new location.
If PostgreSQL fails to start, you've done something wrong. Do not use pg_resetxlog to "fix" it. Undo your changes and work out what you did wrong.
Move the contents of your pg_xlog directory to the desired location like '/home/foo/pg_xlog'
mv pg_xlog/* /home/foo/pg_xlog
Delete the pg_xlog directory
rm -rf pg_xlog
Create a soft-link of pg_xlog
ln -s /home/foo/pg_xlog pg_xlog
Verify the link
ls -lrt pg_xlog
Note: pg_resetxlog is not the right tool to move the pg_xlog please read
http://www.postgresql.org/docs/9.2/static/app-pgresetxlog.html
The data directory corresponds to the data_directory entry in the postgresql.conf file, or the PGDATA environment variable, and it can also be queried live in SQL with the SHOW data_directory statement. It does not point to the pg_xlog directory, but one level above.
To change the location of the WAL files, the PG server must be shut down, the pg_xlog directory and its contents moved to the new location, a symbolic link should be created from the old location to the new location, and the server restarted. pg_resetxlog should not be used for this, as it may suppress the latest transactions (this tool is typically used in crash recovery situations when all else fails).
You should never manually touch the WAL files, that is perfectly clear.
If there is dangling files in the pg_xlog directory, that is, there are is file which ends with .done* in the sub-folder archive_status which need to be cleaned up manually, that can be accomplished with the sql command
CHECKPOINT;
which forces a transaction checkpoint which includes cleaning up the WAL segment files.
See documentation for 9.3 but exists in all current versions of Postgresql.

Moving MongoDB's data folder?

I have 2 computers in different places (so it's impossible to use the same wifi network).
One contains about 50GBs of data (MongoDB files) that I want to move to the second one which has much more computation power for analysis. But how can I make MongoDB on the second machine recognize that folder?
When you start mongodprocess you provide an argument to it --dbpath /directory which is how it knows where the data folder is.
All you need to do is:
stop the mongod process on the old computer. wait till it exits.
copy the entire /data/db directory to the new computer
start mongod process on the new computer giving it --dbpath /newdirectory argument.
The mongod on the new machine will use the folder you indicate with --dbpath. There is no need to "recognize" as there is nothing machine specific in that folder, it's just data.
I did this myself recently, and I wanted to provide some extra considerations to be aware of, in case readers (like me) run into issues.
The following information is specific to *nix systems, but it may be applicable with very heavy modification to Windows.
If the source data is in a mongo server that you can still run (preferred)
Look into and make use of mongodump and mongorestore. That is probably safer, and it's the official way to migrate your database.
If you never made a dump and can't anymore
Yes, the data directory can be directly copied; however, you also need to make sure that the mongodb user has complete access to the directory after you copy it.
My steps are as follows. On the machine you want to transfer an old database to:
Edit /etc/mongod.conf and change the dbPath field to the desired location.
Use the following script as a reference, or tailor it and run it on your system, at your own risk.
I do not guarantee this works on every system --> please verify it manually.
I also cannot guarantee it works perfectly in every case.
WARNING: will delete everything in the target data directory you specify.
I can say, however, that it worked on my system, and that it passes shellcheck.
The important part is simply copying over the old database directory, and giving mongodb access to it through chown.
#!/bin/bash
TARGET_DATA_DIRECTORY=/path/to/target/data/directory # modify this
SOURCE_DATA_DIRECTORY=/path/to/old/data/directory # modify this too
echo shutting down mongod...
sudo systemctl stop mongod
if test "$TARGET_DATA_DIRECTORY"; then
echo removing existing data directory...
sudo rm -rf "$TARGET_DATA_DIRECTORY"
fi
echo copying backed up data directory...
sudo cp -r "$SOURCE_DATA_DIRECTORY" "$TARGET_DATA_DIRECTORY"
sudo chown -R mongodb "$TARGET_DATA_DIRECTORY"
echo starting mongod back up...
sudo systemctl start mongod
sudo systemctl status mongod # for verification
quite easy for windows, just move the data folder to the target location
run cmd
"C:\your\mongodb\bin-path\mongod.exe" --dbpath="c:\what\ever\path\data\db"
In case of Windows in case you need just to configure new path for data, all you need to create new folder, for example D:\dev\mongoDb-data, open C:\Program Files\MongoDB\Server\6.0\bin\mongod.cfg and change there path :
Then, restart your PC. Check folder - it should contains new files/folders with data.
Maybe what you didn't do was export or dump the database.
Databases aren't portable therefore must be exported or created as a dumpfile.
Here is another question where the answer is further explained