Copy data between two servers that are not connected

Copy data between two servers that are not connected - postgresql

I have got 2 postgresql servers on 2 different computers that are not connected.
Each server holds a database with the same schema.
I would like one of the server to be the master server: this server should store all data that are inserted on both databases.
For that I would like to import regularly (on a daily basis for example) data from one database to the second database.
It implies that I should be able to :
"dump" into file(s) all data that have been stored in the first database since a given date.
import the exported data to the second database
I haven't seen any time/date option in pg_dump/pg_restore commands.
So how could I do that ?
NB: data are inserted in the database and never updated.

I haven't seen any time/date option in pg_dump/pg_restore commands.
There isn't any, and you can't do it that way. You'd have to dump and restore the whole database.
Alternatives are:
Use WAL based replication. Have the master write WAL to an archive location, using an archive_command. When you want to sync, copy all the new WAL from the master to the replica, which must be made as a pg_basebackup of the master and must have a suitable recovery.conf. The replica will replay the master's WAL to get all the master's recent changes.
Use a custom trigger-based system to record all changes to log tables. COPY those log tables to external files, then copy them to the replica. Use a custom script to apply the log table change records to the main tables.
Add a timestamp column to all your tables. Keep a record of when you last synced changes. Do a \COPY (SELECT * FROM sometable WHERE insert_timestamp > 'last_sync_timestamp') TO 'somefile' for each table, probably scripted. Copy the files to the secondary server. There, automate the process of doing a \copy sometable FROM 'somefile' to load the changes from the export files.
In your situation I'd probably do the WAL-based replication. It does mean that the secondary database must be absolutely read-only though.

Related

pg_dump blocked by another transaction using TRANSACTION_REPEATABLE_READ

I have a java program exporting selected data from several tables of a Postgres database. The logic is
set the isolation level to TRANSACTION_REPEATABLE_READ
select data from table 1 and export them to an external file
select data from table 2 and export them
commit
While the program is running, if I run pg_dump to backup the database, pg_dump will be blocked until my program is finished.
I am using REPEATABLE_READ to make sure my export data is consistent, i.e. not affected by other concurrent transaction. Any suggestion how to get a consistent set of data without blocking pg_dump?
Thanks

Postgres and multiple locations of data storage

Postgres and the default location for its storage is at my C-drive. I would like to restore a backup to another database but to access it via the same Postgres server instance - the issue is that the size of the DB is too big to be restore on the same c-drive ...would it be possible to tell Postgres that the second database should be restore and placed on another location/drive (while still remaining the first one)? Like database1 at my C-drive and database2 at my D-drive?
Otherwise the second best solution would be to install 2 separate Postgres instances - but that also seems a bit overkill?

That should be entirely achievable, if you've used the postgres pg_dump command.
The pg_dump command does not create the database, so you create it yourself first. Use CREATE TABLESPACE to specify the location.
CREATE TABLESPACE secondspace LOCATION 'D:\postgresdata';
CREATE DATABASE seconddb TABLESPACE secondspace;
This creates an empty database on the D: drive.
Then the standard restore from a pg_dump should work:
psql seconddb < dumpfile

Replication
Sounds like you need database replication.
There are several ways to do this with Postgres, one built-in, and other approaches using add-on libraries.
Built-in replication feature
The built-in replication feature is likely to suit your needs. See the manual. In this approach, you have an instance of Postgres running on your primary server, doing reads and writes of your data. On a second server, an entirely separate computer, you run another instance of Postgres known as the replica. You first set up the replica by doing a full backup of your database on the first server, and restore to the second server.
Next you configure the replication feature. The replica needs to know it is playing the role of a replica rather than a regular database server. And the primary server needs to know the replica exists, so that every database change, every insert, modification, and deletion, can be communicated.
WAL
This communication happens via WAL files.
The Write-Ahead Log (WAL) feature in Postgres is where the database writes all changes first to the WAL, and only after that is complete, then writes to the actual database. In case of crash, power outage, or other failure, the database upon restarting can detect a transaction left incomplete. If incomplete, the transaction is rolled back, and the database server can try again by seeing the "To-Do" list of work listed in the WAL.
Every so often the current WAL is closed, with a new WAL file created to take over the work. With replication enabled, the closed WAL file is copied to the replica. The replica then incorporates that WAL file, to follow the same "To-Do" list of changes as written in that WAL file. So all changes are made to the replica database exactly as they were made to the primary database. Your replica is an exact match to the primary, except for a slight lag in time. The replica is always just one WAL file behind the progress of the primary.
In times of trouble, the replica serves as a warm stand-by. You can shutdown the primary, then tell the replica that it is now the primary. You can even configure the replica to be a hot stand-by, meaning it will automatically take-over when the primary seems to have failed. There are pros and cons to hot stand-by.
Offload read-only queries
As a bonus feature, the replica can be used for read-only queries. If your database is heavily used, you can offload some of the work burden from your primary to the replica. Any queries that do not require the absolute latest information can be shifted by connecting to the replica rather than the original. For example, a quarterly sales report likely does not need the latest data stored in the active WAL file that has not yet arrived on the replica.
Physical replication means all databases are copied
Caveat: This built-in replication feature is physical replication. This means all the changes to the entire Postgres installation (formally known as a cluster, not to be confused with a hardware cluster) is copied to the replica. If you use one Postgres server to server multiple databases, all those databases must be replicated – you cannot pick and choose which get copied over. There may be alternative replication features in the future related to logical replication.
More to learn
I am being brief here. The topics of replication, high-availability, and disaster-recovery are broad and complex, too much for an Answer on Stack Overflow.
Tip: This kind of Question might have been better asked on the sister site, DBA.StackExchange.com.

How to restore database in PostgreSQL with pgadmin3?

I'm using pgAdmin to restore PostgreSQL database. To restore the database I need to delete, drop and remake it. How to restore the database without deleting and remaking it?

This cannot be done in pgAdmin or with any database tools. Regular backup files cannot be restored without deleting the data first because they consist of normal COPY statements which will fail if you have rows in the database (primary keys collide etc).
For a simple way to get back to an earlier snapshot in a testing environment take a look at PostgreSQL documentation - 24.2. File System Level Backup:
For backup:
Shut down your database
copy all the files from your data directory
For restore:
Shut down your database
replace your data directory with the backup directory
Note:
the size of the data might be significantly larger than with a regular backup especially if you have a lot of indexes
this is a server wide backup so you can't do this on individual databases
don't attempt to use it on a different version of PostgreSQL
this really deletes the data too - by replacing it with the backup
Also with regular backups you don't have to do a DROP TABLE if you do a data-only restore with pg_restore --data-only for example. You still have to delete the data though.

Backup specific tables in AWS RDS Postgres Instance

I have two databases on Amazon RDS, both Postgres. Database 1 and 2
I need to restore an instance from a snapshot of Database 1 for my Staging environment. (Database 2 is my current Staging DB).
However, I want the data from a few of the tables in Database 2 to overwrite the tables in the newly restored snapshot. What is the best way to do this?

When restoring RDS from a Snapshot, a new database instance is created. If you only wish to copy a portion of the snapshot:
Restore the snapshot to a new (temporary) database
Connect to the new database and dump the desired tables using pg_dump
Connect to your staging server and restore the tables using pg_restore (most probably deleting any matching existing tables first)
Delete the temporary database
pg_dump actually outputs SQL commands that are then used to recreate tables and restore data. Look at the content of a dump to understand how the restore process actually works.

I hope this still works for someone else.
With my team we faced a similar issue. We also had 2 Postgres databases and we also just needed to backup some tables from db1 to db2.
What we did is to use a lambda function using Python (from AWS lambda ofc) that connected to both databases and validates if db1.table1 has the same data as db2.table1, if not, then the lambda function should write the missing data from db1.table1 into db2.table1. The approach of using lambda was because we wanted to automate the process due to the main db (let's say db1) is constantly being updated. In addition, it allowed us to only backup our desired tables (let's say 3 tables out of 10), instead of backing up the whole database.
Note: Maybe you want to do these writes using temporary tables to avoid issues with any constraints you have in your tables.

deleting database in oracle 10g query

i want to drop a database in oracle 10g..
i saw the command
DROP database.
How to run this query and drop database.
If i tried this query, it is showing error as invalid query.

Oracle documentation states:
Dropping a database involves removing
its datafiles, redo log files, control
files, and initialization parameter
files. The DROP DATABASE statement
deletes all control files and all
other database files listed in the
control file. To use the DROP DATABASE
statement successfully, all of the
following conditions must apply:
The database must be mounted and closed.
The database must be mounted exclusively--not in shared mode.
The database must be mounted as RESTRICTED.
An example of this statement is:
DROP DATABASE;
The DROP DATABASE statement has no
effect on archived log files, nor does
it have any effect on copies or
backups of the database. It is best to
use RMAN to delete such files. If the
database is on raw disks, the actual
raw disk special files are not
deleted.
If you used the Database Configuration
Assistant to create your database, you
can use that tool to delete (drop)
your database and remove the files.