I am currently working with a Managed Instance from TimescaleDB and have incoming data. Now, I am setting up another instance, but this time it will be self-hosted and managed by me.
As such, I would like to setup some sort of replication so that the data coming into the managed instance is accessible in the self-hosted one (after a while, does not have to be live). I've looked into setting up replication with WAL streaming, however I've run into an issue.
Most replication workflows require changes in postgresql.conf and pg_hba.conf files which I cannot access (Managed instance). TimescaleDB support also says modifying this is not possible.
Is there a way to achieve this, without access to those files?
Related
I have an Azure-managed PostgreSQL database.
I want to create a logical replica of it at GCP, (Google-managed, if possible).
At Azure, I've set the Azure replication support to Logical. However, this just seems to allow me to create replicas inside Azure. What I want is to create a replica in GCP.
If this was not Azure-managed, but self-managed, I would be able to create a tunnel from Azure to GCP and then do the WAL copy replication.
One might wonder: why? Because I don't want to be locked with one vendor.
If that cross-cloud replication is not possible, what's the easiest way to pull the entire database off (possibly not just the data with pgdump, but all its internals too).
While this question is Azure -> GCP, it seems other alternatives like GCP -> AWS or other vendors are also not supported. Or what am I missing?
Cross-Cloud Replication from Azure Source PostgreSQL to GCP destination CloudSQL through Conventional Native Logical Replication is possible and I've tested that it's working. I'm sure that it would work for self managed database too.
I have a Live - Backup pair with the replication HA policy. I would like to manage the replica startup myself when the live server fails, leaving only data replication between them. Is it possible to somehow achieve this behavior?
One of the main goals of the live/backup pair is failover (i.e. automatically starting the backup when the live fails). There is no way to disable this functionality and still use a live/backup pair.
However, you could potentially use a broker-connection with the "mirror" configuration to get the results you want.
We are considering using pgbouncer for our project, which includes dynamic db creation (i.e each and every tenant that is added - a new db created)
As far as I understand, pgbouncer takes a config file that maps the databases.
The question is - is there a way adding new databases to pgbouncer without restarting it? (adding a new db row in the config.ini file)
I was actually looking into this same issue. It doesn't seem to be possible by default right now (per this issue). The originator of that issue has a branch of his fork for dynamic pooling, but it doesn't seem that will be merged. I wouldn't use it in production unless you're up to the additional work of maintaining a forked dependency for your project.
The current way is updating the .ini. However, in addition to the overhead of maintaining configuration in another place this is further complicated because based on the docs the "online restart" capability of pgbouncer only works for non-TLS connections and if your pgbouncer is running with unix sockets. So depending on your system configuration online restarts for the potentially frequent updates might be out of the question.
I'm using PostgreSQL 9.6.
Is it possible to have replication and incremental backup on the same setup
I would like to have high availability setup. On the main site I will have two servers with replication between them and pgpool will handle the failover in case the primary server goes down.
I would also like to have another remote site for geographical redundancy. This site will be active only if the main site is no longer functioning. The remote site does not need to be updated in real-time. Therefore, if it saves resources I thought about having incremental backup and restore from the main site to the remote site. In other words the main site primary server will replicate its data to the main site secondary server. In addition it will also generate incremental backup and that backup will be restored on the remote site.
From your answer I understood that it is possible to have both replication and incremental backup. However, will this solution be better (resource consumption, reliability etc.) than just have replication to both the main secondary server and the remote site server?
Yes, you can have PITR and streaming replication in use at the same time. Streaming replication can fall back to restoring from the WAL archive if it loses direct connectivity to the master too.
Lots more detail in the manual. It's hard to be more specific with a rather open and vague question - what, exactly, do you mean by "incremental backup"? etc.
What's a quick and efficient way to transfer a large Mongo database?
I want to transfer a 10GB production Mongo 3.4 database to a staging environment for testing. I used the mongodump/mongorestore tools to test this transfer to my localhost, but it took over 8 hours and consumed a massive amount of CPU and memory, which is something I'd like to avoid in the future. The database doesn't have any indexes, so the mongodump option to exclude indexes doesn't increase performance.
My staging environment will mostly be read-only, but it will still need to write occasionally, so it can't be setup as a permanent read replica of production.
I've read about [replication sets][1], but they seem very complicated to setup and designed for permanent mirroring of a primary to two or more secondaries. I've read some posts about people hacking this to be temporary, so they can do a one-time mirroring, but I can't find any reliable documentation since this isn't the intended usage of the feature. All the guides I've read also say you need at least 3 servers, which seems unintuitive since I only have 2 (production and staging) and don't want to create a third.
Several options exist today (2020-05-06).
Copy Data Directory
If you can take the system offline you can copy the data directory from one host to another then set the configuration to point to this directory and start up the new mongod.
Mongomirror
Mongomirror (https://docs.atlas.mongodb.com/import/mongomirror/) is intended to be a tool to migrate from on-premises to Atlas, but this tool can be leveraged to copy data to another on-premises host. Beware, this connection requires SSL configurations on source and target to transfer.
Replicaset
MongoDB has built-in High Availability features using a replica set model (https://docs.mongodb.com/manual/tutorial/deploy-replica-set/). It is not overly complicated and works very well. This option allows the original system to stay online while replication does its magic. Once the replication completes reconfigure the replica set to be a single node replica set referring only to the new host and shut down the original host. This configuration is referred to as a single-node replica set. Having a single node replica set offers benefits over a stand-alone installation in that the replica set underpinnings (oplog) are the basis for other features such as change streams (https://docs.mongodb.com/manual/changeStreams/)
Backup and Restore
As you mentioned you can use mongodump/mongorestore. There is a point in time where the backup must be restored. During this time it is expected the original system is offline and not accepting any additional writes. This method is robust but has downtime associated with it. You could use mongoexport/mongoimport to use a JSON file as an intermediate step but this is not recommended as BSON data types could be lost in translation.
Per Mongo documentation, you should be able to cp/rsync files for creating a backup (if you are able to halt write ops temporarily on your production setup - or if you do this during a maintenance window)
https://docs.mongodb.com/manual/core/backups/#back-up-by-copying-underlying-data-files
Back Up with cp or rsync
If your storage system does not support snapshots, you can copy the files >directly using cp, rsync, or a similar tool. Since copying multiple files is not >an atomic operation, you must stop all writes to the mongod before copying the >files. Otherwise, you will copy the files in an invalid state.
Backups produced by copying the underlying data do not support point in time >recovery for replica sets and are difficult to manage for larger sharded >clusters. Additionally, these backups are larger because they include the >indexes and duplicate underlying storage padding and fragmentation. mongodump, >by contrast, creates smaller backups.
FYI - for replica sets, the third "server" is an arbiter which exists to break the tie when electing a new primary. It does not consume as many resources as the primary/secondaries. Since you are looking to creating a staging environment, i would not recommend creating a replica set that includes production and staging env. Your primary instance could switch over to the staging instance and clients who are meant to access production instance will end up reading/writing from staging instance.