I'm reproducing following this doc a master-slave replication with PostgreSQL DB servers. In front of the cluster there is a pgpool instance as load balancer. So far so good.
The problem is when I query the database from the app and use specific functions from database which use some extensions like pg_trgm or pg_prewarm for giving an example. Every time the query is balanced to one of the slaves I receive an Exception telling me that the extension I'm trying to use is missing.
Could not access file $libdir/pg_trgm
When I check the extension list with \dx on every database on master I receive the full list but on slaves it just pop plpgsql.
As the slaves are read-only servers I can't create the extensions there.
Is there a way I could replicate the extention creation to slaves servers?
Thanks in advance!
You forgot to install the “contrib” PostgreSQL package on the standby machine. As a consequence, the extensions cannot be created.
The slave servers think the extension is installed (because the references to it were copied to it along with the rest of the master's catalogs), but when it goes to load the meat and potatoes of the extension, it actually isn't there.
You need to install the binary objects which make up the extensions onto the replica servers. How you do this depends on how you installed the software on those servers to start with.
When I check the extension list with \dx on every database on master I receive the full list but on slaves it just pop plpgsql.
This isn't possible based on your description. If the replica is a copy of the master created by pg_basebackup, then \dx should return the same results on both master and replica. \dx just checks the system catalog to see what it thinks is installed. If the underlying binaries are missing, it doesn't care, it reports it anyway. If you get different results, then you are not connected to the instance you think you are.
Related
I am currently working with a Managed Instance from TimescaleDB and have incoming data. Now, I am setting up another instance, but this time it will be self-hosted and managed by me.
As such, I would like to setup some sort of replication so that the data coming into the managed instance is accessible in the self-hosted one (after a while, does not have to be live). I've looked into setting up replication with WAL streaming, however I've run into an issue.
Most replication workflows require changes in postgresql.conf and pg_hba.conf files which I cannot access (Managed instance). TimescaleDB support also says modifying this is not possible.
Is there a way to achieve this, without access to those files?
We are experimenting with Kubernetes and Confluence in the cloud and have deployed Confluence connected to a pgsql database. When applying an update, something happened that caused the pgsql pod to tank and lose the persistent volume connections.
Thankfully the volume was set to retain, so we have the volume and I have since been able to point a new pgsql instance to this volume, but I can't find a way to get Confluence to see this existing database. Confluence just proceeds to the initial fresh install screens. I've tried installing it on a temporary database and then modifying the confluence.cfg.xml file to point to the old data once completed but Confluence will not restart when I try this.
Any help is appreciated.
Using the web installer you should have a step to select "My own database". From there you can configure the database credentials and host. Go ahead and let the installer run, it will overwrite the default settings but will retain your existing data.
Also, you may want to get on the psql shell via console and check to make sure that your data actually exists and you haven't ended up with an empty database.
If you're still stuck, reach out here and we can check out the next steps.
In my case the original solution posted here is accurate:
However I had to do this in a non containerized environment. I installed Confluence on a VM using a blank database, then modified the confluence.cfg.xml file to point to the pgsql database in the kubernetes cluster and restarted confluence. I was able to see my data, so I then used confluence's XML export feature to grab the dataset. I then blew away the kubernetes environment and re-created it from scratch and imported the backed up XML into the new instance. Not a super clean way of doing it, but got where I needed to.
The idea here is, I have mongo cluster deployed in managed cloud service atlas. I have enabled Continuous Backup.
Now what I want to do is :
1) I want to use existing backup.
2) Using this existing backup I want to create similar cluster
(having same data form backup)
3) Automate this process so that every day my new cluster gets upto date from original cluster.
Note: The idea here for cloning cluster is, The original cluster is production data. I want to create a db which has similar data on which I can plug and play using any analytic tools and perform diffrent operations without affecting production data and load.
So far what I have found is to use mongorestore and mongodump.But here mongodump is putting load on production db even though my backup is enabled. I want to use same backup to clone this to another db cluster.
Deployed on Atlas, your server must have replica set.
Here are 2 solutions :
You need only reading data : connect your tools to a secondary server (ideally dedicated with priority 0 for becoming primary)
You need to read/write data : on the same server than above, play your mongodump command with --oplog option. By this way, you're dumping your data from a read-only server, preventing slowing performances of your main servers.
In this last case, what you need will find its solution in backup strategies, take a look at the doc to know more.
There's an offering for this purpose in ATLAS called analytic node.Link.
Analytic node is read replica of your database. Plus it will not interfere with your production traffic which makes it safer.
Also, you can connect BI connectors to this node and create your analytic platform.
We used redash.
One of our clients have a server running a MongoDB instance and we have to build an analytical application using the data stored in their MongoDB database which changes frequently.
Clients requirements are:
That we do not connect to their MongoDB instance directly or run another instance of MongoDB on their server but just somehow run our own MongoDB instance on our machine in our office using their MongoDB database directory with read only access remotely.
We've suggested deploying a REST application, getting a copy of their database dump but they did not want that. They just want us to run our own MongoDB intance which is hooked up with the MongoDB instance directory. Is this even possible ?
I've been searching for a solution for the past two days and we have to submit a solution by Monday. I really need some help.
I think this is normal request because analytical queries could cause too much load on the production server. It is pretty normal to separate production and analytical databases.
The easiest option is to use MongoDB replication. Set up MongoDB replica set with production database instance as primary and analytical database instance as secondary, also configure the analytical instance to never become primary.
If it is not possible to use replication - for example client doesn't want this, the servers could not connect directly to each other... - there is another option. You can read oplog from remote database and apply operations to your database instance. This is exactly the low level mechanism how replica set works, but you can do it manually too. For example MMS (Mongo Monitoring Sevice) Backup uses reading oplog for online backups of MongoDB.
Update: mongooplog could be the right tool for real-time application of replication oplog pulled from remote server on local server.
I don't think that running two databases that points to the same database files is possible or even recommended.
You could use mongorestore to restore from their data files directly, but this will only work if their mongod instance is not running (because mongorestore will need to lock the directory).
Another solution will be to do file system snapshots and then restore to your local database.
The downside to this backup/restore solutions is that your data will not be synced all the time.
Probably the best solution will be to use replica sets with hidden members.
You can create a replica set with just two members:
Primary - this will be the client server.
Secondary - hidden, with votes and priority set to 0. This will be your local instance.
Their server will always be primary (because hidden members cannot become primaries). Clients cannot see hidden members so for all intents and purposes your server will be read only.
Another upside to this is that the MongoDB replication will do all the "heavy" work of syncing the data between servers and your instance will always have the latest data.
I have an app that can run in offline mode. If offline it uses a local mongo database, if it has a data connection it will use a remote mongo database.
Is there an easy way to sync these two databases and make sure they both have the union of their collections and documents?
EDIT: Effectively there are two databases that could both have insertions and deletions happening on them that aren't happening on the other. At fixed points in time I would like to have both databases show the union of them both.
For example over a period of time.
DB1.insert(A)
DB1.insert(B)
DB2.insert(C)
DB1.remove(A)
RUN SYNC
DB1 = DB2 = {B, C}
EDIT2: Been doing some reading. It's not the intended purpose but could they be set up as slaves replica sets of the remote and used that way? Problem is that I think replicas need to have a replica hosts must be accessible by way of resolvable DNS. Not sure how the remote could access local host.
You could use replica set but MongoDB doesn’t support master-master replication. Let's assume if you have setup like this:
two nodes with priority 1 which will be used as remote servers
single arbiter to ensure majority if one of remotes dies
5 local dbs with priority set as 0
When your application goes offline, it will stay secondary so you won't be able to perform writes. When you go online it will sync changes from remote dbs but you still need some way of syncing local changes. One of dealing with could be using local fallback db which will be used for writes when you are offline. When you go online, you push all new records to master. A little bit trickier could be dealing with updates but it is doable.
Another problem is that it won't scale up if you'll need to add more applications. If I remember correctly, there is a 12 nodes per replica set limit. For small cluster DNS resolution could be solved by using ssh tunnels.
Another way of dealing with a problem could be using small restful service and document timestamps. Whenever app is online it can periodically push local inserts to remote and pull data from remote db.