Heroku horizontal scaling - postgresql

I have a python app running on Heroku using a PostgreSQL database. If I create a database follower, will that follower be used to balance the read database load automatically? I know this provides me a failover copy of sorts, but will it relieve my database load?

No -- you'll need to configure your Python software to send SQL queries to both the follower AND the master database in order to actually 'relieve' your database load.
If you're using Django, you'll want to read this: https://docs.djangoproject.com/en/1.8/topics/db/multi-db/
If you're using SQLAlchemy, you'll want to read this: read slave , read-write master setup

Related

Postgres pg_dumpall consistency across databases that are backed up

Is there a way to backup all of the databases on a hyperscaler managed postgres server as of a certain time in order to maintain data consistency between the databases with either pg_dumpall, pg_dump or something else?
Background:
With the utilization of micro-services, an application may have many databases associated to it on a single hyperscaler managed postgres server. The hyperscalers do perform a functional snapshot backup; however, when a hyperscaler managed postgres server is accidentally deleted, the postgres backups are lost as well. These hyperscalers provide locks to prevent accidental deletes of a postgres server and mention that their support teams can be contacted to restore a deleted server, however, we still had a postgres server get deleted. We were able to recover by contacting the hyperscalers support team but would like to have a second way of backing up a hyperscaler managed postgres server.
I realize that the micro-services should be able to auto-recover to a data consistent point but the reality is that many of the micro-services have not been designed nor written to that requirement. I really do not want to get into the aspect of micro-service design and want to retain this to be a DBA backup question.
pg_dumpall will not take a consistent backup across all databases. Each database backup will be consistent, but the snapshots for the backups of the different databases will be taken at different times.
If you need a consistent backup across several databases in a single cluster, use an online file system backup with pg_basebackup.
You can use pg_basebackup which most of the PostgreSQL DBA’s would end up using on a daily basis be it via scripts or manually. This creates the base backup of the database which can help in recovering in multiple situations. This takes an online backup of the database and hence is very useful when being used in production
you can review this for more details

Postgres master / slave based on table

Currently I have 1 postgres instance which is starting to receive too much load and want create a cluster of 2 postgres nodes.
From reading the documentation for postgres and pgpool, it seems like I can only write to a master and read from a slave or run parallel queries.
What I'm looking for is a simple replication of a database but with master/slave based on which table is being updated. Is this possible? Am i missing it somewhere in the documentation?
e.g.
update users will be executed on server1 and replicated to server2
update big_table will be executed on server2 and replicated back to server1
What you are looking for is called MASTER/MASTER replication. This is supported natively (without PgPool) since 9.5. Note, that it's an "eventually consistent" architecture, so your application should be aware of possible temporary differences between the two servers.
See PG documentation for more details and setup instructions.

Connect directly to the "local" database of primary MongoDB node in replica set

I need access to the OpLog of the primary of my replica set in MongoDB. I am developing an OpLog driven ETL job as it is far more efficient than simply querying and transferring data from the entire table.
I can access the OpLog easily with the following steps in the terminal:
mongo mymongoserver.com:10733/my-db -u oplogUser -pxxxxxxxx
Then I run:
use local
and then I can query the OpLog using:
db.oplog.rs.find()
My question is: Are there any settings I can pass to the mongo connect command to get me straight into the local database for the primary node?
I am using Talend Open Studio for my ETL needs. Am I approaching this the wrong way?
I have come across this from the guys at Stripe, which means that this is definitely possible! https://www.mongodb.com/presentations/building-real-time-systems-mongodb-using-oplog-stripe

Feasible to install mongo-connector on separate server from mongo databases?

Does anyone know if its more feasible to install mongo-connector on its own server or to just have it running on one of the mongo database servers? I currently have it running on a secondary mongo server and want to move it off the secondary server but I am not sure if it really matters.
Mongo Connector is just another client application connecting to your MongoDB server. You can install it wherever you like, as long as it is able to connect to your MongoDB server and whatever external system you're trying to feed data to.
Note: If Mongo Connector is matching a comparatively small selection of data in the replication oplog to insert into your external system, it would make sense to keep this on the same secondary or node whose oplog is being monitored. Otherwise the Connector will be pulling the full oplog over the network, which may consume unnecessary bandwidth.

Mirror one database to another in PostgreSQL

I know the way to set up a Master/Slave DB in Postgres is having 2 DB servers, but unfortunately i have only one server for now.
How can i mirror my production db into another "backup db" in "real_time"? I want to give access to another user to the mirrored db, so even if he does something there it will not affect production.
Nothing stops you setting up hot standby streaming replication, or another replication option like Londiste, between two PostgreSQL instances on the same computer.
The two copies of PostgreSQL must use different ports, but that's the only real restriction.
How to set up the second PostgreSQL instance depends on your operating system and how you installed PostgreSQL, which you have not mentioned.
You'll want to use streaming replication with hot standby if you want a read-only replica. If you want it to be read/write, then you can do a one-off copy of the database with pg_basebackup and not keep them in sync after that. Or you can use a tool like Londiste to replicate changes selectively.
You can run multiple instances of PostgreSQL on the same computer, by using different ports.