Is logical replication using pglogical possible with timescaleDB? - postgresql

I set up a fully functional logical replication of multiple masters to one slave.
As soon as I convert any of the replicated tables to timescaleDB's hypertable, the replication stops working - only the structure is replicated, but no data.
Is it possible to use timescaleDB together with pglogical? Would it be possible to use PostgreSQL 10 and its in-built logical replication abilities?
My guess is, that logical replication doesn't make sense in the context of tables consisting of virtual chunks.

No.
To our knowledge, PG10's logical replication will not work with
hypertables, because it actually doesn't replicate DDL commands, and
instead just does a pub/sub on the data.
- Mike Freedman, CTO TimescaleDB (04.04.2019)
More info: https://github.com/timescale/timescaledb/issues/1138#issuecomment-479674594

Related

citus: Can I add one more replica for a distributed table

I have a distributed table,but this table only has one replica,only one replica doesn't have
ha, I want and one more replica for the table,can i? how to do?
I have search online help docs,but didn't find any solution.
Replica's in Citus are not an HA solution. For HA you will need to setup any postgres tooling for every member in your cluster to stream WAL to another node. Citus specializes in distributed queries and separates that problem from HA by relying on proven technology available in the postgres ecosystem.
If you want to scale out reads adding a replica can help. However adding replica's have a significantly high impact on write throughput. Before adding replica's please thoroughly test that your database can handle your expected load. And yet again, if HA is your goal, don't add Citus replica's instead, apply postgres HA solutions to every worker and coordinator.
Increasing the replica count of an already distributed table is due to above reasoning not an operation Citus provides out of the box. Easiest would be to create a new table and use an INSERT INTO SELECT clause to reinsert the data into a table with appropriate shard_count and replica's according to your application needs.

I have a postgresql setup with 1 master and 2 slaves. Now I want to setup a replication for only one database to reporting server

My problem statements: I can't use bucardo (no approval for creating triggers on prod tables) pglogical only works on Master node.(the publisher). I want to replicate from slave (this is my end goal)
If you have v10 or better, you should definitely use logical replication. The burden on the primary server is minimal.
If for whatever incomprehensible reason you absolutely have to replicate from a standby, you'll have to use streaming replication. Then you have to replicate the whole cluster, but there is no better solution.
One way out could be to put walbouncer in between, which can filter out WAL records for all but one database, so that you ate effectively replicating just one database.
Disclaimer: walbouncer was developed by my company (it is open source though).

MongoDB Sharding, Replication and Clustering

Based on my analysis below is understanding, correct me if my understating is wrong.
Sharding - Horizontal scaling, split the records into multiple chunks and store across multiple machine with good shard key for all collections.
Replication - Replicate the data across multiple machine for high availability
Clustering - As per Mongo architecture there will be one Master and multiple slave machine. Write and sensitive read operation performs against Master and read operation performs against slaves.
I am not able to correlate Clustering with Replication and Sharding, could you please someone guide me how to relate them?
Term "clustering" is not normally used with mongodb. Instead, its meaning included in the term "sharding". A shard is a node/replicaset with only a portion of your data, yes. And cluster is simply a collection of shards (and supporting nodes, like config servers and mongos routers)
Whereas replication is done with replica sets, which have one primary node (master) and other nodes are secondaries (slaves).

postgres streaming replication - slave only index

We have successfully deployed Postgres 9.3 with streaming replication (WAL replication). We currently have 2 slaves, the second slave being a cascaded slave from the first slave. Both slaves are hot-standby's with active read-only connections in use.
Due to load, we'd like to create a 3rd slave, with slightly different hardware specifications, and a different application using it as a read-only database in more of a Data Warehouse use-case. As it's for a different application, we'd like to optimize it specifically for that application, and improve performance by utilizing some additional indexes. For size and performances purposes, we'd rather not have those indexes on the master, or the other 2 slaves.
So my main question is, can we create different indexes on slaves for streaming replication, and if not, is there another data warehouse technique that I'm missing out on?
So my main question is, can we create different indexes on slaves for streaming replication
No, you can't. Streaming physical replication works at a lower level than that, copying disk blocks around. It doesn't really pay attention to "this is an index update," "this is an insert into a table," etc. It does not have the information it'd need to maintain standby-only indexes.
and if not, is there another data warehouse technique that I'm missing out on?
Logical replication solutions like:
Londiste
pglogical
Slony-I
can do what you want. They send row changes, so the secondary server can have additional indexes.

Is it possible to have additional writable tables to a postgresql slave (replicated by Skytools/londiste)

I have a master postgres database M with tables M.A1, etc.
I have a slave database S with tables M.A1, etc, populated and maintained by Skytools/londiste. Everything works great.
I don't know exactly how it works, since I am not the person who set up my Skytools instance. I have just read some pieces of documentation and interacted with it slightly.
I would like to add some auxiliary read/write tables to S: S.B1 . (I want to join against S.A1, and not add any extra load to M, which is why I want to install B1 on S. Is it possible to maintain this setup?
If I create a new table S.B1 on a Skytools/Londiste slave, will that interfere with replication of table A1?
Edit to add followup:
How safe would such a setup be, with respect to slave failures impacting the master?
I am not very concerned about replication lag or downtime on my analytics slave (but I would need a way to eventually recover without taking downtime on the master).
I am very concerned about a slave failure causing the master to grow its replication queue indefinitely and consuming HD/RAM/resources on the master. How would I mitigate that? Is there a way to set a tolerance so that the master just drops the slave connection if the slave falls too far behind?
Part 2
If I do get this set up working, I'll want to have a slave backup of S.B1 somewhere, in case S fails.
Is it possible to set up a secondary slave T, and configure Skytools/Londiste to replicate S.B1 to T.B1, while M.A1 is also replicating to S.A1?
What caveats are gotchas should I be concerned about?
Thank you so much for advice and pointers.
Firstly I would really suggest that you spend the time in understanding how skytools pgq and londiste work. It is not very complicated but the documentation is rather scant.
For your first question - yes you can have other tables on the slave which are not replicated from the master.
Your second question is a bit more involved and I am not sure if your requirement is entirely clear.
Assuming the tables you want to replicate from the slave to a secondary slave are an entirely separate group from the tables you are replicating for the master to the initial slave then you could install pgq on the initial slave and londiste on the secondary slave , create a new queue and add those tables to that queue which you wish to replicate to the secondary slave.
You can't use skytools/Londiste for cascading replication e.g. master -> slave1 -> slave2 so it is not obvious what benefit you would get from the partial replication of data from one slave to another.
It would be simpler to have all the tables on the master and then just one queue for replication to a slave and then for resilience have a warm standby of the master see
explanation for 8.4 from which you could do a point in time recovery if necessary and then rebuild the slave from a consistent master. Skytools has packages to help you with setting up warm standby/pitr.
If you cannot have all the tables on the master then you might be better to maintain a warm standby of the slave for pitr recovery but bear in mind you would probably have to resubscribe the tables replicated from the master after doing such a restore. This might be complicated if the slave tables you are joining to the master tables have foreign key constraints.
If you are postgres 9 there is streaming replication which may also serve but I have not used this.
Just to expand on the topic, if anyone reaches it, you can have multiple queues as Gavin suggested above, but also you can have cascading replication as of Skytools version 3 (March 2012). And indeed you can replicate any subset of tables, and you can even have tables renamed on destination, if needed.