How to connect Postgres ReadRepica for Reads without affecting the application source code - postgresql

I have an application which has read and write inline queries in the code, I am facing a challenge while pointing the read and write queries to respective Databases. Is there any best of doing it for Go application?
My thought is to have two ORMs up with Read and Write databases and select appropriate based on the operation. e.g: ReadDbMap.Select("query"); WriteDbMap.Update("query");
But this change effects entire application, that is the concern I have

I am afraid that there is no simpler way.
Streaming replication is not primarily a load balancing feature. For one, you'll have to be aware that a change you made on the primary server is not immediately visible on the standby, so your application will have to deal with these temporary inconsistencies.

Related

Multiple Siddhi apps or one big one

We are building an application on top of Siddhi (using the Java library) that allows users to dynamically add rules and have all incoming information going forward be run against those rules. My question is if it's better to have one large app with many queries, streams, windows, and partitions, or to break up each query into it's own application.
We have been including everything in one single Siddhi app (SiddhiAppRuntime), but this is starting to become large and I fear things may start interacting with each other in unintended ways. We are also snapshotting the SiddhiAppRuntime and restoring state whenever our application gets restarted. This could likely lead to massive restores if we have hundreds of pattern queries to re-run.
I am considering making a separate SiddhiAppRuntime from a single SiddhiManager for each query. The benefits (as I see them) would be reduce the risk of unintentional interactions, make each query able to function on its own, and restoring the query after a shutdown should be much simpler since it will only need to restore a single query. Potential downsides could be increased overhead for having potentially hundreds of SiddhiAppRuntimes.
What is considered best practice for our scenario? What will offer better performance, both for running data through the rules and for restoring the rules in the case we have to restart.
(If this is too broad or any clarification is needed I will do my best to update this question accordingly)
From the lengthy description that you have given I assume these rules that users add does not interact with each other meaning rules add by user1 will not be interacting with rules added by user2.
In such a case it is recommended to use different Siddhi Apps(SiddhiAppRuntimes) for each user. This wont add much additional performance overhead as apps wont be interacting with each other. This will improve snapshoting process as we will be taking separate snapshopts per each app.
Also this will make sure you will have clear separation between each collection of rules and will be easily manageable.

PostgreSQL - Periodically copying data from one database to another

I'm trying to set up an architecture with 2 databases, say preview and live, that have the exact same schemas. The use case is that edits can be made to the preview database and then pushed to the live database after they are vetted and approved. The production application would read from the live database.
What would be the most appropriate way to push all data from the preview database to the live database without bringing the live database down? Ideally the copy from preview to live would be an atomic transaction.
I've worked with this type of setup in MSSQL, but I'm fairly new to Postgres. So I'm open to hearing other ways to architect this (with Schemas perhaps?).
EDIT: The main reason to use separate databases is that I may need more than 1 target database (not just a single "live" database). I also may need to switch target databases on the fly without altering the source database schema.
I think what you're looking for is a "hot standby". This would be a separate instance of Postgresql, possibly on the same server but usually not, which is a near-real-time replica of the primary server.
In broad strokes, this is done by shipping the binary transaction logs from the primary server to the backup server, and then "replaying" them there. The exact mechanism for transmitting the logs may vary depending on your requirements.
Fortunately, the docs on this are excellent:
https://www.postgresql.org/docs/9.3/static/warm-standby.html
https://www.postgresql.org/docs/9.0/static/hot-standby.html

How do we develop an application on the mainframe to access DB2/LUW without DB2/z?

We have developed an application which runs on the mainframe (z/OS), and it uses CAF, the Call Attach Facility, to talk to DB2/z for storing its data.
Those customers which already have DB2/z (and hence have to pay for it regardless) are not concerned, but there are others who want to use our application without incurring the expense of the database as well.
They have expressed a desire to have our product not use DB2/z, due to the expense. Under z/OS, the licence fees for DB2 are rather high and our application doesn't really need the insane levels of reliability that it provides.
So what they'd like us to do is to run DB2 under either zLinux (SLES/RHEL), or DB2/LUW on a machine totally separate from the mainframe. Or even, though this will probably be harder, in a non-IBM database.
We're looking for a hopefully-minimal-change solution to our code in achieving this. DB2 has all its federated stuff which will allow a program using DB2/z to seamlessly access data on an instance running elsewhere, but this still requires DB2/z and hence won't result in a cost reduction.
What would be the easiest way to shift all the data off the mainframe and allow us to remove the DB2/z dependency completely from our application?
Building on #NealB's answer, another way to create the layers would be to have no SQL in your application layer, but to call subroutines to accomplish your I/O. You indicate you would be willing to create custom builds, so you could create a set of routines for commonly-asked-for persistence layers.
Call the "database connect" module, which for DB2 on z/OS would do the CAF calls, for DB2 on z/Linux would (say) establish an SSL connection to the DBMS. Maintain a structure in memory with a union of pointers to the necessary data structures to communicate with your DBMS of choice.
FWIW I've seen vendor code that does this, allowing the business logic to be independent of the DBMS implementation. Some shops use VSAM, others DB2, other IMS. The data model is messy, but, sometimes them's the breaks.
This isn't an answer, just a couple of ideas and observations.
One approach I can think of would be to tier your application into an I/O layer and an
application layer. The application would run on Z/os and the I/O layer would run on
whatever machine hosts the database. All data access would then be via remote procedure calls
over TCP/IP or UDP. This would be a lot of work to set up and configure. Worse yet it may only be
appropriate for read-only type operations because managing transaction ACID (Atomicity, Consistency, Isolation, Durability)
properties becomes a real nightmare in the face of update operations.
As cschneid pointed out, you could try "rolling your own" database management system using
open source; but that too would probably lead to more problems than it solves.
I think your observation about "pushing a big rock uphill" sums it up.

Ejabberd: How to decouple Mnesia from the application server and use Mongodb plugin instead .. and how to handle big file transfers

We are building a chatting application that should be able to handle millions of users, and capable of handling huge file transfers with ease; and we decided to use Ejabberd.
So my question comes in 2 parts:
Part One:
I did some research about replacing Mnesia db (which is RDMS db type) with Mongodb (NoSQL, where comes speed and scalability), which turns out to be not so efficient, as we will have to wrap the NoSQL interface as an RDBMS database.
I'm thinking of this approach: to disable the Mnesia database (the built-in db) of the Ejabberd server, and use mongodb-erlang plugin to store everything on, is that possible? And if possible is it efficient?
Part Two:
Is it the right way to handle the file transfers through Ejabberd's mod_proxy65 plugin for such intended big files (users can pass videos) for millions of users, or there is a better approach, for example, to decouple the file transfer on a different application/proxy, I need some clarification on that part as well.

How to link MemCached server together?

I'm looking into using MemCached for a web application I am developing and after researching MemCached over the past few days, I have come across a question I could not find the answer to.
How do you link Memcached server together or how do you replicate data between MemCached server?
Additionally: Is this functionality controlled by the servers or the clients and how?
when you set several servers, the client libraries use a first hash to pick one where to store each key/data pair. that means that there's no replication, and also that every client has to use the same set of servers.
pros:
almost zero overhead, storage and bandwidth grow linearly.
server code is kept simple and reliable.
cons:
any change in the set of servers (one goes down, or you add a new one) suddenly invalidates (almost) the whole cache.
you have to be sure to use the same algorithm on every client.
if you have control to the client's code, you can simply store each key/data pair twice on two servers. just be sure to search on the same places when reading from a different client.
I've used BeITMemcached and in that you create an instance of MemcacheClient and set the servers you want to use, just as strings.
At that point the client itself determines which of the servers it has available to put different items into. You never know which an item will be in.
Check here to see how the servers handle failover.
The easiest thing is to have a repopulate mechanism. In my case, I store several hundred objects in memcache which come out of a database. I can just call repopulate and put them all back in there. Whenever I add, update or delete them to the database, I make those same calls to memcache.
http://repcached.lab.klab.org/
Also, the PHP PECL memcache client can replicate data to multiple servers, see memcache.redundancy.
It sounds like you wish to have caches that can cope with machines rebooting etc if so…
In a lot of case (assuming you are not writing Facebook) a RDMS is fast enough for caching. Just create a table that has a key and a blob column. If the RDBS server has enough ram, all the data will be in RAM and just saved to disk so as to allow recovery.
Remember this could be a separate server(s) from your main database server.
If you wish to get more fancy and are using a high-end RDMS, you may be able to set up change notifications on the queries that are used to build the “cached data” that delete out-of-date rows from the cache.
Someone you can set up triggers to clear invalid rows from the cache, however this can be very complex very quickly.
Memcached does not provide replication property. To do that, you need to add the server to memcached client server list and then hit the DB for the data to be stored in that particular server.
You should seriously consider CouchBase. It uses the memcached protocol, provides nearly the same speed, and delivers the automatic replication you're looking for. It also persists to disk so your cache will never be cold.