Postgres architecture for one machine with several apps - postgresql

I have one machine on which several applications are hosted. Applications work on separated data and don't interact - each application only needs access to its own data. I want to use PostgreSQL as RDBMS. Which one of the following is best and why?
One global Postgres sever, one global database, one schema per application.
One global Postgres server, one database per application.
One Postgres server per application.
Feel free to suggest additional architectures if you think they would be better than the ones above.

The questions you need to ask yourself: does any application ever need to access data from another application (in the same SQL statement). If you can can answer that with a clear NO, then you should at least go for separate databases. Cross-database queries aren't that straight-forward in Postgres, so if the different applications do need a lot of data from other applications, then solution 1 might be deployment layout to think about. If this would only concern very few tables, then using foreign data wrappers with different databases might still be a better solution.
Solution 2 and 3 are more or less the same from the perspective of each application. One thing to keep in mind when deciding between 2 and 3 is availability. Some configuration changes to Postgres require a restart of the service. Is an outage of all applications acceptable in that case, even though the change was only necessary for one?
But you can always start with option 2 and then move database to different servers later.
Another question to ask is if all applications always use the same (major) Postgres version With solution 2 you must make sure that all applications are compatible with a new Postgres version if one of them wants to upgrade e.g. because of new features that the application wants to use.

Solution 1 is stupid : a SQL schema is not a database. Use SQL schema for one application that have multiple "parts" like "Poduction", "sales", "marketing", "finances"...
While the final volume of the data won't be too heavy and the number of user won't be too much, use only one PG cluster to facilitate administration tasks
If the volume of data or the number of user increases, it will be time to separates your different databases on new distinct PG clusters....

Related

How to create read replicas from multiple postgres databases into a single database?

I'd like to preface this by saying I'm not a DBA, so sorry for any gaps in technical knowledge.
I am working within a microservices architecture, where we have about a dozen or applications, each supported by its Postgres database instance (which is in RDS, if that helps). Each of the microservices' databases contains a few tables. It's safe to assume that there's no naming conflicts across any of the schemas/tables, and that there's no sharding of any data across the databases.
One of the issues we keep running into is wanting to analyze/join data across the databases. Right now, we're relying on a 3rd Party tool that caches our data and makes it possible to query across multiple database sources (via the shared cache).
Is it possible to create read-replicas of the schemas/tables from all of our production databases and have them available to query in a single database?
Are there any other ways to configure Postgres or RDS to make joining across our databases possible?
Is it possible to create read-replicas of the schemas/tables from all of our production databases and have them available to query in a single database?
Yes, that's possible and it's actually quite easy.
Setup one Postgres server that acts as the master.
For each remote server, create a foreign server then you then use to create a foreign table that makes the data accessible from the master server.
If you have multiple tables in multiple server that should be viewed as a single table in the master, you can setup inheritance to make all those tables appear like one. If you can define a "sharding" key that identifies a distinct attribute between those server, you can even make Postgres request the data only from the specific server.
All foreign tables can be joined as if they were local tables. Depending on the kind of query, some (or a lot) of the filter and join criteria can even be pushed down to the remote server to distribute the work.
As the Postgres Foreign Data Wrapper is writeable, you can even update the remote tables from the master server.
If the remote access and joins is too slow, you can create materialized views based on the remote tables to create a local copy of the data. This however means that it's not a real time copy and you have to manage the regular refresh of the tables.
Other (more complicated) options are the BDR project or pglogical. It seems that logical replication will be built into the next Postgres version (to be released a the end of this year).
Or you could use a distributed, shared-nothing system like Postgres-XL (which probably is the most complicated system to setup and maintain)

Firebird with 1 database file to use 2 servers

Is it possible for firebirdSQL to run 2 servers sharing 1 database file (FDB)/ repository?
No. The server needs exclusive access to the database files. In the case of the Classic architecture version, multiple fb_inet_server processes access the same files, but locks are managed through the fb_lock_mgr process.
Databases on NFS or SMB/CIFS shares are disallowed unless one explicitly disables this protection. firebird.conf includes strong warnings against doing this unless you really know what you are doing.
If you mean if two servers on different host can share the same database, then no.
Firebird either requires exclusive access to a database (SuperServer), or coordinates access to the database by different processes on the same host through a lock file (SuperClassic and ClassicServer).
In both cases the server requires certain locking and write visibility guarantees, and most networked filesystems don't provide those (or don't provide the locking semantics Firebird needs it).
If you really, really want to, you can by changing a setting in firebird.conf, but that is a road to corrupt database or other consistency problems. And therefor not something you should want to do.
Every SQL server will not allow such configuration. If you want to split load, maybe you need to look at Multi Tier architecture. Using this architecture, you can split your SQL query load to many computers.

Messaging system : One Postgre Database, several instances of the server

I am developing a messaging system (in java) that can support around 10k users. The architecture is supposed to be as following :
- 10k clients
- 2 or more replicas of the server (each on a different machine)
- 1 postgre DB
The application is aimed to run on a clustered environment (Amazone Webservice).
Now, I have read a couple of things on Schemas in Postgre DB's. I am not sure if I should use them (and in what way) or if a simple relational DB model will do.
Basically, the DB is supposed to be very simple (messages/metadata, queueID for messages, and users).
Thank you for your answers
Don't bother with schemas. They are useful for semantically separating information in a database with lots of tables that can be grouped into clusters relevant to separate topics. They don't help you with performance, clustering or replicating databases. Also, I agree with Frank Heikens - unless each of your users sends messages with high frequency, I wouldn't worry.

Postgres Multi-tenant administration/maintenance

We have a SaaS application where each tenant has its own database in Postgres. How would I apply a patch to all the databses? For example if I want to add a table or add a column to a table, I have to either write a program that loops through all databases and execute a SQL against them or using pgadmin, go through them one by one.
Is there smarter and/or faster way?
Any help is greatly appreciated.
Yes, there's a smarter way.
Don't create a new database for each tenant. If everything is in one database then you only need to alter one database.
Pick one database, alter each table to have the column TENANT and add this to the primary key. Then insert into this database every record for all tenants and drop the other databases (obviously considerably more work than this as your application will need to be changed).
The differences with your approach are extensively discussed elsewhere:
What problems will I get creating a database per customer?
What are the advantages of using a single database for EACH client?
Multiple schemas versus enormous tables
Practicality of multiple databases per client vs one database
Multi-tenancy - single database vs multiple database
If you don't put everything in one database then I'm afraid you have to alter them all individually, and doing it programatically would be simplest.
At a higher level, all multi-tenant applications follow one of three approaches:
One tenant's data lives in one database,
One tenant's data lives in one schema, or
Add a tenant_id / account_id column to your tables (shared schema).
I usually find that developers use the following criteria when they evaluate these different approaches.
Isolation: Since you can put each tenant into its own database in one hand, and have tenants share the same table on the other, this becomes the most apparent dimension. If you provide your users raw SQL access or you're in a regulated industry such as healthcare, you may need strict guarantees from your database. That said, PostgreSQL 9.5 comes with row level security policies that makes this less of a concern for most applications.
Extensibility: If your tenants are sharing the same schema (approach #3), and your tenants have fields that varies between them, then you need to think about how to merge these fields.
This article on multi-tenant databases has a great summary of different approaches. For example, you can add a dozen columns, call them C1, C2, and so forth, and have your application infer the actual data in this column based on the tenant_id. PostgresQL 9.4 comes with JSONB support and natively allows you to use semi-structured fields to express variations between different tenants' data.
Scaling: Another criteria is how easily your database would scale-out. If you create a tenant per database or schema (#1 or #2 above), your application can make use of existing Ruby Gems or [Django packages][1] to simplify app integration. That said, you'll need to manually manage your tenants' data and the machines they live on. Similarly, you'll need to build your own sharding logic to propagate foreign key constraints and ALTER TABLE commands.
With approach #3, you can use existing open source scaling solutions, such as Citus. For example, this blog post describes how to easily shard a multi-tenant app with Postgres.
it's time for me to give back to the community :) So after 4 years, our multi-tenant platform is in production and I would like to share the following observations/experiences with all of you.
We used a database per each tenant. This has given us extreme flexibility as the size of the databases in the backups are not huge and hence we can easily import them into our staging environment for customers issues.
We use Liquibase for database development and upgrades. This has been a tremendous help to us, allowing us to package the entire build into a simple war file. All changes are easily versioned and managed very efficiently. There is a bit of learning curve here an there but nothing substantial. 2-5 days can significantly save you time.
Given that we use Spring/JPA/Hibernate, we use a technique called Dynamic Data Source Routing. So when a user logs-in, we find the related datasource with a lookup and connect them to the session to the right database. That's also when the Liquibase scripts get applied for updates.
This is, for now, I will come back with more later on.
Well, there are problems with one database for all tenants in our case for sure.
The backup file gets huge and becomes almost not practical hard to manage
For troubleshooting, we need to restore customer's data in our dev env, we just use that customer's backup file and usually the file is not as big as if we were to use one database for all customers.
Again, Liquibase has been key in allowing to manage updates across all the tenants seamlessly and without any issues. Without Liquibase, I can see lots of complications with this approach. So Liquibase, Liquibase and more Liquibase.
I also suspect that we would need a more powerful hardware to manage a huge database with large joins across millions of records vs much lighter database with much smaller queries.
In case of problems, the service doesn't go down for everyone and there will be limited to one or few tenants.
In general, for our purposes, this has been a great architectural decision and we are benefiting from it every day. One time we had one customer that didn't have their archiving active and their database size grew to over 3 GB. With offshore teams and slower internet as well as storage/bandwidth prices, one can see how things may become complicated very quickly.
Hope this helps someone.
--Rex

How to create multiple instance of sqlite database?

I am making an online app in which when I sync my data web then 25 to 30 local database queries in different tables are executed. So it will take around 25 to 30 sec because all database queries are execute in this manner, first check that data is present or not in local database if present then row is update otherwise insert. Now I want to ask that there are any way through which I can execute these all queries concurrently. If I can do this then I can save my 10 to 15 sec in every sync. So please gave a better solution to execute multiple queries.
Consider using High Performance database management system such as cubeSQL :
SQLabs has announced the release of cubeSQL a fully featured and high
performance relational database management system built on top of the
sqlite database engine. It is the ideal database server for both
developers who want to convert a single user database solution to a
multiuser project and for companies looking for an affordable, easy to
use and easy to maintain database management system. cubeSQL runs on
Windows, Mac, Linux and it can be embedded into any iOS and Cocoa
application.
cubeSQL is incredibly fast, has a small footprint, is highly reliable
and it offers some unique features. It can be easily accessed with any
JSON client, with PHP, with the native C SDK, with a Windows DLL and
with an highly optimized REAL Studio plugin.
It is not possible to run 2 or more than two queries at a single time cause when 1 query runs it locks the DataBase.
If all queries you want to execute that relates to the different table then in that case you can create the Separate Database File for every Table.