Save simple information for a database within postgress - postgresql

I have a multi tennant application which will use the SILO Model to save data (each tennant will get an own database).
Because tennant names could be redundand my database are with GUIDs: MyApp_[GUID].
Now I want to save simple but neccesary information for each database like a tennant name and 3 to 5 more informations.
Is there a simple way to write and get these data?
The only way I can think of is to create a special table for this with only 1 row - but it seems a bot of wasting.

If you're looking for a simpler solution than a table per database (and having to deal with the awkward constraint that it must have exactly one row), you could
use a custom configuration parameter. You can change them with ALTER DATABASE. The downside is that you can only store strings, and that the settings might be overridden per session.
use a COMMENT on the database. The downside is that you can only store a single string per databasebase; the advantage is that it is automatically shown in many lists of databases such as psql's \l+ command
add your own columns to the pg_database system table. You should not mess with that, so it's a spectacularly bad idea even if you knew what you were doing, but in a relational model it's the closest to what you were asking for so I'd mention it for completeness.
I don't really advocate any of these solutions, although they do what you were asking for there's probably a better solution to your actual problem. It might be as simple a table of databases, possibly with a foreign key to pg_database, in an extra database shared by all tenants.

Related

db2look from SQL

Is it possible to get the table structure like db2look from SQL?
Or the only way is from command line? Thus, by wrapping a external stored procedure in C I could call the db2look, but that is not what I am looking for.
Clarification added later:
I want to know which tables have the non logged option from SQL.
It is possible to create the table structure from regular SQL and the public DB2 catalog - however, it is complex and requires some deeper skills.
The metadata is available in the DB2 catalog views in the SYSCAT schema. For a regular table you would first start off by looking into the values in SYSCAT.TABLES and SYSCAT.COLUMNS. From there you would need to branch off to other views depending on what table and column options you are after, whether time-travel tables, special partitioning rules, or many other options are involved.
Serge Rielau published an article on developerWorks called Backup and restore SQL schemas for DB2 Universal Database that provides a set of stored procedures that will do exactly what you're looking for.
The article is quite old (2006) so you may need to put some time in to update the procedures to be able to handle features that were added to DB2 since the date of publication, but the procedures may work for you now and are a nice jumping off point.

Postgres Multi-tenant administration/maintenance

We have a SaaS application where each tenant has its own database in Postgres. How would I apply a patch to all the databses? For example if I want to add a table or add a column to a table, I have to either write a program that loops through all databases and execute a SQL against them or using pgadmin, go through them one by one.
Is there smarter and/or faster way?
Any help is greatly appreciated.
Yes, there's a smarter way.
Don't create a new database for each tenant. If everything is in one database then you only need to alter one database.
Pick one database, alter each table to have the column TENANT and add this to the primary key. Then insert into this database every record for all tenants and drop the other databases (obviously considerably more work than this as your application will need to be changed).
The differences with your approach are extensively discussed elsewhere:
What problems will I get creating a database per customer?
What are the advantages of using a single database for EACH client?
Multiple schemas versus enormous tables
Practicality of multiple databases per client vs one database
Multi-tenancy - single database vs multiple database
If you don't put everything in one database then I'm afraid you have to alter them all individually, and doing it programatically would be simplest.
At a higher level, all multi-tenant applications follow one of three approaches:
One tenant's data lives in one database,
One tenant's data lives in one schema, or
Add a tenant_id / account_id column to your tables (shared schema).
I usually find that developers use the following criteria when they evaluate these different approaches.
Isolation: Since you can put each tenant into its own database in one hand, and have tenants share the same table on the other, this becomes the most apparent dimension. If you provide your users raw SQL access or you're in a regulated industry such as healthcare, you may need strict guarantees from your database. That said, PostgreSQL 9.5 comes with row level security policies that makes this less of a concern for most applications.
Extensibility: If your tenants are sharing the same schema (approach #3), and your tenants have fields that varies between them, then you need to think about how to merge these fields.
This article on multi-tenant databases has a great summary of different approaches. For example, you can add a dozen columns, call them C1, C2, and so forth, and have your application infer the actual data in this column based on the tenant_id. PostgresQL 9.4 comes with JSONB support and natively allows you to use semi-structured fields to express variations between different tenants' data.
Scaling: Another criteria is how easily your database would scale-out. If you create a tenant per database or schema (#1 or #2 above), your application can make use of existing Ruby Gems or [Django packages][1] to simplify app integration. That said, you'll need to manually manage your tenants' data and the machines they live on. Similarly, you'll need to build your own sharding logic to propagate foreign key constraints and ALTER TABLE commands.
With approach #3, you can use existing open source scaling solutions, such as Citus. For example, this blog post describes how to easily shard a multi-tenant app with Postgres.
it's time for me to give back to the community :) So after 4 years, our multi-tenant platform is in production and I would like to share the following observations/experiences with all of you.
We used a database per each tenant. This has given us extreme flexibility as the size of the databases in the backups are not huge and hence we can easily import them into our staging environment for customers issues.
We use Liquibase for database development and upgrades. This has been a tremendous help to us, allowing us to package the entire build into a simple war file. All changes are easily versioned and managed very efficiently. There is a bit of learning curve here an there but nothing substantial. 2-5 days can significantly save you time.
Given that we use Spring/JPA/Hibernate, we use a technique called Dynamic Data Source Routing. So when a user logs-in, we find the related datasource with a lookup and connect them to the session to the right database. That's also when the Liquibase scripts get applied for updates.
This is, for now, I will come back with more later on.
Well, there are problems with one database for all tenants in our case for sure.
The backup file gets huge and becomes almost not practical hard to manage
For troubleshooting, we need to restore customer's data in our dev env, we just use that customer's backup file and usually the file is not as big as if we were to use one database for all customers.
Again, Liquibase has been key in allowing to manage updates across all the tenants seamlessly and without any issues. Without Liquibase, I can see lots of complications with this approach. So Liquibase, Liquibase and more Liquibase.
I also suspect that we would need a more powerful hardware to manage a huge database with large joins across millions of records vs much lighter database with much smaller queries.
In case of problems, the service doesn't go down for everyone and there will be limited to one or few tenants.
In general, for our purposes, this has been a great architectural decision and we are benefiting from it every day. One time we had one customer that didn't have their archiving active and their database size grew to over 3 GB. With offshore teams and slower internet as well as storage/bandwidth prices, one can see how things may become complicated very quickly.
Hope this helps someone.
--Rex

Migrating a schema from one database to other

As part of some requirement, I need to migrate a schema from some existing database to a new schema in a different database. Some part of it is already done and now I need to compare the 2 schema and make changes in the new schema as per gap finding.
I am not using a tool and was trying to understand some details using syscat command but could not get much success.
Any pointer on what is the best way to solve this?
Regards,
Ramakant
A tool really is the best way to solve this – IBM Data Studio is free and can compare schemas between databases.
Assuming you are using DB2 for Linux/UNIX/Windows, you can do a rudimentary compare by looking at selected columns in SYSCAT.TABLES and SYSCAT.COLUMNS (for table definitions), and SYSCAT.INDEXES (for indexes). Exporting this data to files and using diff may be the easiest method. However, doing this for more complex structures (tables with range or database partitioning, foreign keys, etc) will become very complex very quickly as this information is spread across a lot of different system catalog tables.
An alternative method would be to extract DDL using the db2look utility. However, you can't specify the order that db2look outputs objects (db2look extracts DDL based on the objects' CREATE_TIME), so you can't extract DDL for an entire schema into a file and expect to use diff to compare. You would need to extract DDL into a separate file for each table.
Use SchemaCrawler for IBM DB2, a free open-source tool that is designed to produce text output that is designed to be diffed. You can get very detailed information about your schema, including view and stored procedure definitions. All of the information that you need will be output in a single file, and can be compared very easily using a standard diff tool.
Sualeh Fatehi, SchemaCrawler
unfortunately as per company policy, cannot use these tools at this point of time. So am writing some program using JDBC to get the details and do some comparison kind of stuff.

Amazon DynamoDB Item Size?

I'm just exploring the whole NoSQL concept. I've been playing around with Amazons DynamoDB and I really like the concept. However that said I am not quite sure how the data should be separated. By this I mean should I create a new Table for related data features like you would in a relational database or do I use a single table to store all the applications data?
As an example, in a relational DB I might have a table called users and a table called users_details. I would then for example, create a 1:1 relationship between the two tables. With the NoSQL concept I could theoretically create two tables as well but it strikes me as more efficient to have all the data in a single table.
If that is the case then when do you stop? Is the idea to store all the application data for a given user in a single table?
First ask yourself: why did I separate the users from the user details in RDBMS in the first place.
On a more general note, when talking about NoSQL you really shouldn't look at relationships between tables. You shouldn't think about "joining" information from different tables but rather prepare your data in a way that can be retrieved optimally.
In the user/user_details scenario, put all information in a users table, and query what you want by specifying the attributes to get.

Q's on pgsql

I'm a newbie to pgsql. I have few questionss on it:
1) I know it is possible to access columns by <schema>.<table_name>, but when I try to access columns like <db_name>.<schema>.<table_name> it throwing error like
Cross-database references are not implemented
How do I implement it?
2) We have 10+ tables and 6 of have 2000+ rows. Is it fine to maintain all of them in one database? Or should I create dbs to maintain them?
3) From above questions tables which have over 2000+ rows, for a particular process I need a few rows of data. I have created views to get those rows.
For example: a table contains details of employees, they divide into 3 types; manager, architect, and engineer. Very obvious thing this table not getting each every process... process use to read data from it...
I think there are two ways to get data SELECT * FROM emp WHERE type='manager', or I can create views for manager, architect n engineer and get data SELECT * FROM view_manager
Can you suggest any better way to do this?
4) Do views also require storage space, like tables do?
Thanx in advance.
Cross Database exists in PostGreSQL for years now. You must prefix the name of the database by the database name (and, of course, have the right to query on it). You'll come with something like this:
SELECT alias_1.col1, alias_2.col3 FROM table_1 as alias_1, database_b.table_2 as alias_2 WHERE ...
If your database is on another instance, then you'll need to use the dblink contrib.
This question doe not make sens. Please refine.
Generally, views are use to simplify the writing of other queries that reuse them. In your case, as you describe it, maybe that stored proceudre would better fits you needs.
No, expect the view definition.
1: A workaround is to open a connection to the other database, and (if using psql(1)) set that as your current connection. However, this will work only if you don't try to join tables in both databases.
1) That means it's not a feature Postgres supports. I do not know any way to create a query that runs on more than one database.
2) That's fine for one database. Single databases can contains billions of rows.
3) Don't bother creating views, the queries are simple enough anyway.
4) Views don't require space in the database except their query definition.