Create foreign key to table that doesn't exist in postgres - postgresql

I've also asked this question in DBA exchange, but not sure if that is the right place.
I'm trying to put together a solution where I have a bunch of script files that contain my schema creation. Something similar to Visual Studio's SQL Project, but for Postgres.
My problem is that the files won't necessarily be read in the correct order, for instance the first file/script/table read has a dependency on the 2nd file/script/table.
I was hoping there was some way I could disable the check to see if the referred table/column does exist, creating all the tables, and then re-enabling the check.
I'm running 9.2.24
My 2nd solution would be to order the files in such a way that it does only create a table when the dependencies already exist, but my above mention method would be preferred.

Related

Idempotency and foreign key constraints when using Postgres pg_dump files with Flyway

I am trying to use some pg_dump generated migration scripts with Flyway. The first migration script is for schema only. The other migration scripts load seed data into various tables using the Postgres COPY command. These seed-data scripts are going to exist as Flyway repeatable migration scripts. This setup presents two issues.
When Flyway loads the seed data from the migration scripts, I'm getting foreign key constraint violations since I don't have the various tables being seeded in the correct order. There are a large number of tables to deal with, so is there an easy way to work around this so that I don't have to try to reorder my COPY's?
Since the seed data is going to be in repeatable migration scripts, these need to be idempotent. Is there a way to do this with the Postgres COPY command? I'm trying to avoid having to convert this to INSERTs since it will hurt performance and also make my migrations files huge.
The trick here for idempotency is to delete the data from the files in the correct dependency order, and when you've done that, to likewise load the data in the correct dependency order. The correct dependency order for deleting the data is worked out by obtaining the target tables for every foreign key constraint and ensuring that no data from a table is ever deleted when it is the target of a table whose data is yet to be deleted. This list of tables in dependency order is usually called a 'manifest' and is required also for CREATE statements and for the PgSQL COPY. The Public domain PowerShell-based Flyway Teamworks framework will create the manifest for you.

How to implement schema migration for PostgreSQL database

I need to implement schema migration mechanism for PostgreSQL.
Just to remove ambiguity: with schema-migration I mean that I need upgrade my database structures to the latest version regardless of their current state on particular server instance.
For example in version one I created some tables, then in version two I renamed some columns and in version three I removed one table and created another one. I have multiple servers and on some of them I have version one on some version three etc.
My idea:
Generate hash for output produced by
pg_dump --schema-only
every time before I change my database schema. This will be a reliable way to identify database version in the future to which the patch should apply.
Contain a list of patches with the associated hashed to which they should apply.
When I need to upgrade my database I will run an application that will search for hash that corresponds to current database structure (by calculating hash of local database and comparing it with hash set that I have) and apply associated patch.
Repeat until next hash is not found.
Could you please point any weak sides of this approach?
Have you ever heard of https://pgmodeler.io ? At the company where I work we decided to go for this since it can perform schema diff even between local and remote. We are very satisfied with it.
Otherwise if you are more for a free solution, you could develop a migration tool which can be used to apply migrations you store in a single repo. Furthermore this tool could rely on a migration table you keep in a separate schema so that your DB(s) will always know which migrations were applied or not.
The beauty of this approach is that migrations can both be about a schema change and data changes.
I hope this can give you some ideas.

I need help understanding the syntax for BACKUP in MySQL Workbench

So, I am relatively new to MySQL and recently, I was asked to create a query that utilizes the BACKUP command in order to copy a table to a given destination folder. I was provided text from an SQL tutorial in w3schools.com, however, when I attempted to follow the format, I was informed that "BACKUP is not valid at this position, expecting: EOF, BEGIN, CATCH, CHECKSUM, COMMIT, DEALLOCATE,..". So, I was wondering, what is the proper syntax for using the BACKUP command in a query?
I have attempted several actions in order to resolve the issue, but none of them were successful. I have tried;
1# Executing a query with and without the underlying table saved in a file folder.
2# Using BACKUP for a database in case the problem was with tables.
3# Starting with BEGIN, DO, and mysqldump.
4# Removing TABLE.
5# Adding an opening parenthesis after the name of the table and a closing parenthesis after the name of the destination.
I do not feel comfortable sharing my own table and destination folder, but here is what I was supposed to use for reference. My code follows the same format;
What I was supposed to use for reference
BACKUP DATABASE Is not part of MySQL syntax. I believe you may be thinking of the SQL Server statement.
For MySQL you will likely want to use the mysqldump utility (which is a separate concept from SQL queries). Or possibly some solution involving the SELECT ... INTO OUTFILE variant of the SELECT...INTO statement.

PostgreSQL: update a schema when views from another schema depend on it

Here is my setup. I have two schemas: my_app and static_data. The latter is imported from a static dump. For the needs of my application logic, I made views that use the tables of static_data, and I stored them in the my_app schema.
It all works great. But I need to update the static_data schema with a new dump, and have my views use the new data. The problem is, whatever I do, my views will always reference the old schema!
I tried importing the new dump in a new schema, static_data_new, then trying to delete static_data and rename static_data_new to static_data. It doesn't work because my views depend on tables in static_data, therefore PostgreSQL won't let me delete it.
Then I tried setting search_path to static_data_new. But when I do that, the views still reference the old tables!
Is it possible to have views that reference tables using the search_path? Thanks.
Views are bound to the underlying objects. Renaming the object does not affect this link.
I see basically 3 different ways to deal with your problem:
DELETE your views and re-CREATE them after you have your new table(s) in place. Simple and fast, as soon as you have your complete create script together. Don't forget to reset privileges, too. The recreate script may be tedious to compile, though.
Use table-functions (functions RETURNING SETOF rows or RETURNING TABLE) instead of a views. Thereby you get "late binding": the object names will be looked up in the system catalogs at execution time, not at creation time. It will be your responsibility that those objects can, in fact, be found.
The search_path can be pre-set per function or the search_path of the executing role will be effective for objects that are not explicitly schema-qualified. Detailed instructions and links in this related answer on SO.
Functions are basically like prepared statements and behave subtly different from views. Details in this related answer on dba.SE.
Take the TRUNCATE and INSERT route for the new data instead of DELETE and CREATE. Then all references stay intact. Find a more detailed answer about that here.
If foreign keys reference your table you have to use DELETE FROM TABLE instead - or drop and recreate the foreign key constraints. it will be your responsibility that the referential integrity can be restored, or the recreation of the foreign key will fail.

Data Warehousing Postgres

We're considering using SSIS to maintain a PostgreSql data warehouse. I've used it before between SQL Servers with no problems, but am having a lot of difficulty getting it to play nicely with Postgres. I’m using the evaluation version of the OLEDB PGNP data provider (http://www.postgresql.org/about/news.1004).
I wanted to start with something simple like UPSERT on the fact table (10k-15k rows are updated/inserted daily), but this is proving very difficult (not to mention I’ll want to use surrogate keys in the future).
I’ve attempted (Link) and (http://consultingblogs.emc.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx) which are effectively the same (except I don’t really understand the union all at the end when I’m trying to upsert) But I run into the same problem with parameters when doing the update using a OLEDb command – which I tried to overcome using (http://technet.microsoft.com/en-us/library/ms141773.aspx) but that just doesn’t seem to work, I get a validation error –
The external columns for complent.... are out of sync with the datasource columns... external column “Param_2” needs to be removed from the external columns.
(this error is repeated for the first two parameters as well – never came across this using the sql connection as it supports named parameters)
Has anyone come across this?
AND:
The fact that this simple task is apparently so difficult to do in SSIS suggests I’m using the wrong tool for the job - is there a better (and still flexible) way of doing this? Or would another ETL package be better for use between two Postgres database? -Other options include any listed on (http://en.wikipedia.org/wiki/Extract,_transform,_load#Open-source_ETL_frameworks). I could just go and write a load of SQL to do this for me, but I wanted a neat and easily maintainable solution.
I have used the Slowly Changing Dimension wizard for this with good success. It may give you what you are looking for especially with the Wizard
http://msdn.microsoft.com/en-us/library/ms141715.aspx
The External Columns Out Of Sync: SSIS is Case Sensitive - I encountered this issue multiple times and it makes me want to pull my hair out.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
SCD is way too slow for what I want. I need to use set based sql.
It turned out that a lot of my problems were with bugs in the provider.
I opened a forum topic (http://www.pgoledb.com/forum/viewtopic.php?f=4&t=49) and had a useful discussion with the moderator/support/developer person.
Also Postgres doesn't let you do cross db querys, so I solved the problem this way:
Data Source from Production DB to a temp Archive DB table
Run set based query between temp table and archive table
Truncate temp table
Note that the temp table is not atchally a temp table, but a copy of the archive table schema to temporarily stored data in.
Took a while, but I got there in the end.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
What enterprise ETL solution would you suggest?