My requirement is to be able to bulk insert data to multiple table. (Having foreign key constraints) I won't get the foreign keys till I submit the batch to the parent table. how to achieve this using bulk copy?
i cannot use linked servers or openrowset due to policy constraints.
you may be able to get away with turning off the constraints temporarily during the bulk load.
Related
I'm using the Oracle foreign data wrapper and would like to have local copies of some of my foreign tables locally. Is there another option than having materialized views and refreshing them manually?
Not really, unless you want to add functionality in Oracle:
If you add a trigger on the Oracle table that records all data modifications in another table, you could define a foreign table on that table. Then you can regularly run a function in PostgreSQL that takes the changes since you checked last time and applies them to a PostgreSQL table.
If you understand how “materialized view logs” work in Oracle (I don't, and I think the documentation doesn't tell), you could define a foreign table on that and use it like above. That might be cheaper.
Both of these ideas would still require you to regularly run something in PostgreSQL, but you might be cheaper. Perhaps (if you have the money) you could use Oracle Heterogenous Services to modify a PostgreSQL table whenever something changes in an Oracle table.
I'm currently working on dumping one of our customer's database in a way that allows us to create new databases from this customer's basic structure, but without bringing along their private data.
So far, I've had success with pg_dump combined with the --exclude_table and exclude-table-data commands, which allowed me to bring only the data I'll effectively need for this task.
However, there are a few tables that mix lines which references some of the data I left behind with other lines that references data that I had to bring, and this is causing me a few issues during the restore operation. Specifically, when the dump tries to enforce FOREIGN KEY constraints for certain columns on these tables, it fails because there are some lines with keys that have no matching data on the respective foreign table - because I chose to not bring this table's data!
I know I can log into the database after the dump is complete, delete any rows that reference data that no longer exists and create the constraint myself, but I'd like to automate the process as much as possible. Is there a way to tell pg_dump or pg_restore (or any other program) to not bring rows from table A if they reference table B if and table B's data was excluded from the backup? Or to tell Postgres that I'd like to have that specific foreign key to be active before importing the table's data?
For reference, I'm working with PostgreSQL 9.2 on a HREL 7 server.
What if you disable foreign key checking when you restore your database dump? And after that remove lonely rows from the referring table.
By the way, I recommend you to fix you database schema so there is no chance wrong tuples being inserted into your database.
How postgreSQL handle the multiple concurrent requests to foreign tables?
If two data consumers want to access the same foreign table, do they have to wait and execute the query sequentially, or concurrency of queries is supported?
The following answer is mostly for the foreign data wrapper for PostgreSQL, postgres_fdw.
If you need information about other foreign data wrappers, that will vary with the implementation of the foreign data wrapper and the capabilities of the underlying data store. For example, to have concurrent (read) requests with file_fdw, you need a file system that allows two processes to open the file for reading simultaneously.
Concurrency of queries against the same foreign table is just like for local tables. It is the remote server that handles the SQL statements, locks modified rows until the transaction finishes, and similar.
So there can be arbitrarily many concurrent readers, and readers won't block writers and vice versa.
If you run UPDATEs or DELETEs with WHERE conditions than cannot be pushed down to the foreign server (check the execution plan), it can happen that you have more locks than when using a local table.
Imagine a query like this:
UPDATE remote_tab SET col = 0 WHERE <complicated condition that is true for only one row>;
On a local table, this would only lock a single row.
If the condition is too complicated to be pushed down to the foreign server, postgres_fdw will first run a query like this:
SELECT ctid, col FROM remote_tab FOR UPDATE;
That will retrieve and lock all rows of the table.
Then the WHERE condition will be applied locally, and the resulting row is updated on the foreign server:
UPDATE remote_tab SET col = 0 WHERE ctid = ...;
So in this case, concurrency and performance can suffer quite a lot.
I have a database with around 20 normalized tables and there are a lot of foreign key constraints between these tables. The user table for instance, is being referenced in 3 other tables and the primary keys of these 3 tables are further referenced in other tables whose primary keys are further referenced in other tables.
Now, I need to implement a functionality for archiving a user which involves deleting the row from user table and deleting all its direct and indirect linkages, but at the same time saving, serializing everything in a JSON structure and dumping it in an archive table so that if the need arises to unarchive the user in future, we can do that. I was thinking of writing a function for this since it makes a lot more sense to do this entire activity in a single transaction. I was wondering if there is a better and easier alternative for doing such thing ?
Currently in Oracle I run a procedure monthly to delete some data. For performance reasons in Oracle I have used BULK COLLECT and FORALL .. DELETE to perform the deletes.
Anyone know if there is there anything similar in Postgres? Do I need to be concerned about performance if I use the following to delete a lot of data?
DELETE FROM sample WHERE id IN (SELECT id FROM test);
Use WHERE EXISTS not WHERE IN.
Otherwise, should be fine so long as sample isn't the target of any foreign key refs. If it is, you'll need indexes on the referencing ends.
For really big deletes on FKs with ON DELETE CASCADE it can be preferable to do a join to delete the referring side in a batch, then delete the referred-to side. That helps prevent millions of individual DELETE statements having to run for cascade deletes.