How can I know when a set of constraints were added to a specific table? - postgresql

I am working with a table that has many constraints.
I am receiving errors while importing a clean data in the table.
I want to know at what time these constraints were added to the table so that I can have an idea whether it was after or before the bulk import in that table.
How can I find out when a constraint was added to a table (the date of creation of constraint)
I use PostgreSQL 10

PostgreSQL doesn't record this information in the metadata.
If you have log_statement = 'ddl', you might find the information in the log file.

Related

How does postgreql manage added or changed columns involved in table publication?

I've managed to CREATE PUBLICATION ( and corresponding SUBSCRIPTION ) for a set of tables in my database. It's all working wonderfully. Now someone wants to add a column to one of the base tables (ayayay) and I can't seem to find any documentation regarding how to (most-easily) manage this situation and I'm hoping someone can lead me in the right direction.
Adding a column is easy:
First, add the column on the standby. That won't break replication; the new column will remain empty.
Then, add the column on the primary.
If you change the set of tables that is replicated, don't forget to run
ALTER SUBSCRIPTION ... REFRESH PUBLICATION;

Postgres parallel/efficient load huge amount of data psycopg

I want to load many rows from a CSV file.
The file​s​ contain​ data like these​ "article​_name​,​article_time,​start_time,​end_time"
There is a contraint on the table: for the same article name, i don't insert a new row if the new ​article_time falls in an existing range​ [start_time,​end_time]​ for the same article.
ie: don't insert row y if exists [​start_time_x,​end_time_x] for which time_article_y inside range [​start_time_x,​end_time_x] , with article_​name_​y = article_​name_​x
I tried ​with psycopg by selecting the existing article names ad checking manually if there is an overlap --> too long
I tried again with psycopg, this time by setting a condition 'exclude using...' and tryig to insert with specifying "on conflict do nothing" (so that it does not fail) but still too long
I tried the same thing but this time trying to insert many values at each call of execute (psycopg): it got a little better (1M rows processed in almost 10minutes)​, but still not as fast as it needs to be for the amount of data ​I have (500M+)
I tried to parallelize by calling the same script many time, on different files but the timing didn't get any better, I guess because of the locks on the table each time we want to write something
Is there any way to create a lock only on rows containing the same article_name? (and not a lock on the whole table?)
Could you please help with any idea to make this parallellizable and/or more time efficient?
​Lots of thanks folks​
Your idea with the exclusion constraint and INSERT ... ON CONFLICT is good.
You could improve the speed as follows:
Do it all in a single transaction.
Like Vao Tsun suggested, maybe COPY the data into a staging table first and do it all with a single SQL statement.
Remove all indexes except the exclusion constraint from the table where you modify data and re-create them when you are done.
Speed up insertion by disabling autovacuum and raising max_wal_size (or checkpoint_segments on older PostgreSQL versions) while you load the data.

Alter a SELECT Query to Limit

I'm working on a 1M+ row table. The software that inserts the data sometimes tries to select all rows. If it tries to do that; It crashes.
I'm not able to modify the software so I'm trying to implement a fix on the Postgresql side.
I want Postgresql to limit SELECT query results that are coming from a special user to 1.
I tried to implement a RULE but haven't been able to do it with success. Any suggestions are welcome.
Br,
You could rename the table and create a view with the name of the table (selecting from the renamed table).
Then you can include a LIMIT clause in the view definition.
There is a chance you need an index. Let me give you a few scenarios
There is a unique constraint on one of the fields but no corresponding index. This way when you insert a record PostgreSQL has to scan the table to see if there is an existing record with the same value in that field.
Your software mimics unique field constraint. Before inserting a new record it scans the table for a record with the same value in one of the fields to check if such a record already exists. Index on the right field would definitely help.
You software wants to compute the next "id" value. In this case it runs SELECT MAX(id) in order to find the next available value. "id" needs an index.
Try to find out if indexing one of the table fields helps. You can also try to trace and analyze queries submitted to the server and see if those queries can benefit from indexing the table. You can enable query logging this way How to log PostgreSQL queries?
Another guess is that your software buffers all records before processing them. Reading 1M records into memory may crash it. Limiting fetchSize (e.g. if your software uses JDBC you could add defaultRowFetchSize connection parameter to the connection string) may help though I realize you may not have means to change the way the existing software fetches data from DB.

Using IMPORT instead of LOAD in DB2

I wanted to prepare a load utility to load the data into DB2 table. The table has columns which contains GENERATEDALWAYS feature set.
So, I am not able to load an unloaded details from the table.
Is it possible to use import for tables having columns with GENERATEDALWAYS set?
Steps I did:
1. db2 "export to tbl.txt of del modified by coldel| select * from <schema.table> where col=value"
2. db2 "delete from <schema.table> where col=value"
3. db2 "import from tbl.txt of del modified by coldel| allow write access warningcount1 insert into <schema.table>"
The columns with "GENERATEDALWAYS" is having NEW Value after import. Is it possible to use import to populate GENERATEDALWAYS columns to have the old values?
Appreciate the assistance.
Thanks,
Mathew Liju
What you are asking is not possible. With IMPORT you can't override columns that have GENERATED ALWAYS. As #Peter Miehle suggests you could alter the table to specify that the column is GENERATED BY DEFAULT, but this may break other applications.
Your question's title implies that you don't want to use the LOAD utility (but you don't mention anything about it in the actual question). However, LOAD is the only way to write data into the table and maintain the values for the generated column as they exist in the file:
db2 "load from tbl.txt of del modified by generatedoverride insert into schema.table"
If you do this, be aware that:
DB2 does not check if there are conflicts with existing rows in the table. You would need to define a unique index on the column(s) in question to resolve this; this would cause DB2 to delete the rows that you just loaded in the DELETE phase of the load.
If your generated column(s) are using IDENTITY, make sure that you alter the column to ensure that future generated values do not conflict with the rows that you just inserted into the table.
maybe you can drop the "generation" from the column and add it after importing with the appropriate values again.
#Ian Bjorhovde has given you the options.
IMPORT actually does INSERTs in the background - ie, it first prepares a INSERT statement with parameter markers and uses the values in the input file for those markers.
In your SQL snapshot you will see INSERT statement that is used.
Anything that is not possible in an INSERT statement isn't possible with IMPORT (kind of .. )

PostgreSQL v7.4 ALTER TABLE to change column

I have a need to change the length of CHAR columns in tables in a PostgreSQL v7.4 database. This version did not support the ability to directly change the column type or size using the ALTER TABLE statement. So, directly altering a column from a CHAR(10) to CHAR(20) for instance isn't possible (yeah, I know, "use varchars", but that's not an option in my current circumstance). Anyone have any advice/tricks on how to best accomplish this? My initial thoughts:
-- Save the table's data in a new "save" table.
CREATE TABLE save_data AS SELECT * FROM table_to_change;
-- Drop the columns from the first column to be changed on down.
ALTER TABLE table_to_change DROP column_name1; -- for each column starting with the first one that needs to be modified
ALTER TABLE table_to_change DROP column_name2;
...
-- Add the columns back, using the new size for the CHAR column
ALTER TABLE table_to_change ADD column_name1 CHAR(new_size); -- for each column dropped above
ALTER TABLE table_to_change ADD column_name2...
-- Copy the data bace from the "save" table
UPDATE table_to_change
SET column_name1=save_data.column_name1, -- for each column dropped/readded above
column_name2=save_date.column_name2,
...
FROM save_data
WHERE table_to_change.primary_key=save_data.primay_key;
Yuck! Hopefully there's a better way? Any suggestions appreciated. Thanks!
Not PostgreSQL, but in Oracle I have changed a column's type by:
Add a new column with a temporary name (ie: TMP_COL) and the new data type (ie: CHAR(20))
run an update query: UPDATE TBL SET TMP_COL = OLD_COL;
Drop OLD_COL
Rename TMP_COL to OLD_COL
I would dump the table contents to a flat file with COPY, drop the table, recreate it with the correct column setup, and then reload (with COPY again).
http://www.postgresql.org/docs/7.4/static/sql-copy.html
Is it acceptable to have downtime while performing this operation? Obviously what I've just described requires making the table unusable for a period of time, how long depends on the data size and hardware you're working with.
Edit: But COPY is quite a bit faster than INSERTs and UPDATEs. According to the docs you can make it even faster by using BINARY mode. BINARY makes it less compatible with other PGSQL installs but you won't care about that because you only want to load the data to the same instance that you dumped it from.
The best approach to your problem is to upgrade pg to something less archaic :)
Seriously. 7.4 is going to be removed from "supported versions" pretty soon, so I wouldn't wait for it to happen with 7.4 in production.