I'm wondering if I can use a trigger on a table to "ignore" columns that are in a COPY statement from STDIN but which are not in the target table. Sorry if the wording/syntax of the question is off, but here is and explanation of what I'm trying to say. I'm new to triggers so any advice is helpful.
I'm using the PostGIS Shapefile importer to copy shapefiles to the spatial tables in my PostgreSQL database.
This creates a COPY statement which contains all the fields in the shapefile something like:
COPY "public"."stations" ("column1","column2","column3","column4", geom) FROM stdin;
column1 and column2 are in the file but not in the target table, so the COPY fails.
Is there a way to create a trigger to create something that would have the same result as:
COPY "public"."stations" ("column3","column4", geom) FROM stdin;
No, you cannot skip columns that are present in the input file. This will error out, before triggers are even invoked. And you cannot use rules either. I quote the manual:
COPY FROM will invoke any triggers and check constraints on the
destination table. However, it will not invoke rules.
You can either edit the file or use a temporary staging table:
COPY to a temporary table with matching columns.
Use INSERT to write the desired columns to the final target table(s) - or the whole range of SQL DDL commands for more sophisticated matters.
Related
I have a data anonymization process that takes a production copy of a database and turns it into an anonymized copy by UPDATE-ing some columns.
Some of the tables contain several million rows so instead of UPDATE-ing the columns, which is very log intensive, I went down the way of
SELECT
Id,
CAST('Redacted' AS NVARCHAR(255)) [ColumnRequiringAnonymization]
INTO MyTable_New
FROM MyTable
EXEC sp_rename MyTable, MyTable_old
EXEC sp_rename MyTable_new, MyTable
DROP TABLE MyTable_old
The problem with this approach is that the "new" table no longer has any of the keys, indices and other dependent objects. I have figured out the keys and indices using SPs to generate the DROP and CREATE scripts. The SPs are based on manually written SQL as can be seen e.g. in this answer.
The next problem is that we have a schemabound view on top of this table, which has indices and a full-text index on its own. The number of SPs to generate scripts is growing and I am sure there will be mistakes.
Is there a way to completely script a table/view by using SQL commands only? ie. just like SSMS does when you click "Script table as - CREATE to" but within a stored procedure?
Right-click on the database, select Tasks; there is Generate Scripts there. Just follow prompts or Google for additional information.
I have a .csv file with ~300,000 rows, some of which violate certain constraints I set in my postgres database. Is there a way to copy my .csv file into the database and have postgres filter out the rows that violate the constraints? I do not want these rows to show up in the database.
If this is not possible, is there any other way to solve this problem?
what I'm doing right now is
COPY blocksequences from '/tmp/blocksequences.csv CSV HEADER;
And I get
'ERROR: new row for relation "blocksequences" violates check constraint "blocksequences_partid3_check"
DETAIL: Failing row contains (M001-M049-S186, M001, null, M049, S186).
CONTEXT: COPY blocksequences, line 680: "M001-M049-S186,M001,,M049,S186"
reason for the error: column that contains M049 is not allowed to have that string entered. Many other rows have violations like this.
I read a little about exception when check violation --do nothing am I on the right track here? seems like it's only a mysql thing maybe
Usually this is done in this way:
create a temporary table with the same structure as the destination one but without constraints,
copy data to the temporary table with COPY command,
copy rows that do fulfill constraints from temp table to the destination one, using INSERT command with conditions in the WHERE clause based on the table constraint,
drop the temporary table.
When dealing with really large CSV files or very limited server resources, use the extension file_fdw instead of temporary tables. It's much more efficient way but it requires server access to a CSV file (while copying to a temporary table can be done over the network).
In Postgres 12 you can use the WHERE clause in COPY FROM.
How to update table from csv file in PostgreSQL? (version 9.2.4)
Copy command is for insert. But I need to update table. How can I update table from csv file without temp table?
I don't want to copy to temp table from csv file and update table from temp table.
And no merge command like Oracle?
The simple and fast way is with a temporary staging table, like detailed in this closely related answer:
How to update selected rows with values from a CSV file in Postgres?
If you don't "want" that for some unknown reason, there are more ways:
A foreign data wrapper with file_fdw.
You can run UPDATE commands directly using this one.
pg_read_file(). For special use cases.
Details in this related answer:
Read data from a text file inside a trigger
There is no MERGE command in Postgres, even less for COPY.
Discussion about whether and how to add it is ongoing. Check out the Postgres Wiki for details.
I am trying to rename a table in db2 like so
rename table schema1.mytable to schema2.mytable
but getting the following error message:
the name "mytable" has the wrong number of qualifiers.. SQLCODE=-108,SQLSTATE=42601
what is the problem here.... I am using the exact syntax from IBM publib documentation.
You cannot change the schema of a given object. You have to recreate it.
There are severals ways to do that:
If you have only one table, you can export and import/load the table. If you use the IDX format, the DDL will be included in the generated file. If using another format, the table has be created.
You can recreate the table by using:
Create table schema2.mytable like schema1.mytable
You can extract the DDL with the db2look tool
If you are changing the schema name for a schema given, you can use ADMIN_COPY_SCHEMA
These last two options only create the table structure, and you still need to import the data. After having create the table, you insert the data by different ways:
Inserting directly
insert into schema2.mytable select * from schema1.mytable
Via load from cursor
Via a Load or import from file (The file exported in the previous step)
The problem is the foreign relations, because they have to be recreated.
Finally, you can create an alias. It is easier, and you do not have to deal with relations.
You can easily rename a table with this statement:
RENAME TABLE SCHEMA.TABLENAME TO NEWTABLENAME;
You're not renaming table in provided example, you're trying to move to different schema, it's not the same thing. Look into db2move tool for this.
if you want to rename a table in the same schema, you can use like this.
RENAME TABLE schema.table_name TO "new_table_name";
Otherwise, you can use tools like DBeaver to rename or copy tables in a db2 db.
What if you leave it as is and create an alias with the new name and schema.
Renaming a table means to rename a table within same schema .To rename in other schema ,db2 call its ALIAS:
db2 create alias for
I've a lot of records that are originally from MySQL. I massaged the data so it will be successfully inserted into PostgreSQL using ActiveRecord. This I can easily do with insertions on row basis i.e one row at a time. This is very slow I want to do bulk insert but this fails if any of the rows contains invalid data. Is there anyway I can achieve bulk insert and only the invalid rows failing instead of the whole bulk?
COPY
When using SQL COPY for bulk insert (or its equivalent \copy in the psql client), failure is not an option. COPY cannot skip illegal lines. You have to match your input format to the table you import to.
If data itself (not decorators) is violating your table definition, there are ways to make this a lot more tolerant though. For instance: create a temporary staging table with all columns of type text. COPY to it, then fix offending rows with SQL commands before converting to the actual data type and inserting into the actual target table.
Consider this related answer:
How to bulk insert only new rows in PostreSQL
Or this more advanced case:
"ERROR: extra data after last expected column" when using PostgreSQL COPY
If NULL values are offending, remove the NOT NULL constraint from your target table temporarily. Fix the rows after COPY, then reinstate the constraint. Or take the route with the staging table, if you cannot afford to soften your rules temporarily.
Sample code:
ALTER TABLE tbl ALTER COLUMN col DROP NOT NULL;
COPY ...
-- repair, like ..
-- UPDATE tbl SET col = 0 WHERE col IS NULL;
ALTER TABLE tbl ALTER COLUMN col SET NOT NULL;
Or you just fix the source table. COPY tells you the number of the offending line. Use an editor of your preference and fix it, then retry. I like to use vim for that.
INSERT
For an INSERT (like commented) the check for NULL values is trivial:
To skip a row with a NULL value:
INSERT INTO (col1, ...
SELECT col1, ...
WHERE col1 IS NOT NULL
To insert sth. else instead of a NULL value (empty string in my example):
INSERT INTO (col1, ...
SELECT COALESCE(col1, ''), ...
A common work-around for this is to import the data into a TEMPORARY or UNLOGGED table with no constraints and, where data in the input is sufficiently bogus, text typed columns.
You can then do INSERT INTO ... SELECT queries against the data to populate the real table with a big query that cleans up the data during import. You can use a lot of CASE statements for this. The idea is to transform the data in one pass.
You might be able to do many of the fixes in Ruby as you read the data in, then push the data to PostgreSQL using COPY ... FROM STDIN. This is possible with Ruby's Pg gem, see eg https://bitbucket.org/ged/ruby-pg/src/tip/sample/copyfrom.rb .
For more complicated cases, look at Pentaho Kettle or Talend Studio ETL tools.