PostgreSQL COPY FROM STDIN Expressions - postgresql

I am attempting to use COPY FROM STDIN to import data into my table. One of the columns in my table is of type geometry. My command looks something like this...
COPY "WeatherStations" ("Station_ID", "Station_Code", "Station_Name", "Station_Location") FROM stdin;
1 KAVP WILKES-BARRE ST_GeomFromText('POINT(41.338055 -75.724166)')
2 KOKV WINCHESTER ST_GeomFromText('POINT(39.143333 -78.144444)')
3 KSHD SHENANDOAH ST_GeomFromText('POINT(38.263611 -78.896388)')
...
However, I think it is attempting to insert the text "ST_GeomFromText('POINT..." and failing instead of evaluating the expression and inserting the result of the expression. Does anyone know what might be going on here and how I can get the actual geoms inserted?

I had a bad time figuring out how to bulk copy/load geometry data into PostGIS using the COPY FROM STDIN command, I couldn't find official documentation on this topic.
Altering the column during the bulk load (the ALTER TABLE / SET DATA TYPE / USING) was not an option to me because it is only supported in PostGIS 2.0+ for the Geometry type, nor was acceptable the use of a temporary table.
There is indeed a direct way to do it (at least in PostGIS 1.5.2+).
You can simply rewrite the data for your copy statement this way, using a simple WKT (Well-known text) representation for your Geometry data:
1 KAVP WILKES-BARRE POINT(41.338055 -75.724166)
2 KOKV WINCHESTER POINT(39.143333 -78.144444)
3 KSHD SHENANDOAH POINT(38.263611 -78.896388)
If you have enforced a SRID constraint on the geometry column you'll have to use the following syntax (in this example the SRID is 4326) known as EWKT (Extended Well-Known Text, which is a PostGIS specific format):
1 KAVP WILKES-BARRE SRID=4326;POINT(41.338055 -75.724166)
2 KOKV WINCHESTER SRID=4326;POINT(39.143333 -78.144444)
3 KSHD SHENANDOAH SRID=4326;POINT(38.263611 -78.896388)
Closing note: there must be no space between "POINT" and the opening parenthesis "(", or the COPY will still return error saying your geometry data has an invalid format.

You could omit the function wrapping the text, import into a temporary table with text column, and then run INSERT/SELECT into the permanent table with the function doing the conversion in that step.
INSERT INTO "WeatherStations"
("Station_ID", "Station_Code", "Station_Name", "Station_Location")
SELECT "Station_ID", "Station_Code", "Station_Name",
ST_GeomFromText("Station_Location")
FROM "TempWeatherStations";

You will keep all the values in .csv file and try like this:
CAT /path/file/demo.csv | psql -u <username> -h <localhost> -d<database>
-c "COPY "WeatherStations" ("Station_ID", "Station_Code", "Station_Name",
"Station_Location") FROM stdin;"
This will work.

Point's value looks something like this: 0101000020E6100000DA722EC555552B40CDCCCCCCCC0C4840.
I typically keep latitude and longitude columns in my tables and build spatial data with triggers.
I don't know how to copy POINTs from stdin otherwise.

Related

Trim/whitespace issue when load data from Db2 source to Postgresql DB using Talend Open source

We are seeing issue in table value which are populated from DB2 (source) to Postgres (Target).
I have including here all the job details for each component.
Based on the above approach and once the data has been populated, when we run the below query in the Postgres DB.
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where cust_gssn_cd='XY03666699' ;
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where cust_cntry_cd='847' ;
There will be no records were returned however, when we run the same query with Trim as below it works.
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where trim(cust_gssn_cd)='XY03666699' ;
SELECT * FROM VMRCTTA1.VMRRCUST_SUMM where trim(cust_cntry_cd)='847' ;
Below are the ways we have tried to overcome this but no luck.
Used tmap between source and target component.
Used trim in source component under Advanced setting.
Change the datatype in Postgres DB of cust_cntry_cd from char(5) to Character varying, this will allow value without any length restriction.
Please suggest what is missing as we have this issue in almost all the table where we have character/varchar columns.
We are using TOS.
The data type is probably character(5) in DB2.
That means that the trailing spaces are part of the column and will be migrated. You have to compare with
cust_cntry_cd = '847 '
or cast the right argument to character(5):
cust_cntry_cd = CAST ('847' AS character(5))
Maybe you could delete all spaces in the advanced settings of the tDB2Input component.
Like the screen :

Timescaledb - How to display chunks of a hypertable in a specific schema

I have a table named conditions on a schema named test. I created a hypertable and inserted hundreds of rows.
When I run select show_chunks(), it works and displays chunks but I cannot use the table name as parameter as suggested in the manual. This does not work:
SELECT show_chunks("test"."conditions");
How can I fix this?
Ps: I want to query the chunk itself by its name? How can I do this?
The show_chunks expects a regclass, which depending on your current search path means you need to schema qualify the table.
The following should work:
SELECT public.show_chunks('test.conditions');
The double quotes are only necessary if your table is a delimited identifier, for example if your tablename contains a space, you would need to add the double quotes for the identifier. You will still need to wrap it in single quotes though:
SELECT public.show_chunks('test."equipment conditions"');
SELECT public.show_chunks('"test schema"."equipment conditions"');
For more information about identifier quoting:
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
Edit: Addressing the PS:
I want to query the chunk itself by its name? How can I do this?
feike=# SELECT public.show_chunks('test.conditions');
show_chunks
--------------------------------------------
_timescaledb_internal._hyper_28_1176_chunk
_timescaledb_internal._hyper_28_1177_chunk
[...]
SELECT * FROM _timescaledb_internal._hyper_28_1176_chunk;

Data correction exporting CSV file to Postgres

I am importing a csv file into postgres, and would like to know how to import the correct data type while using the COPY command. For instance, I have a column column_1 integer; and want to insert the value 6 into it from my csv file.
I run the command copy "Table" from 'path/to/csv' DELIMITERS ',' CSV; and every time I try to do this I get the error ERROR: invalid input syntax for integer: "column_1". I figured out that it's because it is automatically importing every piece of data from the csv file as a string or text. If I change the column type to text then it works successfully, but this defeats the purpose of using a number as I need it for various calculations. Is there a way to conserve the data type when transferring? Is there something I need to change in the csv file? Or is there another datatype to assign to column_1? Hope this makes sense. Thanks in advance!
I did this and it worked flawlessly:
I put the plain number in the stack.csv
(The stack.csv has only one value 6)
# create table stack(i int);
# \copy stack from 'stack.csv' with (format csv);
I read in your comment that you have 25 columns in your CSV file. You need to have at least 25 columns in your table. All columns need to be mapped from CSV. If you have more than 25 columns in table you need the map only the columns mapped from CSV.
That's why it works at a text field because all data is put in one row cell.
If you have more columns that "fields" in your CSV file than the format is like this
\copy stack(column1, column2, ..., column25) from 'stack.csv' with (format csv);

How do I escape single quotes in data which is of hstore datatype using Pentaho

I am trying to read hstore data from source and insert into target hstore column. But for some weird reason the data has some single quotes in it and I cannot delete or remove them. Source hstore data looks something like
Value 1: "Target_Payment_Type"=>"Auto_Renew", "Target_Membership_term"=>"1 Year"
Value 2: "Target_Payment_Type"=>"'Auto_Renew'", "Target_Membership_term"=>"'1 Year'"
The transformation works fine with the 1st value but fails when at Value2. Can could anyone suggest me a way I can escape the single quotes which may appear in data using pentaho or postgresql (source & target database). Thanks in advance.
At least, you can use postgres replace function in Table Input step:
SELECT
,all_your_non_string_columns
,replace(string_column,'''', '') //note that '''' represents '
FROM
your_table
Real solution you could find in up-to-date driver perhaps.

COPY only some columns from an input CSV?

I have created a table in my database with name 'con' which has two columns with the name 'date' and 'kgs'. I am trying to extract data from this 'hi.rpt' file copied on this location 'H:Sir\data\reporting\hi.rpt' and want to store values in the table 'con' in my database.
I have tried this code in pgadmin
When I run:
COPY con (date,kgs)
FROM 'H:Sir\data\reporting\hi.rpt'
WITH DELIMITER ','
CSV HEADER
date AS 'Datum/Uhrzeit'
kgs AS 'Summe'
I get the error:
ERROR: syntax error at or near "date"
LINE 5: date AS 'Datum/Uhrzeit'
^
********** Error **********
ERROR: syntax error at or near "date"
SQL state: 42601
Character: 113
"hi.rpt" file from which i am reading the data look like this:
Datum/Uhrzeit,Sta.,Bez.,Unit,TBId,Batch,OrderNr,Mat1,Total1,Mat2,Total2,Mat3,Total3,Mat4,Total4,Mat5,Total5,Mat6,Total6,Summe
41521.512369(04.09.13 12:17:48),TB01,TB01,005,300,9553,,2,27010.47,0,0.00,0,0.00,3,1749.19,0,0.00,0,0.00,28759.66
41521.547592(04.09.13 13:08:31),TB01,TB01,005,300,9570,,2,27057.32,0,0.00,0,0.00,3,1753.34,0,0.00,0,0.00,28810.66
Is it possible to extract only two data values from 20 different type of data that i have in this 'hi.rpt' file or not?
or is there only a mistake in the syntax that i have written?
What is the correct way to write it?
I don't know where you got that syntax, but COPY doesn't take a list of column aliases like that. See the help:
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
(AS isn't one of the listed options; to see the full output run \d copy in psql, or look at the manual for the copy command online).
There is no mapping facility in COPY that lets you read only some columns of the input CSV. It'd be really useful, but nobody's had the time/interest/funding to implement it yet. It's really only one of many data transform/filtering tasks people want anyway.
PostgreSQL expects the column-list given in COPY to be in the same order, left-to-right, as what's in the CSV file, and have the same number of entries as the CSV file has columns. So if you write:
COPY con (date,kgs)
then PostgreSQL will expect an input CSV with exactly two columns. It'll use the first csv column for the "date" table column and the second csv column for the "kgs" table column. It doesn't care what the CSV headers are, they're ignored if you specify WITH (FORMAT CSV, HEADER ON), or treated as normal data rows if you don't specify HEADER.
PostgreSQL 9.4 adds FROM PROGRAM to COPY, so you could run a shell command to read the file and filter it. A simple Python or Perl script would do the job.
If it's a small file, just open a copy in the spreadsheet of your choice as a csv file, delete the unwanted columns, and save it, so only the date and kgs columns remain.
Alternately, COPY to a staging table that has all the same columns as the CSV, then do an INSERT INTO ... SELECT to transfer just the wanted data into the real target table.