How to properly insert from stdin in postgresql? - postgresql

normal insert:
insert into tfreeze(id,s) values(1,'foo');
I tried the following ways, both not working:
copy tfreeze(id,s ) from stdin;
1 foo
\.
copy tfreeze(id,s ) from stdin;
1 'foo'
\.
Only a few questions related from stdin in stackoverflow. https://stackoverflow.com/search?q=Postgres+Insert+statements+from+stdin
--
error code:
ERROR: 22P02: invalid input syntax for type integer: "1 foo"
CONTEXT: COPY tfreeze, line 1, column id: "1 foo"
LOCATION: pg_strtoint32, numutils.c:320
I get code from this(https://postgrespro.ru/education/books/internals) book.
code source: https://prnt.sc/eEsRZ5AK-tjQ
So far I tried:
1, foo, 1\t'foo', 1\tfoo

First, you have to use psql for that (you are already doing that).
You get that error because you use the default text format, which requires that the values are separated by tabulator characters (ASCII 9).
I recommend that you use the CSV format and separate the values with commas:
COPY tfreeze (id, s) FROM STDIN (FORMAT 'csv', FREEZE);
1,foo
\.

Related

Snowflake null values quoted in CSV breaks PostgreSQL unload

I am trying to shift data from Snowflake to Postgresql and to do so I first load it into s3 in CSV format. In the table, comas in text could appear, I therefore use FIELD_OPTIONALLY_ENCLOSED_BY snowflake unloading option to quote the content of the problematic cells. However when this happen + null values, I can't manage to have a valid CSV for PostgreSQL.
I created a simple table for you to understand the issue. Here it is :
CREATE OR REPLACE TABLE PUBLIC.TEST(
TEXT_FIELD VARCHAR(),
NUMERIC_FIELD INT
);
INSERT INTO PUBLIC.TEST VALUES
('A', 1),
(NULL, 2),
('B', NULL),
(NULL, NULL),
('Hello, world', NULL)
;
COPY INTO #STAGE/test
FROM PUBLIC.TEST
FILE_FORMAT = (
COMPRESSION = NONE,
TYPE = CSV,
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
NULL_IF = ''
)
OVERWRITE = TRUE;
Snowflake will from that create the following CSV
"A",1
"",2
"B",""
"",""
"Hello, world",""
But after that, it is for me impossible to copy this CSV inside a PostgreSQL Table as it is.
Even thought from PostgreSQL documentation we have next to NULL option :
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format.
Not setting COPY Option in PostgreSQL COPY INTO will result in a failed unloading. Indeed it won't work as we also have to specify the quote used using QUOTE. Here it'll be QUOTE '"'
Therefore during POSTGRESQL unloading, using :
FORMAT csv, HEADER false, QUOTE '"' will give :
DataError: invalid input syntax for integer: "" CONTEXT: COPY test, line 3, column numeric_field: ""
FORMAT csv, HEADER false, NULL '""', QUOTE '"' will give :
NotSupportedError: CSV quote character must not appear in the NULL specification
FYI, To test the unloading in s3 I will use this command in PostgreSQL:
CREATE IF NOT EXISTS TABLE PUBLIC.TEST(
TEXT_FIELD VARCHAR(),
NUMERIC_FIELD INT
);
CREATE EXTENSION IF NOT EXISTS aws_s3 CASCADE;
SELECT aws_s3.table_import_from_s3(
'PUBLIC.TEST',
'',
'(FORMAT csv, HEADER false, NULL ''""'', QUOTE ''"'')',
'bucket',
'test_0_0_0.csv',
'aws_region'
)
Thanks a lot for any ideas on what I could do to make it happen? I would love to find a solution that don't requires modifying the csv between snowflake and postgres. I think it is an issue more on the Snowflake side as it don't really make sense to quote null values. But PostgreSQL is not helping either.
When you set the NULL_IF value to '', you are actually telling Snowflake to convert NULLS to a BLANK, which then get quoted. When you are copying out of Snowflake, the copy options are "backwards" in a sense and NULL_IF acts more like an IFNULL.
This is the code that I'd use on the Snowflake side, which will result in an unquoted empty string in your CSV file:
FILE_FORMAT = (
COMPRESSION = NONE,
TYPE = CSV,
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
NULL_IF = ()
)

Quote strings and dates in psql query results output

Under the section \pset [ option [ value ] ] of the psql docs, I can set various settings to make my query results convenient for me.
I can, for example, approach a CSV-like output with:
\pset fieldsep ','
\pset footer off
\pset format unaligned
\pset null 'NULL'
Resulting in output like:
> WITH foo_tbl(foo,bar,baz)
> AS
> (
> VALUES
> ('foo', NULL, 1),
> (NULL, 'bar', 1)
> )
> SELECT * FROM foo_tbl;
foo,bar,baz
foo,NULL,1
NULL,bar,1
This is great, but I'd like strings and dates to be quoted, like this:
foo,bar,baz
'foo',NULL,1
NULL,'bar',1
Is this not possible with psql?
p.s. I know this kind of thing can be done with SQL clients like DBeaver, but that isn't in the scope of this question.
To generate CSV output, you can use the copy command rather than trying to tweak the output of a regular SELECT statement.
copy (
WITH foo_tbl (foo,bar,baz,dt) AS
(
VALUES
('foo', NULL, 1, date '2020-01-02'),
(NULL, 'bar', 1, date '2020-03-04')
)
SELECT *
FROM foo_tbl
) to stdout
with (format csv, quote '''', header, null 'NULL', force_quote (foo, dt) );
Will generate the following output
foo,bar,baz,dt
'foo',NULL,1,'2020-01-02'
NULL,bar,1,'2020-03-04'
I am not aware of an option that will quote only dates and strings, but not numbers, so using force_quote and specifying the columns to quote is the only way to get them (always).
copy (...) to stdout is easier to use than it's psql sibling \copy because it allows multi-line queries.
To write everything into a file, you can use the \o command in psql
postgres=> \o data.csv
postgres=> copy (...) to stdout with (...);

How to skip empty line in psql \COPY in PostgreSQL

In PostgreSQL psql, how to make \copy command ignore empty lines in input file?
Here is the code to reproduce it,
create table t1(
n1 int
);
echo "1
2
" > m.csv
psql> \copy t1(n1) FROM 'm.csv' (delimiter E'\t', NULL 'NULL', FORMAT CSV, HEADER false);
ERROR: invalid input syntax for integer: ""
CONTEXT: COPY t1, line 3, column n1: ""
There is an empty line in file m.csv
cat m.csv
1
2
<< empty line
PostgreSQL COPY is very strict, so there is not possibility to start COPY in tolerant mode. If it is possible, you can use COPY FROM PROGRAM
[pavel#nemesis ~]$ cat ~/data.csv
10,20,30
40,50,60
70,80,90
psql -c "\copy f from program ' sed ''/^\s*$/d'' ~/data.csv ' csv" postgres

how to deal with missings when importing csv to postgres?

I would like to import a csv file, which has multiple occurrences of missing values. I recoded them into NULL and tried to import the file as. I suppose that my attributes which include the NULLS are character values. However transforming them to numeric is bit complicated. Therefore I would like to import all of my table as:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' DELIMITER ';' CSV WITH NULL AS 'NULL' ';' HEADER
There must be a syntax error. But I tried different combinations and always get:
ERROR: syntax error at or near "WITH NULL"
LINE 1: COPY player_allstar FROM STDIN DELIMITER ';' CSV WITH NULL ...
I also tried:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
and get:
ERROR: invalid input syntax for integer: "NULL"
CONTEXT: COPY player_allstar, line 2, column dreb: "NULL"
I suppose it is caused by preprocessing with R. The Table came with NAs so I change them to:
data[data==NA] <- "NULL"
I`m not aware of a different way chaning to NULL. I think this causes strings. Is there a different way to preprocess and keep the NAs(as NULLS in postgres of course)?
Sample:
pts dreb oreb reb asts stl
11 NULL NULL 8 3 NULL
4 5 3 8 2 1
3 NULL NULL 1 1 NULL
data type is integer
Given /tmp/sample.csv:
pts;dreb;oreb;reb;asts;stl
11;NULL;NULL;8;3;NULL
4;5;3;8;2;1
3;NULL;NULL;1;1;NULL
then with a table like:
CREATE TABLE player_allstar (pts integer, dreb integer, oreb integer, reb integer, asts integer, stl integer);
it works for me:
\copy player_allstar FROM '/tmp/sample.csv' WITH (FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Your syntax is fine, the problem seem to be in the formatting of your data. Using your syntax I was able to load data with NULLs successfully:
mydb=# create table test(a int, b text);
CREATE TABLE
mydb=# \copy test from stdin WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> col a header;col b header
>> 1;one
>> NULL;NULL
>> 3;NULL
>> NULL;four
>> \.
mydb=# select * from test;
a | b
---+------
1 | one
|
3 |
| four
(4 rows)
mydb=# select * from test where a is null;
a | b
---+------
|
| four
(2 rows)
In your case you can substitute to NULL 'NA' in the copy command, if the original value is 'NA'.
You should make sure that there's no spaces around your data values. For example, if your NULL is represented as NA in your data and fields are delimited with semicolon:
1;NA <-- good
1 ; NA <-- bad
1<tab>NA <-- bad
etc.

PostgreSQL change part of a string to uppercase

I have a field named rspec in a table trace.
So for now the field is like "Vol3/data/20070204_191426_FXBS.v3a".
All I need is a query to change it to the format "Vol3/data/20070204_191426_FXBS.V3A".
Assuming the current version:
select left(rspec, - 3)||upper(right(rspec, 3))
from trace
For older versions:
select substr(rspec, 1, length(rspec) - 3)||upper(substring(rspec from '...$'))
from trace
Or, to cover all possibilities like
file extensions of variable length: abc123.jpeg
no file extension at all: abc123
dot as last character: abc123.
multiple dots: abc.123.jpg
SELECT CASE WHEN rspec ~~ '%.%'
THEN substring(rspec, E'^.*\\.')
|| upper(substring(rspec , E'([^.]*)$'))
ELSE rspec
END AS rspec
FROM (VALUES
('abc123.jpeg')
, ('abc123')
, ('abc123.')
, ('abc.123.jpg')
) ASx(rspec); -- testcases
Explain:
If the string has no dot, use the string.
Else, take everything up to and including the last dot in the string.
Append everything after the last dot in upper case.