Trying to create query to export data as csv - postgresql

I have a Postgresql table I wish to export as CSV on demand using a query, without superuser.
I tried:
COPY myapp_currencyprice to STDOUT WITH (DELIMITER ',', FORMAT CSV, HEADER) \g /tmp/prices.csv
But I get a syntax error at "\g"
So I tried:
\copy myapp_currencyprice to '/tmp/prices.csv' with (DELIMITER ',', FORMAT CSV, HEADER)
But I also get a syntax error at "" from "\copy"

You can do the following in psql.
SELECT 1 as one, 2 as two \g /tmp/1.csv
then in psql
\! cat /tmp/1.csv
or you can
copy (SELECT 1 as one, 2 as two) to '/tmp/1.csv' with (format csv , delimiter '|');
But You can't STDOUT and filename. Because in manual(https://www.postgresql.org/docs/current/sql-copy.html):
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | PROGRAM 'command' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]
the Vertical line | means: you must choose one alternative.(source: https://www.postgresql.org/docs/14/notation.html)

Related

How to export to S3 from RDS / Aurora using `aws_s3.query_export_to_s3` with a tab delimiter?

Trying to run
SELECT
*
FROM
aws_s3.query_export_to_s3(
'SELECT * FROM <tbl> WHERE <cond>',
aws_commons.create_s3_uri(
'<bucket_name>',
'<file_name>',
'<region>'
),
options :='format csv, HEADER true, delimiter $$\t$$'
)
;
The custom delimiter specification follows the AWS documentation
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html#postgresql-s3-export-examples-custom-delimiter
However, it fails to export due to ERROR: COPY delimiter must be a single one-byte character
The tab delimiter provided in the query complies with the Postgres COPY command.
Any ideas?
you could use E''\t'', it worked, see code below
SELECT * from aws_s3.query_export_to_s3('select * from tb',
aws_commons.create_s3_uri('s3-bucket', 'data.csv', 'us-east-1'),
options :='format csv, HEADER true, delimiter E''\t'' '
);
rows_uploaded | files_uploaded | bytes_uploaded
---------------+----------------+----------------
2 | 1 | 21
(1 row)

Quote strings and dates in psql query results output

Under the section \pset [ option [ value ] ] of the psql docs, I can set various settings to make my query results convenient for me.
I can, for example, approach a CSV-like output with:
\pset fieldsep ','
\pset footer off
\pset format unaligned
\pset null 'NULL'
Resulting in output like:
> WITH foo_tbl(foo,bar,baz)
> AS
> (
> VALUES
> ('foo', NULL, 1),
> (NULL, 'bar', 1)
> )
> SELECT * FROM foo_tbl;
foo,bar,baz
foo,NULL,1
NULL,bar,1
This is great, but I'd like strings and dates to be quoted, like this:
foo,bar,baz
'foo',NULL,1
NULL,'bar',1
Is this not possible with psql?
p.s. I know this kind of thing can be done with SQL clients like DBeaver, but that isn't in the scope of this question.
To generate CSV output, you can use the copy command rather than trying to tweak the output of a regular SELECT statement.
copy (
WITH foo_tbl (foo,bar,baz,dt) AS
(
VALUES
('foo', NULL, 1, date '2020-01-02'),
(NULL, 'bar', 1, date '2020-03-04')
)
SELECT *
FROM foo_tbl
) to stdout
with (format csv, quote '''', header, null 'NULL', force_quote (foo, dt) );
Will generate the following output
foo,bar,baz,dt
'foo',NULL,1,'2020-01-02'
NULL,bar,1,'2020-03-04'
I am not aware of an option that will quote only dates and strings, but not numbers, so using force_quote and specifying the columns to quote is the only way to get them (always).
copy (...) to stdout is easier to use than it's psql sibling \copy because it allows multi-line queries.
To write everything into a file, you can use the \o command in psql
postgres=> \o data.csv
postgres=> copy (...) to stdout with (...);

How to skip empty line in psql \COPY in PostgreSQL

In PostgreSQL psql, how to make \copy command ignore empty lines in input file?
Here is the code to reproduce it,
create table t1(
n1 int
);
echo "1
2
" > m.csv
psql> \copy t1(n1) FROM 'm.csv' (delimiter E'\t', NULL 'NULL', FORMAT CSV, HEADER false);
ERROR: invalid input syntax for integer: ""
CONTEXT: COPY t1, line 3, column n1: ""
There is an empty line in file m.csv
cat m.csv
1
2
<< empty line
PostgreSQL COPY is very strict, so there is not possibility to start COPY in tolerant mode. If it is possible, you can use COPY FROM PROGRAM
[pavel#nemesis ~]$ cat ~/data.csv
10,20,30
40,50,60
70,80,90
psql -c "\copy f from program ' sed ''/^\s*$/d'' ~/data.csv ' csv" postgres

Complex sed Command for Insert Command

I have a bunch of php files, which have many insert commands.
In each query, I want to insert a column variable admin_id = '$admin_id',
i.e., if the query is
insert into users (ch_id, num_value) values ('2', '100')
the query should be converted to
insert into users (admin_id, ch_id, num_value) values ($admin_id, '2', '100')
To do this, I have executed the following command
sed -i 's/\(insert.*into.*\) (\(.*values\)/\1 (admin_id, \2/' *.php
and
sed -i "s/\(insert.*into.*\) values (/\1 values ('\$admin_id', /" *.php
The above has worked successfully, but am still facing problem with SQL queries where there is no where in the query, i.e.,
insert into abctable (id,no)
to
insert into tablename (admin_id, id, no)
and
insert into abctable select $column from $tableperiod
to
insert into abctable select $column from $tableperiod where admin_id='$admin_id'
and
insert into abctable select $column from $tableperiod where abc != 'xyz'
to
insert into abctable select $column from $tableperiod where admin_id = '$admin_id' and abc != 'xyz'
How can I insert admin_id in these queries as well?
The queries in php files are executed by passing the query to the function in the following way:
execute_query("insert * from $table order by username");
I can find the queries still which are left to be modified by
executing
grep 'execute_query' *| grep insert| grep -v admin_id > stillleft.txt
I have solved it by using the following command
sed -e "s/\(query.*insert.*select.*where\)/& admin_id='\$admin_id' and /g" -e t \
-e "s/\(query.*insert.*select.*\)\")/\1 where admin_id='\$admin_id\")'/g" -e t \
-e "s/\(query.*insert.*\)(\(.*\)values (/\1(admin_id, \2values ('\$admin_id', /g" -e t \
-e "s/\(query.*insert.*(\)/& admin_id, /g" \
-i *.php
I'm not sure my testcases are right, but I think this could help you:
I changed the first statement, because I think it's easier and it matches the first and the second command of YOUR sed
sed -i 's/\(insert into .* (\)\(.*) values (\)\(.*\)) /\1admin_id, \2\$admin_id, \3/' *.php
The second (the first you are looking for) should work with the following
sed -i 's/\(insert into .* (\)\(.*) \)/\1admin_id, \2/' *.php
And the last two should work with this:
sed -i "s/\(insert into \w* select \$column from \$tableperiod\)/\1 where admin_id='\$admin_id'/" *.php
I hope this works for you, if not, please send a little bit more test data, if tested the commands with the text of your question as input
I you use multiple sed commands, you'll traverse the complete file each time. You can do it in a single pass. Assuming an input file infile that looks like this:
insert into users (ch_id, num_value) values ('2', '100')
insert into abctable (id, no)
insert into abctable select $column from $tableperiod
insert into abctable select $column from $tableperiod where abc != 'xyz'
we can use the following sed script sedscr
/^insert into/ {
s/\(([^)]*)\)(.*)\(([^)]*)\)/(admin_id, \1)\2($admin_id, \3)/
s/^([^(]+)\(([^)]*)\)$/\1(admin_id, \2)/
/\(.*\)/! {
/where/s/$/ and admin_id ='$admin_id'/
/where/!s/$/ where admin_id='$admin_id'/
}
}
It does the following:
if a line starts with insert into, then
for all lines with two pairs of parentheses, insert admin_id in the the first one and $admin_id in the second one
for lines with one pair of parentheses at the end, insert admin_id
if there are no parentheses, then
if there is a "where" clause, append and admin_id = '$admin_id'
else append where admin_id='$admin_id'
This can be called as follows:
$ sed -rf sedscr infile
insert into users (admin_id, ch_id, num_value) values ($admin_id, '2', '100')
insert into abctable (admin_id, id, no)
insert into abctable select $column from $tableperiod where admin_id='$admin_id'
insert into abctable select $column from $tableperiod where abc != 'xyz' and admin_id ='$admin_id'
If you can't use extened regular expressions (-r), the quoting of parentheses has to be inverted (all \( become ( etc.) and the + has to be replaced by \{1,\}.
The cumbersome regexes such as \(([^)]*)\) stand for "between literal parentheses, capture zero or more characters that are not a closing parenthesis" – this enables non-greedy capturing.

how to deal with missings when importing csv to postgres?

I would like to import a csv file, which has multiple occurrences of missing values. I recoded them into NULL and tried to import the file as. I suppose that my attributes which include the NULLS are character values. However transforming them to numeric is bit complicated. Therefore I would like to import all of my table as:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' DELIMITER ';' CSV WITH NULL AS 'NULL' ';' HEADER
There must be a syntax error. But I tried different combinations and always get:
ERROR: syntax error at or near "WITH NULL"
LINE 1: COPY player_allstar FROM STDIN DELIMITER ';' CSV WITH NULL ...
I also tried:
\copy player_allstar FROM '/Users/Desktop/Rdaten/Data/player_allstar.csv' WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
and get:
ERROR: invalid input syntax for integer: "NULL"
CONTEXT: COPY player_allstar, line 2, column dreb: "NULL"
I suppose it is caused by preprocessing with R. The Table came with NAs so I change them to:
data[data==NA] <- "NULL"
I`m not aware of a different way chaning to NULL. I think this causes strings. Is there a different way to preprocess and keep the NAs(as NULLS in postgres of course)?
Sample:
pts dreb oreb reb asts stl
11 NULL NULL 8 3 NULL
4 5 3 8 2 1
3 NULL NULL 1 1 NULL
data type is integer
Given /tmp/sample.csv:
pts;dreb;oreb;reb;asts;stl
11;NULL;NULL;8;3;NULL
4;5;3;8;2;1
3;NULL;NULL;1;1;NULL
then with a table like:
CREATE TABLE player_allstar (pts integer, dreb integer, oreb integer, reb integer, asts integer, stl integer);
it works for me:
\copy player_allstar FROM '/tmp/sample.csv' WITH (FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Your syntax is fine, the problem seem to be in the formatting of your data. Using your syntax I was able to load data with NULLs successfully:
mydb=# create table test(a int, b text);
CREATE TABLE
mydb=# \copy test from stdin WITH(FORMAT CSV, DELIMITER ';', NULL 'NULL', HEADER);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> col a header;col b header
>> 1;one
>> NULL;NULL
>> 3;NULL
>> NULL;four
>> \.
mydb=# select * from test;
a | b
---+------
1 | one
|
3 |
| four
(4 rows)
mydb=# select * from test where a is null;
a | b
---+------
|
| four
(2 rows)
In your case you can substitute to NULL 'NA' in the copy command, if the original value is 'NA'.
You should make sure that there's no spaces around your data values. For example, if your NULL is represented as NA in your data and fields are delimited with semicolon:
1;NA <-- good
1 ; NA <-- bad
1<tab>NA <-- bad
etc.