Postgres Copy To CSV Array is double double quoted - postgresql

I am trying to export my a Postgres table to CSV. My table contains a number of varchar[] fields, eg
{CA4 0,CA5 7}
When export using the following
COPY (Select * from table) To 'C:\output.csv' WITH (FORMAT CSV, HEADER)
The array becomes
"{""CA4 0"",""CA5 7""}"
I am not Force Quoting the field and if I do it has no impact. I would have expected the array to become
"{"CA4 0","CA5 7"}"
Can anyone help provide a way to get this as the output?
I am using a Windows 2016 Server and PostgreSQL 12.

Related

Export Postgres data to s3 with headers on each row

I'm was able to export data from Postgres to AWS S3 using this document by using the aws_commons extension.
The table columns are id and name. with this table I was able to export as csv file using below mentioned query
SELECT * from aws_s3.query_export_to_s3('select * from sample_table',
aws_commons.create_s3_uri('rds-export-bucket', 'rds-export', 'us-east-1') ,
options :='format csv , HEADER true'
);
with the query I'm able to generate csv file like
id,name
1,Monday
2,Tuesday
3,Wednesday
but is it possible to generate the csv file in the below mentioned format
id:1,name:Monday
id:2,name:Tuesday
id:3,name:Wednesday
I tried to create a different table with jsonb structure, and each row inserted as a json. then export had curly braces and two double quotes in it.
Sample insertion command,
CREATE TABLE unstructured_table (data JSONB NOT NULL);
INSERT INTO unstructured_table VALUES($$
{
"id": "1",
"name": "test"
}}
$$);
after exporting from this table, I'm getting csv file like,
"{""id"": ""1"", ""name"": "test"}"
Thanks in advance
JSON requires double quotes around strings and CSV also requires double quotes around fields when they contain commas or double quotes.
If your goal is to produce a comma-separated list of ColumnName:ColumnValue, for all columns and rows without any kind of quoting, then this requirement is not compatible with the CSV format.
This could however be generated in SQL relatively generically, for all columns and rows of any sample_table, id being the primary key, with a query like this:
select string_agg(k||':'||v, ',')
from sample_table t,
lateral row_to_json(t) as r(j),
lateral json_object_keys(j) as keys(k),
lateral (select j->>k) as val(v)
group by id order by id;
If you feed that query to aws_s3.query_export_to_s3 with a format csv option, it will enclose each output line with double quotes. That may be close enough to your goal.
Alternatively, the text format could be used. Then the lines would not be enclosed with double quotes, but backslashes sequences might be emitted for control characters and backslashes themselves (see the text format in the COPY documentation).
Ideally the result of this query should be output verbatim into a file, not using COPY, as you would do locally in a shell with:
psql -Atc "select ..." > output-file.txt
But it doesn't seem like aws_s3.query_export_to_s3 provides an equivalent way to do that, since it's an interface on top of the COPY command.

Why does my postgresql csv export have more rows than the table?

I am trying to copy a table in a postgresql database (version 10.12) via psql. One of the rows contains strings representing xml data. When I query the database for a row count with this query I get a count of about 50,000:
select count(column) from table;
But when I try to export the data to a csv file the output has more than 1,000,000 rows! I don't understand how a csv export could have a different number of rows than the table!
This is the copy command:
\copy (select column from table) to 'directory/output.csv' with csv;
It doesn't seem to matter if I change the delimiter or quote either. I've tried using | as a delimiter and ` as a quote and the number of rows in the csv was the same. Why is the row count different in the csv export?
The row count is not different: the CSV output simply has linefeeds (LF, ASCII code 10) embedded in fields, which is expected in XML.
If you want one line per row with COPY, don't use CSV, use the text format, that is, just omit with csv. Then newlines are encoded with \n instead of being output verbatim.

export records from a table by modifying without double quotes for the numeric columns from the table in db2(udb)

I am trying to remove the double quotes for the numeric columns using export command by using replace function but it wont worked out, below is the query I used in Linux environment,
EXPORT TO '/Staging/ebi/src/CLP/legal_bill_charge_adjustment11.csv' OF DEL
MESSAGES '/Staging/ebi/src/CLP/legal_bill_charge_adjustment11.log'
select
CLIENT_ID,
CLIENT_DIVISION_ID,
CLIENT_OFFICE_ID,
MATTER_ID,
LEGAL_BILL_CHARGE_ADJ_ID,
LEGAL_BILL_CHARGE_ID,
ADJUSTMENT_DT,
replace ( ORIGINAL_ADJUSTMENT_AMT,""),
replace (CURRENT_ADJUSTMENT_AMT,""),
replace (SYSTEM_ADJUSTMENT_AMT,""),
replace (CLIENT_ADJUSTMENT_AMT,""),
replace (DELETED_ADJUSTMENT,""),
FLAGGED_AMOUNT,
ADJUSTMENT_USER,
STATUS_DESC,
ADJUSTMENT_COMMENT,
WF_TASK_NAME,
WF_TASK_DESC from CLP.legal_bill_charge_adjustment1;
If anyone suggest me the exact db2 query it would be helpful.
Thanks in advance.
Export will not have quotes around numeric data types. You have not provided any data type information so I suppose your numeric content may be stored in a CHAR/VARCHAR column.
Try casting the columns to numeric data types in the export SQL statement.
i.e.
SELECT cast(Textcol as integer) as colname
..

Which delimiter to use when loading CSV data into Postgres?

I've come across a problem with loading some CSV files into my Postgres tables. I have data that looks like this:
ID,IS_ALIVE,BODY_TEXT
123,true,Hi Joe, I am looking for a new vehicle, can you help me out?
Now, the problem here is that the text in what is supposed to be the BODY_TEXT column is unstructured email data and can contain any sort of characters, and when I run the following COPY command it's failing because there are multiple , characters within the BODY_TEXT.
COPY sent from ('my_file.csv') DELIMITER ',' CSV;
How can I resolve this so that everything in the BODY_TEXT column gets loaded as-is without the load command potentially using characters within it as separators?
Additionally to the fixing the source file format you can do it by PostgreSQL itself.
Load all lines from file to temporary table:
create temporary table t (x text);
copy t from 'foo.csv';
Then you can to split each string using regexp like:
select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') from t;
regexp_matches
---------------------------------------------------------------------------
{123,true,"Hi Joe, I am looking for a new vehicle, can you help me out?"}
{456,false,"Hello, honey, there is what I want to ask you."}
(2 rows)
You can use this query to load data to your destination table:
insert into sent(id, is_alive, body_text)
select x[1], x[2], x[3]
from (
select regexp_matches(x, '^([0-9]+),(true|false),(.*)$') as x
from t) t

Data correction exporting CSV file to Postgres

I am importing a csv file into postgres, and would like to know how to import the correct data type while using the COPY command. For instance, I have a column column_1 integer; and want to insert the value 6 into it from my csv file.
I run the command copy "Table" from 'path/to/csv' DELIMITERS ',' CSV; and every time I try to do this I get the error ERROR: invalid input syntax for integer: "column_1". I figured out that it's because it is automatically importing every piece of data from the csv file as a string or text. If I change the column type to text then it works successfully, but this defeats the purpose of using a number as I need it for various calculations. Is there a way to conserve the data type when transferring? Is there something I need to change in the csv file? Or is there another datatype to assign to column_1? Hope this makes sense. Thanks in advance!
I did this and it worked flawlessly:
I put the plain number in the stack.csv
(The stack.csv has only one value 6)
# create table stack(i int);
# \copy stack from 'stack.csv' with (format csv);
I read in your comment that you have 25 columns in your CSV file. You need to have at least 25 columns in your table. All columns need to be mapped from CSV. If you have more than 25 columns in table you need the map only the columns mapped from CSV.
That's why it works at a text field because all data is put in one row cell.
If you have more columns that "fields" in your CSV file than the format is like this
\copy stack(column1, column2, ..., column25) from 'stack.csv' with (format csv);