Import Athena table to Postgres with quotes and commas - postgresql

I have an Athena table in text format. I looks like this in S3:
2022-06-05,55389e3b-3730-4cbb-85f2-d6f4de5123f4,{""05b7ede6-c9b7-4919-86d3-015dc5e77d40"":2",""1008b57c-fe53-4e3b-b84e-257eef70ce73"":2",""886e6dce-c40d-4c58-b87d-956b61382f18"":1",""e7c67b9b-3b01-4c3b-8411-f36659600bc3"":9}
2022-06-05,04e5b51e-8e16-4827-80c1-b50776bfb71b,{""04112c3e-0923-4c33-b92e-1183c06580b9"":1",""0f930559-0e66-45c0-bf9e-56edce16448d"":1",""1008b57c-fe53-4e3b-b84e-257eef70ce73"":70",""11e2e1cd-3078-4272-8b8b-55d62d2c0894"":2018",""19109d21-6150-4dd2-82e1-8bc5eee1d55c"":8",""1e58bb5f-cb5b-48d9-b752-f41dd5bd75bc"":32",""28276ff9-5221-4e41-be15-b7f9edee1162"":23",""2b70946f-1004-456b-9786-0c0274d31d1b"":1",""350b04d8-7910-4f19-b14b-d84e046f0bd6"":1",""3d4b0cb7-b701-4086-8fc8-22f957336670"":4",""3ed395b6-b354-4905-8d70-53174d68e293"":1",""41d99562-fd0b-4c1b-9e5b-66b82615b587"":1",""41e778fd-f6b9-4d71-8053-e2f2446b495e"":23",""44760b78-f700-4e4f-bb5b-cfe1de2b3771"":4",""4b01c168-e16d-499c-9e0e-483d7d86f679"":10",""5050d32f-6b4e-493b-bf37-876dc4cf7d4f"":5}
The columns are: DATE, UUID, JSONB
I have escaped the " and , characters, but Postgres seems unable to import it?
SELECT aws_s3.table_import_from_s3(
'my_table',
'd, id, j',
'(format csv)',
aws_commons.create_s3_uri(
'my-bucket',
'abc/20220608_172519_00015_d684z_740d0f86-1df0-4058-9d2c-7354a328dfcb.gz',
'us-west-2'
)
);
ERROR: extra data after last expected column

JSON should be escaped like this:
SELECT '"' || replace(json_format(j), '"', '""') || '"' AS escaped_json
FROM my_table

Related

How to escape one doulbe quote " in data file for psql \COPY import?

There are many standalone double quote '"' in a tab delimited text file, need to be loaded into PostgreSQL with psql \copy command.
If I use FORMAT CSV option, I have to specify the QUOTE, and QUOTE char needs to be paired.
Here is the code, and output,
create table t1(
c1 varchar(20),
n1 numeric
);
echo 'Alf_7" 5.12' > m.csv
psql> \copy t1 FROM 'm.csv' (FORMAT CSV, delimiter E'\t', NULL 'NULL', HEADER false);
ERROR: unterminated CSV quoted field
CONTEXT: COPY t1, line 1: "Alfa_7" 5.1
Use FORMAT text option. then you do not have to specify the QUOTE.
psql=> \copy t1 FROM 'm.csv' (FORMAT text, delimiter E'\t', NULL 'NULL', HEADER false);
COPY 1

For xml - similar function in postgresql [duplicate]

Let say you have a SELECT id from table query (the real case is a complex query) that does return you several results.
The problem is how to get all id return in a single row, comma separated?
SELECT string_agg(id::text, ',') FROM table
Requires PostgreSQL 9.0 but that's not a problem.
You can use the array() and array_to_string() functions togetter with your query.
With SELECT array( SELECT id FROM table ); you will get a result like: {1,2,3,4,5,6}
Then, if you wish to remove the {} signs, you can just use the array_to_string() function and use comma as separator, so: SELECT array_to_string( array( SELECT id FROM table ), ',' ) will get a result like: 1,2,3,4,5,6
You can generate a CSV from any SQL query using psql:
$ psql
> \o myfile.csv
> \f ','
> \a
> SELECT col1 AS column1, col2 AS column2 ... FROM ...
The resulting myfile.csv will have the SQL resultset column names as CSV column headers, and the query tuples as CSV rows.
h/t http://pookey.co.uk/wordpress/archives/51-outputting-from-postgres-to-csv
use array_to_string() & array() function for the same.
select array_to_string(array(select column_name from table_name where id=5), ', ');
Use this below query it will work and gives the exact result.
SELECT array_to_string(array_agg(id), ',') FROM table
Output : {1,2,3,4,5}
SELECT array_agg(id, ',') FROM table
{1,2,3,4}
I am using Postgres 11 and EntityFramework is fetching it as array of integers.

PostgreSQL: How do I avoid unwanted characters in COPY command output?

I have an array (text) (sample row) in my PostgreSQL 9.5 database like follows:
my_array(text)
1,112,292,19.7
I am exporting this text array using Postgres COPY command to a custom text file like this:
Copy
(
Select
my_array
from
my_table
Order by my_table_id
) to '~my_path/output.str' With DELIMITER ',';
I get the output:
1\,112\,292\,19.7\
How can I avoid these unwanted \ in my copy command output?
if the delimiter character (, in your case) is present in a string, it will be escaped (normally by prefixing it with a \)
If you use a different separator (from ,), the , separator doesn't have to be escaped.
If you quote the string in the output, the , separator doesn't have to be escaped.
-- CREATE TABLE my_table(my_table_id SERIAL PRIMARY KEY, my_array text);
-- INSERT INTO my_table(my_array )VALUES ('1,112,292,19.7') ;
COPY ( SELECT my_array FROM my_table ORDER BY my_table_id)
TO '/tmp/output.str'
WITH CSV DELIMITER ',' QUOTE '"'
;

How does postgres not to escape when I copy csv to it?

I want to copy a csv to postgres. And some value is string like this "{\"foo\": 123}"
If I use the COPY in the postgres directly, it will escape the string. when I select from postgres, it will become "{foo: 123}", but it's hard for me to handle, so how to not to escape the ". That is to say, I hope I can get the origin string "{\"foo\": 123}" when I select it from postgres
CREATE TABLE meuk
( bagger varchar
);
COPY meuk(bagger) FROM stdin WITH CSV QUOTE '"' ESCAPE E'\\' ;
"{\"foo\": 123}"
\.
SELECT * from meuk;
Result:
CREATE TABLE
bagger
--------------
{"foo": 123}
(1 row)

TSQL Passing MultiValued Reporting Services Parameter into Dynamic SQL

Duplicate of: TSQL varchar string manipulation
I'm building a dynamic SQL statement out of parameters from a reporting services report. Reporting services passes MutiValue Parameters in a basic CSV format. For example a list of states may be represented as follows: AL,CA,NY,TN,VA
In a SQL statement this is OK:
WHERE customerState In (#StateList)
However, the dynamic variant isn't OK:
SET #WhereClause1 = #WhereClause1 + 'AND customerState IN (' + #StateList + ') '
This is because it translates to (invalid SQL):
AND customerState IN (AL,CA,NY,TN,VA)
To process it needs something like this:
AND customerState IN ('AL','CA','NY','TN','VA')
Is there some cool expression I can use to insert the single quotes into my dynamic SQL?
REPLACE didn't work for me when used with IN for some reason. I ended up using CHARINDEX
WHERE CHARINDEX( ',' + customerState + ',', ',' + #StateList + ',' ) > 0
For anyone attempting to use Dynamic SQL with a multi-valued parameter in the where clause AND use it to run an SSRS report, this is how I got around it...
create table #temp
(id, FName varchar(100), LName varchar(100))
declare #sqlQuery (nvarchar(max))
set #sqlQuery =
'insert into #temp
select
id,
FName,
LName
from dbo.table'
exec (#sqlQuery)
select
FName, LName
from #temp
where #temp.id in (#id) -- #id being an SSRS parameter!
drop table #temp
Granted, the problem with this query is that the dynamic SQL will select everything from dbo.table, and then the select from #temp is where the filter kicks in, so if there's a large amount of data - it's probably not so great. But... I got frustrated trying to get REPLACE to work, or any other solutions others had posted.
This takes care of the middle:
SET #StateList = REPLACE(#StateList, ',', ''',''')
Then quote the edges:
SET #WhereClause1 = #WhereClause1 + 'AND customerState IN (''' + #StateList + ''') '