Postgres - limit number of rows COPY FROM - postgresql

Is there a way to limit the Postgres COPY FROM syntax to only the first row? There doesn't seem to be an option listed in the documentation.
I know there's that functionality in SQL Server, see FIRSTROW AND LASTROW options below:
BULK INSERT sometable
FROM 'E:\filefromabove.txt
WITH
(
FIRSTROW = 2,
LASTROW = 4,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)

You could use the PROGRAM option to preprocess the file to read from the standard output.
To load only the first line use
Unix/Linux/Mac
COPY sometable from PROGRAM 'head -1 filefromabove.txt' ;
Windows
COPY sometable from PROGRAM 'set /p var= <filefromabove.txt && echo %var%' ;

Related

How to get the col headers from a select query using psycopg2 client?

I have this python3 code :
conn = psycopg2.connect( ... )
curr = conn.cursor()
curr.execute(code)
rows = curr.fetchall()
where 'code' has the select query statement
After executing this, 'rows' list will have lists of only the selected row values. How do I run 'curr.execute' in such a way that I also get the respective col headers too?
Meaning if I have say
Select col1, col2 from table Where some_condition;
I want my 'rows' list to have something like [['col1', 'col2'], [some_val_for_col1, some_val_for_col2] ....]. Any other ways of getting these col headers are also fine, but the select query in 'code' shouldn't change.
you have to execute 2 commands
curr.execute("Select * FROM people LIMIT 0")
colnames = [desc[0] for desc in curs.description]
curr.execute(code)
you can follow steps mentioned in https://kb.objectrocket.com/postgresql/get-the-column-names-from-a-postgresql-table-with-the-psycopg2-python-adapter-756

Copy into snowflake table from raw data file using Perl DBI

There's not much info out there for perl dbi and snowflake so I'll give this a shot. I have a raw file, of which the headers are contained in line 1. This exact 'copy into' command works from the snowflake gui. I'm not sure if I can just take this exact command and put it into a perl prepare and execute.
COPY INTO DBTABLE.LND_LND_STANDARD_DATA FROM (
SELECT SPLIT_PART(METADATA$FILENAME,'/',4) as SEAT_ID,
$1:auction_id_64 as AUCTION_ID_64,
DATEADD(S,\$1:date_time,'1970-01-01') as DATE_TIME,
$1:user_tz_offset as USER_TZ_OFFSET,
$1:creative_width as CREATIVE_WIDTH,
$1:creative_height as CREATIVE_HEIGHT,
$1:media_type as MEDIA_TYPE,
$1:fold_position as FOLD_POSITION,
$1:event_type as EVENT_TYPE
FROM #DBTABLE.lnd.S3_STAGE_READY/pr/data/standard/data_dt=20200825/00/STANDARD_FILE.gz.parquet)
pattern = '.*.parquet' file_format = (TYPE = 'PARQUET' SNAPPY_COMPRESSION = TRUE)
ON_ERROR = 'SKIP_FILE_10%'
my $SQL = "COPY INTO DBTABLE.LND_LND_STANDARD_DATA FROM (
SELECT SPLIT_PART(METADATA\$FILENAME,'/',4) as SEAT_ID,
\$1:auction_id_64 as AUCTION_ID_64,
DATEADD(S,\$1:date_time,'1970-01-01') as DATE_TIME,
\$1:user_tz_offset as USER_TZ_OFFSET,
\$1:creative_width as CREATIVE_WIDTH,
\$1:creative_height as CREATIVE_HEIGHT,
\$1:media_type as MEDIA_TYPE,
\$1:fold_position as FOLD_POSITION,
\$1:event_type as EVENT_TYPE
FROM \#DBTABLE.lnd.S3_STAGE_READY/pr/data/standard/data_dt=20200825/00/STANDARD_FILE.gz.parquet)
pattern = '.*.parquet' file_format = (TYPE = 'PARQUET' SNAPPY_COMPRESSION = TRUE)
ON_ERROR = 'SKIP_FILE_10%'";
my $sth = $dbh->prepare($sql);
$sth->execute;
In looking at the output from snowflake I see this error
syntax error line 3 at position 4 unexpected '?'.
syntax error line 4 at position 13 unexpected '?'.
COPY INTO DBTABLE.LND_LND_STANDARD_DATA FROM (
SELECT SPLIT_PART(METADATA$FILENAME,'/',4) as SEAT_ID,
$1? as AUCTION_ID_64,
DATEADD(S,$1?,'1970-01-01') as DATE_TIME,
$1? as USER_TZ_OFFSET,
$1? as CREATIVE_WIDTH,
$1? as CREATIVE_HEIGHT,
$1? as MEDIA_TYPE
Do I need to create bind variables for each of the columns? I usually pull in the data from the file and put them into variables but this is different as I can't read the raw file first, it has to come directly from the copy into command.
Any help would be appreciated.
It was interpreting the : as a bind variable value, rather than a value in a variant. I used the bracket notation, instead like the following:
my $SQL = "COPY INTO DBTABLE.LND_LND_STANDARD_DATA FROM (
SELECT SPLIT_PART(METADATA\$FILENAME,'/',4) as SEAT_ID,
\$1['auction_id_64'] as AUCTION_ID_64,
DATEADD(S,\$1['date_time,'1970-01-01') as DATE_TIME,
\$1['user_tz_offset'] as USER_TZ_OFFSET,
\$1:creative_width'] as CREATIVE_WIDTH,
etc...
That worked

Quote strings and dates in psql query results output

Under the section \pset [ option [ value ] ] of the psql docs, I can set various settings to make my query results convenient for me.
I can, for example, approach a CSV-like output with:
\pset fieldsep ','
\pset footer off
\pset format unaligned
\pset null 'NULL'
Resulting in output like:
> WITH foo_tbl(foo,bar,baz)
> AS
> (
> VALUES
> ('foo', NULL, 1),
> (NULL, 'bar', 1)
> )
> SELECT * FROM foo_tbl;
foo,bar,baz
foo,NULL,1
NULL,bar,1
This is great, but I'd like strings and dates to be quoted, like this:
foo,bar,baz
'foo',NULL,1
NULL,'bar',1
Is this not possible with psql?
p.s. I know this kind of thing can be done with SQL clients like DBeaver, but that isn't in the scope of this question.
To generate CSV output, you can use the copy command rather than trying to tweak the output of a regular SELECT statement.
copy (
WITH foo_tbl (foo,bar,baz,dt) AS
(
VALUES
('foo', NULL, 1, date '2020-01-02'),
(NULL, 'bar', 1, date '2020-03-04')
)
SELECT *
FROM foo_tbl
) to stdout
with (format csv, quote '''', header, null 'NULL', force_quote (foo, dt) );
Will generate the following output
foo,bar,baz,dt
'foo',NULL,1,'2020-01-02'
NULL,bar,1,'2020-03-04'
I am not aware of an option that will quote only dates and strings, but not numbers, so using force_quote and specifying the columns to quote is the only way to get them (always).
copy (...) to stdout is easier to use than it's psql sibling \copy because it allows multi-line queries.
To write everything into a file, you can use the \o command in psql
postgres=> \o data.csv
postgres=> copy (...) to stdout with (...);

Use outcome of SQL query as value for variable

I have a query that extracts performance measurements of a number of APIs and those I want to save over time to different files in one folder. Say every hour one run and one output file.
The Invantive scripting statement
local export results as "${exportfilename}" format xml
Can do this when you have exportfilename correctly set up.
With Oracle SQL*Plus you can memorize the outcome of a query in a variable with the column ... new_value syntax.
How can I set exportfile using the outcome of an Invantive SQL query?
Solution was to use the ${outcome:row,column} syntax as in:
local define outfolder "c:\temp"
select sdy3.value || '-' || lpad(year(sysdate), 4, '0') || lpad(month(sysdate), 2, '0') || lpad(day(sysdate), 2, '0') || lpad(hour(sysdate), 2, '0') || lpad(minute(sysdate), 2, '0') ||'.xml' file_name
from exactonlinerest..systemdatacontainerproperties sdy1
join exactonlinerest..systemdatacontainerproperties sdy2
on sdy2.data_container_alias = 'default'
and sdy2.name = 'provider-description'
join exactonlinerest..systemdatacontainerproperties sdy3
on sdy3.data_container_alias = 'default'
and sdy3.name = 'provider-short-name'
where sdy1.data_container_alias = 'default'
and sdy1.name = 'data-container-id'
local define exportfilename "${outfolder}\${outcome:0,0}"
<<< Run actual SQL>>>
local export results as "${exportfilename}" format xml
The ${outcome:...,...} syntax puts the string representation of the respective row number (0..max) and column number (0..max) as a value into the indicated variable name.

How to import a CSV file through a stored function?

I Have sample CSV file which contains 10 records.
So I want to upload the CSV file Thru stored procedure.
Is it possible to do that way. This is my stored function.
FOR i IN 1..v_cnt LOOP
SELECT idx_date,file_path INTO v_idx_date,v_file_path FROM cloud10k.temp_idx_dates
WHERE is_updated IS FALSE LIMIT 1;
COPY cloud10k.temp_master_idx_new(header_section) FROM v_file_path;
DELETE FROM cloud10k.temp_master_idx_new WHERE header_section NOT ILIKE '%.txt%';
UPDATE cloud10k.temp_master_idx_new SET CIK = split_part( header_section,'|',1),
company_name = split_part( header_section,'|',2),
form_type = split_part( header_section,'|',3),
date_filed = split_part( header_section,'|',4)::DATE,
accession_number = replace(split_part(split_part( header_section,'|',5),'/',4),'.txt',''),
file_path = to_char(SUBSTRING(SPLIT_PART(v_file_path,'master.',2) FROM 1 FOR 8)::DATE,'YYYY')
||'/'||to_char(SUBSTRING(SPLIT_PART(v_file_path,'master.',2) FROM 1 FOR 8)::DATE,'MM')
||'/'||to_char(SUBSTRING(SPLIT_PART(v_file_path,'master.',2) FROM 1 FOR 8)::DATE,'DD')
||'/'||CONCAT_WS('.','master',SPLIT_PART(v_file_path,'master.',2) )
WHERE header_section ILIKE '%.txt%';
END LOOP;
But its not executing. Can someone suggest me how to do that..
Tanks,
Ramesh