I want to be able to load/run a sql file from inside another sql file.
Because my goal is to have one sql file with all the table creations and then after each create statment I let run the <table-name>-data.sql to insert the data for the table.
I just dont know what the rigth sql command is for that because copy is only for csv files.
And load is for shared libarys.
LOAD — load a shared library file
File with the data from a table:
<table-name>-data.sql:
INSERT INTO public.table VALUES ('2022-11-16');
INSERT INTO public.table VALUES ('2022-11-17');
INSERT INTO public.table VALUES ('2022-11-18');
INSERT INTO public.table VALUES ('2022-11-19');
File where I create the table and then load <table-name>-data.sql:
create.sql:
... sql for creating the table ...
run "c:/path/<table-name>-data.sql"
And in the end I can just run the create.sql file.
Solved by #Álvaro González,
Just use: \i path/file.sql
You can use it inside SQL too when trying it to run with psql.
Related
I am using ora2pg to export TABLE and INSERT type from oracle database. https://ora2pg.darold.net/documentation.html#:~:text=Ora2Pg%20consist%20of,QUERY%2C%20KETTLE%2C%20SYNONYM.
I have 2 questions.
The TABLE and INSERT sql statements have double quotes for table and column names but I want to create them without double quotes. Is it possible to configure this in the .conf file?
The INSERT sql file that is generated by ora2pg does not have the sql statements in right order. The parent table data should be inserted first before trying to insert data into child table due to foreign key constraints. But the INSERT sql file generated by ora2pg is not taking this into account so this is causing error because the child table insert statement is present before the parent table. Is this how ora2pg works or am I doing something wrong in the .conf file?
I have doing an extension in Postgres.
For do that, I make a backup in plain text of my functions, types, etc and I use this file for my extension.
Now I want to add an auxiliar table too. But the dump in the file for the table is like that (after it has create the table "tAcero" and the sequence):
COPY sdmed."tAcero" (id, area, masa, tipo, tamanno) FROM stdin;
44 65.30 502.000 HEB 180
45 78.10 601.000 HEB 200
.....
more values
\.
and I wonder if could be possible to use this COPY statement for populate the table into the extension, or I only can do it using "INSERT"?
Thank you.
You can indeed load tables in PostgreSQL using the COPY statement.
An example using the psql client and a CSV file:
CREATE TABLE test_of_copy (my_column text);
\COPY test_of_copy FROM './a_file_stored_locally' CSV HEADER;
Where the contents of a_file_stored_locally are:
my_column
"test_input"
Please have a read of the documentation: https://www.postgresql.org/docs/9.2/sql-copy.html. If you have any issues with this, perhaps add some more detail to your question.
Is there a way to COPY the CSV file data directly into a JSON or JSONb array?
Example:
CREATE TABLE mytable (
id serial PRIMARY KEY,
info jSONb -- or JSON
);
COPY mytable(info) FROM '/tmp/myfile.csv' HEADER csv;
NOTE: each CSV line is mapped to a JSON array. It is a normal CSV.
Normal CSV (no JSON-embeded)... /tmp/myfile.csv =
a,b,c
100,Mum,Dad
200,Hello,Bye
The correct COPY command must be equivalent to the usual copy bellow.
Usual COPY (ugly but works fine)
CREATE TEMPORARY TABLE temp1 (
a int, b text, c text
);
COPY temp1(a,b,c) FROM '/tmp/myfile.csv' HEADER csv;
INSERT INTO mytable(info) SELECT json_build_array(a,b,c) FROM temp1;
It is ugly because:
need the a priory knowledge about fields, and a previous CREATE TABLE with it.
for "big data" need a big temporary table, so lost CPU, disk and my time — the table mytable have CHECKs and UNIQUEs constraints for each line.
... Needs more than 1 SQL command.
Perfect solution!
Not need to know all the CSV columns, only extract what you know.
Use at SQL CREATE EXTENSION PLpythonU;: if the command produce an error like "could not open extension control file ... No such file" you need to install pg-py extra-packages. In standard UBUNTU (16 LTS) is simple, apt install postgresql-contrib postgresql-plpython.
CREATE FUNCTION get_csvfile(
file text,
delim_char char(1) = ',',
quote_char char(1) = '"')
returns setof text[] stable language plpythonu as $$
import csv
return csv.reader(
open(file, 'rb'),
quotechar=quote_char,
delimiter=delim_char,
skipinitialspace=True,
escapechar='\\'
)
$$;
INSERT INTO mytable(info)
SELECT jsonb_build_array(c[1],c[2],c[3])
FROM get_csvfile('/tmp/myfile1.csv') c;
The split_csv() function was defined here. The csv.reader is very reliable (!).
Not tested for big-big CSV... But expected Python do job.
PostgreSQL workaround
It is not a perfect solution, but it solves the main problem, that is the
... big temporary table, so lost CPU, disk and my time"...
This is the way we do it, a workaround with file_fdw!
Adopt your conventions to avoid file-copy and file-permission confusions... The standard file path for a CSV. Example: /tmp/pg_myPrj_file.csv
Initialise your database or SQL script with the magic extension,
CREATE EXTENSION file_fdw;
CREATE SERVER files FOREIGN DATA WRAPPER file_fdw;
For each CSV file, myNewData.csv,
3.1. make a symbolic link (or scp remote copy) for your new file ln -sf $PWD/myNewData.csv /tmp/pg_socKer_file.csv
3.2. configure the file_fdw for your new table (suppose mytable).
CREATE FOREIGN TABLE temp1 (a int, b text, c text)
SERVER files OPTIONS (
filename '/tmp/pg_socKer_file.csv',
format 'csv',
header 'true'
);
PS: after running SQL script with psql, when having some permission problem, change owner of the link by sudo chown -h postgres:postgres /tmp/pg_socKer_file.csv.
3.3. use the file_fdw table as source (suppose populating mytable).
INSERT INTO mytable(info)
SELECT json_build_array(a,b,c) FROM temp1;
Thanks to #JosMac (and his tutorial)!
NOTE: if there is a STDIN way to do it (exists??), will be easy, avoiding permission problems and use of absolute paths. See this answer/discussion.
As Redshift is based on PostgreSQL, does it have an option to overwrite or append data in table while copying from S3 to redshift?
Only thing I got is use of triggers but they don't accept any argument.
All I need to write a script which takes an argument as yes/no (or similar) if the data is already in the table.
When loading data from Amazon S3 into Amazon Redshift using the COPY command, data is appended to the target table.
Redshift does not have an "overwrite" option. If you wish to replace existing data with the data being loaded, you could:
Load the data into a temporary table
Delete rows in the main table that match the incoming data, eg:
DELETE FROM main-table WHERE id IN (SELECT id from temp-table)
Copy the rows from the temporary table to the main table, eg:
SELECT * FROM temp-table INTO main-table
See: Updating and Inserting New Data
Redshift doesn't allow you to create triggers or events like other sql databases, the solution I found is to run update (sql query)though you can use also Python or other language and schedule the Rscript with crontab task.
As of May 2019, Redshift supports stored procedures so you can package up a set of queries/statements like this:
CREATE OR REPLACE PROCEDURE public.copy_and_cleanse_data(overwrite bool)
AS $$
BEGIN
if overwrite IS TRUE THEN DELETE FROM myredshifttable;
copy myredshifttable
from 's3://awssampledbuswest2/tickit/category_pipe.txt'
iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>'
region 'us-west-2';
UPDATE myredshifttable SET myfield = REPLACE(myfield, 'foo', 'bar');
END;
$$ LANGUAGE plpgsql
SECURITY DEFINER;
Then use or schedule the following query:
CALL public.copy_and_cleanse_data()
I'm wondering if I can use a trigger on a table to "ignore" columns that are in a COPY statement from STDIN but which are not in the target table. Sorry if the wording/syntax of the question is off, but here is and explanation of what I'm trying to say. I'm new to triggers so any advice is helpful.
I'm using the PostGIS Shapefile importer to copy shapefiles to the spatial tables in my PostgreSQL database.
This creates a COPY statement which contains all the fields in the shapefile something like:
COPY "public"."stations" ("column1","column2","column3","column4", geom) FROM stdin;
column1 and column2 are in the file but not in the target table, so the COPY fails.
Is there a way to create a trigger to create something that would have the same result as:
COPY "public"."stations" ("column3","column4", geom) FROM stdin;
No, you cannot skip columns that are present in the input file. This will error out, before triggers are even invoked. And you cannot use rules either. I quote the manual:
COPY FROM will invoke any triggers and check constraints on the
destination table. However, it will not invoke rules.
You can either edit the file or use a temporary staging table:
COPY to a temporary table with matching columns.
Use INSERT to write the desired columns to the final target table(s) - or the whole range of SQL DDL commands for more sophisticated matters.