Command to read a file and execute script with psql - postgresql

I am using PostgreSQL 9.0.3. I have an Excel spreadsheet with lots of data to load into couple of tables in Windows OS.
I have written the script to get the data from input file and Insert into some 15 tables. This can't be done with COPY or Import. I named the input file as DATALD.
I find out the psql command -d to point the db and -f for the script sql. But I need to know the commands how to feed the input file along with the script so that the data gets inserted into the tables..
For example this is what I have done:
begin
for emp in (select distinct w_name from DATALD where w_name <> 'w_name')
--insert in a loop
INSERT INTO tblemployer( id_employer, employer_name,date_created, created_by)
VALUES (employer_id,emp.w_name,now(),'SYSTEM1');
Can someone please help?

For an SQL script you must ..
either have the data inlined in your script (in the same file).
or you need to utilize COPY to import the data into Postgres.
I suppose you use a temporary staging table, since the format doesn't seem to fit the target tables. Code example:
How to bulk insert only new rows in PostreSQL
There are other options like pg_read_file(). But:
Use of these functions is restricted to superusers.
Intended for special purposes.

Related

In PostgreSQL, is there a CLI command to copy the speed of a SELECT statement as well as the SELECT statement into a text file (without the data)?

I am currently comparing performance of PostgreSQL with several other SQL systems. I am aware of the \timing option to turn on timing queries. However, I would very much like to automate the process of copying the statements executed and the query speed below it. I imagine there is a simple way to log this?
Let's say I run:
CREATE TABLE t1 AS
SELECT itemID, prodCategory
FROM products
WHERE prodCategory = "footwear"
I want to automatically save into a text file:
CREATE TABLE t1 AS
SELECT itemID, prodCategory
FROM products
WHERE prodCategory = "footwear"
SELECT 7790
Time: 10.884 ms
If OS Specifications are needed, I am using MacOS.
I just learned that you can use the:
script filename
command to save everything that is printed on your screen. If timing is on, you can record the queries and the query time outputs.
To stop recording, simply type exit.

How to ignore errors with psql \copy meta-command

I am using psql with a PostgreSQL database and the following copy command:
\COPY isa (np1, np2, sentence) FROM 'c:\Downloads\isa.txt' WITH DELIMITER '|'
I get:
ERROR: extra data after last expected column
How can I skip the lines with errors?
You cannot skip the errors without skipping the whole command up to and including Postgres 14. There is currently no more sophisticated error handling.
\copy is just a wrapper around SQL COPY that channels results through psql. The manual for COPY:
COPY stops operation at the first error. This should not lead to problems in the event of a COPY TO, but the target table will
already have received earlier rows in a COPY FROM. These rows will
not be visible or accessible, but they still occupy disk space. This
might amount to a considerable amount of wasted disk space if the
failure happened well into a large copy operation. You might wish to
invoke VACUUM to recover the wasted space.
Bold emphasis mine. And:
COPY FROM will raise an error if any line of the input file contains
more or fewer columns than are expected.
COPY is an extremely fast way to import / export data. Sophisticated checks and error handling would slow it down.
There was an attempt to add error logging to COPY in Postgres 9.0 but it was never committed.
Solution
Fix your input file instead.
If you have one or more additional columns in your input file and the file is otherwise consistent, you might add dummy columns to your table isa and drop those afterwards. Or (cleaner with production tables) import to a temporary staging table and INSERT selected columns (or expressions) to your target table isa from there.
Related answers with detailed instructions:
How to update selected rows with values from a CSV file in Postgres?
COPY command: copy only specific columns from csv
It is too bad that in 25 years Postgres doesn't have -ignore-errors flag or option for COPY command. In this era of BigData you get a lot of dirty records and it can be very costly for the project to fix every outlier.
I had to make a work-around this way:
Copy the original table and call it dummy_original_table
in the original table, create a trigger like this:
CREATE OR REPLACE FUNCTION on_insert_in_original_table() RETURNS trigger AS $$
DECLARE
v_rec RECORD;
BEGIN
-- we use the trigger to prevent 'duplicate index' error by returning NULL on duplicates
SELECT * FROM original_table WHERE primary_key=NEW.primary_key INTO v_rec;
IF v_rec IS NOT NULL THEN
RETURN NULL;
END IF;
BEGIN
INSERT INTO original_table(datum,primary_key) VALUES(NEW.datum,NEW.primary_key)
ON CONFLICT DO NOTHING;
EXCEPTION
WHEN OTHERS THEN
NULL;
END;
RETURN NULL;
END;
Run a copy into the dummy table. No record will be inserted there, but all of them will be inserted in the original_table
psql dbname -c \copy dummy_original_table(datum,primary_key) FROM '/home/user/data.csv' delimiter E'\t'
Workaround: remove the reported errant line using sed and run \copy again
Later versions of Postgres (including Postgres 13), will report the line number of the error. You can then remove that line with sed and run \copy again, e.g.,
#!/bin/bash
bad_line_number=5 # assuming line 5 is the bad line
sed ${bad_line_number}d < input.csv > filtered.csv
[per the comment from #Botond_Balázs ]
Here's one solution -- import the batch file one line at a time. The performance can be much slower, but it may be sufficient for your scenario:
#!/bin/bash
input_file=./my_input.csv
tmp_file=/tmp/one-line.csv
cat $input_file | while read input_line; do
echo "$input_line" > $tmp_file
psql my_database \
-c "\
COPY my_table \
FROM `$tmp_file` \
DELIMITER '|'\
CSV;\
"
done
Additionally, you could modify the script to capture the psql stdout/stderr and exit
status, and if the exit status is non-zero, echo $input_line and the captured stdout/stderr to stdin and/or append it to a file.

Conditional Select using a file

I have a file with a list of identifying attributes, one on each line. I'd like to use these as part of a conditional select statement to psql. The same query I'm thinking of is:
SELECT * FROM mytable where mykey IN ('contents of file');
I'd like to point the IN construct to the file (which is being generated from another database and script.) Every entry in the file will be in mytable
Is there a way to do this in psql from the command line? My intention is to run this as part of a bash script on our servers.

PostgreSQL isset function

Is there any way, how to check, whether a variable has already been set in my environment?
Example:
\set table_name countries
\i queries.sql
queries.sql:
SELECT * FROM :table_name;
I want to make queries.sql to be called independently and use some default table name I would specify.
Is this possible or do I really need to create another SQL file through which I will call the queries (\i)?
My use case is usage of my SQL queries both in pgTAP unit tests (with some sample table names) and independently.
You could check the current value with:
SELECT :'table_name';
You can set it on the call to psql with something like --set='table_name' on the psql command line.

stored procedures in postgresql

I want to know WHERE to write stored procedures in PostgreSQL?
I mean not how to write but the very basic thing where to write, where to go if I want to write one?
Is it written just like query or in some different sort of file?
I am fairly new to postgresql
So please explain as much as possible
Just use any text editor to create a (SQL) file containing the necessary CREATE FUNCTION statement.
Then run that file using psql.
As an alternative you can use a GUI tool like pgAdmin or something similar (Squirrel, DbVisualizer, SQL Workbench/J, ...) where you have the editor "built-in"
You can directly run the statement that you edit against the database.
Use the CREATE FUNCTION... command in whatever your prefered PSQL manager is.
Something like this (psuedo SQL):
CREATE OR REPLACE FUNCTION
MyProc(text, text)
RETURNS
void
AS
$delimiter$
INSERT INTO MyTable (text_val_1, text_val_2)
VALUES ($1, $2);
$delimiter$
LANGUAGE SQL;
More info can be found here:
http://www.day32.com/MySQL/Meetup/Presentations/postgresql_stored_procedures.pdf
You need to open pgAdmin application which you need to install if you do not have it.
Then you need click on this button as I have marked and then a query editor will appear at right side. You will write your query or stored procedure or functions here in this query editor.
See the screenshot attached :