PostgreSQL - batch + script + variable - postgresql

I am not a programmer, I am struggling a bit with this.
I have a batch file connecting to my PostgreSQL server, and then open a sql script. Everything works as expected. My question is how to pass a variable (if possible) from one to the other.
Here is my batch file:
set PGPASSWORD=xxxx
cls
#echo off
C:\Progra~1\PostgreSQL\8.3\bin\psql -d Total -h localhost -p 5432 -U postgres -f C:\TotalProteinImport.sql
And here's the script:
copy totalprotein from 'c:/TP.csv' DELIMITERS ',' CSV HEADER;
update anagrafica
set pt=(select totalprotein.resultvalue from totalprotein where totalprotein.accessionnbr=anagrafica.id)
where data_analisi = '12/23/2011';
delete from totalprotein;
This is working great, now the question is how could I pass a variable that would carry the date for data_analisi?
Like in the batch file, "Please enter date", and then the value is passed to the sql script.

You could create a function out of your your SQL script like this:
CREATE OR REPLACE FUNCTION f_myfunc(date)
RETURNS void AS
$BODY$
CREATE TEMP TABLE t_tmp ON COMMIT DROP AS
SELECT * FROM totalprotein LIMIT 0; -- copy table-structure from table
COPY t_tmp FROM 'c:/TP.csv' DELIMITERS ',' CSV HEADER;
UPDATE anagrafica a
SET pt = t.resultvalue
FROM t_tmp t
WHERE a.data_analisi = $1
AND t.accessionnbr = a.id;
-- Temp table is dropped automatically at end of session
-- In this case (ON COMMIT DROP) after the transaction
$BODY$
LANGUAGE sql;
You can use language SQL for this kind of simple SQL batch.
As you can see I have made a couple of modifications to your script that should make it faster, cleaner and safer.
Major points
For reading data into an empty table temporarily, use a temporary table. Saves a lot of disc writes and is much faster.
To simplify the process I use your existing table totalprotein as template for the creation of the (empty) temp table.
If you want to delete all rows of a table use TRUNCATE instead of DELETE FROM. Much faster. In this particular case, you need neither. The temporary table is dropped automatically. See comments in function.
The way you updated anagrafica.pt you would set the column to NULL, if anything goes wrong in the process (date not found, wrong date, id not found ...). The way I rewrote the UPDATE, it only happens if matching data are found. I assume that is what you actually want.
Then ask for user input in your shell script and call the function with the date as parameter. That's how it could work in a Linux shell (as user postgres, with password-less access (using IDENT method in pg_haba.conf):
#! /bin/sh
# Ask for date. 'YYYY-MM-DD' = ISO date-format, valid with any postgres locale.
echo -n "Enter date in the form YYYY-MM-DD and press [ENTER]: "
read date
# check validity of $date ...
psql db -p5432 -c "SELECT f_myfunc('$date')"
-c makes psql execute a singe SQL command and then exits. I wrote a lot more on psql and its command line options yesterday in a somewhat related answer.
The creation of the according Windows batch file remains as exercise for you.
Call under Windows
The error message tells you:
Function tpimport(unknown) does not exist
Note the lower case letters: tpimport. I suspect you used mixe case letters to create the function. So now you have to enclose the function name in double quotes every time you use it.
Try this one (edited quotes!):
C:\Progra~1\PostgreSQL\8.3\bin\psql -d Total -h localhost -p 5432 -U postgres
-c "SELECT ""TPImport""('%dateimport%')"
Note how I use singe and double quotes here. I guess this could work under windows. See here.
You made it hard for yourself when you chose to use mixed case identifiers in PostgreSQL - a folly which I never tire of warning against. Now you have to double quote the function name "TPImport" every time you use it. While perfectly legit, I would never do that. I use lower case letters for identifiers. Always. This way I never mix up lower / upper case and I never have to use double quotes.
The ultimate fix would be to recreate the function with a lower case name (just leave away the double quotes and it will be folded to lower case automatically). Then the function name will just work without any quoting.
Read the basics about identifiers here.
Also, consider upgrading to a more recent version of PostgreSQL 8.3 is a bit rusty by now.

psql supports textual replacement variables. Within psql they can be set using \set and used using :varname.
\set xyz 'abcdef'
select :'xyz';
?column?
----------
abcdef
These variables can be set using command line arguments also:
psql -v xyz=value
The only problem is that these textual replacements always need some fiddling with quoting as shown by the first \set and select.

After creating the function in Postgres, you must create a .bat file in the bin directory of your Postgres version, for example C:\Program Files\PostgreSQL\9.3\bin. Here you write:
#echo off
cd C:\Program Files\PostgreSQL\9.3\bin
psql -p 5432 -h localhost -d myDataBase -U postgres -c "select * from myFunction()"

Related

how to pass variable to copy command in Postgresql

I tried to make a variable in SQL statement in Postgresql, but it did not work.
There are many csv files stored under the path. I want to set path in Postgresql that can tell copy command where can find csv files.
SQL statement sample:
\set outpath '/home/clients/ats-dev/'
\COPY licenses (_id, name,number_seats ) FROM :outpath + 'licenses.csv' CSV HEADER DELIMITER ',';
\COPY uploaded_files (_id, added_date ) FROM :outpath + 'files.csv' CSV HEADER DELIMITER ',';
It did not work. I got error: no such files. The two files licneses.csv and files.csv are stored under /home/cilents/ats-dev on Ubuntu. I found some sultion that use "\set file 'license.csv'". It did not work for me becacuse I have many csv files. also I tried to use "from : outpath || 'licenses.csv'". it did not work ether. Appreciate for any helps.
Using 9.3.
It looks like psql does not support :variable substitution withinpsql backslash commands.
test=> \set somevar fred
test=> \copy z from :somevar
:somevar: No such file or directory
so you will need to do this via an external tool like the unix shell. e.g.
for f in *.sql; do
psql -c "\\copy $(basename $f) FROM '$f'"
done
You can try COPY command
\set outpath '\'/home/clients/ats-dev/'
COPY licenses (_id, name,number_seats ) FROM :outpath/licenses.csv' WITH CSV HEADER DELIMITER ',';
COPY uploaded_files (_id, added_date ) FROM :outpath/files.csv' WITH CSV HEADER DELIMITER ',';
Note: Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must reside on or be accessible to the database server machine, not the client. They must be accessible to and readable or writable by the PostgreSQL user (the user ID the server runs as), not the client. Similarly, the command specified with PROGRAM is executed directly by the server, not by the client application, must be executable by the PostgreSQL user. COPY naming a file or command is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
Documentation: Postgresql 9.3 COPY
It may have been true when this was originally asked, that psql backslash commands didn't support variable interpolation, but in my PostgreSQL 14 instance that's no longer the case. However, the psql manpage is clear that \copy specifically does not support variable interpolation.

Greenplum to file using PSQL

I'm trying to export data from Green-plum to a text file(client) with pipe delimiter using PSQL and \copy. In the output i see single slash is converted to double slash and tab is converted \t.
Example
N\A is converted to N\\A
So how to get just N\A instead N\\A and just spaces instead of \t ?
Note: i`m allowed to use only \copy. Since my file is huge im getting space issue while use SED or Perl for find and replace
Assuming you don't have any "^" characters, you could use that as the escape character.
copy tpcds.call_center to stdout with delimiter '|' escape '^';
More on copy can be found here: https://www.postgresql.org/docs/8.2/static/sql-copy.html
This technique will be relatively slow and put a burden on the Master. If you used gpfdist instead, you could leverage the parallelism in the cluster and bypass the master. This solution is ideal for unloading large amounts of data.
First, start the gpfidst process:
[gpadmin#gpdbsne ~]$ gpfdist -p 8888 > gpfdist_8888.log 2>&1 < gpfdist_8888.log &
[1] 2255
Now, you can create the external table.
[gpadmin#gpdbsne ~]$ psql
SET
Timing is on.
psql (8.2.15)
Type "help" for help.
gpadmin=# create writable external table tpcds.et_call_center
(like tpcds.call_center)
location ('gpfdist://gpdbsne:8888/call_center.txt')
format 'text' (delimiter '|' escape '^');
NOTICE: Table doesn't have 'distributed by' clause, defaulting to distribution columns from LIKE table
CREATE EXTERNAL TABLE
Time: 18.681 ms
Now, you insert the data:
gpadmin=# insert into tpcds.et_call_center select * from tpcds.call_center;
INSERT 0 6
Time: 72.653 ms
gpadmin=# \q
Verify:
[gpadmin#gpdbsne ~]$ wc -l call_center.txt
6 call_center.txt
In my example, I used the hostname "gpdbsne" which is accessible to all segments in this cluster. Typically, Greenplum uses a private network for communication between segments so this hostname will need to be connected to the private network.
Since the writable external table is written to with SQL, you can use whatever transformation logic you want in the SQL so you can change tabs to spaces if you want. This eliminates the need for awk or sed for post processing the files. Copy can use SQL too but like I said, it is a slower than using writable external tables.

Postgres: Combining multiple COPY TO outputs to a postgres-importable file

I have my database hosted on heroku, and I want to download specific parts of the database (e.g. all the rows with id > x from table 1, all the rows with name = x from table 2, etc.) in a single file.
From some research and asking a question here it seems that some kind of modified pg_dump would solve my problem. However, I won't be able to use pg_dump because I won't have access to the command line (basically I want to be able to click a button in my web app and it will generate + download the database file).
So my new strategy is to use the postgres copy command. I'll go through the various tables in my server database, run COPY (Select * FROM ... WHERE ...) TO filename , where filename is just a temporary file that I will download when complete.
The issue is that this filename file will just have the rows, so I can't just turn around and import it into pgadmin. Assuming I have an 'empty' database set up (the schema, indices, and stuff are all already set up), is there a way I can format my filename file so that it can be easily imported into a postgres db?
Building on my comment about to/from stdout/stdin, and answering the actual question about including multiple tables in one file; you can construct the output file to interleave copy ... from stdin with actual data and load it via psql. For example, psql will support input files that look like this:
copy my_table (col1, col2, col3) from stdin;
foo bar baz
fizz buzz bizz
\.
(Note the trailing \. and that the separators should be tabs; you could also specify the delimiter option in the copy command).
psql will treat everything between the ';' and '.' as stdin. This essentially emulates what pg_dump does when you export table data and no schema (e.g., pg_dump -a -t my_table).
The resulting load could be as simple as psql mydb < output.dump.

Passing null string value via environment variable to TSQL script

I have a DOS batch file I want to use to invoke a TSQL program.
I want to pass the names of the databases to use. This seems to work.
I want to pass the PREFIXES for the names of the tables I want to work with.
So for test tables I want to pass the name of a prefix to use the test table.
set svr=myserver
rem set db=myTESTdatabasename
set db=mydatabasename
rem set tp=TEST
set tp=
sqlcmd -S %svr% -d somename -i test01.sql
test01.sql looks like this:
use $(db)
go
select top 10 * into $(db).dbo.$(tp)dsttbl from $(db).dbo.$(tp)srctbl
It works fine for the test stuff, but for the real stuff, I just want to set the value of tp to null so that it will use the real table name and not the bogus table name.
The reason I'm doing this is because I don't know the names of everything that will be used on the actual databases. I'm trying to make it generic so I don't have to do a bunch of search replaces on what will be a very large sql program (the real sql program is already hundreds of lines).
In the test case, this would resolve to
select top 10 * into myTESTdatabasename.dbo.TESTdsttbl from myTESTdatabasename.dbo.TESTsrctbl
For the production runs, it should resolve to
select top 10 * into mydatabasename.dbo.dsttbl from mydatabasename.dbo.srctbl
The problem seems that it doesn't like null values for $(tp), or perhaps that it's getting an undefined variable.
I experimented some with the syntax and as Preet Sangha pointed out you should use the /V command line option.
The reason is that setting a variable to the empty string in a batch script undefines it.
If you want to set the database name in the top of the batch file you can still use set, like this:
set db_to_use=
Then you can use this (undefined) variable in the sqlcmd using the /V option:
sqlcmd -S %svr% -d somename -v db="%db_to_use%" -i test01.sql
...or you can just set the value directly in the sqlcmd line:
sqlcmd -S %svr% -d somename -v db="" -i test01.sql

Automating PostgreSQL output to csv

mohpc04pp1: /h/u544835 % psql arco
Welcome to psql 8.1.17, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
WARNING: You are connected to a server with major version 8.3,
but your psql client is major version 8.2. Some backslash commands,
such as \d, might not work properly.
dbname=> \o /h/u544835/data25000.csv
dbname=> select url from urltable where scoreid=1 limit 25000;
dbname=> \q
This is took from a link online of basically what I have been doing, but what I need to do is make a script that I can use to produce csv files daily
So my aim of the script is to while in the script connect to the db, run the \o etc commands then close it
but I'm having trouble scripting it to say go into the psql arco database then run those queries.
command line to connect to db = psql arco then once the scrits recognised I'm in that databse perform those commands to automate a query to a csv file.
if anyone can get me started or point me towards reading material for me to get past that bit, it will be duely appreciated.
i'm running all this off a standard windows xp, ssh'ing to a SLES set-up web server that holds my postgresql database running psql version 8.1.17
First of all you should fix your setup. As it turns out, we are dealing with PostgreSQL 8.1 here. This version has reached end of live in 2010. You need to seriously think about upgrading - or at least remind the guys running the server. Current version is 9.1.
The command you are looking for:
psql arco -c "\copy (select url from urltable where scoreid=1 limit 25000) to '/h/u544835/data25000.csv'"
Assuming your db is named "arco". Adjusted for changed question (including changed port).
I now see version 8.1 popping up in your question, but it's all contradictory. You need Postgres 8.2 or later to use a query (instead of a table) with the \copy meta-command.
Details about psql in the fine manual.
Alternative approach that should work with obsolete PostgreSQL 8.1:
psql arco -o /h/u544835/data25000.csv -t -A -c 'SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000'
Find some more info about these command line options under this related question on dba.SE.
With function (syntax compatible with 8.1)
Another way would be to create a server side function (if you can!) that executes COPY from a temp table (old syntax - works with pg 8.1):
CREATE OR REPLACE FUNCTION f_copy_file()
RETURNS void AS
$BODY$
BEGIN
CREATE TEMP TABLE u_tmp AS (
SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000
);
COPY u_tmp TO '/h/u544835/data25000.csv';
DROP TABLE u_tmp;
END;
$BODY$
LANGUAGE plpgsql;
And then from the shell:
psql arco -c 'SELECT f_copy_file()'
Change the separator
\f sets the field separator. I quote the manual again:
-F separator
--field-separator=separator
Use separator as the field separator for unaligned output.
This is equivalent to \pset fieldsep or \f.
Or you can change the column separator in Excel, here are the instructions from MS.
Thanks to Erwin's help and a link I read up on he posted for me I managed to combine the two to get
#!/bin/sh
dbname='arco'
username='' # If you actually supply a username, you need to add the -U switch!
psql $dbname $username << EOF
\f ,
\o /h/u544835/showme.csv
SELECT * FROM storage;
EOF
which will write my queries to a csv file etc for me.
From what there is above, it is not separating the sql query so if I load it straight into excel, they all stay in the same column too which means I'm having a problem with the delimiter
I've tried tabbed delimiters, also tried , ; etc but none are letting me separate it
I need for it
is there an option I can click to see which delimiter is being used with my psql? or a different way of dumping the data from a query into a file that can be read by excel, so a different column for each row etc