Automating PostgreSQL output to csv - postgresql

mohpc04pp1: /h/u544835 % psql arco
Welcome to psql 8.1.17, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
WARNING: You are connected to a server with major version 8.3,
but your psql client is major version 8.2. Some backslash commands,
such as \d, might not work properly.
dbname=> \o /h/u544835/data25000.csv
dbname=> select url from urltable where scoreid=1 limit 25000;
dbname=> \q
This is took from a link online of basically what I have been doing, but what I need to do is make a script that I can use to produce csv files daily
So my aim of the script is to while in the script connect to the db, run the \o etc commands then close it
but I'm having trouble scripting it to say go into the psql arco database then run those queries.
command line to connect to db = psql arco then once the scrits recognised I'm in that databse perform those commands to automate a query to a csv file.
if anyone can get me started or point me towards reading material for me to get past that bit, it will be duely appreciated.
i'm running all this off a standard windows xp, ssh'ing to a SLES set-up web server that holds my postgresql database running psql version 8.1.17

First of all you should fix your setup. As it turns out, we are dealing with PostgreSQL 8.1 here. This version has reached end of live in 2010. You need to seriously think about upgrading - or at least remind the guys running the server. Current version is 9.1.
The command you are looking for:
psql arco -c "\copy (select url from urltable where scoreid=1 limit 25000) to '/h/u544835/data25000.csv'"
Assuming your db is named "arco". Adjusted for changed question (including changed port).
I now see version 8.1 popping up in your question, but it's all contradictory. You need Postgres 8.2 or later to use a query (instead of a table) with the \copy meta-command.
Details about psql in the fine manual.
Alternative approach that should work with obsolete PostgreSQL 8.1:
psql arco -o /h/u544835/data25000.csv -t -A -c 'SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000'
Find some more info about these command line options under this related question on dba.SE.
With function (syntax compatible with 8.1)
Another way would be to create a server side function (if you can!) that executes COPY from a temp table (old syntax - works with pg 8.1):
CREATE OR REPLACE FUNCTION f_copy_file()
RETURNS void AS
$BODY$
BEGIN
CREATE TEMP TABLE u_tmp AS (
SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000
);
COPY u_tmp TO '/h/u544835/data25000.csv';
DROP TABLE u_tmp;
END;
$BODY$
LANGUAGE plpgsql;
And then from the shell:
psql arco -c 'SELECT f_copy_file()'
Change the separator
\f sets the field separator. I quote the manual again:
-F separator
--field-separator=separator
Use separator as the field separator for unaligned output.
This is equivalent to \pset fieldsep or \f.
Or you can change the column separator in Excel, here are the instructions from MS.

Thanks to Erwin's help and a link I read up on he posted for me I managed to combine the two to get
#!/bin/sh
dbname='arco'
username='' # If you actually supply a username, you need to add the -U switch!
psql $dbname $username << EOF
\f ,
\o /h/u544835/showme.csv
SELECT * FROM storage;
EOF
which will write my queries to a csv file etc for me.
From what there is above, it is not separating the sql query so if I load it straight into excel, they all stay in the same column too which means I'm having a problem with the delimiter
I've tried tabbed delimiters, also tried , ; etc but none are letting me separate it
I need for it
is there an option I can click to see which delimiter is being used with my psql? or a different way of dumping the data from a query into a file that can be read by excel, so a different column for each row etc

Related

Run .sql file with PSQL [duplicate]

This question already has answers here:
Running SQL script through psql gives syntax errors that don't occur in PgAdmin
(4 answers)
Closed 3 years ago.
I have a .sql file that has several commands in it like this:
alter table mytable add codecolumn varchar(10);
update mytable set codecolumn = 'code';
alter table mytable set anothercolumn = 'anothervalue';
This is a PostgreSQL database and I'm trying to use psql to execute the file like so:
psql -h hostname -d dbname -w -f "C:\filepath\filename.sql"
The problem I'm noticing is that it's not executing the commands in the same order as it would if I opened up a query window and ran it that way - from top to bottom. I know this because it's saying that codecolumn doesn't exist.
My goal is to turn this into a batch file to run periodically without having to open up the database. Is there a better way to run this or tell it to run the commands in order?
specs: PostgreSQL 9.5.1, Windows 10 64 bit
UPDATE:
I noticed this strange character is being added to the beginning of the first statement:
LINE 1: alter table "mytable" add codecolumn varchar (10);
I believe it's skipping this initial statement as a result.
I found a workaround to this here:
https://www.postgresql-archive.org/Using-psql-f-to-load-a-UTF8-file-td5724735.html
The solution was to add an additional line to the file with a semicolon so that the encoding error get tripped up on that.

how to pass variable to copy command in Postgresql

I tried to make a variable in SQL statement in Postgresql, but it did not work.
There are many csv files stored under the path. I want to set path in Postgresql that can tell copy command where can find csv files.
SQL statement sample:
\set outpath '/home/clients/ats-dev/'
\COPY licenses (_id, name,number_seats ) FROM :outpath + 'licenses.csv' CSV HEADER DELIMITER ',';
\COPY uploaded_files (_id, added_date ) FROM :outpath + 'files.csv' CSV HEADER DELIMITER ',';
It did not work. I got error: no such files. The two files licneses.csv and files.csv are stored under /home/cilents/ats-dev on Ubuntu. I found some sultion that use "\set file 'license.csv'". It did not work for me becacuse I have many csv files. also I tried to use "from : outpath || 'licenses.csv'". it did not work ether. Appreciate for any helps.
Using 9.3.
It looks like psql does not support :variable substitution withinpsql backslash commands.
test=> \set somevar fred
test=> \copy z from :somevar
:somevar: No such file or directory
so you will need to do this via an external tool like the unix shell. e.g.
for f in *.sql; do
psql -c "\\copy $(basename $f) FROM '$f'"
done
You can try COPY command
\set outpath '\'/home/clients/ats-dev/'
COPY licenses (_id, name,number_seats ) FROM :outpath/licenses.csv' WITH CSV HEADER DELIMITER ',';
COPY uploaded_files (_id, added_date ) FROM :outpath/files.csv' WITH CSV HEADER DELIMITER ',';
Note: Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must reside on or be accessible to the database server machine, not the client. They must be accessible to and readable or writable by the PostgreSQL user (the user ID the server runs as), not the client. Similarly, the command specified with PROGRAM is executed directly by the server, not by the client application, must be executable by the PostgreSQL user. COPY naming a file or command is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
Documentation: Postgresql 9.3 COPY
It may have been true when this was originally asked, that psql backslash commands didn't support variable interpolation, but in my PostgreSQL 14 instance that's no longer the case. However, the psql manpage is clear that \copy specifically does not support variable interpolation.

Greenplum to file using PSQL

I'm trying to export data from Green-plum to a text file(client) with pipe delimiter using PSQL and \copy. In the output i see single slash is converted to double slash and tab is converted \t.
Example
N\A is converted to N\\A
So how to get just N\A instead N\\A and just spaces instead of \t ?
Note: i`m allowed to use only \copy. Since my file is huge im getting space issue while use SED or Perl for find and replace
Assuming you don't have any "^" characters, you could use that as the escape character.
copy tpcds.call_center to stdout with delimiter '|' escape '^';
More on copy can be found here: https://www.postgresql.org/docs/8.2/static/sql-copy.html
This technique will be relatively slow and put a burden on the Master. If you used gpfdist instead, you could leverage the parallelism in the cluster and bypass the master. This solution is ideal for unloading large amounts of data.
First, start the gpfidst process:
[gpadmin#gpdbsne ~]$ gpfdist -p 8888 > gpfdist_8888.log 2>&1 < gpfdist_8888.log &
[1] 2255
Now, you can create the external table.
[gpadmin#gpdbsne ~]$ psql
SET
Timing is on.
psql (8.2.15)
Type "help" for help.
gpadmin=# create writable external table tpcds.et_call_center
(like tpcds.call_center)
location ('gpfdist://gpdbsne:8888/call_center.txt')
format 'text' (delimiter '|' escape '^');
NOTICE: Table doesn't have 'distributed by' clause, defaulting to distribution columns from LIKE table
CREATE EXTERNAL TABLE
Time: 18.681 ms
Now, you insert the data:
gpadmin=# insert into tpcds.et_call_center select * from tpcds.call_center;
INSERT 0 6
Time: 72.653 ms
gpadmin=# \q
Verify:
[gpadmin#gpdbsne ~]$ wc -l call_center.txt
6 call_center.txt
In my example, I used the hostname "gpdbsne" which is accessible to all segments in this cluster. Typically, Greenplum uses a private network for communication between segments so this hostname will need to be connected to the private network.
Since the writable external table is written to with SQL, you can use whatever transformation logic you want in the SQL so you can change tabs to spaces if you want. This eliminates the need for awk or sed for post processing the files. Copy can use SQL too but like I said, it is a slower than using writable external tables.

How can I use mwdumper with Postgresql command line

I was importing a MediaWiki database using mwdumper with MySql. Now I need to do the same thing, but using Postgresql.
Basicly I get a archive in this link:
http://dumps.wikimedia.org/enwiki/20140903/
And I use mwdumper program to get informations and put in my database.
This is the database script:
https://git.wikimedia.org/blob/mediawiki%2Fcore.git/HEAD/maintenance%2Fpostgres%2Ftables.sql
I created the database through this sql, and now I need to use mwdumper to put data in my database.
I saw many links about this, but only to do in MySql.
Anyone know how to do this import using Postgres, using command line?
Mwdumper: www.mediawiki.org/wiki/Manual:MWDumper
I forgot this question, but I found the solution, the command line to use mwdumper with postgres is:
java -jar mwdumper-1.16.jar --format=pgsql:1.5 ARCHIVE.xml.gz | psql -U wikiUSER -d wikiDATABASE
The command isn't wrong, the errors that happen is because the mwdumper-1.16 convert xml to sql with wrong sintaxe.
This is a insert sql after convert mwdumper (XML->PostgreSql):
INSERT INTO revision (rev_id,rev_page,rev_text_id,rev_comment,rev_user,rev_user_text,rev_timestamp,rev_minor_edit,rev_deleted) VALUES (378187747,676,378187747,'there is no such thing as \"Jr.\" in Russian names. sincerely yours, X\\'ZZ\\'',0,'198.240.130.75','2010-08-10 14:55:48',0,0);
Analizing the same insert in my data base Mysql, the expected text in Postgres is:
INSERT INTO (...) ,' there is no such thing as "Jr." in Russian names. sincerely yours, X\''ZZ\'' ', (...).
For example:
To represent double quotes, mwdumper give a sintaxe \" , but to represent " in Postgres doesn't have \ , it's just ". The same idea to others sintaxe errors.
When you solve all sintaxe errors you can dump perfectly.

PostgreSQL - batch + script + variable

I am not a programmer, I am struggling a bit with this.
I have a batch file connecting to my PostgreSQL server, and then open a sql script. Everything works as expected. My question is how to pass a variable (if possible) from one to the other.
Here is my batch file:
set PGPASSWORD=xxxx
cls
#echo off
C:\Progra~1\PostgreSQL\8.3\bin\psql -d Total -h localhost -p 5432 -U postgres -f C:\TotalProteinImport.sql
And here's the script:
copy totalprotein from 'c:/TP.csv' DELIMITERS ',' CSV HEADER;
update anagrafica
set pt=(select totalprotein.resultvalue from totalprotein where totalprotein.accessionnbr=anagrafica.id)
where data_analisi = '12/23/2011';
delete from totalprotein;
This is working great, now the question is how could I pass a variable that would carry the date for data_analisi?
Like in the batch file, "Please enter date", and then the value is passed to the sql script.
You could create a function out of your your SQL script like this:
CREATE OR REPLACE FUNCTION f_myfunc(date)
RETURNS void AS
$BODY$
CREATE TEMP TABLE t_tmp ON COMMIT DROP AS
SELECT * FROM totalprotein LIMIT 0; -- copy table-structure from table
COPY t_tmp FROM 'c:/TP.csv' DELIMITERS ',' CSV HEADER;
UPDATE anagrafica a
SET pt = t.resultvalue
FROM t_tmp t
WHERE a.data_analisi = $1
AND t.accessionnbr = a.id;
-- Temp table is dropped automatically at end of session
-- In this case (ON COMMIT DROP) after the transaction
$BODY$
LANGUAGE sql;
You can use language SQL for this kind of simple SQL batch.
As you can see I have made a couple of modifications to your script that should make it faster, cleaner and safer.
Major points
For reading data into an empty table temporarily, use a temporary table. Saves a lot of disc writes and is much faster.
To simplify the process I use your existing table totalprotein as template for the creation of the (empty) temp table.
If you want to delete all rows of a table use TRUNCATE instead of DELETE FROM. Much faster. In this particular case, you need neither. The temporary table is dropped automatically. See comments in function.
The way you updated anagrafica.pt you would set the column to NULL, if anything goes wrong in the process (date not found, wrong date, id not found ...). The way I rewrote the UPDATE, it only happens if matching data are found. I assume that is what you actually want.
Then ask for user input in your shell script and call the function with the date as parameter. That's how it could work in a Linux shell (as user postgres, with password-less access (using IDENT method in pg_haba.conf):
#! /bin/sh
# Ask for date. 'YYYY-MM-DD' = ISO date-format, valid with any postgres locale.
echo -n "Enter date in the form YYYY-MM-DD and press [ENTER]: "
read date
# check validity of $date ...
psql db -p5432 -c "SELECT f_myfunc('$date')"
-c makes psql execute a singe SQL command and then exits. I wrote a lot more on psql and its command line options yesterday in a somewhat related answer.
The creation of the according Windows batch file remains as exercise for you.
Call under Windows
The error message tells you:
Function tpimport(unknown) does not exist
Note the lower case letters: tpimport. I suspect you used mixe case letters to create the function. So now you have to enclose the function name in double quotes every time you use it.
Try this one (edited quotes!):
C:\Progra~1\PostgreSQL\8.3\bin\psql -d Total -h localhost -p 5432 -U postgres
-c "SELECT ""TPImport""('%dateimport%')"
Note how I use singe and double quotes here. I guess this could work under windows. See here.
You made it hard for yourself when you chose to use mixed case identifiers in PostgreSQL - a folly which I never tire of warning against. Now you have to double quote the function name "TPImport" every time you use it. While perfectly legit, I would never do that. I use lower case letters for identifiers. Always. This way I never mix up lower / upper case and I never have to use double quotes.
The ultimate fix would be to recreate the function with a lower case name (just leave away the double quotes and it will be folded to lower case automatically). Then the function name will just work without any quoting.
Read the basics about identifiers here.
Also, consider upgrading to a more recent version of PostgreSQL 8.3 is a bit rusty by now.
psql supports textual replacement variables. Within psql they can be set using \set and used using :varname.
\set xyz 'abcdef'
select :'xyz';
?column?
----------
abcdef
These variables can be set using command line arguments also:
psql -v xyz=value
The only problem is that these textual replacements always need some fiddling with quoting as shown by the first \set and select.
After creating the function in Postgres, you must create a .bat file in the bin directory of your Postgres version, for example C:\Program Files\PostgreSQL\9.3\bin. Here you write:
#echo off
cd C:\Program Files\PostgreSQL\9.3\bin
psql -p 5432 -h localhost -d myDataBase -U postgres -c "select * from myFunction()"