I have a csv with 90 columns that I need to import as a table to my pgsql database (and there are several more csv files with large numbers of columns that I would like to apply this method to). My aim is to avoid manually designating 90 separate columns with a CREATE TABLE query.
Column headers in the table should remain the same as in the csv and every column should be imported as a numeric data type with a precision of 2 decimal points.
So far, the only program that I've come across that does this is pgfutter which I have installed successfully. However, the database that I am connecting to is a remote one on AWS and it is unclear where to input the connection details. Also, after installing, I get an error when requesting help info:
$ ./pgfutter --help
-bash: ./pgfutter: Permission denied
Could anyone suggest a workaround in pgfutter or another method to import a csv file with straightforward numeric columns automatically to PostgreSQL ?
It is simple to write a shell script that constructs a CREATE TABLE statement from the first line of a CSV file.
Here is my solution:
#!/bin/bash
# makes a CREATE TABLE statement out of the first line of a CSV file
# usage: mktab <tabname> <CSV file>
if [ -z "$2" -o -n "$3" ]; then
echo "usage: mktab <tabname> <CSV file>" 1>&2
exit 1
fi
IFS=,
first=1
echo -n "CREATE TABLE \"$1\" ("
for col in $(head -1 "$2"); do
if [ $first -eq 1 ]; then
first=0
else
echo -n ', '
fi
echo -n "\"$col\" numeric(10,2)"
done
echo ');'
exit 0
Related
I have a pipe delimited data file with no headers. I need to import data into a PostgreSQL table starting with the data from second column in the file i.e skip the data before the first '|' for each line. How do I achieve this using the COPY command?
Use the cut command to remove the first column and then import.
cut -d "|" -f 2- file1.csv > file2.csv
psql -d test -h localhost -c "\copy table(f1,f2,f3) from 'file2.csv' delimiter '|' csv header;"
Not an answer as such related to postgresql but more about command line tools.
My input file is a csv file containing details as:
2233,anish sharma
2234,azad khan
2235,birbal singh
2236,chaitanya kumar
my expected output is display of the two details in two separate columns.
I executed following code. Full name is not getting displayed. The part after space doesn't appear. What changes should be done?
echo "Roll no updation"
tput cup 10 10
echo "Key in file name (rollno,name separated by comma)"
tput cup 12 10
read infile
for i in `cat $infile`
do
rollno=`echo $i|cut -d , -f1`
name=`echo $i|cut -d , -f2`
psql -U postgres -A -t -F, -c "update student set name = '$name' where rollno = '$rollno' current record" >bq
done
Your loop should be written in this fashion
# comma separates records
IFS=,
cat "$infile" | while read rollno name; do
psql -U postgres -A -t -F, -c \
"update student set name = '$name'
where rollno = '$rollno'" >bq
done
But you should be aware that this code is susceptible to SQL injection. Only use it if you can trust the source of the data!
Any ' in the file will cause errors and worse.
Is there any way to use copy command for batch data import and read data from a url. For example, copy command has a syntax like :
COPY sample_table
FROM 'C:\tmp\sample_data.csv' DELIMITER ',' CSV HEADER;
What I want is not to give a local path but a url. Is there any way?
It's pretty straightforward, provided you have an appropriate command-line tool available:
COPY sample_table FROM PROGRAM 'curl "http://www.example.com/file.csv"'
Since you appear to be on Windows, I think you'll need to install curl or wget yourself. There is an example using wget on Windows here which may be useful.
My solution is
cat $file |
tail -$numberLine |
sed 's/ / ,/g' |
psql -q -d $dataBaseName -c "COPY tableName FROM STDIN DELIMITER ','"
You can insert a awk between sed and psql to add missing column.
Interesting if already you know what to put in the missing column.
awk '{print $0" , "'info_about_missing_column'"\n"}'
I have done that and it works and faster than INSERT.
I have a SQL file my_query.sql:
select * from my_table
Using psql, I can read in this sql file:
\i my_query.sql
Or pass it in as an arg:
psql -f my_query.sql
And I can output the results of a query string to a csv:
\copy (select * from my_table) to 'output.csv' with csv header
Is there a way to combine these so I can output the results of a query from a SQL file to a CSV?
Unfortunately there's no baked-in functionality for this, so you need a little bash-fu to get this to work properly.
CONN="psql -U my_user -d my_db"
QUERY="$(sed 's/;//g;/^--/ d;s/--.*//g;' my_query.sql | tr '\n' ' ')"
echo "\\copy ($QUERY) to 'out.csv' with CSV HEADER" | $CONN
The sed fun removes all semicolons, comment lines, and end of line comments, and tr converts newlines to spaces (as mentioned in a comment by #abelisto):
-- my_query.sql
select *
from my_table
where timestamp < current_date -- only want today's records
limit 10;
becomes:
select * from my_table where timestamp < current_date limit 10
which then gets passed in to the valid psql command:
\copy (select * from my_table where timestamp < current_date) to 'out.csv' with csv header
Here's a script:
sql_to_csv.sh
#!/bin/bash
# sql_to_csv.sh
CONN="psql -U my_user -d my_db"
QUERY="$(sed 's/;//g;/^--/ d;s/--.*//g;' $1 | tr '\n' ' ')"
echo "$QUERY"
echo "\\copy ($QUERY) to '$2' with csv header" | $CONN > /dev/null
./sql_to_csv.sh my_query.sql out.csv
I think the simplest way is to take advantage of the shell's variable expansion capabilities:
psql -U my_user -d my_db -c "COPY ($(cat my_query.sql)) TO STDOUT WITH CSV HEADER" > my_query_results.csv
You could do it using a bash script.
dump_query_to_csv.sh:
#!/bin/bash
# Takes an sql query file as an argument and dumps its results
# to a CSV file using psql \copy command.
#
# Usage:
#
# dump_query_to_csv.sh <sql_query_file> [<csv_output_filesname>]
SQL_FILE=$1
[ -z $SQL_FILE ] && echo "Must supply query file" && exit
shift
OUT_FILE=$1
[ -z $OUT_FILE ] && OUT_FILE="output.csv" # default to "output.csv" if no argument is passed
TMP_TABLE=ttt_temp_table_xx # some table name that will not collide with existing tables
## Build a psql script to do the work
PSQL_SCRIPT=temp.psql
# create a temporary database table using the SQL from the query file
echo "DROP TABLE IF EXISTS $TMP_TABLE;CREATE TABLE $TMP_TABLE AS" > $PSQL_SCRIPT
cat $SQL_FILE >> $PSQL_SCRIPT
echo ";" >> $PSQL_SCRIPT
# copy the temporary table to the output CSV file
echo "\copy (select * from $TMP_TABLE) to '$OUT_FILE' with csv header" >> $PSQL_SCRIPT
# drop the temporary table
echo "DROP TABLE IF EXISTS $TMP_TABLE;" >> temp.sql
## Run psql script using psql
psql my_database < $PSQL_SCRIPT # replace my_database and add user login credentials as necessary
## Remove the psql script
rm $PSQL_SCRIPT
You'll need to edit the psql line in the script to connect to your database. The script could also be enhanced to take the database and account credentials as arguments.
The accepted solution is correct, but I had Windows and had to make it run via a batch (command) file. Posting it here if someone needs that
#echo off
echo 'Reading file %1'
set CONN="C:\Program Files\PostgreSQL\11\bin\psql.exe" -U dbusername -d mydbname
"C:\Program Files\Git\usr\bin\sed.exe" 's/;//g;/^--/ d;s/--.*//g;' %1 | "C:\Program Files\Git\usr\bin\tr.exe" '\n' ' ' > c:\temp\query.txt
set /p QUERY=<c:\temp\query.txt
echo %QUERY%
echo \copy (%QUERY%) to '%2' WITH (FORMAT CSV, HEADER) | %CONN%
I am trying to write a batch script that will query a Postgres database and output the results to a csv. Currently, it queries the database and saves the output as a pipe delimited csv.
I want the output to be tab delimited rather than pipe delimited, since I will eventually be importing the csv into Access. Does anyone know how this can be achieved?
Current code:
cd C:\Program Files\PostgreSQL\9.1\bin
psql -c "SELECT * from jivedw_day;" -U postgres -A -o sample.csv cscanalytics
postgres = username
cscanalytics = database
You should be using COPY to dump CSV:
psql -c "copy jivedw_day to stdout csv delimiter E'\t'" -o sample.csv -U postgres -d csvanalytics
The delimiter E'\t' part will get you your output with tabs instead of commas as the delimiter. There are other other options as well, please see the documentation for further details.
Using -A like you are just dumps the usual interactive output to sample.csv without the normal padding to making the columns line up, that's why you're seeing the pipes:
-A
--no-align
Switches to unaligned output mode. (The default output mode is otherwise aligned.)