import data from csv file in Postgres

import data from csv file in Postgres - postgresql

ERROR: could not open file "C:\Program Files\PostgreSQL\10\data\Data_copy\student.csv" for reading: No such file or directory
HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.
SQL state: 58P01

The error message says it all.
If you use SQL COPY command the file to be loaded must be accessible on database server side.
If this is not possible you can use psql CLI \copy command from the client side because it can access a file on client side.
Example:
$ cat t.csv
1,ONE
2,TWO
3,THREE
In psql:
# \d t
Table "public.t"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
x | integer | | |
y | text | | |
# \copy t from 't.csv' delimiter ',';
COPY 3
# select * from t;
x | y
---+-------
1 | ONE
2 | TWO
3 | THREE
(3 rows)

Using psql run the following command
\copy students from 'C:\Program Files\PostgreSQL\10\data\Data_copy\student.csv' csv;

Related

Postgres - character sets and encodings

I was wondering if someone can help me understand what's going on/wrong with my Postgres data please...
I'll explain things below - but I guess ultimately the questions I have are :
What characterset/encoding should I be using (i.e. what is best
practise)?
IF the answer is UTF8, then will certain characters (e.g. UK pound symbols) always look "funny" in the database?
I've got a database that has a table with data about flights (although obviously it could be anything really), defined as follows...
CREATE TABLE public.flight (
flightid integer DEFAULT nextval('public.flight_seq'::regclass) NOT NULL,
tripid integer NOT NULL,
flightdatedeparted date NOT NULL,
flightairportdeparted text NOT NULL,
flightairportarrived text NOT NULL,
flightairline text NOT NULL,
flightdetails text,
flightdayflightnumber integer DEFAULT 1 NOT NULL,
flightdistance numeric
);
Now, when I enter data into it via a web front end connected to this database then I end up with data something like...
holidayinfo=# select distinct * from flight where flightid=97;
-[ RECORD 1 ]---------+---------------------------------------
flightid | 97
tripid | 36
flightdatedeparted | 2004-05-14
flightairportdeparted | LHR
flightairportarrived | WAW
flightairline | British Airways
flightdetails | Hotline, ┬ú82.40, BA850, 13:40 -> 17:05
flightdayflightnumber | 1
flightdistance | 912.7
However, the data that I'd entered into the web form for the field "flightdetails" was actually...
Hotline, £82.40, BA850, 13:40 -> 17:05
Now, when I dump the data and look at it in Notepad++, depending on what encoding I use then sometimes I see it correctly as the pound symbol (when I choose ANSI) and other times it's incorrect as xA3 (when I choose UTF8).
At least when it's stored in Postgres as the "funny" value then it also displays correctly on my webpage when I retrieve the data - so that's good.
If I try to manually update the value via psql then I get the following...
holidayinfo=# update flight set flightdetails='Hotline, £82.40, BA850, 13:40 -> 17:05' where flightid=97;
ERROR: invalid byte sequence for encoding "UTF8": 0x9c
In terms of how my database is created and what client encoding its using then I've got the following...
holidayinfo=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------+----------+----------+-----------------------------+-----------------------------+-----------------------
holidayinfo | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
leagueinfo | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(5 rows)
holidayinfo=# show client_encoding;
client_encoding
-----------------
UTF8
(1 row)
Maybe this is all working as designed, but I'm just confused as to how things should be?
Ultimately, I'd love to be able to have the data stored so that I can see it as the pound sign AND be entered/retrieved/displayed as the pound sign.
The former is desirable so that if ever I need to look at the data then I can see what the real data is - not have to make assumptions on what character "┬ú" really means.
Also, this problem scales up when there are other characters having the same "issue" such as a hyphen (-) showing as "ÔÇô" and or an apostrophe (') showing as "ÔÇÖ".
Thanks in advance!

You must be viewing the data with psql using cmd.exe with code page CP-850.
The data in your database are wrong, because the application that inserted them had client_encoding set to WIN1252 while feeding the database UTF-8 characters.
So £, which is 0xC2A3 in UTF-8, is interpreted as two characters, namely Â (0xC2) and £ (0xA3). They are converted to UTF-8 and stored in the database as 4 bytes (0xC382 and 0xC2A3). When you view them with psql, they are converted back to WINDOWS-1252, but cmd.exe interprets them as CP-850 and renders them as ┬ú.
The fix is to change client_encoding to UTF8 in the application that inserts the data into the database.

Use date function to rename a database in postgres

I would like to know how to rename a database with the current date
thanks for your help

You may use dynamic SQL in aDO block. Here I use a date suffix in YYYYMMDD format for the database name.
knayak=# CREATE DATABASE mydatabase;
CREATE DATABASE
DO $$
BEGIN
EXECUTE format('ALTER DATABASE %I RENAME TO %I_%s', 'mydatabase','mydatabase',
to_char(current_date,'YYYYMMDD')::TEXT);
END
$$;
knayak=#
knayak=# \l mydatabase*
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
---------------------+--------+----------+-------------+-------------+-------------------
mydatabase_20181214 | knayak | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(1 row)

psql non-select: how to remove formatting and show only certain columns?

I'm looking to remove all line drawing characters from:
PGPASSWORD="..." psql -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
(2 rows)
Adding the -t option gets rid of the header and footer, but the vertical bars are still present:
PGPASSWORD="..." psql -t -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
Note that this question is specific to show pool_nodes and other similar non-select SQL statements.
My present workaround is to involve the Linux cut command:
<previous command> | cut -d '|' -f 4
The question has two parts:
How using psql only can the vertical bars above be removed?
How using psql only can only a specific column (for example, status) or columns be shown? For example, the result might be just two lines, each showing the number 2.
I'm using psql version psql (PostgreSQL) 9.2.18 on a CentOS 7 server.

For scripting psql use psql -qAtX:
quiet
tuples-only
unAligned output
do not read .psqlrc (X)
To filter columns you must name them in the SELECT list. psql always outputs the full result set it gets from the server. E.g. SELECT status FROM pool_nodes.
Or you can cut to extract ordinal column numbers e.g.
psql -qAtX -c 'whatever' | cut -d '|' -f 1,2-4
(I have no idea how show pool_nodes can produce the output you show here, since SHOW returns a single scalar value...)
To change the delimiter from a pipe | to something else, use -F e.g. -F ','. But be warned, the delimiter is not escaped when it appears in output, this isn't CSV. You might want to consider a tab as a useful option; you have to enter a quoted literal tab to do this. (If doing it in an interactive shell, search for "how to enter literal tab in bash" when you get stuck).
Example showing all the above, given dummy data:
CREATE TABLE dummy_table (
a integer,
b integer,
c text,
d text
);
INSERT INTO dummy_table
VALUES
(1,1,'chicken','turkey'),
(2,2,'goat','cow'),
(3,3,'mantis','cricket');
query, with single space as the column delimiter (so you'd better not have spaces in your data!):
psql -qAtX -F ' ' -c 'SELECT a, b, d FROM dummy_table'
If for some reason you cannot generate a column-list for SELECT you can instead filter by column-ordinal with cut:
psql -qAtX -F '^' -c 'TABLE dummy_table' | cut -d '^' -f 1-2,4

Alternative to list command in postgres for scripting

Unfortunatly psql -l uses linewraps
example
see output and regard the "access" column
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------------------+------------+----------+-------------+-------------+-----------------------
firstdb | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
secnddb | scnduser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
thrddb | scnduser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
postgres | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
template0 | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)
hint even with some option, I can't get that gone:
$ psql -Atlqn
firstdb|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
secnddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
thrddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
postgres|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
template0|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|=c/postgres
postgres=CTc/postgres
template1|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|=c/postgres
postgres=CTc/postgres
Question Is there another way to get the list of databases in the same way \list prints it so I can use it in scripts for parsing with e.g. awk?

Interesting issue. You're being bitten by hard linewraps in ACL entries. Those aren't the only place they can appear btw, just the most common.
Use a null-byte recordsep
Rather than trying to avoid the newlines, why not use a different record separator? A null byte (\0) can be good; that's what the -0 option is for. It's only useful if your client can deal with null bytes, though; good for xargs -0, not good for lots of other stuff. The handy thing about a null byte is that it won't appear in psql's output otherwise so there's no risk of conflict. gawk does support null-separated records, though it's woefully underdocumented.
Try, e.g:
psql -Atlqn -0 | gawk -vRS=$'\0' '{ gsub("\n", " "); print }
which replaces newlines in database names (yes, they can appear there!), ACL entries, etc with a space.
Use a different recordsep
Alternately, use -R, e.g. -R '!' or -R '--SEPARATOR--' or whatever is easy for you to parse and not likely to appear in the output.
Query the catalogs yourself, escaping strings
Depending on the information you need, you can instead query the catalogs or information_schema directly, too. You'll still have to deal with funny chars, so you may want a regexp to escape any funny business.
Newlines and shell metacharacters
Beware that you still have to deal with unexpected newlines; consider what happens when some ##$# does this:
CREATE DATABASE "my
database";
Yes, that's a legal database name. So are both of:
CREATE DATABASE "$(rm -rf /gladthisisnotroot);";
CREATE DATABASE "$(createuser -l -s my_haxxor -W top_secret)"
Yes, both are legal database names. Yes, that's really, really bad if you don't escape your shell metacharacters properly, you made the mistake of running your shell scripts as root or the postgres user, and they're not feeling nice.

All relevant data is in pg_database and pg_user
see http://www.postgresql.org/docs/current/static/catalog-pg-database.html
select pg_database.datname,pg_user.usename,pg_encoding_to_char(pg_database.encoding),pg_database.datcollate,pg_database.datctype,pg_database.datacl from pg_database,pg_user WHERE pg_database.datdba = pg_user.usesysid;
on shell, wrapped in psql command:
psql -AStnq -c "select [...]"
returns correct formatted
template1|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|{=c/postgres,postgres=CTc/postgres}
template0|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|{=c/postgres,postgres=CTc/postgres}
postgres|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
firstdb|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
secnddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
thrddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|

Semicolon escaping in postgresql on Windows

I have an sql script defining a function in create_database_function.sql file:
CREATE OR REPLACE FUNCTION increment_display_count(in_host_id int, in_host_owner_id int, in_date_ date) RETURNS void as $$
BEGIN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ = in_date_;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO display_count (host_id, host_owner_id, date_, display_count) VALUES (in_host_id, in_host_owner_id, in_date_, 1);
EXCEPTION WHEN OTHERS THEN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ = in_date_;
END;
RETURN;
END;
$$
LANGUAGE plpgsql;
I want to execute in on Windows. To do so, I run a usual command:
psql -h localhost -U postgres -d database -f create_database_function.sql
But this script gave me a huge number of syntax errors. An hour-long googling didn't bear fruit. But I went on tinkering with this script and eventually found the problem.
The solution was to prepend all semicolon ; signs with backslash \.
Though this solves the problem, it introduces another one. I work on the project with another guy. But he works on Linux. In his case the script should be without \ before semicolons.
So, why do I need to prepend ; with \ on Windows? Can this be somehow avoided or done another way?
I googled for it a lot and haven't found any similar problem.
Update
The output when I use \; instead of ;:
C:\Xubuntu_shared\pixel\pixel\src\main\scripts>psql -h localhost -U postgres -d
pixel_test -f create_database_function.sql
Password:
CREATE FUNCTION
The output with errors when I execute the script without backslash escaping:
C:\Xubuntu_shared\pixel\pixel\src\main\scripts>psql -h localhost -U postgres -d
pixel_test -f create_database_function.sql
Password:
psql:create_database_function.sql:4: ERROR: unterminated dollar-quoted string a
t or near "$$
BEGIN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ =
in_date_;" at character 121
psql:create_database_function.sql:6: ERROR: syntax error at or near "IF" at cha
racter 5
psql:create_database_function.sql:7: ERROR: syntax error at or near "IF" at cha
racter 9
psql:create_database_function.sql:9: ERROR: syntax error at or near "INSERT" at
character 19
psql:create_database_function.sql:12: ERROR: syntax error at or near "EXCEPTION
" at character 5
psql:create_database_function.sql:13: WARNING: there is no transaction in progr
ess
COMMIT
psql:create_database_function.sql:14: ERROR: syntax error at or near "RETURN" a
t character 5
psql:create_database_function.sql:15: WARNING: there is no transaction in progr
ess
COMMIT
psql:create_database_function.sql:17: ERROR: unterminated dollar-quoted string
at or near "$$
LANGUAGE plpgsql;" at character 1
Other information that may be important:
create_database_function.sql encoding is UTF-8, without BOM. Line ending is in Windows format.
Update 2
Version
pixel=> SELECT version();
version
-------------------------------------------------------------
PostgreSQL 9.2.3, compiled by Visual C++ build 1600, 32-bit
(1 row)
pixel=>
Update 3
Output from output of select name, setting, source from pg_settings where source <> 'default'; command:
Oleg#OLEG-PC /C/Xubuntu_shared/pixel/pixel/src/main/scripts (pixel-dev2)
$ psql -U postgres
Password:
Welcome to psql.exe 7.4.6, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
Warning: Console codepage (866) differs from windows codepage (1251)
8-bit characters will not work correctly. See PostgreSQL
documentation "Installation on Windows" for details.
postgres=# select name,setting,source from pg_settings where source <> 'default'
;
name | setting
| source
----------------------------+---------------------------------------------------
---+----------------------
config_file | C:/Program Files/PostgreSQL/9.2/data/postgresql.co
nf | override
data_directory | C:/Program Files/PostgreSQL/9.2/data
| override
DateStyle | ISO, MDY
| configuration file
default_text_search_config | pg_catalog.english
| configuration file
hba_file | C:/Program Files/PostgreSQL/9.2/data/pg_hba.conf
| override
ident_file | C:/Program Files/PostgreSQL/9.2/data/pg_ident.conf
| override
lc_collate | C
| override
lc_ctype | C
| override
lc_messages | C
| configuration file
lc_monetary | C
| configuration file
lc_numeric | C
| configuration file
lc_time | C
| configuration file
listen_addresses | *
| configuration file
log_destination | stderr
| configuration file
log_line_prefix | %t
| configuration file
log_timezone | Europe/Moscow
| configuration file
logging_collector | on
| configuration file
max_connections | 100
| configuration file
max_stack_depth | 2048
| environment variable
port | 5432
| configuration file
server_encoding | UTF8
| override
shared_buffers | 4096
| configuration file
TimeZone | Europe/Moscow
| configuration file
transaction_deferrable | off
| override
transaction_isolation | read committed
| override
transaction_read_only | off
| override
wal_buffers | 128
| override
(27 rows)

For those who may encounter this rare problem.
It is related to Postgres server and client versions mismatch. I were using Server of version 9.2.3 and Client of version 7.4.6.
Is important to mention that Postgres installer for Windows already includes psql client. So there is no need to install another one.
I don't remember exactly why I installed a separate client, but I guess it was because psql didn't start from console. I think it could be cured by rebooting Windows after PosrgreSQL isntallation (as a lot of problems are solved with this OS this way) or manually adding path to psql.exe to environment path variable.
So, if you face the same problem, check the version of Server and Client. And if they don't match, set the path environment variable to the correct client from original PosgreSQL shipping.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse