Alternative to list command in postgres for scripting - postgresql

Unfortunatly psql -l uses linewraps
example
see output and regard the "access" column
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------------------+------------+----------+-------------+-------------+-----------------------
firstdb | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
secnddb | scnduser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
thrddb | scnduser | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
postgres | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 |
template0 | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | de_DE.UTF-8 | de_DE.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)
hint even with some option, I can't get that gone:
$ psql -Atlqn
firstdb|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
secnddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
thrddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
postgres|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
template0|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|=c/postgres
postgres=CTc/postgres
template1|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|=c/postgres
postgres=CTc/postgres
Question Is there another way to get the list of databases in the same way \list prints it so I can use it in scripts for parsing with e.g. awk?

Interesting issue. You're being bitten by hard linewraps in ACL entries. Those aren't the only place they can appear btw, just the most common.
Use a null-byte recordsep
Rather than trying to avoid the newlines, why not use a different record separator? A null byte (\0) can be good; that's what the -0 option is for. It's only useful if your client can deal with null bytes, though; good for xargs -0, not good for lots of other stuff. The handy thing about a null byte is that it won't appear in psql's output otherwise so there's no risk of conflict. gawk does support null-separated records, though it's woefully underdocumented.
Try, e.g:
psql -Atlqn -0 | gawk -vRS=$'\0' '{ gsub("\n", " "); print }
which replaces newlines in database names (yes, they can appear there!), ACL entries, etc with a space.
Use a different recordsep
Alternately, use -R, e.g. -R '!' or -R '--SEPARATOR--' or whatever is easy for you to parse and not likely to appear in the output.
Query the catalogs yourself, escaping strings
Depending on the information you need, you can instead query the catalogs or information_schema directly, too. You'll still have to deal with funny chars, so you may want a regexp to escape any funny business.
Newlines and shell metacharacters
Beware that you still have to deal with unexpected newlines; consider what happens when some ##$# does this:
CREATE DATABASE "my
database";
Yes, that's a legal database name. So are both of:
CREATE DATABASE "$(rm -rf /gladthisisnotroot);";
CREATE DATABASE "$(createuser -l -s my_haxxor -W top_secret)"
Yes, both are legal database names. Yes, that's really, really bad if you don't escape your shell metacharacters properly, you made the mistake of running your shell scripts as root or the postgres user, and they're not feeling nice.

All relevant data is in pg_database and pg_user
see http://www.postgresql.org/docs/current/static/catalog-pg-database.html
select pg_database.datname,pg_user.usename,pg_encoding_to_char(pg_database.encoding),pg_database.datcollate,pg_database.datctype,pg_database.datacl from pg_database,pg_user WHERE pg_database.datdba = pg_user.usesysid;
on shell, wrapped in psql command:
psql -AStnq -c "select [...]"
returns correct formatted
template1|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|{=c/postgres,postgres=CTc/postgres}
template0|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|{=c/postgres,postgres=CTc/postgres}
postgres|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
firstdb|postgres|UTF8|de_DE.UTF-8|de_DE.UTF-8|
secnddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|
thrddb|scnduser|UTF8|de_DE.UTF-8|de_DE.UTF-8|

Related

UPPER function is not working properly on O umlaut characters in Postgres

Please note that I have a postgres database as : -
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+------------+----------+-------------+-------------+----------------------------
my_db | admin_user | UTF8 | de_DE.UTF-8 | C | =Tc/admin_user +
| | | | | admin_user=CTc/admin_user +
| | | | | my_readonly=c/admin_user
UPPER function is not working properly on O umlaut characters in this database.
Please advice if there is any settings that can be the issue.
What determines the rules about what is a number or a letter; or the correspondence of small to big letters is LC_CTYPE*. You need it to be something like de_DE.UTF-8 in order to do UPPER for such letters. You have C at the moment.
When creating a DB, Postgres takes these settings from the environment variables in operating system. But you can override them at that point.
*I read CTYPE as Character TYPE

Postgres - character sets and encodings

I was wondering if someone can help me understand what's going on/wrong with my Postgres data please...
I'll explain things below - but I guess ultimately the questions I have are :
What characterset/encoding should I be using (i.e. what is best
practise)?
IF the answer is UTF8, then will certain characters (e.g. UK pound symbols) always look "funny" in the database?
I've got a database that has a table with data about flights (although obviously it could be anything really), defined as follows...
CREATE TABLE public.flight (
flightid integer DEFAULT nextval('public.flight_seq'::regclass) NOT NULL,
tripid integer NOT NULL,
flightdatedeparted date NOT NULL,
flightairportdeparted text NOT NULL,
flightairportarrived text NOT NULL,
flightairline text NOT NULL,
flightdetails text,
flightdayflightnumber integer DEFAULT 1 NOT NULL,
flightdistance numeric
);
Now, when I enter data into it via a web front end connected to this database then I end up with data something like...
holidayinfo=# select distinct * from flight where flightid=97;
-[ RECORD 1 ]---------+---------------------------------------
flightid | 97
tripid | 36
flightdatedeparted | 2004-05-14
flightairportdeparted | LHR
flightairportarrived | WAW
flightairline | British Airways
flightdetails | Hotline, £82.40, BA850, 13:40 -> 17:05
flightdayflightnumber | 1
flightdistance | 912.7
However, the data that I'd entered into the web form for the field "flightdetails" was actually...
Hotline, £82.40, BA850, 13:40 -> 17:05
Now, when I dump the data and look at it in Notepad++, depending on what encoding I use then sometimes I see it correctly as the pound symbol (when I choose ANSI) and other times it's incorrect as xA3 (when I choose UTF8).
At least when it's stored in Postgres as the "funny" value then it also displays correctly on my webpage when I retrieve the data - so that's good.
If I try to manually update the value via psql then I get the following...
holidayinfo=# update flight set flightdetails='Hotline, £82.40, BA850, 13:40 -> 17:05' where flightid=97;
ERROR: invalid byte sequence for encoding "UTF8": 0x9c
In terms of how my database is created and what client encoding its using then I've got the following...
holidayinfo=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------+----------+----------+-----------------------------+-----------------------------+-----------------------
holidayinfo | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
leagueinfo | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
postgres | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 |
template0 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | English_United Kingdom.1252 | English_United Kingdom.1252 | =c/postgres +
| | | | | postgres=CTc/postgres
(5 rows)
holidayinfo=# show client_encoding;
client_encoding
-----------------
UTF8
(1 row)
Maybe this is all working as designed, but I'm just confused as to how things should be?
Ultimately, I'd love to be able to have the data stored so that I can see it as the pound sign AND be entered/retrieved/displayed as the pound sign.
The former is desirable so that if ever I need to look at the data then I can see what the real data is - not have to make assumptions on what character "£" really means.
Also, this problem scales up when there are other characters having the same "issue" such as a hyphen (-) showing as "ÔÇô" and or an apostrophe (') showing as "ÔÇÖ".
Thanks in advance!
You must be viewing the data with psql using cmd.exe with code page CP-850.
The data in your database are wrong, because the application that inserted them had client_encoding set to WIN1252 while feeding the database UTF-8 characters.
So £, which is 0xC2A3 in UTF-8, is interpreted as two characters, namely  (0xC2) and £ (0xA3). They are converted to UTF-8 and stored in the database as 4 bytes (0xC382 and 0xC2A3). When you view them with psql, they are converted back to WINDOWS-1252, but cmd.exe interprets them as CP-850 and renders them as ┬ú.
The fix is to change client_encoding to UTF8 in the application that inserts the data into the database.

psql non-select: how to remove formatting and show only certain columns?

I'm looking to remove all line drawing characters from:
PGPASSWORD="..." psql -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
(2 rows)
Adding the -t option gets rid of the header and footer, but the vertical bars are still present:
PGPASSWORD="..." psql -t -d postgres -h "1.2.3.4" -p 9432 -c 'show pool_nodes' -U owner
0 | 10.20.30.40 | 5432 | 2 | 0.500000 | primary
1 | 10.20.30.41 | 5432 | 2 | 0.500000 | standby
Note that this question is specific to show pool_nodes and other similar non-select SQL statements.
My present workaround is to involve the Linux cut command:
<previous command> | cut -d '|' -f 4
The question has two parts:
How using psql only can the vertical bars above be removed?
How using psql only can only a specific column (for example, status) or columns be shown? For example, the result might be just two lines, each showing the number 2.
I'm using psql version psql (PostgreSQL) 9.2.18 on a CentOS 7 server.
For scripting psql use psql -qAtX:
quiet
tuples-only
unAligned output
do not read .psqlrc (X)
To filter columns you must name them in the SELECT list. psql always outputs the full result set it gets from the server. E.g. SELECT status FROM pool_nodes.
Or you can cut to extract ordinal column numbers e.g.
psql -qAtX -c 'whatever' | cut -d '|' -f 1,2-4
(I have no idea how show pool_nodes can produce the output you show here, since SHOW returns a single scalar value...)
To change the delimiter from a pipe | to something else, use -F e.g. -F ','. But be warned, the delimiter is not escaped when it appears in output, this isn't CSV. You might want to consider a tab as a useful option; you have to enter a quoted literal tab to do this. (If doing it in an interactive shell, search for "how to enter literal tab in bash" when you get stuck).
Example showing all the above, given dummy data:
CREATE TABLE dummy_table (
a integer,
b integer,
c text,
d text
);
INSERT INTO dummy_table
VALUES
(1,1,'chicken','turkey'),
(2,2,'goat','cow'),
(3,3,'mantis','cricket');
query, with single space as the column delimiter (so you'd better not have spaces in your data!):
psql -qAtX -F ' ' -c 'SELECT a, b, d FROM dummy_table'
If for some reason you cannot generate a column-list for SELECT you can instead filter by column-ordinal with cut:
psql -qAtX -F '^' -c 'TABLE dummy_table' | cut -d '^' -f 1-2,4

Column sorting in PostgreSQL is different between macOS and Ubuntu using same collation

I created a database with UTF8 encoding and fr_FR collation on both my Mac and Ubuntu server like this:
CREATE DATABASE my_database OWNER 'admin' TEMPLATE 'template0' ENCODING 'UTF8' LC_COLLATE 'fr_FR.UTF-8' LC_CTYPE 'fr_FR.UTF-8';
On both, I queried the collation:
show lc_collate;
and obtained:
fr_FR.UTF-8
Then, I tried to sort the same database and didn't obtain same results:
SELECT winery FROM usr_wines WHERE user_id=1 AND status=1 ORDER BY winery LIMIT 5;
1 - On macOS:
a space before the a
A New record
Aa
Altesinoo
Aé
2- On Ubuntu 14.04:
Aa
Aé
Altesino
A New Wine
a space before a
On Ubuntu, I have installed the desired locales and create a new collation:
CREATE COLLATION "fr_FR.utf8" (LOCALE = "fr_FR.utf8")
select * from pg_collation;
collname | collnamespace | collowner | collencoding | collcollate | collctype
------------+---------------+-----------+--------------+-------------+------------
default | 11 | 10 | -1 | |
C | 11 | 10 | -1 | C | C
POSIX | 11 | 10 | -1 | POSIX | POSIX
C.UTF-8 | 11 | 10 | 6 | C.UTF-8 | C.UTF-8
en_US | 11 | 10 | 6 | en_US.utf8 | en_US.utf8
en_US.utf8 | 11 | 10 | 6 | en_US.utf8 | en_US.utf8
ucs_basic | 11 | 10 | 6 | C | C
fr_FR | 2200 | 10 | 6 | fr_FR.utf8 | fr_FR.utf8
On the mac, the fr_FR collation was already installed.
So why this difference in sorting ?
Another strange issue on Ubuntu: I fi tried to force the collation in my request:
SELECT winery FROM usr_wines WHERE user_id=1 AND status=1 ORDER BY winery COLLATE "fr_FR" LIMIT 5;
I got:
ERROR: collation "fr_FR" for encoding "UTF8" does not exist
Any help is welcome.
COLLATE "C" will give you predictable results on all platforms. Additional collations may be available depending on operating system support. And thus its behaviour totally depends on OS.
https://www.postgresql.org/docs/current/static/collation.html:
On all platforms, the collations named default, C, and POSIX are
available. Additional collations may be available depending on
operating system support. The default collation selects the LC_COLLATE
and LC_CTYPE values specified at database creation time. The C and
POSIX collations both specify "traditional C" behavior, in which only
the ASCII letters "A" through "Z" are treated as letters, and sorting
is done strictly by character code byte values.
If the operating system provides support for using multiple locales
within a single program (newlocale and related functions), then when a
database cluster is initialized, initdb populates the system catalog
pg_collation with collations based on all the locales it finds on the
operating system at the time. For example, the operating system might
provide a locale named de_DE.utf8. initdb would then create a
collation named de_DE.utf8 for encoding UTF8 that has both LC_COLLATE
and LC_CTYPE set to de_DE.utf8. It will also create a collation with
the .utf8 tag stripped off the name. So you could also use the
collation under the name de_DE, which is less cumbersome to write and
makes the name less encoding-dependent. Note that, nevertheless, the
initial set of collation names is platform-dependent.

Semicolon escaping in postgresql on Windows

I have an sql script defining a function in create_database_function.sql file:
CREATE OR REPLACE FUNCTION increment_display_count(in_host_id int, in_host_owner_id int, in_date_ date) RETURNS void as $$
BEGIN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ = in_date_;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO display_count (host_id, host_owner_id, date_, display_count) VALUES (in_host_id, in_host_owner_id, in_date_, 1);
EXCEPTION WHEN OTHERS THEN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ = in_date_;
END;
RETURN;
END;
$$
LANGUAGE plpgsql;
I want to execute in on Windows. To do so, I run a usual command:
psql -h localhost -U postgres -d database -f create_database_function.sql
But this script gave me a huge number of syntax errors. An hour-long googling didn't bear fruit. But I went on tinkering with this script and eventually found the problem.
The solution was to prepend all semicolon ; signs with backslash \.
Though this solves the problem, it introduces another one. I work on the project with another guy. But he works on Linux. In his case the script should be without \ before semicolons.
So, why do I need to prepend ; with \ on Windows? Can this be somehow avoided or done another way?
I googled for it a lot and haven't found any similar problem.
Update
The output when I use \; instead of ;:
C:\Xubuntu_shared\pixel\pixel\src\main\scripts>psql -h localhost -U postgres -d
pixel_test -f create_database_function.sql
Password:
CREATE FUNCTION
The output with errors when I execute the script without backslash escaping:
C:\Xubuntu_shared\pixel\pixel\src\main\scripts>psql -h localhost -U postgres -d
pixel_test -f create_database_function.sql
Password:
psql:create_database_function.sql:4: ERROR: unterminated dollar-quoted string a
t or near "$$
BEGIN
UPDATE display_count set display_count = display_count + 1
WHERE host_id = in_host_id AND host_owner_id = in_host_owner_id AND date_ =
in_date_;" at character 121
psql:create_database_function.sql:6: ERROR: syntax error at or near "IF" at cha
racter 5
psql:create_database_function.sql:7: ERROR: syntax error at or near "IF" at cha
racter 9
psql:create_database_function.sql:9: ERROR: syntax error at or near "INSERT" at
character 19
psql:create_database_function.sql:12: ERROR: syntax error at or near "EXCEPTION
" at character 5
psql:create_database_function.sql:13: WARNING: there is no transaction in progr
ess
COMMIT
psql:create_database_function.sql:14: ERROR: syntax error at or near "RETURN" a
t character 5
psql:create_database_function.sql:15: WARNING: there is no transaction in progr
ess
COMMIT
psql:create_database_function.sql:17: ERROR: unterminated dollar-quoted string
at or near "$$
LANGUAGE plpgsql;" at character 1
Other information that may be important:
create_database_function.sql encoding is UTF-8, without BOM. Line ending is in Windows format.
Update 2
Version
pixel=> SELECT version();
version
-------------------------------------------------------------
PostgreSQL 9.2.3, compiled by Visual C++ build 1600, 32-bit
(1 row)
pixel=>
Update 3
Output from output of select name, setting, source from pg_settings where source <> 'default'; command:
Oleg#OLEG-PC /C/Xubuntu_shared/pixel/pixel/src/main/scripts (pixel-dev2)
$ psql -U postgres
Password:
Welcome to psql.exe 7.4.6, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
Warning: Console codepage (866) differs from windows codepage (1251)
8-bit characters will not work correctly. See PostgreSQL
documentation "Installation on Windows" for details.
postgres=# select name,setting,source from pg_settings where source <> 'default'
;
name | setting
| source
----------------------------+---------------------------------------------------
---+----------------------
config_file | C:/Program Files/PostgreSQL/9.2/data/postgresql.co
nf | override
data_directory | C:/Program Files/PostgreSQL/9.2/data
| override
DateStyle | ISO, MDY
| configuration file
default_text_search_config | pg_catalog.english
| configuration file
hba_file | C:/Program Files/PostgreSQL/9.2/data/pg_hba.conf
| override
ident_file | C:/Program Files/PostgreSQL/9.2/data/pg_ident.conf
| override
lc_collate | C
| override
lc_ctype | C
| override
lc_messages | C
| configuration file
lc_monetary | C
| configuration file
lc_numeric | C
| configuration file
lc_time | C
| configuration file
listen_addresses | *
| configuration file
log_destination | stderr
| configuration file
log_line_prefix | %t
| configuration file
log_timezone | Europe/Moscow
| configuration file
logging_collector | on
| configuration file
max_connections | 100
| configuration file
max_stack_depth | 2048
| environment variable
port | 5432
| configuration file
server_encoding | UTF8
| override
shared_buffers | 4096
| configuration file
TimeZone | Europe/Moscow
| configuration file
transaction_deferrable | off
| override
transaction_isolation | read committed
| override
transaction_read_only | off
| override
wal_buffers | 128
| override
(27 rows)
For those who may encounter this rare problem.
It is related to Postgres server and client versions mismatch. I were using Server of version 9.2.3 and Client of version 7.4.6.
Is important to mention that Postgres installer for Windows already includes psql client. So there is no need to install another one.
I don't remember exactly why I installed a separate client, but I guess it was because psql didn't start from console. I think it could be cured by rebooting Windows after PosrgreSQL isntallation (as a lot of problems are solved with this OS this way) or manually adding path to psql.exe to environment path variable.
So, if you face the same problem, check the version of Server and Client. And if they don't match, set the path environment variable to the correct client from original PosgreSQL shipping.