formatting the data while exporting to a file? - postgresql

I want the output as below format in the output file.
Expected OUTPUT:
Table CUSTOMERMSTR
---------------------------
custno cjd custtype
-------------- ----------- ------------
cust123 01-OCT-1900 1
cust123 08-SEP-1997 1
cust123 01-JAN-1996 1
3 rows
AS of NOW:
Table CUSTOMERMSTR
----------------------------
custno|to_char|custtype
cust123|01-OCT-1900|1
cust123|08-SEP-1997|1
cust123|01-JAN-1996|1
This is my expdata.psql file in the UNIX server.
This is not giving the expecting format.
Please help me with what I have to add to get the desired output in my .psql script.
--col custno format 9999999 heading "custno"
--col cjd format A25 heading "cjd"
--col custtype format 999999999 heading "custtype"
\t off
\a
\echo
\echo Table CUSTOMERMSTR
\echo ----------------------------
\echo
select custno,TO_CHAR(cjd,'DD-MON-YYYY'),custtype from CUSTOMERMSTR;

Related

psql expanded display - avoid dashes

When I have a very wide column (like a json document) and I am using expanded display to make the contents at least partly readable, I am still seeing extremely ugly record separators, that seems to want to be as wide as the widest column, like so:
Is there a way to avoid the "Sea of Dashes"?
-[ RECORD 1 ]--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 18
description | {json data xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}
parameter | {json data xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}
name | Foo
-[ RECORD 2 ]--+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 19
description | {}
parameter | {json data xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}
name | CustomerRequestEventType
to avoid sea of dashes, use \pset format unaligned, eg:
t=# \x
Expanded display is on.
t=# \pset format unaligned
Output format is unaligned.
t=# with ts as (select generate_series('2010-01-01'::timestamp,'2010-01-10'::timestamp,'1 day'::interval) s) select array_agg(s) from ts; array_agg|{"2010-01-01 00:00:00","2010-01-02 00:00:00","2010-01-03 00:00:00","2010-01-04 00:00:00","2010-01-05 00:00:00","2010-01-06 00:00:00","2010-01-07 00:00:00","2010-01-08 00:00:00","2010-01-09 00:00:00","2010-01-10 00:00:00"}
Time: 0.250 ms
As you can see, no dashes, but the long string is still wrapped over lines by the length of the window (or not wrapped at all). In case of unformatted string this is the solution, but you mentioned json - it can be devided in a pretty way. To do so instead of using unaligned format in psql, consume jsonb_pretty function or pretty flag of other functions, eg (with array_to_json(..., true):
t=# with ts as (select generate_series('2010-01-01'::timestamp,'2010-01-31'::timestamp,'1 day'::interval) s) select array_to_json(array_agg(s),true) from ts;
array_to_json|["2010-01-01T00:00:00",
"2010-01-02T00:00:00",
"2010-01-03T00:00:00",
"2010-01-04T00:00:00",
"2010-01-05T00:00:00",
"2010-01-06T00:00:00",
"2010-01-07T00:00:00",
"2010-01-08T00:00:00",
"2010-01-09T00:00:00",
"2010-01-10T00:00:00",
"2010-01-11T00:00:00",
"2010-01-12T00:00:00",
"2010-01-13T00:00:00",
"2010-01-14T00:00:00",
"2010-01-15T00:00:00",
"2010-01-16T00:00:00",
"2010-01-17T00:00:00",
"2010-01-18T00:00:00",
"2010-01-19T00:00:00",
"2010-01-20T00:00:00",
"2010-01-21T00:00:00",
"2010-01-22T00:00:00",
"2010-01-23T00:00:00",
"2010-01-24T00:00:00",
"2010-01-25T00:00:00",
"2010-01-26T00:00:00",
"2010-01-27T00:00:00",
"2010-01-28T00:00:00",
"2010-01-29T00:00:00",
"2010-01-30T00:00:00",
"2010-01-31T00:00:00"]
Time: 0.291 ms
Note I still use unaligned format to avoid "+" though...

PostgreSQL full text search yielding weird results

I have a schema like this (simplified):
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name NOT NULL
);
CREATE INDEX users_idx
ON users
USING GIN (to_tsvector('finnish', name));
But I'm getting completely invalid results with my queries:
# select name from users where to_tsvector('finnish', name) ## to_tsquery('lemmin');
name
------
(0 rows)
# select name from users where to_tsvector('finnish', name) ## to_tsquery('lemmink');
name
--------------------
Riitta ja Lemminki
Riitta ja Lemminki
(2 rows)
# select name from users where name ilike 'lemmink%';
name
----------------------
Lemminkäinen Matilda
Lemminkäinen Matias
Lemminkäinen Kyösti
Lemminkäinen Tuomas
(4 rows)
Another example:
# select name from users where to_tsvector('finnish', name) ## to_tsquery('partu');
name
----------
Partuuna
(1 row)
# select name from users where to_tsvector('finnish', name) ## to_tsquery('partur');
name
------------------------
Parturi-Kampaamo Raija
Parturi-Kampaamo Siema
(2 rows)
I was expecting to get the bottom two results on both queries...
Using the following version:
psql (9.4.6, server 9.5.2)
WARNING: psql major version 9.4, server major version 9.5.
Some psql features might not work.
I don't speak Finnish, but it seems expected result. FTS looks for lexemes, not for parts of words, Eg, do is not a lexemme for dog, but dog is for dogs:
t=# select to_tsvector('english', 'Dogs eats bone') ## to_tsquery('do');
NOTICE: text-search query contains only stop words or doesn't contain lexemes, ignored
?column?
----------
f
(1 row)
t=# select to_tsvector('english', 'Dogs eats bone') ## to_tsquery('dog');
?column?
----------
t
(1 row)
So I believe in Parturi last i is optional ending - right?..
Update:
from https://en.wiktionary.org/wiki/parturi :
partur[i], partur[eita] => lexeme will be partur

Importing bytea data into PostgreSQL by using COPY FROM stdin

I generated a (UTF-8) file by an external program for importing into PostgreSQL 9.6.1. Problem is the bytea field (PWHASH).
Snippet from this file (using TAB as delimiter)
COPY USERS (ID,CODE,PWHASH,EMAIL) FROM stdin;
7 test1 E'\\\\x657B954D27B4AC56FA997D24A5FF2563' test#amce.org
\.
When importing with
psql mydb myrole -f test.sql
Everything goes well.
However, if i query the result, the byte array is not 16 bytes, but 37 bytes:
select passwordhash,length(passwordhash) from users;
passwordhash | length
------------------------------------------------------------------------------+--------
\x45275c78363537423935344432374234414335364641393937443234413546463235363327 | 37
What is the correct syntax for this?
The format of the input file is wrong. It should be like this:
7 test1 \\x657B954D27B4AC56FA997D24A5FF2563 test#amce.org
I will have to "prepare" data I believe. Smth like here:
t=# insert into u select 'x657B954D27B4AC56FA997D24A5FF2563';
INSERT 0 1
Time: 5990.809 ms
t=# select b from u;
b
----------------------------------------------------------------------
\x783635374239353444323742344143353646413939374432344135464632353633
(1 row)
Time: 0.234 ms
t=# insert into u select decode('657B954D27B4AC56FA997D24A5FF2563','hex');
INSERT 0 1
Time: 62.767 ms
t=# select b from u;
b
----------------------------------------------------------------------
\x783635374239353444323742344143353646413939374432344135464632353633
\x657b954d27b4ac56fa997d24a5ff2563
(2 rows)
Time: 0.208 ms
So in your case you can:
create table t as select ID,CODE,PWHASH::text,EMAIL from users where false;
COPY t (ID,CODE,PWHASH,EMAIL) FROM stdin;
insert into users select ID,CODE,decode(substr(PWHASH,4),'hex'),EMAIL from t;

Postgres-pgloader-transformation in columns

Loading flat file to postgres table.I need to do few transformations while reading the file and load it.
Like
-->Check for characters, if it is present, default some value. Reg_Exp can be used in oracle. How the functions can be called in below syntax?
-->TO_DATE function from text format
-->Check for Null and defaulting some value
-->Trim functions
-->Only few columns from source file should be loaded
-->Defaulting values, say for instance, source file has only 3 columns. But we need to load 4 columns. One column should be defaulted with some value
LOAD CSV
FROM 'filename'
INTO postgresql://role#host:port/database_name?tablename
TARGET COLUMNS
(
alphanm,alphnumnn,nmrc,dte
)
WITH truncate,
skip header = 0,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by '|',
batch rows = 100,
batch size = 1MB,
batch concurrency = 64
SET work_mem to '32 MB', maintenance_work_mem to '64 MB';
Kindly help me, how this can be accomplished used pgloader?
Thanks
Here's a self-contained test case for pgloader that reproduces your use-case, as best as I could understand it:
/*
Sorry pgloader version "3.3.2" compiled with SBCL 1.2.8-1.el7 Doing kind
of POC, to implement in real time work. Sample data from file:
raj|B|0.5|20170101|ABCD Need to load only first,second,third and fourth
column; Table has three column, third column should be defaulted with some
value. Table structure: A B C-numeric D-date E-(Need to add default value)
*/
LOAD CSV
FROM inline
(
alphanm,
alphnumnn,
nmrc,
dte [date format 'YYYYMMDD'],
other
)
INTO postgresql:///pgloader?so.raja
(
alphanm,
alphnumnn,
nmrc,
dte,
col text using "constant value"
)
WITH truncate,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by '|'
SET work_mem to '12MB',
standard_conforming_strings to 'on'
BEFORE LOAD DO
$$ drop table if exists so.raja; $$,
$$ create table so.raja (
alphanm text,
alphnumnn text,
nmrc numeric,
dte date,
col text
);
$$;
raj|B|0.5|20170101|ABCD
Now here's the extract from running the pgloader command:
$ pgloader 41287414.load
2017-08-15T12:35:10.258000+02:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2017-08-15T12:35:10.261000+02:00 LOG Data errors in '/private/tmp/pgloader/'
2017-08-15T12:35:10.261000+02:00 LOG Parsing commands from file #P"/Users/dim/dev/temp/pgloader-issues/stackoverflow/41287414.load"
2017-08-15T12:35:10.422000+02:00 LOG report summary reset
table name read imported errors total time
----------------------- --------- --------- --------- --------------
fetch 0 0 0 0.007s
before load 2 2 0 0.016s
----------------------- --------- --------- --------- --------------
so.raja 1 1 0 0.019s
----------------------- --------- --------- --------- --------------
Files Processed 1 1 0 0.021s
COPY Threads Completion 2 2 0 0.038s
----------------------- --------- --------- --------- --------------
Total import time 1 1 0 0.426s
And here's the content of the target table when the command is done:
$ psql -q -d pgloader -c 'table so.raja'
alphanm │ alphnumnn │ nmrc │ dte │ col
═════════╪═══════════╪══════╪════════════╪════════════════
raj │ B │ 0.5 │ 2017-01-01 │ constant value
(1 row)

postgresql + textsearch + german umlauts + UTF8

I'm really at my wits end, with this Problem, and I really hope someone could help me. I am using a Postgresql 9.3. My Database contains mostly german texts but not only, so it's encoded in utf-8. I want to establish a fulltextsearch wich supports german language, nothing special so far.
But the search is behaving really strange,, and I can't find out what I am doing wrong.
So, given the following table given as example
select * from test;
a
-------------
ein Baum
viele Bäume
Überleben
Tisch
Tische
Café
\d test
Tabelle »public.test«
Spalte | Typ | Attribute
--------+------+-----------
a | text |
sintext=# \d
Liste der Relationen
Schema | Name | Typ | Eigentümer
--------+---------------------+---------+------------
(...)
public | test | Tabelle | paf
Now, lets have a look at some textsearch examples:
select * from test where to_tsvector('german', a) ## plainto_tsquery('Baum');
a
-------------
ein Baum
viele Bäume
select * from test where to_tsvector('german', a) ## plainto_tsquery('Bäume');
--> No Hits
select * from test where to_tsvector('german', a) ## plainto_tsquery('Überleben');
--> No Hits
select * from test where to_tsvector('german', a) ## plainto_tsquery('Tisch');
a
--------
Tisch
Tische
Whereas Tische is Plural of Tisch (table) and Bäume is plural of Baum (tree). So, Obviously Umlauts does not work while textsearch perfoms well.
But what really confuses me is, that a) non-german special characters are matching
select * from test where to_tsvector('german', a) ## plainto_tsquery('Café');
a
------
Café
and b) if I don't use the german dictionary, there is no Problem with umlauts (but of course no real textsearch as well)
select * from test where to_tsvector(a) ## plainto_tsquery('Bäume');
a
-------------
viele Bäume
So, if I use the german dictionary for Text-Search, just the german special characters do not work? Seriously? What the hell is wrong here? I Really can't figure it out, please help!
You're explicitly using the German dictionary for the to_tsvector calls, but not for the to_tsquery or plainto_tsquery calls. Presumably your default dictionary isn't set to german; check with SHOW default_text_search_config.
Compare:
regress=> select plainto_tsquery('simple', 'Bäume'),
plainto_tsquery('english','Bäume'),
plainto_tsquery('german', 'Bäume');
plainto_tsquery | plainto_tsquery | plainto_tsquery
-----------------+-----------------+-----------------
'bäume' | 'bäume' | 'baum'
(1 row)
The language setting affects word simplification and root extraction, so a vector from one language won't necessarily match a query from another:
regress=> SELECT to_tsvector('german', 'viele Bäume'), plainto_tsquery('Bäume'),
to_tsvector('german', 'viele Bäume') ## plainto_tsquery('Bäume');
to_tsvector | plainto_tsquery | ?column?
-------------------+-----------------+----------
'baum':2 'viel':1 | 'bäume' | f
(1 row)
If you use a consistent language setting, all is well:
regress=> SELECT to_tsvector('german', 'viele Bäume'), plainto_tsquery('german', 'Bäume'),
to_tsvector('german', 'viele Bäume') ## plainto_tsquery('german', 'Bäume');
to_tsvector | plainto_tsquery | ?column?
-------------------+-----------------+----------
'baum':2 'viel':1 | 'baum' | t
(1 row)