postgresql crosstab simple example - postgresql

I got a key-value based table where each key-value pair is assigned to an entity which is identified by an id:
|_id__|_key_______|_value_|
| 123 | FIRSTNAME | John |
| 123 | LASTNAME | Doe |
And I want to transform it a structre like this:
|_id__|_firstName_|_lastName_|
| 123 | John | Doe |
I suppose one can use postgres build in crosstab function to do it.
Can you show me how to do it and explain why it works?

First of all activate the build in tablefunc-extension:
CREATE EXTENSION tablefunc;
Then create table and add sample data:
CREATE TABLE example (
id int,
key text,
value text
);
INSERT INTO example VALUES
(123, 'firstName', 'John'),
(123, 'lastName', 'Doe');
Now lets prepare the crosstab statment:
SELECT *
FROM example
ORDER BY id ASC, key ASC;
Its important to have the ORDER BY here.
Result:
|_id__|_key_______|_value_|
| 123 | FIRSTNAME | John |
| 123 | LASTNAME | Doe |
Solution
Now crosstab creates the table as we want:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key ASC;'
) AS ct(id INT, firstname TEXT, lastname TEXT);
Result:
|_id__|_firstName_|_lastName_|
| 123 | John | Doe |
How it works #1
To however understand how it works I found it easiest to just change the ORDER BY and see what happens:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key DESC;'
) AS ct(id INT, firstname TEXT, lastname TEXT);
Result:
|_id__|_firstName_|_lastName_|
| 123 | Doe | John |
As we changed the sorting of the key, the crosstab function sees the keys sorted in the other direction, thus reversing the generated columns.
How it works #2
Another thing that helped me understand how it works: the column definition is all about positions:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key ASC;'
) AS ct(blablafirst INT, blablasecond TEXT, blablathird TEXT);
Result
|_blablafirst__|_blablasecond_|_blablathird_|
| 123 | John | Doe |

Related

create Postgress view to combine multiple tables data

I have multiple tables with different column names like below,
customer table:
c_id,
customer_name,
phone_number,
postal_code
Employee Table: Employee_name, e_id, mobile_no,zip_code
student Table: s_id, student_name, post_code
I need to create view to combine those tables like below
User Table: userId (c_id, e_id, s_id), userName (all name columns), zip_code, contact
Sample Data
Customer Table:
| c_id | Company_Name | Phone |
|----------------------------------|
| 1 | Company 1 | *******|
|----------------------------------|
Employee Table:
| e_id | Employee_Name | Mobile |
|----------------------------------|
| 1 | employee 1 | *******|
|----------------------------------|
Expected View:
| userId | user_Name | contact |
|-------------------------------------|
| c_1 | Company 1 | ******* |
|-------------------------------------|
| e_1 | employee 1 | ******* |
|-------------------------------------|
how to create the view please help me, Thanks in advance
just use Union or Union All depending on duplicate values. based on data
CREATE VIEW view_name
AS
SELECT concat('c_',c_id) AS userId,
company_name AS user_Name,
phone AS contact
FROM customer
UNION
SELECT concat('e_',e_id),
employee_name,
mobile
FROM employee
dbfiddle
based on tables
CREATE VIEW view_name
AS
SELECT c_id AS userId,
customer_name AS userName,
postal_code AS zip_code,
phone_number AS contact
FROM customer
UNION
SELECT e_id,
employee_name,
zip_code,
mobile_no
FROM employee
UNION
SELECT s_id,
student_name,
post_code,
NULL
FROM student

order by case alphabetical ordering

Question is not too specific but not sure how to explain the question well.
I have table in my db with names in the field.
I would like to order the names in the way if the name starts with certain alphabet, order that first and so on.
What I have now is
SELECT
(T.firstname||' '||T.lastname) as Full_Name
FROM
TABLE T
ORDER BY
CASE
WHEN LPAD(T.firstname, 1) = 'J' THEN T.firstname
WHEN LPAD(T.firstname, 1) = 'B' THEN T.firstname
END DESC,
Full_Name ASC
Now this returns as what I would like to see, name starting with 'J' is ordered first then 'B' then the rest.
However, the result looks like
What I get What I want
Full_Name Full_Name
---------- ----------
Junior MR James A
John Doe Joe Bob
Joe Bob John Doe
James A Junior MR
Brad T B Test
Bob Joe Bb Test
Bb Test Bob Joe
B Test Brad T
A Test A Test
Aa Test Aa Test
AFLKJASDFJ AFLKJASDFJ
Ann Doe Ann Doe
But what I want is that J and B to be sorted alphabetical order as well, right now it is doing reverse alphabetical order.
How can I specify the order inside of case?
I tried having 2 seperate case statement for different cases for starting with 'J' and 'B', it just shows me the same result
Just make one extra column, material using triggers or volatile using expression only executed when select is run, and then use it in sorting.
For secondary sorting use original components of names, not the expression bringing both names together thus destroying the information which was which.
Examples: https://dbfiddle.uk/?rdbms=firebird_3.0&fiddle=fbf89b3903d3271ae6c55589fd9cfe23
create table T (
firstname varchar(10),
lastname varchar(10),
fullname computed by (
Coalesce(firstname, '-') || ' ' || Coalesce(T.lastname, '-')
),
sorting_helper computed by (
CASE WHEN firstname starting with 'J' then 100
WHEN firstname starting with 'B' then 50
ELSE 0
END
)
)
Notice the important distinction: my helper expression is "ranking" one. It yields one of several pre-defined ranks, thus putting "James" and "Joe" into the same bin having exactly the same ranking value. Your expression still yields the names themselves, thus erroneously keeping difference between those names. But you do NOT want that difference, you told you want all J-started names to be moved upwards and then sorted among themselves by usual rules. So, just do what you say, make an expression that pulls all J-names together WITHOUT making distinction between those.
insert into T
select
'John', 'Doe'
from rdb$database union all select
'James', 'A'
from rdb$database union all select
'Aa ', 'Test'
from rdb$database union all select
'Ann', 'Doe'
from rdb$database union all select
'Bob', 'Joe'
from rdb$database union all select
'Brad', 'Test'
from rdb$database union all select
NULL, 'Smith'
from rdb$database union all select
'Ken', NULL
from rdb$database
8 rows affected
select * from T
FIRSTNAME | LASTNAME | FULLNAME | SORTING_HELPER
:-------- | :------- | :---------- | -------------:
John | Doe | John Doe | 100
James | A | James A | 100
Aa | Test | Aa Test | 0
Ann | Doe | Ann Doe | 0
Bob | Joe | Bob Joe | 50
Brad | Test | Brad Test | 50
null | Smith | - Smith | 0
Ken | null | Ken - | 0
Select FullName from T order by sorting_helper desc, firstname asc, lastname asc
| FULLNAME |
| :---------- |
| James A |
| John Doe |
| Bob Joe |
| Brad Test |
| - Smith |
| Aa Test |
| Ann Doe |
| Ken - |
Or without computed-by column
Select FullName from T order by (CASE WHEN firstname starting with 'J' then 0
WHEN firstname starting with 'B' then 1
ELSE 2
END) asc, firstname asc, lastname asc
| FULLNAME |
| :---------- |
| James A |
| John Doe |
| Bob Joe |
| Brad Test |
| - Smith |
| Aa Test |
| Ann Doe |
| Ken - |
For extra tuning of the positioning of the rows lacking name or surname you can also use NULLS FIRST or NULLS LAST option as described in Firebird docs at https://firebirdsql.org/file/documentation/reference_manuals/user_manuals/html/nullguide-sorts.html
The problem with this approach however, on big enough tables, would be that you won't be able to use indices built over names and surnames for sorting, instead you would have to resort to un-sorted pulling of data (aka NATURAL SORT when reading QUERY PLAN) and then sorting it into temporary files on disk. Which might turn very slow and volume-consuming on large enough data.
You can try to make it better by creating "index by the expression", using your ranking expression there. And hope that FB optimizer will use it (it is quite tricky with verbose expressions like CASE). Frankly you would probably still be left without it (at least I did not manage to make FB 2.1 utilize index-by-case-expression there).
You can "materialize" the ranking expression into a regular SmallInt Not Null column instead of COMPUTED BY one, and use TRIGGER of BEFORE UPDATE OR INSERT type keep that column populated with proper data. Then you can create a regular index over that regular column. While it will add two bytes to each row, that is not that much a grow.
But even then, the index with very few distinct values does not add much value, it will have "low selectivity". Also, index-by-expression can not be compound one (meaning, including other columns past the expression).
So for large data you'd practically better be with using THREE different queries fused together. Add scaffolding, if you did not do already:
create index i58647579_names on T58647579 ( firstname, lastname )
Then you can do triple-select like this:
WITH S1 as (
select FullName from T58647579
where firstname starting with 'J'
order by firstname asc, lastname asc
), S2 as (
select FullName from T58647579
where firstname starting with 'B'
order by firstname asc, lastname asc
), S3 as (
select FullName from T58647579
where (firstname is null)
or ( (firstname not starting with 'J')
and (firstname not starting with 'B')
)
order by firstname asc, lastname asc
)
SELECT * FROM S1
UNION ALL
SELECT * FROM S2
UNION ALL
SELECT * FROM S3
And while you would traverse the table thrice - you would do it by pre-sorted index:
PLAN (S1 T58647579 ORDER I58647579_NAMES INDEX (I58647579_NAMES))
PLAN (S2 T58647579 ORDER I58647579_NAMES INDEX (I58647579_NAMES))
PLAN (S3 T58647579 ORDER I58647579_NAMES)

How to return a comma separated values of column without having to loop through the result set

Let say I have this 2 table
+----+---------+ +----+-----------+----------------+
| Id | Country | | Id | CountryId | City |
+----+---------+ +----+-----------+----------------+
| 1 | USA | | 1 | 1 | Washington, DC |
+----+---------+ +----+-----------+----------------+
| 2 | Canada | | 2 | 2 | Ottawa |
+----+---------+ +----+-----------+----------------+
| 3 | 1 | New York |
+----+-----------+----------------+
| 4 | 1 | Baltimore |
+----+-----------+----------------+
I need to produce a result like:
Id | Country | Cities
---+---------+--------------------------------------
1 | USA | Washington, DC, New York, Baltimore
---+------------------------------------------------
2 | Canada | Ottawa
So far, I am looping through the left side table result like this:
DECLARE #table
(
Id INT IDENTITY(1, 1),
CountryId INT,
City VARCHAR(50)
)
DECLARE #tableString
(
Id INT IDENTITY(1, 1),
CountryId INT,
Cities VARCHAR(100)
)
INSERT INTO #table
SELECT Id, City
FROM tblCountries
DECLARE #city VARCHAR(50)
DECLARE #id INT
DECLARE #count INT
DECLARE #i INT = 1
SELECT #count = COUNT(*) FROM #table
WHILE (#i <= #count)
BEGIN
SELECT #Id = Id, #city = City FROM #table WHERE Id = #i
IF(EXISTS(SELECT * FROM #tableString WHERE CountryId = #Id))
BEGIN
UPDATE #tableString SET Cities = Cities + ', ' + #city WHERE Id = #Id
END
ELSE
BEGIN
INSERT INTO #tableString (CountryId, city) VALUES (#Id, #city)
END
SET #i = #i + 1
END
SELECT tc.Id, tc.Country, ts.Cities
FROM tblCountries tc
LEFT JOIN #tableString ts
ON tc.Id = ts.CountryId
My concern is that with all those looping in TSQL, it may be a performance killer. Even with fewer, it appears to be slow. Is there a better way to concatenate those string without having to loop through the data set as if I was working in C#
.
Thanks for helping
This was answered many times, but I've got the feeling, that some explanation might help you...
... am I missing something? It seems like this is related to XML
The needed functionality STRING_AGG() was introduced with SQL-Server 2017. The other direction STRING_SPLIT() came with v2016.
But many people still use older versions (and will do this for years), so we need workarounds. There were approaches with loops, bad and slow... And you might use recursive CTEs. And - that's the point here! - we can use some abilities of XML to solve this.
Try this out:
DECLARE #xml XML=
N'<root>
<element>text1</element>
<element>text2</element>
<element>text3</element>
</root>';
--The query will return the first <element> below <root> and return text1.
SELECT #xml.value(N'(/root/element)[1]','nvarchar(max)');
--But now try this:
SELECT #xml.value(N'(/root)[1]','nvarchar(max)')
The result is text1text2text3.
The reason for this: If you call .value() on an element without a detailed specification of what you want to read, you'll get the whole element back. Find details here.
Now imagine an XML like this
DECLARE #xml2 XML=
N'<root>
<element>, text1</element>
<element>, text2</element>
<element>, text3</element>
</root>';
With the same query as above you'd get , text1, text2, text3. The only thing left is to cut off the leading comma and the space. This is done - in most examples - with STUFF().
So the challenge is to create this XML. And this is what you find in the linked examples.
A general example is this: Read all tables and list their columns as a CSV-list:
SELECT TOP 10
TABLE_NAME
,STUFF(
(SELECT ',' + c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME
ORDER BY c.COLUMN_NAME
FOR XML PATH('')
),1,1,'') AS AllTableColumns
FROM INFORMATION_SCHEMA.TABLES t

Fetch records with distinct value of one column while replacing another col's value when multiple records

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`
You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

Postrge: rows to columns

i have table which contains street names in several languages:
streetName(addressId uuid, languageCode text, name text);
with values:
addressid |languagecode |name |
-------------------------------------|-------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |cz |streetName_1_cz |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |en |streetName_1_en |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |fi |streetName_1_fi |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |sv |streetName_1_sv |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |fi |streetName_2_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |cz |streetName_2_cz |
and need to transform street names in cz, fi, en to columns.
(exactly those three languages even if there are more languages in the table, and it can happen, that value for some of those three language is missing).
so expected result is:
addressid |streetNameCz |streetNameEn |streetNameFi |
-------------------------------------|----------------|----------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |streetName_1_cz |streetName_1_en |streetName_1_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |streetName_2_cz | |streetName_2_fi |
How should do it?
I tried to use crosstable, but it didn't work correctly because there are missing values for some languages,
so i had result like:
addressid |streetNameCz |streetNameEn |streetNameFi |
-------------------------------------|----------------|----------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |streetName_1_cz |streetName_1_en |streetName_1_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |streetName_2_cz |streetName_2_fi | |
which is not correct :-(.
This is select i used:
SELECT *
FROM crosstab(
'select
"addressid"::uuid as rowid,
languagecode::text as attribute,
name::text as value
from streetName
where languageCode in (''cz'', ''en'', ''fi'')
order by 1, 2')
AS ct(row_name uuid, "streetNameCz" text, "streetNameEn" text, "streetNameFi" text);
Thanks for any advice.
Lange.
if you don't want use crosstab you simple do aggregations:
SELECT addressid,
MAX( CASE WHEN languagecode = 'cz' THEN name END ) as lng_cz,
MAX( CASE WHEN languagecode = 'en' THEN name END ) as lng_en,
MAX( CASE WHEN languagecode = 'fi' THEN name END ) as lng_fi
FROM YourTable
GROUP BY addressid