Postrge: rows to columns - postgresql

i have table which contains street names in several languages:
streetName(addressId uuid, languageCode text, name text);
with values:
addressid |languagecode |name |
-------------------------------------|-------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |cz |streetName_1_cz |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |en |streetName_1_en |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |fi |streetName_1_fi |
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |sv |streetName_1_sv |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |fi |streetName_2_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |cz |streetName_2_cz |
and need to transform street names in cz, fi, en to columns.
(exactly those three languages even if there are more languages in the table, and it can happen, that value for some of those three language is missing).
so expected result is:
addressid |streetNameCz |streetNameEn |streetNameFi |
-------------------------------------|----------------|----------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |streetName_1_cz |streetName_1_en |streetName_1_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |streetName_2_cz | |streetName_2_fi |
How should do it?
I tried to use crosstable, but it didn't work correctly because there are missing values for some languages,
so i had result like:
addressid |streetNameCz |streetNameEn |streetNameFi |
-------------------------------------|----------------|----------------|----------------|
e5c8c25c-f21e-47df-9172-7f3c7e52d669 |streetName_1_cz |streetName_1_en |streetName_1_fi |
bff096cc-4d4d-4b2e-aac2-bdc6ab659a72 |streetName_2_cz |streetName_2_fi | |
which is not correct :-(.
This is select i used:
SELECT *
FROM crosstab(
'select
"addressid"::uuid as rowid,
languagecode::text as attribute,
name::text as value
from streetName
where languageCode in (''cz'', ''en'', ''fi'')
order by 1, 2')
AS ct(row_name uuid, "streetNameCz" text, "streetNameEn" text, "streetNameFi" text);
Thanks for any advice.
Lange.

if you don't want use crosstab you simple do aggregations:
SELECT addressid,
MAX( CASE WHEN languagecode = 'cz' THEN name END ) as lng_cz,
MAX( CASE WHEN languagecode = 'en' THEN name END ) as lng_en,
MAX( CASE WHEN languagecode = 'fi' THEN name END ) as lng_fi
FROM YourTable
GROUP BY addressid

Related

How to return a comma separated values of column without having to loop through the result set

Let say I have this 2 table
+----+---------+ +----+-----------+----------------+
| Id | Country | | Id | CountryId | City |
+----+---------+ +----+-----------+----------------+
| 1 | USA | | 1 | 1 | Washington, DC |
+----+---------+ +----+-----------+----------------+
| 2 | Canada | | 2 | 2 | Ottawa |
+----+---------+ +----+-----------+----------------+
| 3 | 1 | New York |
+----+-----------+----------------+
| 4 | 1 | Baltimore |
+----+-----------+----------------+
I need to produce a result like:
Id | Country | Cities
---+---------+--------------------------------------
1 | USA | Washington, DC, New York, Baltimore
---+------------------------------------------------
2 | Canada | Ottawa
So far, I am looping through the left side table result like this:
DECLARE #table
(
Id INT IDENTITY(1, 1),
CountryId INT,
City VARCHAR(50)
)
DECLARE #tableString
(
Id INT IDENTITY(1, 1),
CountryId INT,
Cities VARCHAR(100)
)
INSERT INTO #table
SELECT Id, City
FROM tblCountries
DECLARE #city VARCHAR(50)
DECLARE #id INT
DECLARE #count INT
DECLARE #i INT = 1
SELECT #count = COUNT(*) FROM #table
WHILE (#i <= #count)
BEGIN
SELECT #Id = Id, #city = City FROM #table WHERE Id = #i
IF(EXISTS(SELECT * FROM #tableString WHERE CountryId = #Id))
BEGIN
UPDATE #tableString SET Cities = Cities + ', ' + #city WHERE Id = #Id
END
ELSE
BEGIN
INSERT INTO #tableString (CountryId, city) VALUES (#Id, #city)
END
SET #i = #i + 1
END
SELECT tc.Id, tc.Country, ts.Cities
FROM tblCountries tc
LEFT JOIN #tableString ts
ON tc.Id = ts.CountryId
My concern is that with all those looping in TSQL, it may be a performance killer. Even with fewer, it appears to be slow. Is there a better way to concatenate those string without having to loop through the data set as if I was working in C#
.
Thanks for helping
This was answered many times, but I've got the feeling, that some explanation might help you...
... am I missing something? It seems like this is related to XML
The needed functionality STRING_AGG() was introduced with SQL-Server 2017. The other direction STRING_SPLIT() came with v2016.
But many people still use older versions (and will do this for years), so we need workarounds. There were approaches with loops, bad and slow... And you might use recursive CTEs. And - that's the point here! - we can use some abilities of XML to solve this.
Try this out:
DECLARE #xml XML=
N'<root>
<element>text1</element>
<element>text2</element>
<element>text3</element>
</root>';
--The query will return the first <element> below <root> and return text1.
SELECT #xml.value(N'(/root/element)[1]','nvarchar(max)');
--But now try this:
SELECT #xml.value(N'(/root)[1]','nvarchar(max)')
The result is text1text2text3.
The reason for this: If you call .value() on an element without a detailed specification of what you want to read, you'll get the whole element back. Find details here.
Now imagine an XML like this
DECLARE #xml2 XML=
N'<root>
<element>, text1</element>
<element>, text2</element>
<element>, text3</element>
</root>';
With the same query as above you'd get , text1, text2, text3. The only thing left is to cut off the leading comma and the space. This is done - in most examples - with STUFF().
So the challenge is to create this XML. And this is what you find in the linked examples.
A general example is this: Read all tables and list their columns as a CSV-list:
SELECT TOP 10
TABLE_NAME
,STUFF(
(SELECT ',' + c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME
ORDER BY c.COLUMN_NAME
FOR XML PATH('')
),1,1,'') AS AllTableColumns
FROM INFORMATION_SCHEMA.TABLES t

Fetch records with distinct value of one column while replacing another col's value when multiple records

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`
You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

PostgreSQL create column for each array_agg element instead of comma-separate

I want to force a new column for each string_agg element (i.e., Fiction, Mystery would instead be 'Fiction' in one column, 'Mystery' in the next column) returned from this query, and 2.) I need to be able to expand the tag-columns up to five tags max:
SELECT books.isbn_13 as "ISBN", title as "Title",
author as "Author",
string_agg(tag_name, ', ') as "Tags"
FROM books
LEFT JOIN book_tags on books.isbn_13 = book_tags.isbn_13
GROUP BY books.isbn_13;
Right now everything looks good, except I would like a column for each Tag instead of comma-separated values. Here is my CURRENT result:
ISBN | Title | Author | Tags
1111111111111 | The Adventures of Steve | Russell Barron | Fiction, Mystery
2222222222222 | It's all a mystery to me | Mystery Man | Mystery
3333333333333 | Biography of a Programmer | Solo Artist | Biography
4444444444444 | Steve and Russel go to Mars | Russell Groupon |
6666666666666 | Newest Book you Must Have | Newbie Onthescene |
Desired result (separating tags into columns where there is more than one):
ISBN | Title | Author | Tag1 | Tag2 | Tag3 | Tag4
1111111111111 | The Adventures of Steve | Russell Barron | Fiction | Mystery | Male Protagonists | Fantasy|
2222222222222 | It's all a mystery to me | Mystery Man | Mystery
3333333333333 | Biography of a Programmer | Solo Artist | Biography
4444444444444 | Steve and Russel go to Mars | Russell Groupon |
6666666666666 | Newest Book you Must Have | Newbie Onthescene |
SCHEMA for books table (parent):
CREATE TABLE public.books
(
isbn_13 character varying(13) COLLATE pg_catalog."default" NOT NULL,
title character varying(100) COLLATE pg_catalog."default",
author character varying(80) COLLATE pg_catalog."default",
publish_date date,
price numeric(6,2),
content bytea,
CONSTRAINT books_pkey PRIMARY KEY (isbn_13)
)
SCHEMA book_tags table:
CREATE TABLE public.book_tags
(
isbn_13 character varying(13) COLLATE pg_catalog."default" NOT NULL,
tag_name character varying(30) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT book_tags_pkey PRIMARY KEY (isbn_13, tag_name),
CONSTRAINT book_tags_isbn_13_fkey FOREIGN KEY (isbn_13)
REFERENCES public.books (isbn_13) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE CASCADE
)
I've researched group by, crosstab/pivot resources for hours with no luck. This seems like it should be a simple thing to do but I'm a very-beginner and haven't found an answer. Thanks in advance for any guidance.
With CTE as (
SELECT books.isbn_13 as "ISBN",
title as "Title",
author as "Author",
tag_name as "Tag",
row_number() over (partition by books.isbn_13) as rn
FROM books
LEFT JOIN book_tags
on books.isbn_13 = book_tags.isbn_13
)
SELECT "ISBN", "Title", "Author",
MAX( CASE WHEN rn = 1 THEN Tag END) as Tag1,
MAX( CASE WHEN rn = 2 THEN Tag END) as Tag2,
MAX( CASE WHEN rn = 3 THEN Tag END) as Tag3,
MAX( CASE WHEN rn = 4 THEN Tag END) as Tag4,
MAX( CASE WHEN rn = 5 THEN Tag END) as Tag5
FROM CTE
GROUP BY "ISBN", "Title", "Author";

postgresql crosstab simple example

I got a key-value based table where each key-value pair is assigned to an entity which is identified by an id:
|_id__|_key_______|_value_|
| 123 | FIRSTNAME | John |
| 123 | LASTNAME | Doe |
And I want to transform it a structre like this:
|_id__|_firstName_|_lastName_|
| 123 | John | Doe |
I suppose one can use postgres build in crosstab function to do it.
Can you show me how to do it and explain why it works?
First of all activate the build in tablefunc-extension:
CREATE EXTENSION tablefunc;
Then create table and add sample data:
CREATE TABLE example (
id int,
key text,
value text
);
INSERT INTO example VALUES
(123, 'firstName', 'John'),
(123, 'lastName', 'Doe');
Now lets prepare the crosstab statment:
SELECT *
FROM example
ORDER BY id ASC, key ASC;
Its important to have the ORDER BY here.
Result:
|_id__|_key_______|_value_|
| 123 | FIRSTNAME | John |
| 123 | LASTNAME | Doe |
Solution
Now crosstab creates the table as we want:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key ASC;'
) AS ct(id INT, firstname TEXT, lastname TEXT);
Result:
|_id__|_firstName_|_lastName_|
| 123 | John | Doe |
How it works #1
To however understand how it works I found it easiest to just change the ORDER BY and see what happens:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key DESC;'
) AS ct(id INT, firstname TEXT, lastname TEXT);
Result:
|_id__|_firstName_|_lastName_|
| 123 | Doe | John |
As we changed the sorting of the key, the crosstab function sees the keys sorted in the other direction, thus reversing the generated columns.
How it works #2
Another thing that helped me understand how it works: the column definition is all about positions:
SELECT *
FROM crosstab(
'SELECT *
FROM example
ORDER BY id ASC, key ASC;'
) AS ct(blablafirst INT, blablasecond TEXT, blablathird TEXT);
Result
|_blablafirst__|_blablasecond_|_blablathird_|
| 123 | John | Doe |

Insert output of query into another table in postgres?

I'm working in Postgres 9.4. I have two tables:
Table "public.parcel"
Column | Type | Modifiers
ogc_fid | integer | not null default
wkb_geometry | geometry(Polygon,4326) |
county | character varying |
parcel_area | double precision |
Table "public.county"
Column | Type | Modifiers
--------+------------------------+-----------
name | character(1) |
chull | geometry(Polygon,4326) |
area | double precision |
I would like to find all the unique values of county in parcel, and the total areas of the attached parcels, and then insert them into the county table as name and area respectively.
I know how to do the first half of this:
SELECT county,
SUM(parcel_area) AS area
FROM inspire_parcel
GROUP BY county;
But what I don't know is how to insert these values into county. Can anyone advise?
I think it's something like:
UPDATE county SET name, area = (SELECT county, SUM(parcel_area) AS area
FROM inspire_parcel GROUP BY county)
You use INSERT INTO. So, something like this:
INSERT INTO county
SELECT county, SUM(parcel_area) AS area
FROM inspire_parcel GROUP BY county;