Postgres values as columns - postgresql

I am working with PostgreSQL 9.3, and I have this:
PARENT_TABLE
ID | NAME
1 | N_A
2 | N_B
3 | N_C
CHILD_TABLE
ID | PARENT_TABLE_ID | KEY | VALUE
1 | 1 | K_A | V_A
2 | 1 | K_B | V_B
3 | 1 | K_C | V_C
5 | 2 | K_A | V_D
6 | 2 | K_C | V_E
7 | 3 | K_A | V_F
8 | 3 | K_B | V_G
9 | 3 | K_C | V_H
Note that I might add K_D in KEY's, it's completely dynamic.
What I want is a query that returns me the following:
QUERY_TABLE
ID | NAME | K_A | K_B | K_C | others K_...
1 | N_A | V_A | V_B | V_C | ...
2 | N_B | V_D | | V_E | ...
3 | N_C | V_F | V_G | V_H | ...
Is this possible to do ? If so, how ?

Since there can be values missing, you need the "safe" form of crosstab() with the column names as second parameter:
SELECT * FROM crosstab(
'SELECT p.id, p.name, c.key, c."value"
FROM parent_table p
LEFT JOIN child_table c ON c.parent_table_id = p.id
ORDER BY 1'
,$$VALUES ('K_A'::text), ('K_B'), ('K_C')$$)
AS t (id int, name text, k_a text, k_b text, k_c text; -- use actual data types
Details in this related answer:
PostgreSQL Crosstab Query
About adding "extra" columns:
Pivot on Multiple Columns using Tablefunc

Related

Create a PostgreSQL function that becomes a formula field of a table retrieving related data from other table

The example above can be done on a SQL Server. It is a function that performs the calculation on another table while getting the current table field Id to list data from other table, return a single value.
Question: how to do the exact thing with PostgreSQL
SELECT TOP(5) * FROM Artists;
+------------+------------------+--------------+-------------+
| ArtistId | ArtistName | ActiveFrom | CountryId |
|------------+------------------+--------------+-------------|
| 1 | Iron Maiden | 1975-12-25 | 3 |
| 2 | AC/DC | 1973-01-11 | 2 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 |
| 4 | Buddy Rich | 1919-01-01 | 6 |
| 5 | Devin Townsend | 1993-01-01 | 8 |
+------------+------------------+--------------+-------------+
SELECT TOP(5) * FROM Albums;
+-----------+------------------------+---------------+------------+-----------+
| AlbumId | AlbumName | ReleaseDate | ArtistId | GenreId |
|-----------+------------------------+---------------+------------+-----------|
| 1 | Powerslave | 1984-09-03 | 1 | 1 |
| 2 | Powerage | 1978-05-05 | 2 | 1 |
| 3 | Singing Down the Lane | 1956-01-01 | 6 | 3 |
| 4 | Ziltoid the Omniscient | 2007-05-21 | 5 | 1 |
| 5 | Casualties of Cool | 2014-05-14 | 5 | 1 |
+-----------+------------------------+---------------+------------+-----------+
The function
CREATE FUNCTION [dbo].[ufn_AlbumCount] (#ArtistId int)
RETURNS smallint
AS
BEGIN
DECLARE #AlbumCount int;
SELECT #AlbumCount = COUNT(AlbumId)
FROM Albums
WHERE ArtistId = #ArtistId;
RETURN #AlbumCount;
END;
GO
Now, (at SQL Server), after update the first table fields with ALTER TABLE Artists ADD AlbumCount AS dbo.ufn_AlbumCount(ArtistId); whe can list and get the following result.
+------------+------------------+--------------+-------------+--------------+
| ArtistId | ArtistName | ActiveFrom | CountryId | AlbumCount |
|------------+------------------+--------------+-------------+--------------|
| 1 | Iron Maiden | 1975-12-25 | 3 | 5 |
| 2 | AC/DC | 1973-01-11 | 2 | 3 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 | 2 |
| 4 | Buddy Rich | 1919-01-01 | 6 | 1 |
| 5 | Devin Townsend | 1993-01-01 | 8 | 3 |
| 6 | Jim Reeves | 1948-01-01 | 6 | 1 |
| 7 | Tom Jones | 1963-01-01 | 4 | 3 |
| 8 | Maroon 5 | 1994-01-01 | 6 | 0 |
| 9 | The Script | 2001-01-01 | 5 | 1 |
| 10 | Lit | 1988-06-26 | 6 | 0 |
+------------+------------------+--------------+-------------+--------------+
but how to achieve this on postgresql?
Postgres doesn't support "virtual" computed column (i.e. computed columns that are generated at runtime), so there is no exact equivalent. The most efficient solution is a view that counts this:
create view artists_with_counts
as
select a.*,
coalesce(t.album_count, 0) as album_count
from artists a
left join (
select artist_id, count(*) as album_count
from albums
group by artist_id
) t on a.artist_id = t.artist_id;
Another option is to create a function that can be used as a "virtual column" in a select - but as this is done row-by-row, this will be substantially slower than the view.
create function album_count(p_artist artists)
returns bigint
as
$$
select count(*)
from albums a
where a.artist_id = p_artist.artist_id;
$$
language sql
stable;
Then you can include this as a column:
select a.*, a.album_count
from artists a;
Using the function like that, requires to prefix the function reference with the table alias (alternatively, you can use album_count(a))
Online example

Select common values when using group by [Postgres]

I have three main tables meetings, persons, hobbies with two relational tables.
Table meetings
+---------------+
| id | subject |
+----+----------+
| 1 | Kickoff |
| 2 | Relaunch |
| 3 | Party |
+----+----------+
Table persons
+------------+
| id | name |
+----+-------+
| 1 | John |
| 2 | Anna |
| 3 | Linda |
+----+-------+
Table hobbies
+---------------+
| id | name |
+----+----------+
| 1 | Soccer |
| 2 | Tennis |
| 3 | Swimming |
+----+----------+
Relation Table meeting_person
+-----------------+-----------+
| id | meeting_id | person_id |
+----+------------+-----------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 2 |
| 6 | 3 | 1 |
+----+------------+-----------+
Relation Table person_hobby
+----------------+----------+
| id | person_id | hobby_id |
+----+-----------+----------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 2 |
| 6 | 3 | 1 |
+----+-----------+----------+
Now I want to to find the common hobbies of all person attending each meeting.
So the desired result would be:
+------------+-----------------+------------------------+
| meeting_id | persons | common_hobbies |
| | (Aggregated) | (Aggregated) |
+------------+-----------------+------------------------+
| 1 | John,Anna,Linda | Soccer |
| 2 | John,Anna | Soccer,Tennis |
| 3 | John | Soccer,Tennis,Swimming |
+------------+-----------------+------------------------+
My current work in progress is:
select
m.id as "meeting_id",
(
select string_agg(distinct p.name, ',')
from meeting_person mp
inner join persons p on mp.person_id = p.id
where m.id = mp.meeting_id
) as "persons",
string_agg(distinct h2.name , ',') as "common_hobbies"
from meetings m
inner join meeting_person mp2 on m.id = mp2.meeting_id
inner join persons p2 on mp2.person_id = p2.id
inner join person_hobby ph2 on p2.id = ph2.person_id
inner join hobbies h2 on ph2.hobby_id = h2.id
group by m.id
But this query lists not the common_hobbies but all hobbies which are at least once mentioned.
+------------+-----------------+------------------------+
| meeting_id | persons | common_hobbies |
+------------+-----------------+------------------------+
| 1 | John,Anna,Linda | Soccer,Tennis,Swimming |
| 2 | John,Anna | Soccer,Tennis,Swimming |
| 3 | John | Soccer,Tennis,Swimming |
+------------+-----------------+------------------------+
Does anyone have any hints for me, on how I could solve this problem?
Cheers
This problem can be solved by implement custom aggregation function (found it here):
create or replace function array_intersect(anyarray, anyarray)
returns anyarray language sql
as $$
select
case
when $1 is null then $2
when $2 is null then $1
else
array(
select unnest($1)
intersect
select unnest($2))
end;
$$;
create aggregate array_intersect_agg (anyarray)
(
sfunc = array_intersect,
stype = anyarray
);
So, the solution can be next:
select
meeting_id,
array_agg(ph.name) persons,
array_intersect_agg(hobby) common_hobbies
from meeting_person mp
join (
select p.id, p.name, array_agg(h.name) hobby
from person_hobby ph
join persons p on ph.person_id = p.id
join hobbies h on h.id = ph.hobby_id
group by p.id, p.name
) ph on ph.id = mp.person_id
group by meeting_id;
Look the example fiddle
Result:
meeting_id | persons | common_hobbies
-----------+-----------------------+--------------------------
1 | {John,Anna,Linda} | {Soccer}
3 | {John} | {Soccer,Tennis,Swimming}
2 | {John,Anna} | {Soccer,Tennis}

Find rows in relation with at least n rows in a different table without joins

I have a table as such (tbl):
+----+------+-----+
| pk | attr | val |
+----+------+-----+
| 0 | ohif | 4 |
| 1 | foha | 56 |
| 2 | slns | 2 |
| 3 | faso | 11 |
+----+------+-----+
And another table in n-to-1 relationship with tbl (tbl2):
+----+-----+
| pk | rel |
+----+-----+
| 0 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | 2 |
| 4 | 2 |
| 5 | 3 |
| 6 | 1 |
| 7 | 2 |
+----+-----+
(tbl2.rel -> tbl.pk.)
I would like to select only the rows from tbl which are in relationship with at least n rows from tbl2.
I.e., for n = 2, I want this table:
+----+------+-----+
| pk | attr | val |
+----+------+-----+
| 0 | ohif | 4 |
| 1 | foha | 56 |
| 2 | slns | 2 |
+----+------+-----+
This is the solution I came up with:
SELECT DISTINCT ON (tbl.pk) tbl.*
FROM (
SELECT tbl.pk
FROM tbl
RIGHT OUTER JOIN tbl2 ON tbl2.rel = tbl.pk
GROUP BY tbl.pk
HAVING COUNT(tbl2.*) >= 2 -- n
) AS tbl_candidates
LEFT OUTER JOIN tbl ON tbl_candidates.pk = tbl.pk
Can it be done without selecting the candidates with a subquery and re-joining the table with itself?
I'm on Postgres 10. A standard SQL solution would be better, but a Postgres solution is acceptable.
OK, just join once, as below:
select
t1.pk,
t1.attr,
t1.val
from
tbl t1
join
tbl2 t2 on t1.pk = t2.rel
group by
t1.pk,
t1.attr,
t1.val
having(count(1)>=2) order by t1.pk;
pk | attr | val
----+------+-----
0 | ohif | 4
1 | foha | 56
2 | slns | 2
(3 rows)
Or just join once and use CTE(with clause), as below:
with tmp as (
select rel from tbl2 group by rel having(count(1)>=2)
)
select b.* from tmp t join tbl b on t.rel = b.pk order by b.pk;
pk | attr | val
----+------+-----
0 | ohif | 4
1 | foha | 56
2 | slns | 2
(3 rows)
Is the SQL clearer?

postgres sql : getting unified rows

I have one table where I dump all records from different sources (x, y, z) like below
+----+------+--------+
| id | source |
+----+--------+
| 1 | x |
| 2 | y |
| 3 | x |
| 4 | x |
| 5 | y |
| 6 | z |
| 7 | z |
| 8 | x |
| 9 | z |
| 10 | z |
+----+--------+
Then I have one mapping table where I map values between sources based on my usecase like below
+----+-----------+
| id | mapped_id |
+----+-----------+
| 1 | 2 |
| 1 | 9 |
| 3 | 7 |
| 4 | 10 |
| 5 | 1 |
+----+-----------+
I want merged results where I can see only unique results like
+-----+------------+
| id | mapped_ids |
+-----+------------+
| 1 | 2,9,5 |
| 3 | 7 |
| 4 | 10 |
| 6 | null |
| 8 | null |
+-----+------------+
I am trying different options but could not figure this out, is there way I can write joins to do this. I have to use the mapping table where associations are stored and identify unique records along with records which are not mapped anywhere.
My understanding is, you want to see all dump_table IDs that do not appear in the mapping_id column and then aggregate the mapped_ids for those that are left:
select d1.id,
array_agg(m1.mapped_id order by m1.mapped_id) filter (where m1.mapped_id is not null) as mapped_ids
from dump_table d1
left join mapping_table m1 using (id)
where not exists (select *
from mapping_table m2
where m2.mapped_id = d1.id)
group by d1.id;
Online example: https://rextester.com/JQZ17650
Try something like this:
SELECT id, name, ARRAY_AGG(mapped_id) AS mapped_ids
FROM table1 AS t1
LEFT JOIN table2 AS t2 USING (id)
GROUP BY id, name

Finding value difference in column pairs

I'm using SQL server 2008R2 and I have a view which returns the following:
+----+-------+-------+-------+-------+-------+-------+
| ID | col1A | col1B | col2A | col2B | col3A | col3B |
+----+-------+-------+-------+-------+-------+-------+
| 1 | 1 | 1 | 3 | 5 | 4 | 4 |
| 2 | 1 | 1 | 5 | 5 | 5 | 4 |
| 3 | 3 | 4 | 5 | 5 | 4 | 4 |
| 4 | 1 | 2 | 5 | 5 | 4 | 3 |
| 5 | 1 | 1 | 2 | 2 | 3 | 3 |
+----+-------+-------+-------+-------+-------+-------+
As you can see this view contains column pairs (col1A and col1B), (col2A and col2B), (col3A and col3B).
I need to query this view and find rows where the column pairs contain different values.
So I would be looking to return:
+----+------------+---+-----+
| ID | ColumnType | A | B |
+----+------------+---+-----+
| 1 | Col2 | 3 | 5 |
| 2 | Col3 | 5 | 4 |
| 3 | Col1 | 3 | 4 |
| 4 | Col1 | 1 | 2 |
| 4 | Col3 | 4 | 3 |
+----+------------+---+-----+
I think I need to use UNPIVOT but not sure how – appreciate any suggestions?
Since you are using SQL Server 2008+ you can use CROSS APPLY to unpivot the pair of columns and then you can easily compare the values in the A and B to return the rows that don't match:
select t.ID,
c.ColumnType,
c.A,
c.B
from [dbo].[yourview] t
cross apply
(
values
('Col1', Col1A, Col1B),
('Col2', Col2A, Col2B),
('Col3', Col3A, Col3B)
) c (ColumnType, A, B)
where c.A <> c.B;
If you have different datatypes in your columns, then you'll need to convert the data to the same type. You can do this conversion within the VALUES clause:
select t.ID,
c.ColumnType,
c.A,
c.B
from [dbo].[yourview] t
cross apply
(
values
('Col1', cast(Col1A as varchar(50)), Col1B),
('Col2', cast(Col2A as varchar(50)), Col2B),
('Col3', cast(Col3A as varchar(50)), Col3B)
) c (ColumnType, A, B)
where c.A <> c.B