postgres one to many flattened return - postgresql

I am trying to write a postgres query that uses 3 tables: people, attribute, and a people_attribute join
people table:
id, name
attribute table:
id, name, attr_group
people_attribute join:
people_id, attribute_id
desired output:
name | fav_colors | fav_music | fav_foods
-----------------------------------------------------------------
michael | red,blue,green | pop,hip-hop,jazz | pizza,burgers,tacos
bob | orange,green | null | tacos,steak,fish
...etc
The tags can vary from none to ~12 for each attr_group
Here is the query I am working with:
select
p.id,
p.name,
(case when a.attr_group like 'fav_colors' then string_agg(a.name, ',') else null end) as fav_colors,
(case when a.attr_group like 'fav_music' then string_agg(a.name, ',') else null end) as fav_music,
(case when a.attr_group like 'fav_foods' then string_agg(a.name, ',') else null end) as fav_foods,
from people as p
join people_attribute as pa on pa.people_id = p.id
join "attribute" as a on a.id = pa.attribute_id
group by 1,2,a.attr_group
order by 1 asc;
which returns:
name | fav_colors | fav_music | fav_foods
-----------------------------------------------------------------
michael | red,blue,green | null | null
michael | null | pop,hip-hop,jazz | null
michael | null | null | pizza,burgers,tacos
bob | null | null | null
bob | orange,green | null | null
bob | null | null | tacos,steak,fish
I feel like I'm getting close, but am unsure how to flatten this out to achieve the desired output as shown above. Any help would be greatly appreciated!

You want to use filter for this:
select p.id,
p.name,
string_agg(a.name, ',') filter (where a.attr_group = 'fav_color') as fav_colors,
string_agg(a.name, ',') filter (where a.attr_group = 'fav_music') as fav_music,
string_agg(a.name, ',') filter (where a.attr_group = 'fav_foods') as fav_foods,
from people as p
join people_attribute as pa
on pa.people_id = p.id
join "attribute" as a
on a.id = pa.attribute_id
group by p.id, p.name
order by 1 asc;
Using filter passes only values that match the filter where condition into the aggregation.
The reason yours was showing three rows per people record is because you added attribute.attr_group to your group by. You had no choice since you were using attribute.attr_group in your case conditionals.
Using filter makes attribute.attr_group part of the aggregation, so you do not have to include it in your group by list.

Related

PostgreSQL How to merge two tables row to row without condition

I have two tables
The first table contains three text fields(username, email, num) the second have only one column with random birth_date DATE.
I need to merge tables without condition
For example
first table:
+----------+--------------+-----------+
| username | email | num |
+----------+--------------+-----------+
| 'user1' | 'user1#mail' | '+794949' |
| 'user2' | 'user2#mail' | '+799999' |
+----------+--------------+-----------+
second table:
+--------------+
| birth_date |
+--------------+
| '2001-01-01' |
| '2002-02-02' |
+--------------+
And I need result like
+----------+------------+-------------+--------------+
| username | email | num | birth_date |
+----------+------------+-------------+--------------+
| 'user1' | 'us1#mail' | '+7979797' | '2001-01-01' |
| 'user2' | 'us2#mail' | '+79898998' | '2002-02-02' |
+----------+------------+-------------+--------------+
I need to get in result table with 100 rows too
Tried different JOIN but there is no condition here
Sure there is a join condition, about the simplest there is: Join on true or cross join. Either is the basic merge tables without condition. However this does not result in what you want as it generates a result set of 10k rows. But you an then use limit:
select *
from table1
join table2 on true
order by random()
limit 100;
select *
from table1
cross join table2
order by random()
limit 100;
There is other option, witch I think may be closer to what you want. Assign a value to each row of each table. Then join on this assigned value:
select <column list>
from (select *, row_number() over() rn from table1) t1
join (select *, row_number() over() rn from table2) t2
on (t1.rn = t2.rn);
To eliminate the assigned value you must specifically list each column desired in the result. But that is the way it should be done anyway.
See demo here. (demo user just 3 rows instead of 100)

Add Default Rows in Postgresql

I want to insert default rows into a result set if the LEFT JOIN is NULL.
For example if Jane has no roles, I want to return some default ones in the results.
A query like this will return the following:
SELECT * FROM employees LEFT OUTER JOIN roles ON roles.employee_id = employees.id
Employee ID | Employee Name | Role ID | Role Name
1 | John | 1 | Admin
1 | John | 2 | Standard
2 | Jane | NULL | NULL
I want to return:
Employee ID | Employee Name | Role ID | Role Name
1 | John | 1 | Admin
1 | John | 2 | Standard
2 | Jane | NULL | Admin
2 | Jane | NULL | Standard
Is there a good way to do this in PostgreSQL?
I think you're looking for
SELECT e.*, r.*
FROM employees e
JOIN roles r ON r.employee_id = e.id
UNION ALL
SELECT e.*, NULL, default_name
FROM employees e
JOIN (VALUES ('Admin'), ('Standard')) AS roles(default_name)
WHERE NOT EXISTS (
SELECT *
FROM roles r
WHERE r.employee_id = e.id
)
I don't think there's a (good) way around the UNION because a LEFT JOIN introduces only a single row per unmatched row. You might be able to lift out the join against the employees table though:
SELECT e.*, r.*
FROM employees e,
LATERAL (
SELECT r.id, r.name
FROM roles r
WHERE r.employee_id = e.id
UNION ALL
SELECT NULL, default_name
FROM (VALUES ('Admin'), ('Standard')) AS roles(default_name)
WHERE NOT EXISTS (
SELECT *
FROM roles r
WHERE r.employee_id = e.id
)
)

Merge rows postgres and replace values with latest when not null

I have a table that looks like this:
I am looking for a way to merge the columns on organizations_core_id so that the query returns this:
organization_core_id, slug, name
1, dolphin, Dolphin v2
2, sea-horse-club, Sea Horse
How can I merge these columns and replace the latest value?
First group by organization_core_id to get the ids of the rows with the last not null values for slug and name and then join to the table:
select
t.organization_core_id,
t1.slug,
t2.name
from (
select
organization_core_id,
max(case when slug is not null then id end) slugid,
max(case when name is not null then id end) nameid
from tablename
group by organization_core_id
) t
left join tablename t1 on t1.id = t.slugid
left join tablename t2 on t2.id = t.nameid
See the demo.
Results:
> organization_core_id | slug | name
> -------------------: | :------------- | :---------
> 1 | dolphin | Dolphin v2
> 2 | sea-horse-club | Sea Horse

EAV data in SQL Server

I have no control over the data or the database structure. I have this EAV type of data where a consultant can speak one or many languages and he can travel to 1 or many countries in Europe and he has many skills indeed.
FYI there are 10 different main categories in my data.
Some consultants speak 10 languages while other speak only one.
The data looks a bit like this
____________________________________________
| ConsultantID | Category | Value |
--------------------------------------------
| 1 | Language | English |
| 1 | Language | French (fluent) |
| 1 | Language | Spanish (working)|
| 1 | Country | Ireland |
| 1 | Country | Italy |
| 1 | Country | Germany |
| 1 | Country | Belgium |
| 456 | Language | French (working) |
| 456 | Country | Belgium |
| 847 | Language | English |
| 847 | Country | Belgium |
--------------------------------------------
I want to list all consultants willing to travel to Belgium and who speak French (working or fluent). Based on my current example that would be #1 and #456
I wrote the query below which list all values matching a category for a consultant (note this is not dynamic as the number of value in my example is set to 5 max - so already a poor design).
SELECT
ID, category,
MAX(CASE seq WHEN 1 THEN value ELSE '' END ) +
MAX(CASE seq WHEN 2 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 3 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 4 THEN ',' + value ELSE '' END ) +
MAX(CASE seq WHEN 5 THEN ',' + value ELSE '' END )
FROM
(SELECT
p1.ID, p1.category, p1.value,
(SELECT COUNT(*)
FROM tblWebPracticeInfo p2
WHERE p2.category = p1.category
AND p2.ID = P1.ID
AND p2.value <= p1.value)
FROM
tblWebPracticeInfo p1) D (ID, category, value, seq )
GROUP BY
ID, category
ORDER BY
ID;
I would then need to query this table...
But without even a where clause it takes already 2 seconds to execute
I have something else more basic (but similarly not efficient)
select *
from tblWebMemberInfo m
where
m.ID in (select p.id from tblWebPracticeInfo p
where p.category = 'Language' and p.value like 'French%')
and m.ID in (select p.id from tblWebPracticeInfo p
where p.category = 'Country' and p.value = 'Belgium')
order by m.ID
That's basically where I am. As you can see nothing genius and nothing which is really working.
Can you point me to the right track.
I'm using SQL Server 2005 - v9.00.1
Many thanks in advance for your time & help
If you just need to list the consultants then you can use exists():
select p.Id ...
from Person p /* Assuming you have a regular table for people,
if not, use distinct or group by */
where exists (
select 1
from tblWebPracticeInfo l
where l.Id = p.Id
and l.Category = 'Language'
and l.Value = 'French'
)
and exists (
select 1
from tblWebPracticeInfo c
where c.Id = p.Id
and c.Category = 'Country'
and c.Value = 'Belgium'
)
You could also use aggregation and having like so:
select ConsultantID
from tblWebMemberInfo m
where (p.category = 'Language' and p.value like 'French%')
or (p.category = 'Country' and p.value = 'Belgium')
group by ConsultantID
having count(*) = 2 /* number of conditions to match is 2 */

Select query for selecting columns from those records from the inner query . where inner query and outer query have different columns

I have a group by query which fetches me some records. What if I wish to find other column details representing those records.
Suppose I have a query as follows .Select id,max(date) from records group by id;
to fetch the most recent entry in the table.
I wish to fetch another column representing those records .
I want to do something like this (This incorrect query is just for example) :
Select type from (Select id,max(date) from records group by id) but here type doesnt exist in the inner query.
I am not able to define the question in a simpler manner.I Apologise for that.
Any help is appreciated.
EDIT :
Column | Type | Modifiers
--------+-----------------------+-----------
id | integer |
rdate | date |
type | character varying(20) |
Sample Data :
id | rdate | type
----+------------+------
1 | 2013-11-03 | E1
1 | 2013-12-12 | E1
2 | 2013-12-12 | A3
3 | 2014-01-11 | B2
1 | 2014-01-15 | A1
4 | 2013-12-23 | C1
5 | 2014-01-05 | C
7 | 2013-12-20 | D
8 | 2013-12-20 | D
9 | 2013-12-23 | A1
While I was trying something like this (I'm no good at sql) : select type from records as r1 inner join (Select id,max(rdate) from records group by id) r2 on r1.rdate = r2.rdate ;
or
select type from records as r1 ,(Select id,max(rdate) from records group by id) r2 inner join r1 on r1.rdate = r2.rdate ;
You can easily do this with a window function:
SELECT id, rdate, type
FROM (
SELECT id, rdate, type, rank() OVER (PARTITION BY id ORDER BY rdate DESC) rnk
FROM records
WHERE rnk = 1
) foo
ORDER BY id;
The window definition OVER (PARTITION BY id ORDER BY rdate DESC) takes all records with the same id value, then sorts then from most recent to least recent rdate and assigns a rank to each row. The rank of 1 is the most recent, so equivalent to max(rdate).
If I've understood the question right, then this should work (or at least get you something you can work with):
SELECT
b.id, b.maxdate, a.type
FROM
records a -- this is the records table, where you'll get the type
INNER JOIN -- now join it to the group by query
(select id, max(rdate) as maxdate FROM records GROUP BY id) b
ON -- join on both rdate and id, otherwise you'll get lots of duplicates
b.id = a.id
AND b.maxdate = a.rdate
Note that if you have records with different types for the same id and rdate combination you'll get duplicates.