How do I select a postgres Many-to-One relationship as a single row? [duplicate] - postgresql

This question already has answers here:
PostgreSQL Crosstab Query
(7 answers)
Closed 3 years ago.
I have a many-to-one relationship between Animals and their attributes. Because different Animals have different attributes, I want to be able to select all animals with their attribute name as a column header and NULL values where that animal does not have that attribute.
Like so...
TABLE_ANIMALS
ID | ANIMAL | DATE | MORE COLS....
1 | CAT | 2012-01-10 | ....
2 | DOG | 2012-01-10 | ....
3 | FROG | 2012-01-10 | ....
...
TABLE_ATTRIBUTES
ID | ANIMAL_ID | ATTRIBUE_NAME | ATTRIBUTE_VALUE
1 | 1 | noise | meow
2 | 1 | legs | 4
3 | 1 | has_fur | TRUE
4 | 2 | noise | woof
5 | 2 | legs | 4
6 | 3 | noise | croak
7 | 3 | legs | 2
8 | 3 | has_fur | FALSE
...
QUERY RESULT
ID | ANIMAL | NOISE | LEGS | HAS_FUR
1 | CAT | meow | 4 | TRUE
2 | DOG | woof | 4 | NULL
3 | FROG | croak | 2 | FALSE
How would I do this? To reiterate, it's important that all the columns are there even if one Animal doesn't have that attribute, such as "DOG" and "HAS_FUR" in this example. If it doesn't have the attribute, it should just be null.

How about a simple join, aggregation and group by?
create table table_animals(id int, animal varchar(10), date date);
create table table_attributes(id varchar(10), animal_id int, attribute_name varchar(10), attribute_value varchar(10));
insert into table_animals values (1, 'CAT', '2012-01-10'),
(2, 'DOG', '2012-01-10'),
(3, 'FROG', '2012-01-10');
insert into table_attributes values (1, 1, 'noise', 'meow'),
(2, 1, 'legs', 4),
(3, 1, 'has_fur', TRUE),
(4, 2, 'noise', 'woof'),
(5, 2, 'legs', 4),
(6, 3, 'noise', 'croak'),
(7, 3, 'legs', 2),
(8, 3, 'has_fur', FALSE);
select ta.animal,
max(attribute_value) filter (where attribute_name = 'noise') as noise,
max(attribute_value) filter (where attribute_name = 'legs') as legs,
max(attribute_value) filter (where attribute_name = 'has_fur') as has_fur
from table_animals ta
left join table_attributes tat on tat.animal_id = ta.id
group by ta.animal
Here's a rextester sample
Additionally you can change the aggregation to MAX CASE WHEN... but MAX FILTER WHERE has better performance.

Related

Aggregate all combinations of rows taken k at a time

I am trying to calculate an aggregate function for a field for a subset of rows in a table. The problem is that I'd like to find the mean of every combination of rows taken k at a time --- so for all the rows, I'd like to find (say) the mean of every combination of 10 rows. So:
id | count
----|------
1 | 5
2 | 3
3 | 6
...
30 | 16
should give me
mean of ids 1..10; ids 1, 3..11; ids 1, 4..12, and so so. I know this will yield a lot of rows.
There are SO answers for finding combinations from arrays. I could do this programmatically by taking 30 ids 10 at a time and then SELECTing them. Is there a way to do this with PARTITION BY, TABLESAMPLE, or another function (something like python's itertools.combinations())? (TABLESAMPLE by itself won't guarantee which subset of rows I am selecting as far as I can tell.)
The method described in the cited answer is static. A more convenient solution may be to use recursion.
Example data:
drop table if exists my_table;
create table my_table(id int primary key, number int);
insert into my_table values
(1, 5),
(2, 3),
(3, 6),
(4, 9),
(5, 2);
Query which finds 2 element subsets in 5 element set (k-combination with k = 2):
with recursive recur as (
select
id,
array[id] as combination,
array[number] as numbers,
number as sum
from my_table
union all
select
t.id,
combination || t.id,
numbers || t.number,
sum+ number
from my_table t
join recur r on r.id < t.id
and cardinality(combination) < 2 -- param k
)
select combination, numbers, sum/2.0 as average -- param k
from recur
where cardinality(combination) = 2 -- param k
combination | numbers | average
-------------+---------+--------------------
{1,2} | {5,3} | 4.0000000000000000
{1,3} | {5,6} | 5.5000000000000000
{1,4} | {5,9} | 7.0000000000000000
{1,5} | {5,2} | 3.5000000000000000
{2,3} | {3,6} | 4.5000000000000000
{2,4} | {3,9} | 6.0000000000000000
{2,5} | {3,2} | 2.5000000000000000
{3,4} | {6,9} | 7.5000000000000000
{3,5} | {6,2} | 4.0000000000000000
{4,5} | {9,2} | 5.5000000000000000
(10 rows)
The same query for k = 3 gives:
combination | numbers | average
-------------+---------+--------------------
{1,2,3} | {5,3,6} | 4.6666666666666667
{1,2,4} | {5,3,9} | 5.6666666666666667
{1,2,5} | {5,3,2} | 3.3333333333333333
{1,3,4} | {5,6,9} | 6.6666666666666667
{1,3,5} | {5,6,2} | 4.3333333333333333
{1,4,5} | {5,9,2} | 5.3333333333333333
{2,3,4} | {3,6,9} | 6.0000000000000000
{2,3,5} | {3,6,2} | 3.6666666666666667
{2,4,5} | {3,9,2} | 4.6666666666666667
{3,4,5} | {6,9,2} | 5.6666666666666667
(10 rows)
Of course, you can remove numbers from the query if you do not need them.

postgresql unique index preventing overlaping

My table permission looks like:
id serial,
person_id integer,
permission_id integer,
valid_from date,
valid_to date
I'd like to prevent creating permissions which overlaps valid_from, valid_to date
eg.
1 | 1 | 1 | 2010-10-01 | 2999-12-31
2 | 1 | 2 | 2010-10-01 | 2020-12-31
3 | 2 | 1 | 2015-10-01 | 2999-12-31
this can be added:
4 | 1 | 3 | 2011-10-01 | 2999-12-31 - because no such permission
5 | 2 | 1 | 2011-10-10 | 2999-12-31 - because no such person
6 | 1 | 2 | 2021-01-01 | 2999-12-31 - because doesn't overlaps id:2
but this can't
7 | 1 | 1 | 2009-10-01 | 2010-02-01 - because overlaps id:1
8 | 1 | 2 | 2019-01-01 | 2022-12-31 - because overlaps id:2
9 | 2 | 1 | 2010-01-01 | 2016-12-31 - beacuse overlaps id:3
I can do outside checking but wonder if possible to do it on database
A unique constraint is based on an equality operator and cannot be used in this case, but you can use an exclude constraint. The constraint uses btree operators <> and =, hence you have to install btree_gist extension.
create extension if not exists btree_gist;
create table permission(
id serial,
person_id integer,
permission_id integer,
valid_from date,
valid_to date,
exclude using gist (
person_id with =,
permission_id with =,
daterange(valid_from, valid_to) with &&)
);
These inserts are successful:
insert into permission values
(1, 1, 1, '2010-10-01', '2999-12-31'),
(2, 1, 2, '2010-10-01', '2020-12-31'),
(3, 2, 1, '2015-10-01', '2999-12-31'),
(4, 1, 3, '2011-10-01', '2999-12-31'),
(5, 3, 1, '2011-10-10', '2999-12-31'), -- you meant person_id = 3 I suppose
(6, 1, 2, '2021-01-01', '2999-12-31'),
(7, 1, 1, '2009-10-01', '2010-02-01'); -- ranges do not overlap!
but this one is not:
insert into permission values
(8, 1, 2, '2019-01-01', '2022-12-31');
ERROR: conflicting key value violates exclusion constraint "permission_person_id_permission_id_daterange_excl"
DETAIL: Key (person_id, permission_id, daterange(valid_from, valid_to))=(1, 2, [2019-01-01,2022-12-31)) conflicts with existing key (person_id, permission_id, daterange(valid_from, valid_to))=(1, 2, [2010-10-01,2020-12-31)).
Try it in db<>fiddle.

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.
select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

How to create a pivot table from hstore data?

Imagining I have a table cars with a field data inside:
CARS
name | data
car 1 | { "doors" => "5", "engine" => "1.1" }
car 2 | { "doors" => "3", "engine" => "1.1", "air_conditioning" => "true" }
car 3 | { "doors" => "5", "engine" => "1.4" }
Assuming data keys are dynamic (more can be added), how can I create a pivot table from that data like this:
CROSSTAB
name | doors | engine | air_conditioning
car 1 | 5 | 1.1 |
car 2 | 3 | 1.1 | "true"
car 3 | 5 | 1.4 |
Here's how to get the result you asked for:
CREATE TABLE hstore_test (id bigserial primary key, title text, doors integer, engine text, air_conditioning boolean)
INSERT INTO hstore_test (title, doors, engine, air_conditioning)
VALUES ('Car1', 2, '1.1', false), ('Car2', 4, '1.2', true), ('Car3', 3, '1.3', false), ('Car4', 5, '1.4', null);
DROP TABLE IF EXISTS hstore_persist;
CREATE TABLE hstore_persist AS
SELECT hstore(t) car_data FROM hstore_test AS t;
SELECT car_data->'title' "name", car_data->'doors' doors, car_data->'engine' engine, car_data->'air_conditioning' air_conditioning
FROM hstore_persist
This will result in the table
name | doors | engine | air_conditioning
Car1 | 2 | 1.1 | f
Car2 | 4 | 1.2 | t
Car3 | 3 | 1.3 | f
Car4 | 5 | 1.4 |
There is nothing "crosstab" about it, though. This is just using the accessor methods of an hstore to display the data in the way you show in the example.

Select statement with join, or subquery limit

For few days now I'm trying to solve this problem.
I have table group_user, group_name.
What I wanna to do is select user groups, than description that group (from group_name), and 10 other users from the group.
It's not problem with first two. The problem is, that I'm nowhere to get limit users.
I can select user_group, and other users in that group. I don't know how to limit that.
Using:
SELECT a.g_id,b.group,b.userid
FROM group_user AS a
RIGHT JOIN
(SELECT g_id as group, u_id as userid FROM group_user) AS b ON a.g_id=b.group
WHERE u_id=112
It showing me, my user groups and users in that group. But when I'm trying to limit in subwuery, it limits all, not particular group.
I tried, Select users, with using IN where was goups of my user without luck.
I was thinking maybe group and having will help, but I can't see how I could use it.
So my question is, how can I limit subquery result in MySQL where the subquery is built on result of query.
I think im overload and maybe I don't see something.
UPDATE to show what I really wanna accomplish here's another piece of code.
SELECT g_id FROM group_user WHERE user_id = 112
So I get all groups that user is in let, saye each of that select is var extra_group, so second query will be
SELECT u_id FROM group_user WHERE group_id = extra_group LIMIT 10
I need to do same as above, in one query.
another UPDATE after MIKE post.
I should ADD that, user can be in more than 1 group. So I think the real problem is, that I don't have any clue how to select those groups and in same query select 10 users for selected groups, so in result could be
g_id u_id
1 | 2
1 | 3
1 | 4
3 | 3
3 | 8
where g_id is user groups from that query
SELECT g_id FROM group_user WHERE user_id = 112
Create sample tables and add data:
CREATE TABLE `group_user` (
`u_id` int(11) DEFAULT NULL,
`g_id` int(11) DEFAULT NULL,
`apply_date` date DEFAULT NULL
);
CREATE TABLE `group_name` (
`g_id` int(11) DEFAULT NULL,
`g_name` varchar(255) DEFAULT NULL
);
INSERT INTO `group_name` VALUES
(1, 'Group 1'), (2, 'Group 2'), (3, 'Group 3'), (4, 'Group 4'), (5, 'Group 5');
INSERT INTO `group_user` VALUES
(1, 1, '2010-12-01'), (1, 2, '2010-12-01'), (1, 3, '2010-12-01'), (1, 4, '2010-12-01'), (1, 5, '2010-12-01'),
(2, 1, '2010-12-02'), (2, 2, '2010-12-02'),
(3, 1, '2010-12-03'), (3, 2, '2010-12-03'), (3, 3, '2010-12-03'), (3, 4, '2010-12-03'),
(4, 1, '2010-12-04'), (4, 2, '2010-12-04'),
(5, 1, '2010-12-05'), (5, 2, '2010-12-05'),
(6, 1, '2010-12-06'), (6, 2, '2010-12-06'),
(7, 1, '2010-12-07'), (7, 2, '2010-12-07'), (7, 3, '2010-12-07'), (7, 4, '2010-12-07'), (7, 5, '2010-12-07'),
(8, 1, '2010-12-08'), (8, 2, '2010-12-08'),
(9, 1, '2010-12-09'), (9, 2, '2010-12-09'), (9, 3, '2010-12-09'), (9, 4, '2010-12-09'), (9, 5, '2010-12-09');
Select the groups of which user u_id == 1 is a member. Then for each group select a maximum of 4 members (excluding user u_id == 1), ordered by descending apply_date:
SELECT u3.g_id, g.g_name, u3.u_id, u3.apply_date
FROM (
SELECT
u1.g_id,
u1.u_id,
u1.apply_date,
IF( #prev_gid <> u1.g_id, #user_index := 1, #user_index := #user_index + 1 ) AS user_index,
#prev_gid := u1.g_id AS prev_gid
FROM group_user AS u1
JOIN (SELECT #prev_gid := 0, #user_index := NULL) AS vars
JOIN group_user AS u2
ON u2.g_id = u1.g_id
AND u2.u_id = 1
AND u1.u_id <> 1
ORDER BY u1.g_id, u1.apply_date DESC, u1.u_id
) AS u3
JOIN group_name AS g ON g.g_id = u3.g_id
WHERE u3.user_index <= 4
ORDER BY u3.g_id, u3.apply_date DESC, u3.u_id;
+------+---------+------+------------+
| g_id | g_name | u_id | apply_date |
+------+---------+------+------------+
| 1 | Group 1 | 5 | 2010-12-05 |
| 1 | Group 1 | 4 | 2010-12-04 |
| 1 | Group 1 | 3 | 2010-12-03 |
| 1 | Group 1 | 2 | 2010-12-02 |
| 2 | Group 2 | 5 | 2010-12-05 |
| 2 | Group 2 | 4 | 2010-12-04 |
| 2 | Group 2 | 3 | 2010-12-03 |
| 2 | Group 2 | 2 | 2010-12-02 |
| 3 | Group 3 | 9 | 2010-12-09 |
| 3 | Group 3 | 7 | 2010-12-07 |
| 3 | Group 3 | 3 | 2010-12-03 |
| 4 | Group 4 | 9 | 2010-12-09 |
| 4 | Group 4 | 7 | 2010-12-07 |
| 4 | Group 4 | 3 | 2010-12-03 |
| 5 | Group 5 | 9 | 2010-12-09 |
| 5 | Group 5 | 7 | 2010-12-07 |
+------+---------+------+------------+