postgresql unique index preventing overlaping - postgresql

My table permission looks like:
id serial,
person_id integer,
permission_id integer,
valid_from date,
valid_to date
I'd like to prevent creating permissions which overlaps valid_from, valid_to date
eg.
1 | 1 | 1 | 2010-10-01 | 2999-12-31
2 | 1 | 2 | 2010-10-01 | 2020-12-31
3 | 2 | 1 | 2015-10-01 | 2999-12-31
this can be added:
4 | 1 | 3 | 2011-10-01 | 2999-12-31 - because no such permission
5 | 2 | 1 | 2011-10-10 | 2999-12-31 - because no such person
6 | 1 | 2 | 2021-01-01 | 2999-12-31 - because doesn't overlaps id:2
but this can't
7 | 1 | 1 | 2009-10-01 | 2010-02-01 - because overlaps id:1
8 | 1 | 2 | 2019-01-01 | 2022-12-31 - because overlaps id:2
9 | 2 | 1 | 2010-01-01 | 2016-12-31 - beacuse overlaps id:3
I can do outside checking but wonder if possible to do it on database

A unique constraint is based on an equality operator and cannot be used in this case, but you can use an exclude constraint. The constraint uses btree operators <> and =, hence you have to install btree_gist extension.
create extension if not exists btree_gist;
create table permission(
id serial,
person_id integer,
permission_id integer,
valid_from date,
valid_to date,
exclude using gist (
person_id with =,
permission_id with =,
daterange(valid_from, valid_to) with &&)
);
These inserts are successful:
insert into permission values
(1, 1, 1, '2010-10-01', '2999-12-31'),
(2, 1, 2, '2010-10-01', '2020-12-31'),
(3, 2, 1, '2015-10-01', '2999-12-31'),
(4, 1, 3, '2011-10-01', '2999-12-31'),
(5, 3, 1, '2011-10-10', '2999-12-31'), -- you meant person_id = 3 I suppose
(6, 1, 2, '2021-01-01', '2999-12-31'),
(7, 1, 1, '2009-10-01', '2010-02-01'); -- ranges do not overlap!
but this one is not:
insert into permission values
(8, 1, 2, '2019-01-01', '2022-12-31');
ERROR: conflicting key value violates exclusion constraint "permission_person_id_permission_id_daterange_excl"
DETAIL: Key (person_id, permission_id, daterange(valid_from, valid_to))=(1, 2, [2019-01-01,2022-12-31)) conflicts with existing key (person_id, permission_id, daterange(valid_from, valid_to))=(1, 2, [2010-10-01,2020-12-31)).
Try it in db<>fiddle.

Related

historical aggregation of a column up until a specified time in each row in another column

I have two tables login_attempts and checkouts in Amazon RedShift. A user can have multiple (un)successful login attempts and multiple (un)successful checkouts as shown in this example:
login_attempts
login_id | user_id | login | success
-------------------------------------------------------
1 | 1 | 2021-07-01 14:00:00 | 0
2 | 1 | 2021-07-01 16:00:00 | 1
3 | 2 | 2021-07-02 05:01:01 | 1
4 | 1 | 2021-07-04 03:25:34 | 0
5 | 2 | 2021-07-05 11:20:50 | 0
6 | 2 | 2021-07-07 12:34:56 | 1
and
checkouts
checkout_id | checkout_time | user_id | success
------------------------------------------------------------
1 | 2021-07-01 18:00:00 | 1 | 0
2 | 2021-07-02 06:54:32 | 2 | 1
3 | 2021-07-04 13:00:01 | 1 | 1
4 | 2021-07-08 09:05:00 | 2 | 1
Given this information, how can I get the following table with historical performance included for each checkout AS OF THAT TIME?
checkout_id | checkout | user_id | lastGoodLogin | lastFailedLogin | lastGoodCheckout | lastFailedCheckout |
---------------------------------------------------------------------------------------------------------------------------------------
1 | 2021-07-01 18:00:00 | 1 | 2021-07-01 16:00:00 | 2021-07-01 14:00:00 | NULL | NULL
2 | 2021-07-02 06:54:32 | 2 | 2021-07-02 05:01:01 | NULL | NULL | NULL
3 | 2021-07-04 13:00:01 | 1 | 2021-07-01 16:00:00 | 2021-07-04 03:25:34 | NULL | 2021-07-01 18:00:00
4 | 2021-07-08 09:05:00 | 2 | 2021-07-07 12:34:56 | 2021-07-05 11:20:50 | 2021-07-02 06:54:32 | NULL
Update: I was able to get lastFailedCheckout & lastGoodCheckout because that's doing window operations on the same table (checkouts) but I am failing to understand how to best join it with login_attempts table to get last[Good|Failed]Login fields. (sqlfiddle)
P.S.: I am open to PostgreSQL suggestions as well.
Good start! A couple things in your SQL - 1) You should really try to avoid inequality joins as these can lead to data explosions and aren't needed in this case. Just put a CASE statement inside your window function to use only the type of checkout (or login) you want. 2) You can use the frame clause to not self select the same row when finding previous checkouts.
Once you have this pattern you can use it to find the other 2 columns of data you are looking for. The first step is to UNION the tables together, not JOIN. This means making a few more columns so the data can live together but that is easy. Now you have the userid and the time the "thing" happened all in the same data. You just need to WINDOW 2 more times to pull the info you want. Lastly, you need to strip out the non-checkout rows with an outer select w/ where clause.
Like this:
create table login_attempts(
loginid smallint,
userid smallint,
login timestamp,
success smallint
);
create table checkouts(
checkoutid smallint,
userid smallint,
checkout_time timestamp,
success smallint
);
insert into login_attempts values
(1, 1, '2021-07-01 14:00:00', 0),
(2, 1, '2021-07-01 16:00:00', 1),
(3, 2, '2021-07-02 05:01:01', 1),
(4, 1, '2021-07-04 03:25:34', 0),
(5, 2, '2021-07-05 11:20:50', 0),
(6, 2, '2021-07-07 12:34:56', 1)
;
insert into checkouts values
(1, 1, '2021-07-01 18:00:00', 0),
(2, 2, '2021-07-02 06:54:32', 1),
(3, 1, '2021-07-04 13:00:01', 1),
(4, 2, '2021-07-08 09:05:00', 1)
;
SQL:
select *
from (
select
c.checkoutid,
c.userid,
c.checkout_time,
max(case success when 0 then checkout_time end) over (
partition by userid
order by event_time
rows between unbounded preceding and 1 preceding
) as lastFailedCheckout,
max(case success when 1 then checkout_time end) over (
partition by userid
order by event_time
rows between unbounded preceding and 1 preceding
) as lastGoodCheckout,
max(case lsuccess when 0 then login end) over (
partition by userid
order by event_time
rows between unbounded preceding and 1 preceding
) as lastFailedLogin,
max(case lsuccess when 1 then login end) over (
partition by userid
order by event_time
rows between unbounded preceding and 1 preceding
) as lastGoodLogin
from (
select checkout_time as event_time, checkoutid, userid,
checkout_time, success,
NULL as login, NULL as lsuccess
from checkouts
UNION ALL
select login as event_time,NULL as checkoutid, userid,
NULL as checkout_time, NULL as success,
login, success as lsuccess
from login_attempts
) c
) o
where o.checkoutid is not null
order by o.checkoutid

How do I select a postgres Many-to-One relationship as a single row? [duplicate]

This question already has answers here:
PostgreSQL Crosstab Query
(7 answers)
Closed 3 years ago.
I have a many-to-one relationship between Animals and their attributes. Because different Animals have different attributes, I want to be able to select all animals with their attribute name as a column header and NULL values where that animal does not have that attribute.
Like so...
TABLE_ANIMALS
ID | ANIMAL | DATE | MORE COLS....
1 | CAT | 2012-01-10 | ....
2 | DOG | 2012-01-10 | ....
3 | FROG | 2012-01-10 | ....
...
TABLE_ATTRIBUTES
ID | ANIMAL_ID | ATTRIBUE_NAME | ATTRIBUTE_VALUE
1 | 1 | noise | meow
2 | 1 | legs | 4
3 | 1 | has_fur | TRUE
4 | 2 | noise | woof
5 | 2 | legs | 4
6 | 3 | noise | croak
7 | 3 | legs | 2
8 | 3 | has_fur | FALSE
...
QUERY RESULT
ID | ANIMAL | NOISE | LEGS | HAS_FUR
1 | CAT | meow | 4 | TRUE
2 | DOG | woof | 4 | NULL
3 | FROG | croak | 2 | FALSE
How would I do this? To reiterate, it's important that all the columns are there even if one Animal doesn't have that attribute, such as "DOG" and "HAS_FUR" in this example. If it doesn't have the attribute, it should just be null.
How about a simple join, aggregation and group by?
create table table_animals(id int, animal varchar(10), date date);
create table table_attributes(id varchar(10), animal_id int, attribute_name varchar(10), attribute_value varchar(10));
insert into table_animals values (1, 'CAT', '2012-01-10'),
(2, 'DOG', '2012-01-10'),
(3, 'FROG', '2012-01-10');
insert into table_attributes values (1, 1, 'noise', 'meow'),
(2, 1, 'legs', 4),
(3, 1, 'has_fur', TRUE),
(4, 2, 'noise', 'woof'),
(5, 2, 'legs', 4),
(6, 3, 'noise', 'croak'),
(7, 3, 'legs', 2),
(8, 3, 'has_fur', FALSE);
select ta.animal,
max(attribute_value) filter (where attribute_name = 'noise') as noise,
max(attribute_value) filter (where attribute_name = 'legs') as legs,
max(attribute_value) filter (where attribute_name = 'has_fur') as has_fur
from table_animals ta
left join table_attributes tat on tat.animal_id = ta.id
group by ta.animal
Here's a rextester sample
Additionally you can change the aggregation to MAX CASE WHEN... but MAX FILTER WHERE has better performance.

PostgreSQL, two windowing functions at once

I have typical table with data, say mytemptable.
DROP TABLE IF EXISTS mytemptable;
CREATE TEMP TABLE mytemptable
(mydate date, somedoc text, inqty int, outqty int);
INSERT INTO mytemptable (mydate, somedoc, inqty, outqty)
VALUES ('01.01.2016.', '123-13-24', 3, 0),
('04.01.2016.', '15-19-44', 2, 0),
('06.02.2016.', '15-25-21', 0, 1),
('04.01.2016.', '21-133-12', 0, 1),
('04.01.2016.', '215-11-51', 0, 2),
('05.01.2016.', '11-181-01', 0, 1),
('05.02.2016.', '151-80-8', 4, 0),
('04.01.2016.', '215-11-51', 0, 2),
('07.02.2016.', '34-02-02', 0, 2);
SELECT row_number() OVER(ORDER BY mydate) AS rn,
mydate, somedoc, inqty, outqty,
SUM(inqty-outqty) OVER(ORDER BY mydate) AS csum
FROM mytemptable
ORDER BY mydate;
In my SELECT query I try to order result by date and add row numbers 'rn' and cumulative (passing) sum 'csum'. Of course unsuccessfully.
I believe this is because I use two windowing functions in query which conflicts in some way.
How to properly make this query to be fast, well ordered and to get proper result in 'csum' column (3, 5, 4, 2, 0, -1, 3, 2, 0)
Since there is an ordering tie at 2016-04-01 the result for those rows will be the total accumulated sum. If you want it to be different use untie columns in the order by.
From the manual:
There is another important concept associated with window functions: for each row, there is a set of rows within its partition called its window frame. Many (but not all) window functions act only on the rows of the window frame, rather than of the whole partition. By default, if ORDER BY is supplied then the frame consists of all rows from the start of the partition up through the current row, plus any following rows that are equal to the current row according to the ORDER BY clause. When ORDER BY is omitted the default frame consists of all rows in the partition
Without an untieing column you can use the generated row number in an outer query:
set datestyle = 'dmy';
with mytemptable (mydate, somedoc, inqty, outqty) as (
values
('01-01-2016'::date, '123-13-24', 3, 0),
('04-01-2016', '15-19-44', 2, 0),
('06-02-2016', '15-25-21', 0, 1),
('04-01-2016', '21-133-12', 0, 1),
('04-01-2016', '215-11-51', 0, 2),
('05-01-2016', '11-181-01', 0, 1),
('05-02-2016', '151-80-8', 4, 0),
('04-01-2016', '215-11-51', 0, 2),
('07-02-2016', '34-02-02', 0, 2)
)
select *, sum(inqty-outqty) over(order by mydate, rn) as csum
from (
select
row_number() over(order by mydate) as rn,
mydate, somedoc, inqty, outqty
from mytemptable
) s
order by mydate;
rn | mydate | somedoc | inqty | outqty | csum
----+------------+-----------+-------+--------+------
1 | 2016-01-01 | 123-13-24 | 3 | 0 | 3
2 | 2016-04-01 | 15-19-44 | 2 | 0 | 5
3 | 2016-04-01 | 21-133-12 | 0 | 1 | 4
4 | 2016-04-01 | 215-11-51 | 0 | 2 | 2
5 | 2016-04-01 | 215-11-51 | 0 | 2 | 0
6 | 2016-05-01 | 11-181-01 | 0 | 1 | -1
7 | 2016-05-02 | 151-80-8 | 4 | 0 | 3
8 | 2016-06-02 | 15-25-21 | 0 | 1 | 2
9 | 2016-07-02 | 34-02-02 | 0 | 2 | 0

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.
select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

Select statement with join, or subquery limit

For few days now I'm trying to solve this problem.
I have table group_user, group_name.
What I wanna to do is select user groups, than description that group (from group_name), and 10 other users from the group.
It's not problem with first two. The problem is, that I'm nowhere to get limit users.
I can select user_group, and other users in that group. I don't know how to limit that.
Using:
SELECT a.g_id,b.group,b.userid
FROM group_user AS a
RIGHT JOIN
(SELECT g_id as group, u_id as userid FROM group_user) AS b ON a.g_id=b.group
WHERE u_id=112
It showing me, my user groups and users in that group. But when I'm trying to limit in subwuery, it limits all, not particular group.
I tried, Select users, with using IN where was goups of my user without luck.
I was thinking maybe group and having will help, but I can't see how I could use it.
So my question is, how can I limit subquery result in MySQL where the subquery is built on result of query.
I think im overload and maybe I don't see something.
UPDATE to show what I really wanna accomplish here's another piece of code.
SELECT g_id FROM group_user WHERE user_id = 112
So I get all groups that user is in let, saye each of that select is var extra_group, so second query will be
SELECT u_id FROM group_user WHERE group_id = extra_group LIMIT 10
I need to do same as above, in one query.
another UPDATE after MIKE post.
I should ADD that, user can be in more than 1 group. So I think the real problem is, that I don't have any clue how to select those groups and in same query select 10 users for selected groups, so in result could be
g_id u_id
1 | 2
1 | 3
1 | 4
3 | 3
3 | 8
where g_id is user groups from that query
SELECT g_id FROM group_user WHERE user_id = 112
Create sample tables and add data:
CREATE TABLE `group_user` (
`u_id` int(11) DEFAULT NULL,
`g_id` int(11) DEFAULT NULL,
`apply_date` date DEFAULT NULL
);
CREATE TABLE `group_name` (
`g_id` int(11) DEFAULT NULL,
`g_name` varchar(255) DEFAULT NULL
);
INSERT INTO `group_name` VALUES
(1, 'Group 1'), (2, 'Group 2'), (3, 'Group 3'), (4, 'Group 4'), (5, 'Group 5');
INSERT INTO `group_user` VALUES
(1, 1, '2010-12-01'), (1, 2, '2010-12-01'), (1, 3, '2010-12-01'), (1, 4, '2010-12-01'), (1, 5, '2010-12-01'),
(2, 1, '2010-12-02'), (2, 2, '2010-12-02'),
(3, 1, '2010-12-03'), (3, 2, '2010-12-03'), (3, 3, '2010-12-03'), (3, 4, '2010-12-03'),
(4, 1, '2010-12-04'), (4, 2, '2010-12-04'),
(5, 1, '2010-12-05'), (5, 2, '2010-12-05'),
(6, 1, '2010-12-06'), (6, 2, '2010-12-06'),
(7, 1, '2010-12-07'), (7, 2, '2010-12-07'), (7, 3, '2010-12-07'), (7, 4, '2010-12-07'), (7, 5, '2010-12-07'),
(8, 1, '2010-12-08'), (8, 2, '2010-12-08'),
(9, 1, '2010-12-09'), (9, 2, '2010-12-09'), (9, 3, '2010-12-09'), (9, 4, '2010-12-09'), (9, 5, '2010-12-09');
Select the groups of which user u_id == 1 is a member. Then for each group select a maximum of 4 members (excluding user u_id == 1), ordered by descending apply_date:
SELECT u3.g_id, g.g_name, u3.u_id, u3.apply_date
FROM (
SELECT
u1.g_id,
u1.u_id,
u1.apply_date,
IF( #prev_gid <> u1.g_id, #user_index := 1, #user_index := #user_index + 1 ) AS user_index,
#prev_gid := u1.g_id AS prev_gid
FROM group_user AS u1
JOIN (SELECT #prev_gid := 0, #user_index := NULL) AS vars
JOIN group_user AS u2
ON u2.g_id = u1.g_id
AND u2.u_id = 1
AND u1.u_id <> 1
ORDER BY u1.g_id, u1.apply_date DESC, u1.u_id
) AS u3
JOIN group_name AS g ON g.g_id = u3.g_id
WHERE u3.user_index <= 4
ORDER BY u3.g_id, u3.apply_date DESC, u3.u_id;
+------+---------+------+------------+
| g_id | g_name | u_id | apply_date |
+------+---------+------+------------+
| 1 | Group 1 | 5 | 2010-12-05 |
| 1 | Group 1 | 4 | 2010-12-04 |
| 1 | Group 1 | 3 | 2010-12-03 |
| 1 | Group 1 | 2 | 2010-12-02 |
| 2 | Group 2 | 5 | 2010-12-05 |
| 2 | Group 2 | 4 | 2010-12-04 |
| 2 | Group 2 | 3 | 2010-12-03 |
| 2 | Group 2 | 2 | 2010-12-02 |
| 3 | Group 3 | 9 | 2010-12-09 |
| 3 | Group 3 | 7 | 2010-12-07 |
| 3 | Group 3 | 3 | 2010-12-03 |
| 4 | Group 4 | 9 | 2010-12-09 |
| 4 | Group 4 | 7 | 2010-12-07 |
| 4 | Group 4 | 3 | 2010-12-03 |
| 5 | Group 5 | 9 | 2010-12-09 |
| 5 | Group 5 | 7 | 2010-12-07 |
+------+---------+------+------------+