Return Only Duplicate Rows with PostgreSQL - postgresql

I have a bookings table and I am trying show unique guests who booked the same room.
booking_id | check_in_date | check_out_date | guest_id | room_number
------------+---------------+----------------+----------+-------------
1001 | 2018-01-01 | 2018-01-03 | 1 | 702
1002 | 2018-01-01 | 2018-01-05 | 2 | 1104
1003 | 2018-01-05 | 2018-01-07 | 4 | 509
1004 | 2018-01-05 | 2018-01-07 | 4 | 511
1005 | 2018-01-07 | 2018-01-08 | 2 | 404
1006 | 2018-01-07 | 2018-01-09 | 1 | 1104
1007 | 2018-01-10 | 2018-01-12 | 2 | 509
1008 | 2018-01-15 | 2018-01-18 | 6 | 404
1009 | 2018-01-15 | 2018-01-18 | 6 | 406
1010 | 2018-01-15 | 2018-01-17 | 4 | 511
1011 | 2018-01-20 | 2018-01-22 | 2 | 509
1012 | 2018-01-23 | 2018-01-25 | 4 | 511
So I'm attemping to return:
room_number 404 (room booked by unique guest_id 2 & 6)
room_number 509 (room booked by unique guest_id 2 & 4)
room_number 1104 (room booked by unique guest_id 1 & 2)
The closest I got was with this statement:
SELECT room_number, guest_id, COUNT(room_number)
FROM bookings
GROUP BY room_number, guest_id
ORDER BY room_number;
Which returns:
room_number | guest_id | count
-------------+----------+-------
404 | 2 | 1
404 | 6 | 1
406 | 6 | 1
509 | 4 | 1
509 | 2 | 2
511 | 4 | 3
702 | 1 | 1
1104 | 2 | 1
1104 | 1 | 1
I need to remove the room_number that appear once (room_number 406, 511 & 702).

Try This.
SQL Fiddle
Query 1:
select a.room_number,a.guest_id, b.count
FROM bookings a JOIN
(
SELECT room_number, count (*) as count
FROM bookings
GROUP BY room_number
HAVING count ( DISTINCT guest_id ) > 1
) b ON a.room_number = b.room_number
ORDER BY room_number
Results:
| room_number | guest_id | count |
|-------------|----------|-------|
| 404 | 6 | 2 |
| 404 | 2 | 2 |
| 509 | 2 | 3 |
| 509 | 4 | 3 |
| 509 | 2 | 3 |
| 1104 | 2 | 2 |
| 1104 | 1 | 2 |

Maybe you can try something like this. (This query works in SQL SERVER)
SELECT a_tabla.guest_id,COUNT(a_tabla.room_number)
FROM Resultados a_tabla
INNER JOIN (SELECT new_table.guest_id
FROM Resultados new_table
GROUP BY new_table.guest_id
HAVING count(*) > 1) R ON (a_tabla.guest_id = new_table.guest_id)
ORDER BY a_tabla.guest_id
This querys will return the guest_id which has 1 or more rooms, if you change 1 for 2 this will return the guest_id which has more than 2 rooms.
I hope this will be helpful

Related

postgrest retreive ranked results

I made a game, with level and scores saved into an sql table like this :
create table if not exists api.scores (
id serial primary key,
pseudo varchar(50),
level int,
score int,
created_at timestamptz default CURRENT_TIMESTAMP
);
I want to display the scores in the ui with the rank of each score, based on the score column, ordered by desc.
Here is a sample data :
id | pseudo | level | score | created_at
----+----------+-------+-------+-------------------------------
1 | test | 1 | 1 | 2020-05-01 11:25:20.446402+02
2 | test | 1 | 1 | 2020-05-01 11:28:11.04001+02
3 | szef | 1 | 115 | 2020-05-01 15:45:06.201135+02
4 | erg | 1 | 115 | 2020-05-01 15:55:19.621372+02
5 | zef | 1 | 115 | 2020-05-01 16:14:09.718861+02
6 | aa | 1 | 115 | 2020-05-01 16:16:49.369718+02
7 | zesf | 1 | 115 | 2020-05-01 16:17:42.504354+02
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02
9 | zef | 1 | 115 | 2020-05-01 16:22:23.406013+02
10 | zefzef | 1 | 115 | 2020-05-01 16:23:49.720094+02
Here is what I want :
id | pseudo | level | score | created_at | rank
----+----------+-------+-------+-------------------------------+------
31 | zef | 7 | 730 | 2020-05-01 18:40:42.586224+02 | 1
50 | Cyprien | 5 | 588 | 2020-05-02 14:08:39.034112+02 | 2
49 | cyprien | 4 | 438 | 2020-05-01 23:35:13.440595+02 | 3
51 | Cyprien | 3 | 374 | 2020-05-02 14:13:41.071752+02 | 4
47 | cyprien | 3 | 337 | 2020-05-01 23:27:53.025475+02 | 5
45 | balek | 3 | 337 | 2020-05-01 19:57:39.888233+02 | 5
46 | cyprien | 3 | 337 | 2020-05-01 23:25:56.047495+02 | 5
48 | cyprien | 3 | 337 | 2020-05-01 23:28:54.190989+02 | 5
54 | Cyzekfj | 2 | 245 | 2020-05-02 14:14:34.830314+02 | 9
8 | zesf | 2 | 236 | 2020-05-01 16:18:07.070728+02 | 10
13 | zef | 1 | 197 | 2020-05-01 16:28:59.95383+02 | 11
14 | azd | 1 | 155 | 2020-05-01 17:53:30.372793+02 | 12
38 | balek | 1 | 155 | 2020-05-01 19:08:57.622195+02 | 12
I want to retreive the rank based on the full table whatever the result set.
I'm using the postgrest webserver.
How do I do that ?
You are describing window function rank():
select t.*, rank() over(order by score desc) rnk
from mytable t
order by score desc

How to do sum of different values without duplicate

How to do a sum of different values but same ID without duplicate different values on a column?
My Input in SQL Command.
SELECT
students.id AS student_id,
students.name,
COUNT(*) AS enrolled,
c2.price AS course_price,
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on students.id = e.student_id
LEFT JOIN courses c2 on e.course_id = c2.id
WHERE student_id NOTNULL
GROUP BY students.id, students.name, c2.price
ORDER BY student_id ASC;
My result.
student_id | name | enrolled | paid
------------+---------------------+----------+------
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 1 | 75
1004 | Persis Havlíček | 5 | 450
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 1 | 75
1009 | Iiris Levitt | 2 | 180
1013 | Artair Kovač | 1 | 30
1013 | Artair Kovač | 1 | 90
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
My expected result.
student_id | name | enrolled | paid
------------+---------------------+----------+------
1001 | Gulbadan Bálint | 1 | 90
1002 | Hanna Adair | 5 | 450
1003 | Taddeo Bhattacharya | 1 | 90
1004 | Persis Havlíček | 6 | 525
1005 | Tory Bateson | 1 | 90
1007 | Dávid Fèvre | 1 | 90
1008 | Masuyo Stoddard | 1 | 90
1009 | Iiris Levitt | 3 | 255
1013 | Artair Kovač | 2 | 120
1015 | Matilda Guinness | 2 | 180
1017 | Margarita Ek | 1 | 90
1018 | Misti Zima | 3 | 270
1019 | Conall Ventura | 1 | 90
1020 | Vivian Monday | 2 | 180
I think that the cause come from a GROUP BY command but it will throw an error if I do not write a GROUP BY price.
Perhaps you can use SUM() function.
Please see link below, maybe it's same case with you:
how to group by and return sum row in Postgres
You have excluded course_price column both in your current and expected result. It seems you had wrongly included that in group by.
SELECT
students.id AS student_id,
students.name,
COUNT(*) AS enrolled,
--c2.price AS course_price, --exclude this in o/p?
(COUNT(*) * price) AS paid
FROM students
LEFT JOIN enrolls e on students.id = e.student_id
LEFT JOIN courses c2 on e.course_id = c2.id
WHERE student_id NOTNULL
GROUP BY students.id, students.name --,c2.price --and remove it from here
ORDER BY student_id ASC;

If a users record on column x is not null how do I count how many records that user has after the first time it is not null?

I would like to create a count per user of number of records after the first time that x is not null for that user.
I have a table that is similar to the following:
id | user_id | completed_at | x
----+---------+--------------+---
1 | 1001 | 2017-06-01 | 1
20 | 1001 | 2017-06-01 | 2
21 | 1001 | 2017-06-02 | 4
22 | 1001 | 2017-06-03 |
24 | 1001 | 2017-06-03 |
25 | 1001 | 2017-06-04 |
23 | 1001 | 2017-06-04 |
12 | 1001 | 2017-06-06 |
13 | 1001 | 2017-06-07 |
14 | 1001 | 2017-06-08 |
2 | 1002 | 2017-06-02 | 3
27 | 1002 | 2017-06-02 | 7
15 | 1002 | 2017-06-09 |
3 | 1003 | 2017-06-03 |
4 | 1004 | 2017-06-04 |
5 | 1005 | 2017-06-05 |
33 | 1005 | 2017-06-20 | 8
34 | 1006 | 2017-07-10 | 9
6 | 1006 | 2017-10-06 |
7 | 1007 | 2017-10-07 |
8 | 1008 | 2017-10-08 |
9 | 1009 | 2017-10-09 |
10 | 1010 | 2017-10-10 |
16 | 1011 | 2017-06-01 |
11 | 1011 | 2017-07-01 | 5
17 | 1012 | 2017-06-02 |
26 | 1012 | 2017-07-02 | 6
18 | 1013 | 2017-06-03 |
19 | 1014 | 2017-06-04 |
31 | 1014 | 2017-06-24 |
32 | 1014 | 2017-06-24 |
30 | 1014 | 2017-06-24 |
29 | 1014 | 2017-06-24 |
28 | 1014 | 2017-06-24 |
The expected output would look like this:
+------+------------+---------------+
| user | first_x | records_after |
+------+------------+---------------+
| 1001 | 2017-06-01 | 9 |
| 1002 | 2017-06-02 | 2 |
| 1005 | 2017-06-20 | 0 |
| 1011 | 2017-07-01 | 0 |
| 1012 | 2017-07-02 | 0 |
+------+------------+---------------+
Using running count, and then conditional count for running count > 0
Sample
WITH flags AS (
SELECT
user_id,
completed_at,
sum(CASE WHEN x IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY user_id ORDER BY completed_at ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING) AS flag
FROM users
),
completed AS (
SELECT DISTINCT ON (user_id)
user_id,
completed_at AS first_x
FROM flags
WHERE flag > 0
ORDER BY user_id, completed_at
)
SELECT DISTINCT
user_id AS user,
first_x,
count(flag) FILTER (WHERE flag>0) - 1 AS records_after
FROM flags
NATURAL JOIN completed
GROUP BY 1, 2
ORDER BY 1

Select rows by one column value should only be repeat N times

My table is:
id sub_id datetime resource
---|-----|------------|-------
1 | 10 | 04/03/2009 | 399
2 | 11 | 04/03/2009 | 244
3 | 10 | 04/03/2009 | 555
4 | 10 | 03/03/2009 | 300
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
7 | 11 | 24/12/2008 | 600
8 | 13 | 01/01/2009 | 750
9 | 10 | 01/01/2009 | 760
10 | 13 | 01/01/2009 | 570
11 | 11 | 01/01/2009 | 870
12 | 13 | 01/01/2009 | 670
13 | 13 | 01/01/2009 | 703
14 | 13 | 01/01/2009 | 705
I need to select for each sub_id only 2 times
Result would be:
id sub_id datetime resource
---|-----|------------|-------
1 | 10 | 04/03/2009 | 399
3 | 10 | 04/03/2009 | 555
5 | 11 | 03/03/2009 | 200
6 | 11 | 03/03/2009 | 500
8 | 13 | 01/01/2009 | 750
10 | 13 | 01/01/2009 | 570
How can I achieve this result in postgres ?
Use the window function row_number():
select id, sub_id, datetime, resource
from (
select *, row_number() over (partition by sub_id order by id)
from my_table
) s
where row_number < 3;
look at the order column (I use id to match your sample):
t=# with data as (select *,count(1) over (partition by sub_id order by id) from t)
select id,sub_id,datetime,resource from data where count <3;
id | sub_id | datetime | resource
----+--------+------------+----------
1 | 10 | 2009-03-04 | 399
3 | 10 | 2009-03-04 | 555
2 | 11 | 2009-03-04 | 244
5 | 11 | 2009-03-03 | 200
8 | 13 | 2009-01-01 | 750
10 | 13 | 2009-01-01 | 570
(6 rows)

Rank based on row number SQL Server 2008 R2

I want to group rank my table data by rowcount. First 12 rows that are ordered by date for each ProductID would get value = 1. Next 12 rows would get value = 2 assigned and so on.
How table structure looks:
For ProductID = 1267 are below associated dates:
02-01-2016
03-01-2016
.
. (skipping months..table has one date per month)
.
12-01-2016
02-01-2017
.
.
.
02-01-2018
Use row_number() over() with some arithmetic to calculate groups of 12 ordered by date (per productid). Change the sort to ASCendng or DESCendng to suit your need.
select *
, (11 + row_number() over(partition by productid order by somedate DESC)) / 12 as rnk
from mytable
GO
myTableID | productid | somedate | rnk
--------: | :------------- | :------------------ | :--
9 | 123456 | 2018-11-12 08:24:25 | 1
8 | 123456 | 2018-10-02 12:29:04 | 1
7 | 123456 | 2018-09-09 02:39:30 | 1
2 | 123456 | 2018-09-02 08:49:37 | 1
1 | 123456 | 2018-07-04 12:25:06 | 1
5 | 123456 | 2018-06-06 11:38:50 | 1
12 | 123456 | 2018-05-23 21:12:03 | 1
18 | 123456 | 2018-04-02 03:59:16 | 1
3 | 123456 | 2018-01-02 03:42:24 | 1
17 | 123456 | 2017-11-29 03:19:32 | 1
10 | 123456 | 2017-11-10 00:45:41 | 1
13 | 123456 | 2017-11-05 09:53:38 | 1
16 | 123456 | 2017-10-20 15:39:42 | 2
4 | 123456 | 2017-10-14 19:25:30 | 2
20 | 123456 | 2017-09-21 21:31:06 | 2
6 | 123456 | 2017-04-06 22:10:58 | 2
14 | 123456 | 2017-03-24 23:35:52 | 2
19 | 123456 | 2017-01-22 05:07:23 | 2
11 | 123456 | 2016-12-13 19:17:08 | 2
15 | 123456 | 2016-12-02 03:22:32 | 2
dbfiddle here