How to list rows with duplicate columns

How to list rows with duplicate columns - postgresql

I have a table with the fields id, name, birthday, clinic:
id | name | birthday | clinic
1 | mary | 2020-01-01 | clin 1
2 | mary | 2020-01-01 | clin 1
3 | mary | 2020-01-01 | clin 2
4 | john | 2021-01-01 | clin 1
5 | pete | 2020-01-05 | clin 1
6 | pete | 2020-01-05 | clin 2
7 | pete | 2020-01-05 | clin 3
I want to get all records with name, birthday duplicate like:
id | name | birthday | clinic
1 | mary | 2020-01-01 | clin 1
2 | mary | 2020-01-01 | clin 1
3 | mary | 2020-01-01 | clin 2
5 | pete | 2020-01-05 | clin 1
6 | pete | 2020-01-05 | clin 2
7 | pete | 2020-01-05 | clin 3
Mary and Pete have more than one record with same name and birthday

Using COUNT() as an analytical function, we can try:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, birthday) cnt
FROM yourTable
)
SELECT id, name, birthday, clinic
FROM cte
WHERE cnt > 1;

try
select * from <table> where (name , birthday) in (
select name , birthday from <table> group by name, birthday having count(*)>1)

You can use an EXISTS condition:
select t1.*
from the_table t1
where exists (select *
from the_table t2
where t1.id <> t2.id
and (t1.name, t1.birthday) = (t2.name, t2.birthday));

Related

PostgreSQL - Check if column value exists in any previous row

I'm working on a problem where I need to check if an ID exists in any previous records within another ID set, and create a tag if it does.
Suppose I have the following table
| client_id | order_date | supplier_id |
| 1 | 2022-01-01 | 1 |
| 1 | 2022-02-01 | 2 |
| 1 | 2022-03-01 | 1 |
| 1 | 2022-04-01 | 3 |
| 2 | 2022-05-01 | 1 |
| 2 | 2022-06-01 | 1 |
| 2 | 2022-07-01 | 2 |
And I want to create a column with a "is new supplier" tag (for each client):
| client_id | order_date | supplier_id | is_new_supplier|
| 1 | 2022-01-01 | 1 | True
| 1 | 2022-02-01 | 2 | True
| 1 | 2022-03-01 | 1 | False
| 1 | 2022-04-01 | 3 | True
| 2 | 2022-05-01 | 1 | True
| 2 | 2022-06-01 | 1 | False
| 2 | 2022-07-01 | 2 | True
First I tried doing this by creating a dense_rank and filtering out repeated ranks, but it didn't work:
with aux as (SELECT client_id,
order_date,
supplier_id
FROM table)
SELECT *, dense_rank() over (
partition by client_id
order by supplier_id
) as _dense_rank
FROM aux
Another way I thought about doing this, is by creating an auxiliary id with client_id + supplier_id, ordering by date and checking if the aux id exists in any previous row, but I don't know how to do this in SQL.

You are on the right track.
Instead of dense_rank, you can just use row_number and on your partition by add supplier id..
Don't forget to order by order_date
with aux as (SELECT client_id,
order_date,
supplier_id,
row_number() over (
partition by client_id, supplier_id
order by order_date
) as rank
FROM table)
SELECT client_id,
order_date,
supplier_id,
rank,
(rank = 1) as is_new_supplier
FROM aux

SQL 5.7 Lead Function

I'm struggling emulating a lead function to calculate the difference of (after date - current date)
I'm currently using mysql 5.7 to accomplish this. I have tried looking at various sources on stack overflow but I'm not sure how to get the result.
This is what I want:
What I currently have now is the same thing without the days column.
I would also like to know how to get a column of dates that grabs the date after the current date.

This seems to work (except for the unclear row=4):
DROP TABLE IF EXISTS table4;
CREATE TABLE table4 (id integer, user_id integer, product varchar(10), `date` date);
INSERT INTO table4 VALUES
(1,1,'item1','2020-01-01'),
(2,1,'item2','2020-01-01'),
(3,1,'item3','2020-01-02'),
(4,1,'item4','2020-01-02'),
(5,2,'item5','2020-01-06'),
(6,2,'item6','2020-01-09'),
(7,2,'item7','2020-01-09'),
(8,2,'item8','2020-01-10');
SELECT
id,
user_id,
product,
date,
(SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1) x,
COALESCE(DATEDIFF((SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1),date),0) as days
FROM table4 t1
output:
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| id | user_id | product | date | x | days |
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| 1 | 1 | item1 | 2020-01-01 | 2020-01-01 | 0 |
| 2 | 1 | item2 | 2020-01-01 | 2020-01-02 | 1 |
| 3 | 1 | item3 | 2020-01-02 | 2020-01-02 | 0 |
| 4 | 1 | item4 | 2020-01-02 | 2020-01-06 | 4 |
| 5 | 2 | item5 | 2020-01-06 | 2020-01-09 | 3 |
| 6 | 2 | item6 | 2020-01-09 | 2020-01-09 | 0 |
| 7 | 2 | item7 | 2020-01-09 | 2020-01-10 | 1 |
| 8 | 2 | item8 | 2020-01-10 | | 0 |
+ ------- + ------------ + ------------ + ---------- + ---------- + --------- +
The column x is only here for to see which date is returned from the subquery, and not really needed for the final result.
DBFIDDLE
EDIT: when there are no "gaps" in the numbering of id, you could do this to get a solution which should have more performance:
SELECT
t1.id,
t1.user_id,
t1.product,
t1.date,
COALESCE(DATEDIFF(t2.date,t1.date),0) as days
FROM table4 t1
LEFT JOIN table4 t2 on t2.id = t1.id+1
I added this to the DBFIDDLE

Postgres join when only one row is equal

I have two tables and I am wanting to do an inner join between table_1 and table_2 but only when there is one row in table_2 that meets the join criteria.
For example:
table_1
id | name | age |
-----------------+------------------+--------------+
1 | john jones | 10 |
2 | pete smith | 15 |
3 | mary lewis | 12 |
4 | amy roberts | 13 |
table_2
id | name | age | hair | height |
-----------------+------------------+--------------+--------------+--------------+
1 | john jones | 10 | brown | 100 |
2 | john jones | 10 | blonde | 132 |
3 | mary lewis | 12 | brown | 146 |
4 | pete smith | 15 | black | 171 |
So I want to do a join when name is equal, but only when there is one corresponding matching name in table_2
So my results would look like this:
id | name | age | hair |
-----------------+------------------+--------------+--------------+
2 | pete smith | 15 | black |
3 | mary lewis | 12 | brown |
As you can see, John Jones isn't in the results as there are two corresponding rows in table_2.
My initial code looks like this:
select tb.id,tb.name,tb.age,sc.hair
from table_1 tb
inner join table_2 sc
on tb.name = sc.name and tb.age = sc.age
Can I apply a clause within the join so that it only joins on rows which are unique matches?

Group by all columns and apply having count(*) = 1
select tb.id,tb.name,tb.age,sc.hair
from table_1 tb
join table_2 sc
on tb.name = sc.name and tb.age = sc.age
group by tb.id,tb.name,tb.age,sc.hair
having count(*) = 1
The interesting thing to note is that you don’t need the aggregate expression (in the case count(*) )in the select clause.

postgres tablefunc, sales data grouped by product, with crosstab of months

TIL about tablefunc and crosstab. At first I wanted to "group data by columns" but that doesn't really mean anything.
My product sales look like this
product_id | units | date
-----------------------------------
10 | 1 | 1-1-2018
10 | 2 | 2-2-2018
11 | 3 | 1-1-2018
11 | 10 | 1-2-2018
12 | 1 | 2-1-2018
13 | 10 | 1-1-2018
13 | 10 | 2-2-2018
I would like to produce a table of products with months as columns
product_id | 01-01-2018 | 02-01-2018 | etc.
-----------------------------------
10 | 1 | 2
11 | 13 | 0
12 | 0 | 1
13 | 20 | 0
First I would group by month, then invert and group by product, but I cannot figure out how to do this.

After enabling the tablefunc extension,
SELECT product_id, coalesce("2018-1-1", 0) as "2018-1-1"
, coalesce("2018-2-1", 0) as "2018-2-1"
FROM crosstab(
$$SELECT product_id, date_trunc('month', date)::date as month, sum(units) as units
FROM test
GROUP BY product_id, month
ORDER BY 1$$
, $$VALUES ('2018-1-1'::date), ('2018-2-1')$$
) AS ct (product_id int, "2018-1-1" int, "2018-2-1" int);
yields
| product_id | 2018-1-1 | 2018-2-1 |
|------------+----------+----------|
| 10 | 1 | 2 |
| 11 | 13 | 0 |
| 12 | 0 | 1 |
| 13 | 10 | 10 |

How to get all days in one table a date range even if no data exists also in SQL Server

I have one table name called Tab1. I would like to get all date even if any one of the days is missing also.
+-------------------+--------------------------+
|Name | dateCheck |
+-------------------+--------------------------+
| 1 | 2016-01-01 00:00:00.000 |
| 2 | 2016-01-02 00:00:00.000 |
| 3 | 2016-01-05 00:00:00.000 |
| 4 | 2016-01-07 00:00:00.000 |
+-------------------+--------------------------+
I need output like below :
+-------------------+--------------------------+
|Name | dateCheck |
+-------------------+--------------------------+
| 1 | 2016-01-01 00:00:00.000 |
| 2 | 2016-01-02 00:00:00.000 |
| 0 | 2016-01-03 00:00:00.000 |
| 0 | 2016-01-04 00:00:00.000 |
| 3 | 2016-01-05 00:00:00.000 |
| 0 | 2016-01-06 00:00:00.000 |
| 4 | 2016-01-07 00:00:00.000 |

You may use a calendar table:
SELECT
COALESCE(t2.Name, 0) AS Name,
t1.dateCheck
FROM
(
SELECT '2016-01-01' AS dateCheck UNION ALL
SELECT '2016-01-02' UNION ALL
SELECT '2016-01-03' UNION ALL
SELECT '2016-01-04' UNION ALL
SELECT '2016-01-05' UNION ALL
SELECT '2016-01-06' UNION ALL
SELECT '2016-01-07'
) t1
LEFT JOIN yourTable t2
ON t1.dateCheck = t2.dateCheck;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to list rows with duplicate columns - postgresql

Using COUNT() as an analytical function, we can try: WITH cte AS ( SELECT , COUNT() OVER (PARTITION BY name, birthday) cnt FROM yourTable ) SELECT id, name, birthday, clinic FROM cte WHERE cnt > 1;

try select * from <table> where (name , birthday) in ( select name , birthday from <table> group by name, birthday having count(*)>1)

You can use an EXISTS condition: select t1.* from the_table t1 where exists (select * from the_table t2 where t1.id <> t2.id and (t1.name, t1.birthday) = (t2.name, t2.birthday));

Related

PostgreSQL - Check if column value exists in any previous row

SQL 5.7 Lead Function

Postgres join when only one row is equal

postgres tablefunc, sales data grouped by product, with crosstab of months

How to get all days in one table a date range even if no data exists also in SQL Server

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to list rows with duplicate columns - postgresql

Using COUNT() as an analytical function, we can try: WITH cte AS ( SELECT *, COUNT(*) OVER (PARTITION BY name, birthday) cnt FROM yourTable ) SELECT id, name, birthday, clinic FROM cte WHERE cnt > 1;

try select * from <table> where (name , birthday) in ( select name , birthday from <table> group by name, birthday having count(*)>1)

You can use an EXISTS condition: select t1.* from the_table t1 where exists (select * from the_table t2 where t1.id <> t2.id and (t1.name, t1.birthday) = (t2.name, t2.birthday));

Related

PostgreSQL - Check if column value exists in any previous row

SQL 5.7 Lead Function

Postgres join when only one row is equal

postgres tablefunc, sales data grouped by product, with crosstab of months

How to get all days in one table a date range even if no data exists also in SQL Server

Categories

Resources

Using COUNT() as an analytical function, we can try: WITH cte AS ( SELECT , COUNT() OVER (PARTITION BY name, birthday) cnt FROM yourTable ) SELECT id, name, birthday, clinic FROM cte WHERE cnt > 1;