postgresql select rows from same table twice - postgresql

I want to compare deposit for each person in the table.
and return all the rows where the deposit field is decreased.
Here is what I have done so far;
The customer table is;
person_id employee_id deposit ts
101 201 44 2021-09-30 10:12:19+00
100 200 45 2021-09-30 10:12:19+00
101 201 47 2021-09-30 09:12:19+00
100 200 21 2021-09-29 10:12:19+00
104 203 54 2021-09-27 10:12:19+00
and as a result I want is;
person_id employee_id deposit ts
101 201 44 2021-09-30 10:12:19+00
SELECT person_id,
employee_id,
deposit,
ts,
lag(deposit) over client_window as pre_deposit,
lag(ts) over client_window as pre_ts
FROM customer
WINDOW client_window as (partition by person_id order by ts)
ORDER BY person_id , ts
so it returns table with the following results;
person_id employee_id deposit ts pre_deposit pre_ts
101 201 44 2021-09-30 10:12:19+00 47 2021-09-30 09:12:19+00
100 200 45 2021-09-30 10:12:19+00 21 2021-09-29 10:12:19+00
101 201 47 2021-09-30 09:12:19+00 null null
100 200 21 2021-09-29 10:12:19+00 null null
104 203 54 2021-09-27 10:12:19+00 null null
SELECT person_id,
employee_id,
deposit,
ts,
lag(deposit) over client_window as pre_deposit,
lag(ts) over client_window as pre_ts
FROM customer
WINDOW client_window as (partition by person_id order by ts)
WHERE pre_deposit > deposit //this returns column not found for pre_deposit
ORDER BY person_id , ts
so far somehow I need to select the same table again to be able to apply this condition;
where pre_deposit > deposit
what does it make sense here?
union? outer-join? left-join? right-join?

Use your query as a subquery and filter the results:
SELECT person_id, employee_id, deposit, ts
FROM (
SELECT *, lag(deposit) over client_window as pre_deposit
FROM customer
WINDOW client_window as (partition by person_id order by ts)
) t
WHERE deposit < pre_deposit
ORDER BY person_id, ts;
See the demo.

Related

Window Function For Consecutive Dates

I want to know how many users were active for 3 consecutive days on any given day.
e.g on 2022-11-03, 1 user (user_id = 111) was active 3 days in a row. Could someone please advise what kind of window function(?) would be needed?
This is my dataset:
user_id
active_date
111
2022-11-01
111
2022-11-02
111
2022-11-03
222
2022-11-01
333
2022-11-01
333
2022-11-09
333
2022-11-10
333
2022-11-11
If you are confident there are no duplicate user_id + active_date rows in the source data, then you can use two LAG functions like this:
SELECT user_id,
active_date,
CASE WHEN DATEADD(day, -1, active_date) = LAG(active_date, 1) OVER (PARTITION BY user_id ORDER BY active_date)
AND DATEADD(day, -2, active_date) = LAG(active_date, 2) OVER (PARTITION BY user_id ORDER BY active_date)
THEN 'Yes'
ELSE 'No'
END AS rowof3
FROM your_table
ORDER BY user_id, active_date;
If there might be duplication, use this FROM clause instead:
FROM (SELECT DISTINCT user_id, active_date :: DATE FROM your_table)

Getting duplicate records from 2 sql tables

I have 2 SQL tables
Table #1
account
product
expiry-date
101
prod1
2021-01-30
102
prod2
2021-02-20
103
prod3
2021-03-09
103
prod3
2021-03-19
104
prod4
2021-03-15
105
prod5
2021-04-23
105
prod5
2021-04-24
106
prod6
2021-04-25
Table #2
account
101
106
From the above 2 tables I want to get only unmatched records from Table1 and avoid duplicate records.
Result:
account
product
expiry-date
102
prod2
2021-02-20
103
prod3
2021-03-09
104
prod4
2021-03-15
105
prod5
2021-04-23
Below query I tried but I am getting duplicate records, because expiry date is unique on account. I am getting below records in my output
SQL query I tried:
select distinct (a.account, a.product, a.expiry-date)
from table1 a
where a.account not in (select account from table2)
Result:
account
product
expiry-date
102
prod2
2021-02-20
103
prod3
2021-03-09
103
prod3
2021-03-19
104
prod4
2021-03-15
105
prod5
2021-04-23
105
prod5
2021-04-24
You can use the same query using aggregation:
SELECT a.account
,a.product
,MIN(a.expiry) expiry
FROM table1 a
WHERE a.account NOT IN (
SELECT account
FROM table2
)
GROUP BY a.account
,a.product
You can use an anti-join and then ROW_NUMBER() For example:
select *
from (
select a.*, row_number() over(partition by accoun order by expiry) as rn
from table1 a
left join table2 b on b.account = a.account
where b.account is null
) x
where rn = 1

Ranking in PostgreSQL

I have a query that looks like this:
select
restaurant_id,
rank() OVER (PARTITION BY restaurant_id order by churn desc) as rank_churn,
churn,
orders,
rank() OVER (PARTITION BY restaurant_id order by orders desc) as rank_orders
from data
I would expect that this ranking function will order my data and provide a column that has 1,2,3,4 according to the values of the column.
However the outcome is always 1 in the ranking.
restaurant_id rank_churn churn orders rank_orders
2217 1 75 182 1
2249 1 398 896 1
2526 1 11 56 1
2596 1 89 139 1
What am I doing wrong?

TSQL: How to apply Condition to sub grouping

Image I have the following table with multiple codes for a single person for different periods (id is the primary key)
id code Name Start Finish
325 1353 Bob NULL 2012-07-03 16:21:16.067
1742 1353 Bob 2012-07-03 16:21:16.067 2012-08-03 15:56:29.897
1803 1353 Bob 2012-08-03 15:56:29.897 NULL
17 575 Bob NULL NULL
270 834 Bob NULL 2012-07-20 15:51:19.913
1780 834 Bob 2012-07-20 15:51:19.913 2012-07-26 16:26:54.413
1789 834 Bob 2012-07-26 16:26:54.413 2012-08-21 15:36:58.940
1830 834 Bob 2012-08-21 15:36:58.940 2012-08-24 14:26:05.890
1835 834 Bob 2012-08-24 14:26:05.890 2012-08-30 12:01:05.313
1838 123 Bob 2012-08-30 12:01:05.313 2012-09-05 09:29:02.497
1844 900 Bob 2012-09-05 09:29:02.497 NULL
What I want to do update the table such that the code is take from the latest person.
id code Name Start Finish
325 900 Bob NULL 2012-07-03 16:21:16.067
1742 900 Bob 2012-07-03 16:21:16.067 2012-08-03 15:56:29.897
1803 900 Bob 2012-08-03 15:56:29.897 NULL
17 900 Bob NULL NULL
270 900 Bob NULL 2012-07-20 15:51:19.913
1780 900 Bob 2012-07-20 15:51:19.913 2012-07-26 16:26:54.413
1789 900 Bob 2012-07-26 16:26:54.413 2012-08-21 15:36:58.940
1830 900 Bob 2012-08-21 15:36:58.940 2012-08-24 14:26:05.890
1835 900 Bob 2012-08-24 14:26:05.890 2012-08-30 12:01:05.313
1838 900 Bob 2012-08-30 12:01:05.313 2012-09-05 09:29:02.497
1844 900 Bob 2012-09-05 09:29:02.497 NULL
Latest person is defined as the person with the latest (max?) Start AND (Finish IS NULL or Finish >= GetDate()) WITHIN the Group of people of same Name AND Code
In the above example that is where id = 1844 (with the groups of Bob it's got the latest Start and the Finish is Null)
I pretty sure this is possible with a single statement but I can see how to define 'Latest Person' such that I can join it back to get rows I want to update
Edit: Please note that I cannot rely on the ordering of the Id column only the date columns.
Something like this will do:
update this set code = (
select top (1) that.code from table1 that
where that.name = this.name -- match on name
and (that.Finish is null or that.Finish >= getdate()) -- filter for current rows only
order by that.Start desc, that.id desc -- rank by start, break ties with id
)
from table1 this
I hope your table is well indexed, and/or not too big, because this is expensive to do in one step.
Alternate form, using OUTER APPLY, and more easily extensible:
update this set code = that.code
from table1 this
outer apply (
select top (1) that.code from table1 that
where that.name = this.name -- match on name
and (that.Finish is null or that.Finish >= getdate()) -- filter for current rows
order by that.Start desc, that.id desc -- rank by start, break ties with id
) that
Alternate method using windowing functions, without a join:
update this set code = _latest_code
from (
-- identify the latest code per name
select *, _latest_code = max(
case
when (finish is null or finish >= getdate())
and _row_number = 1
then code else null
end
) over (partition by name)
from (
-- identify the latest row per name
select *, _row_number = row_number() over (
partition by name order by
case when finish is null or finish >= getdate() then 0 else 1 end
, start desc, id desc)
from table1
) this
) this

help req in basic t-sql

Below is a table.
stu_id meet_doc_id doc_name stu_name dob value date
101 0104 AD AM 15/06/1950 LMDO 2011-02-15
101 0105 AD AM 15/06/1950 CLEAR 2011-02-18
101 0106 AD AM 15/06/1950 CLEAR 2011-02-25
102 0107 AD AK 12/08/1987 CLEAR 2011-03-28
102 0108 AD AK 12/08/1987 LDMO 2011-04-29
103 0109 PK LMP 13/07/1970 CLEAR 2011-03-28
103 0110 PK LMP 13/07/1970 CLEAR 2011-05-12
What will be the resulting query if I expect to see a result set of
stu_id meet_doc_id doc_name stu_name dob value date
101 0104 AD AM 15/06/1950 LMDO 2011-02-15
102 0107 AD AK 12/08/1987 CLEAR 2011-03-28
103 0110 PK LMP 13/07/1970 CLEAR 2011-05-12
Your table looks extremely un-normalised with lots of repeating groups but I think you need
;WITH CTE
AS (SELECT stu_id,
meet_doc_id,
doc_name,
stu_name,
dob,
value,
date,
ROW_NUMBER () OVER (PARTITION BY stu_id ORDER BY meet_doc_id) AS
RN
FROM YourTable)
select stu_id,
meet_doc_id,
doc_name,
stu_name,
dob,
value,
date
FROM CTE
WHERE RN = 1