SQL 5.7 Lead Function - mysql-workbench

I'm struggling emulating a lead function to calculate the difference of (after date - current date)
I'm currently using mysql 5.7 to accomplish this. I have tried looking at various sources on stack overflow but I'm not sure how to get the result.
This is what I want:
What I currently have now is the same thing without the days column.
I would also like to know how to get a column of dates that grabs the date after the current date.

This seems to work (except for the unclear row=4):
DROP TABLE IF EXISTS table4;
CREATE TABLE table4 (id integer, user_id integer, product varchar(10), `date` date);
INSERT INTO table4 VALUES
(1,1,'item1','2020-01-01'),
(2,1,'item2','2020-01-01'),
(3,1,'item3','2020-01-02'),
(4,1,'item4','2020-01-02'),
(5,2,'item5','2020-01-06'),
(6,2,'item6','2020-01-09'),
(7,2,'item7','2020-01-09'),
(8,2,'item8','2020-01-10');
SELECT
id,
user_id,
product,
date,
(SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1) x,
COALESCE(DATEDIFF((SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1),date),0) as days
FROM table4 t1
output:
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| id | user_id | product | date | x | days |
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| 1 | 1 | item1 | 2020-01-01 | 2020-01-01 | 0 |
| 2 | 1 | item2 | 2020-01-01 | 2020-01-02 | 1 |
| 3 | 1 | item3 | 2020-01-02 | 2020-01-02 | 0 |
| 4 | 1 | item4 | 2020-01-02 | 2020-01-06 | 4 |
| 5 | 2 | item5 | 2020-01-06 | 2020-01-09 | 3 |
| 6 | 2 | item6 | 2020-01-09 | 2020-01-09 | 0 |
| 7 | 2 | item7 | 2020-01-09 | 2020-01-10 | 1 |
| 8 | 2 | item8 | 2020-01-10 | | 0 |
+ ------- + ------------ + ------------ + ---------- + ---------- + --------- +
The column x is only here for to see which date is returned from the subquery, and not really needed for the final result.
DBFIDDLE
EDIT: when there are no "gaps" in the numbering of id, you could do this to get a solution which should have more performance:
SELECT
t1.id,
t1.user_id,
t1.product,
t1.date,
COALESCE(DATEDIFF(t2.date,t1.date),0) as days
FROM table4 t1
LEFT JOIN table4 t2 on t2.id = t1.id+1
I added this to the DBFIDDLE

Related

SELECT DATA FROM A PREVIOUS LINE

Table:
worker_id | created_at | state_id
---------- ------------- ----------
1 | 14-6-2021 | 12
2 | 14-6-2021 | 12
3 | 13-6-2021 | 12
4 | 12-6-2021 | 12
3 | 10-6-2021 | 4
2 | 9-6-2021 | 4
4 | 8-6-2021 | 12
4 | 1-6-2021 | 4
1 | 1-6-2021 | 12
What I want
worker_id | created_at | state_id
---------- ------------- ----------
2 | 14-6-2021 | 12
3 | 13-6-2021 | 12
I need to obtain the worker_id of the workers that have state_id = 12, and that comply with their previous state_id = 4. I have made multiple attempts but none of them work.
First of all, please write your post in english.
To get the result you want you can use window functions and CTE.
with w as (
select *, lag(state_id) over (partition by worker_id order by worker_id, created_at) as previous_state
from workers
)
select * from w where previous_state = 4
Result here

How can I write a function with two tables inputs and one table output in PostgreSQL?

I want to create a function that can create a table, in which part of the columns is derived from the other two tables.
input table1:
This is a static table for each loan. Each loan has only one row with information related to that loan. For example, original unpaid balance, original interest rate...
| id | loan_age | ori_upb | ori_rate | ltv |
| --- | -------- | ------- | -------- | --- |
| 1 | 360 | 1500 | 4.5 | 0.6 |
| 2 | 360 | 2000 | 3.8 | 0.5 |
input table2:
This is a dynamic table for each loan. Each loan has seraval rows show the loan performance in each month. For example, current unpaid balance, current interest rate, delinquancy status...
| id | month| cur_upb | cur_rate |status|
| ---| --- | ------- | -------- | --- |
| 1 | 01 | 1400 | 4.5 | 0 |
| 1 | 02 | 1300 | 4.5 | 0 |
| 1 | 03 | 1200 | 4.5 | 1 |
| 2 | 01 | 2000 | 3.8 | 0 |
| 2 | 02 | 1900 | 3.8 | 0 |
| 2 | 03 | 1900 | 3.8 | 1 |
| 2 | 04 | 1900 | 3.8 | 2 |
output table:
The output table contains information from table1 and table2. Payoffupb is the last record of cur_upb in table2. This table is built for model development.
| id | loan_age | ori_upb | ori_rate | ltv | payoffmonth| payoffupb | payoffrate |lastStatus | modification |
| ---| -------- | ------- | -------- | --- | ---------- | --------- | ---------- |---------- | ------------ |
| 1 | 360 | 1500 | 4.5 | 0.6 | 03 | 1200 | 4.5 | 1 | null |
| 2 | 360 | 2000 | 3.8 | 0.5 | 04 | 1900 | 3.8 | 2 | null |
Most columns in the output table can directly get or transferred from columns in the two input tables, but some columns can not get then leave blank.
My main question is how to write a function to take two tables as inputs and output another table?
I already wrote the feature transformation part for data files in 2018, but I need to do the same thing again for data files in some other years. That's why I want to create a function to make things easier.
As you want to insert the latest entry of table2 against each entry of table1 try this
insert into table3 (id, loan_age, ori_upb, ori_rate, ltv,
payoffmonth, payoffupb, payoffrate, lastStatus )
select distinct on (t1.id)
t1.id, t1.loan_age, t1.ori_upb, t1.ori_rate, t1.ltv, t2.month, t2.cur_upb,
t2.cur_rate, t2.status
from
table1 t1
inner join
table2 t2 on t1.id=t2.id
order by t1.id , t2.month desc
DEMO1
EDIT for your updated question:
Function to do the above considering table1, table2, table3 structure will be always identical.
create or replace function insert_values(table1 varchar, table2 varchar, table3 varchar)
returns int as $$
declare
count_ int;
begin
execute format('insert into %I (id, loan_age, ori_upb, ori_rate, ltv, payoffmonth, payoffupb, payoffrate, lastStatus )
select distinct on (t1.id) t1.id, t1.loan_age, t1.ori_upb,
t1.ori_rate,t1.ltv,t2.month,t2.cur_upb, t2.cur_rate, t2.status
from %I t1 inner join %I t2 on t1.id=t2.id order by t1.id , t2.month desc',table3,table1,table2);
GET DIAGNOSTICS count_ = ROW_COUNT;
return count_;
end;
$$
language plpgsql
and call above function like below which will return the number of inserted rows:
select * from insert_values('table1','table2','table3');
DEMO2

postgres tablefunc, sales data grouped by product, with crosstab of months

TIL about tablefunc and crosstab. At first I wanted to "group data by columns" but that doesn't really mean anything.
My product sales look like this
product_id | units | date
-----------------------------------
10 | 1 | 1-1-2018
10 | 2 | 2-2-2018
11 | 3 | 1-1-2018
11 | 10 | 1-2-2018
12 | 1 | 2-1-2018
13 | 10 | 1-1-2018
13 | 10 | 2-2-2018
I would like to produce a table of products with months as columns
product_id | 01-01-2018 | 02-01-2018 | etc.
-----------------------------------
10 | 1 | 2
11 | 13 | 0
12 | 0 | 1
13 | 20 | 0
First I would group by month, then invert and group by product, but I cannot figure out how to do this.
After enabling the tablefunc extension,
SELECT product_id, coalesce("2018-1-1", 0) as "2018-1-1"
, coalesce("2018-2-1", 0) as "2018-2-1"
FROM crosstab(
$$SELECT product_id, date_trunc('month', date)::date as month, sum(units) as units
FROM test
GROUP BY product_id, month
ORDER BY 1$$
, $$VALUES ('2018-1-1'::date), ('2018-2-1')$$
) AS ct (product_id int, "2018-1-1" int, "2018-2-1" int);
yields
| product_id | 2018-1-1 | 2018-2-1 |
|------------+----------+----------|
| 10 | 1 | 2 |
| 11 | 13 | 0 |
| 12 | 0 | 1 |
| 13 | 10 | 10 |

1th and 7th row in grouping

I have this table named Samples. The Date column values are just symbolic date values.
+----+------------+-------+------+
| Id | Product_Id | Price | Date |
+----+------------+-------+------+
| 1 | 1 | 100 | 1 |
| 2 | 2 | 100 | 2 |
| 3 | 3 | 100 | 3 |
| 4 | 1 | 100 | 4 |
| 5 | 2 | 100 | 5 |
| 6 | 3 | 100 | 6 |
...
+----+------------+-------+------+
I want to group by product_id such that I have the 1'th sample in descending date order and a new colomn added with the Price of the 7'th sample row in each product group. If the 7'th row does not exist, then the value should be null.
Example:
+----+------------+-------+------+----------+
| Id | Product_Id | Price | Date | 7thPrice |
+----+------------+-------+------+----------+
| 4 | 1 | 100 | 4 | 120 |
| 5 | 2 | 100 | 5 | 100 |
| 6 | 3 | 100 | 6 | NULL |
+----+------------+-------+------+----------+
I belive I can achieve the table without the '7thPrice' with the following
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples
) T WHERE T.r = 1
Any suggestions?
You can try something like this. I used your query to create a CTE. Then joined rank1 to rank7.
;with sampleCTE
as
(SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples)
select *
from
(select * from samplecte where r = 1) a
left join
(select * from samplecte where r=7) b
on a.product_id = b.product_id

Grouping in t-sql with latest dates

I have a table like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
1 | 1 | 2009-01-01 | 100 |
2 | 1 | 2009-01-02 | 20 |
3 | 1 | 2009-01-03 | 50 |
4 | 2 | 2009-01-01 | 80 |
5 | 2 | 2009-01-04 | 30 |
For each contract I need to fetch the latest event and amount associated with the event and get something like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
3 | 1 | 2009-01-03 | 50 |
5 | 2 | 2009-01-04 | 30 |
I can't figure out how to group the data correctly. Any ideas?
Thanks in advance.
SQL 2k5/2k8:
with cte_ranked as (
select *
, row_number() over (
partition by ContractId order by EvantDate desc) as [rank]
from [table])
select *
from cte_ranked
where [rank] = 1;
SQL 2k:
select t.*
from table as t
join (
select max(EventDate) as MaxDate
, ContractId
from table
group by ContractId) as mt
on t.ContractId = mt.ContractId
and t.EventDate = mt.MaxDate