How can I write a function with two tables inputs and one table output in PostgreSQL? - postgresql

I want to create a function that can create a table, in which part of the columns is derived from the other two tables.
input table1:
This is a static table for each loan. Each loan has only one row with information related to that loan. For example, original unpaid balance, original interest rate...
| id | loan_age | ori_upb | ori_rate | ltv |
| --- | -------- | ------- | -------- | --- |
| 1 | 360 | 1500 | 4.5 | 0.6 |
| 2 | 360 | 2000 | 3.8 | 0.5 |
input table2:
This is a dynamic table for each loan. Each loan has seraval rows show the loan performance in each month. For example, current unpaid balance, current interest rate, delinquancy status...
| id | month| cur_upb | cur_rate |status|
| ---| --- | ------- | -------- | --- |
| 1 | 01 | 1400 | 4.5 | 0 |
| 1 | 02 | 1300 | 4.5 | 0 |
| 1 | 03 | 1200 | 4.5 | 1 |
| 2 | 01 | 2000 | 3.8 | 0 |
| 2 | 02 | 1900 | 3.8 | 0 |
| 2 | 03 | 1900 | 3.8 | 1 |
| 2 | 04 | 1900 | 3.8 | 2 |
output table:
The output table contains information from table1 and table2. Payoffupb is the last record of cur_upb in table2. This table is built for model development.
| id | loan_age | ori_upb | ori_rate | ltv | payoffmonth| payoffupb | payoffrate |lastStatus | modification |
| ---| -------- | ------- | -------- | --- | ---------- | --------- | ---------- |---------- | ------------ |
| 1 | 360 | 1500 | 4.5 | 0.6 | 03 | 1200 | 4.5 | 1 | null |
| 2 | 360 | 2000 | 3.8 | 0.5 | 04 | 1900 | 3.8 | 2 | null |
Most columns in the output table can directly get or transferred from columns in the two input tables, but some columns can not get then leave blank.
My main question is how to write a function to take two tables as inputs and output another table?
I already wrote the feature transformation part for data files in 2018, but I need to do the same thing again for data files in some other years. That's why I want to create a function to make things easier.

As you want to insert the latest entry of table2 against each entry of table1 try this
insert into table3 (id, loan_age, ori_upb, ori_rate, ltv,
payoffmonth, payoffupb, payoffrate, lastStatus )
select distinct on (t1.id)
t1.id, t1.loan_age, t1.ori_upb, t1.ori_rate, t1.ltv, t2.month, t2.cur_upb,
t2.cur_rate, t2.status
from
table1 t1
inner join
table2 t2 on t1.id=t2.id
order by t1.id , t2.month desc
DEMO1
EDIT for your updated question:
Function to do the above considering table1, table2, table3 structure will be always identical.
create or replace function insert_values(table1 varchar, table2 varchar, table3 varchar)
returns int as $$
declare
count_ int;
begin
execute format('insert into %I (id, loan_age, ori_upb, ori_rate, ltv, payoffmonth, payoffupb, payoffrate, lastStatus )
select distinct on (t1.id) t1.id, t1.loan_age, t1.ori_upb,
t1.ori_rate,t1.ltv,t2.month,t2.cur_upb, t2.cur_rate, t2.status
from %I t1 inner join %I t2 on t1.id=t2.id order by t1.id , t2.month desc',table3,table1,table2);
GET DIAGNOSTICS count_ = ROW_COUNT;
return count_;
end;
$$
language plpgsql
and call above function like below which will return the number of inserted rows:
select * from insert_values('table1','table2','table3');
DEMO2

Related

SQL 5.7 Lead Function

I'm struggling emulating a lead function to calculate the difference of (after date - current date)
I'm currently using mysql 5.7 to accomplish this. I have tried looking at various sources on stack overflow but I'm not sure how to get the result.
This is what I want:
What I currently have now is the same thing without the days column.
I would also like to know how to get a column of dates that grabs the date after the current date.
This seems to work (except for the unclear row=4):
DROP TABLE IF EXISTS table4;
CREATE TABLE table4 (id integer, user_id integer, product varchar(10), `date` date);
INSERT INTO table4 VALUES
(1,1,'item1','2020-01-01'),
(2,1,'item2','2020-01-01'),
(3,1,'item3','2020-01-02'),
(4,1,'item4','2020-01-02'),
(5,2,'item5','2020-01-06'),
(6,2,'item6','2020-01-09'),
(7,2,'item7','2020-01-09'),
(8,2,'item8','2020-01-10');
SELECT
id,
user_id,
product,
date,
(SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1) x,
COALESCE(DATEDIFF((SELECT date FROM table4 t4 WHERE t4.id>t1.id LIMIT 1),date),0) as days
FROM table4 t1
output:
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| id | user_id | product | date | x | days |
+ ------- + ------------ + ------------ + --------- + ----------- + --------- +
| 1 | 1 | item1 | 2020-01-01 | 2020-01-01 | 0 |
| 2 | 1 | item2 | 2020-01-01 | 2020-01-02 | 1 |
| 3 | 1 | item3 | 2020-01-02 | 2020-01-02 | 0 |
| 4 | 1 | item4 | 2020-01-02 | 2020-01-06 | 4 |
| 5 | 2 | item5 | 2020-01-06 | 2020-01-09 | 3 |
| 6 | 2 | item6 | 2020-01-09 | 2020-01-09 | 0 |
| 7 | 2 | item7 | 2020-01-09 | 2020-01-10 | 1 |
| 8 | 2 | item8 | 2020-01-10 | | 0 |
+ ------- + ------------ + ------------ + ---------- + ---------- + --------- +
The column x is only here for to see which date is returned from the subquery, and not really needed for the final result.
DBFIDDLE
EDIT: when there are no "gaps" in the numbering of id, you could do this to get a solution which should have more performance:
SELECT
t1.id,
t1.user_id,
t1.product,
t1.date,
COALESCE(DATEDIFF(t2.date,t1.date),0) as days
FROM table4 t1
LEFT JOIN table4 t2 on t2.id = t1.id+1
I added this to the DBFIDDLE

Make sure every distinct value of Column1 has a row with every distinct value of Column2, by populating a table with 0s - postgresql

Here's a crude example I've made up to illustrate what I want to achieve:
table1:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod3 | 13 |
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
table1 becomes:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod2 | 0 | -- new
| a | Prod3 | 13 |
| a | Prod4 | 0 | -- new
| b | Prod1 | 0 | -- new
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
In this example, I want to represent every Shop/Product combination
every Shop {a,b} to have a row with every Product {Prod1, Prod2, Prod3, Prod4}
QuantityInStock=13 has no significance, I just wanted a placeholder number :)
Use a calendar table cross join approach:
SELECT s.Shop, p.Product, COALESCE(t1.QuantityInStock, 0) AS QuantityInStock
FROM (SELECT DISTINCT Shop FROM table1) s
CROSS JOIN (SELECT DISTINCT Product FROM table1) p
LEFT JOIN table1 t1
ON t1.Shop = s.Shop AND
t1.Product = p.Product
ORDER BY
s.Shop,
p.Product;
The idea here is to generate an intermediate table containing of all shop/product combinations via a cross join. Then, we left join this to table1. Any shop/product combinations which do not have a match in the actual table are assigned a zero stock quantity.

SUM values from two tables with GROUP BY and WHERE

I have two tables below named sent_table and received_table. I am attempting to mash them together in a query to achieve output_table. All my attempts so far result in a huge amount of duplicates and totally bogus sum values.
I am assuming I would need to use GROUP BY and WHERE to achieve this goal. I want to be able to filter based on the users name.
sent_table
+----+------+-------+----------+
| id | name | value | order_id |
+----+------+-------+----------+
| 1 | dave | 100 | 1 |
| 2 | dave | 200 | 1 |
| 3 | dave | 300 | 2 |
+----+------+-------+----------+
received_table
+----+------+-------+----------+
| id | name | value | order_id |
+----+------+-------+----------+
| 1 | dave | 400 | 1 |
| 2 | dave | 500 | 2 |
| 3 | dave | 600 | 2 |
+----+------+-------+----------+
output table
+------+----------+----------+
| sent | received | order_id |
+------+----------+----------+
| 300 | 400 | 1 |
| 300 | 1100 | 2 |
+------+----------+----------+
I tried the following with no joy. This does not impose any restrictions on how I would desire to solve this problem. It is just how I attempted to do it.
SELECT *
FROM
( select SUM(value) as sent, order_id FROM sent_table WHERE name='dave' GROUP BY order_id) A
CROSS JOIN
( select SUM(value) as received, order_id FROM received_table WHERE name='dave' GROUP BY order_id) B
Any help would be greatly appreciated.
Do the sums on each table, grouping by order_id, then join the results. To get the rows even if one side is missing, do a FULL OUTER JOIN:
SELECT COALESCE(s.order_id, r.order_id) AS order_id, s.sent, r.received
FROM (
SELECT order_id, SUM(value) AS sent
FROM sent
GROUP BY order_id
) s
FULL OUTER JOIN (
SELECT order_id, SUM(value) AS received
FROM received
GROUP BY order_id
) r
USING (order_id)
ORDER BY 1
Result:
| order_id | sent | received |
| -------- | ---- | -------- |
| 1 | 300 | 400 |
| 2 | | 1100 |
Note the COALESCE on the order_id, so that if it's missing from sent it will be taken from recevied, so that that value will never be NULL.
If you want to have 0 in place of NULL (when e.g. there is no record for that order_id in either sent or received), you would do COALESCE(s.sent, 0) AS sent, COALESCE(r.received, 0) AS received.
https://www.db-fiddle.com/f/nq3xYrcys16eUrBRHT6xLL/2

how to get data from other table to one table and show it?

table1
no | date |
J001 | 06 June |
table2
no | code | qty | /// AVGprice | Total
J001 | B001 | 5 | /// 1500 | 7500
J001 | B003 | 7 | /// 1000 | 7000
table3 table4
code | name | AVGPrice no | code | Price
B001 | procc | 1500 M001 | B001 | 1000
B002 | motherboard | 2000 M001 | B002 | 2000
B003 | VGA card | 1000 M002 | B001 | 2000
M002 | B003 | 1000
I get AVGprice from this query
select t.code, t.name, t.avg
from (select table3.code, table3.name, (
select avg(table4.price)
from table4
where table4.code=table3.code)as 'avg'
from table3
)as t
result that i can make is
no | date | Info
J001| 06 June | ABCDEFG
with these query
select t.no, t.date, t.info
from (select table1.no, table1.date, 'ABCDEFG' as info
from table1
)as t
result that I want is
no | date | Info | Total
J001| 06 June | ABCDEFG | 14500 --> from sum of Total
I don't know where to put my avg query and how to sum it...
The following should add the subquery you need to pull the average, and I added another column that
gives you the sum of the averages.
select t.no,
t.date,
t.info,
(select avg(table4.price)
from table4
where table4.code=table3.code)as 'avg',
sum(avg)
from (select table1.no, table1.date, 'ABCDEFG' as info
from table1
)as t
group by t.no, t.date, t.info

Grouping in t-sql with latest dates

I have a table like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
1 | 1 | 2009-01-01 | 100 |
2 | 1 | 2009-01-02 | 20 |
3 | 1 | 2009-01-03 | 50 |
4 | 2 | 2009-01-01 | 80 |
5 | 2 | 2009-01-04 | 30 |
For each contract I need to fetch the latest event and amount associated with the event and get something like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
3 | 1 | 2009-01-03 | 50 |
5 | 2 | 2009-01-04 | 30 |
I can't figure out how to group the data correctly. Any ideas?
Thanks in advance.
SQL 2k5/2k8:
with cte_ranked as (
select *
, row_number() over (
partition by ContractId order by EvantDate desc) as [rank]
from [table])
select *
from cte_ranked
where [rank] = 1;
SQL 2k:
select t.*
from table as t
join (
select max(EventDate) as MaxDate
, ContractId
from table
group by ContractId) as mt
on t.ContractId = mt.ContractId
and t.EventDate = mt.MaxDate