SUM values from two tables with GROUP BY and WHERE - postgresql

I have two tables below named sent_table and received_table. I am attempting to mash them together in a query to achieve output_table. All my attempts so far result in a huge amount of duplicates and totally bogus sum values.
I am assuming I would need to use GROUP BY and WHERE to achieve this goal. I want to be able to filter based on the users name.
sent_table
+----+------+-------+----------+
| id | name | value | order_id |
+----+------+-------+----------+
| 1 | dave | 100 | 1 |
| 2 | dave | 200 | 1 |
| 3 | dave | 300 | 2 |
+----+------+-------+----------+
received_table
+----+------+-------+----------+
| id | name | value | order_id |
+----+------+-------+----------+
| 1 | dave | 400 | 1 |
| 2 | dave | 500 | 2 |
| 3 | dave | 600 | 2 |
+----+------+-------+----------+
output table
+------+----------+----------+
| sent | received | order_id |
+------+----------+----------+
| 300 | 400 | 1 |
| 300 | 1100 | 2 |
+------+----------+----------+
I tried the following with no joy. This does not impose any restrictions on how I would desire to solve this problem. It is just how I attempted to do it.
SELECT *
FROM
( select SUM(value) as sent, order_id FROM sent_table WHERE name='dave' GROUP BY order_id) A
CROSS JOIN
( select SUM(value) as received, order_id FROM received_table WHERE name='dave' GROUP BY order_id) B
Any help would be greatly appreciated.

Do the sums on each table, grouping by order_id, then join the results. To get the rows even if one side is missing, do a FULL OUTER JOIN:
SELECT COALESCE(s.order_id, r.order_id) AS order_id, s.sent, r.received
FROM (
SELECT order_id, SUM(value) AS sent
FROM sent
GROUP BY order_id
) s
FULL OUTER JOIN (
SELECT order_id, SUM(value) AS received
FROM received
GROUP BY order_id
) r
USING (order_id)
ORDER BY 1
Result:
| order_id | sent | received |
| -------- | ---- | -------- |
| 1 | 300 | 400 |
| 2 | | 1100 |
Note the COALESCE on the order_id, so that if it's missing from sent it will be taken from recevied, so that that value will never be NULL.
If you want to have 0 in place of NULL (when e.g. there is no record for that order_id in either sent or received), you would do COALESCE(s.sent, 0) AS sent, COALESCE(r.received, 0) AS received.
https://www.db-fiddle.com/f/nq3xYrcys16eUrBRHT6xLL/2

Related

PostgreSQL - Check if column value exists in any previous row

I'm working on a problem where I need to check if an ID exists in any previous records within another ID set, and create a tag if it does.
Suppose I have the following table
| client_id | order_date | supplier_id |
| 1 | 2022-01-01 | 1 |
| 1 | 2022-02-01 | 2 |
| 1 | 2022-03-01 | 1 |
| 1 | 2022-04-01 | 3 |
| 2 | 2022-05-01 | 1 |
| 2 | 2022-06-01 | 1 |
| 2 | 2022-07-01 | 2 |
And I want to create a column with a "is new supplier" tag (for each client):
| client_id | order_date | supplier_id | is_new_supplier|
| 1 | 2022-01-01 | 1 | True
| 1 | 2022-02-01 | 2 | True
| 1 | 2022-03-01 | 1 | False
| 1 | 2022-04-01 | 3 | True
| 2 | 2022-05-01 | 1 | True
| 2 | 2022-06-01 | 1 | False
| 2 | 2022-07-01 | 2 | True
First I tried doing this by creating a dense_rank and filtering out repeated ranks, but it didn't work:
with aux as (SELECT client_id,
order_date,
supplier_id
FROM table)
SELECT *, dense_rank() over (
partition by client_id
order by supplier_id
) as _dense_rank
FROM aux
Another way I thought about doing this, is by creating an auxiliary id with client_id + supplier_id, ordering by date and checking if the aux id exists in any previous row, but I don't know how to do this in SQL.
You are on the right track.
Instead of dense_rank, you can just use row_number and on your partition by add supplier id..
Don't forget to order by order_date
with aux as (SELECT client_id,
order_date,
supplier_id,
row_number() over (
partition by client_id, supplier_id
order by order_date
) as rank
FROM table)
SELECT client_id,
order_date,
supplier_id,
rank,
(rank = 1) as is_new_supplier
FROM aux

Postgres Hierarchy output

im struggling on how to get the correct output using hierarchy query.
I have one table which loads per day all product and its price. during time this can cancel and being activate again.
I believe with oracle we could use the Connect By.
WITH RECURSIVE cte AS (
select min(event_date) event_date, item_code,sum(price::numeric)/1024/1024 price, 1 AS level
from rdpidevdat.raid_r_cbs_offer_accttype_map where product_type='cars' and item_code in ('Renault')
group by item_code
UNION ALL
SELECT e.event_date, e.item_code, e.price, cte.level + 1
from (select event_date, item_code,sum(price::numeric)/1024/1024 price
from rdpidevdat.raid_r_cbs_offer_accttype_map where product_type='cars' and item_code in ('9859')
group by event_date,item_code) e join cte ON e.event_date = cte.event_date and e.item_code = cte.item_code
)
SELECT *
FROM cte where item_code in ('Renault') ;
how do i put an ouput where will have the range of each product during time?
if we have the data:
EVENT_DATE | ITEM_COD| PRICE
20210910 | Renaut | 2500
20210915 | Renaut | 2500
20210920 | Renaut | 2600
20211020 | Renaut | 2900
20220101 | Renaut | 2500
the expected output should be:
-------------------------------------------------
FROM_EVENT_DATE | TO_EVENT_DATE | ITEM_COD| PRICE
20210910 | 20210915 | Renaut | 2500
20210915 | 20210920 | Renaut | 2600
20210920 | 20211020 | Renaut | 2900
20211020 | 20220101 | Renaut | 2500
Thanks in Advance and Regards!
I already found the solution. Using the Lag and lastvalue function. no need to use the hierarchy one.

How can I `SUM()` in PostgreSQL based on certain condition? For summing debits and credits in accounting journal table

I have a database full with accounting journals. There is table for accounting journal itself (the accounting journal's metadata) and there is a table for accounting journal line (for each account with its debit or credit).
I have database like this:
+----+---------------+--------+---------+
| ID | JOURNAL_NAME | DEBIT | CREDIT |
+----+---------------+--------+---------+
| | | | |
| 1 | INV/0001 | 100 | 0 |
| | | | |
| 2 | INV/0001 | 0 | 100 |
| | | | |
| 3 | INV/0002 | 200 | 0 |
| | | | |
| 4 | INV/0002 | 0 | 200 |
+----+---------------+--------+---------+
I want to have all journal with the same name to be summed in one, their debits and credits. So from the above table... I want to have a query that makes something like this:
+--------------+--------+---------+
| JOURNAL_NAME | DEBIT | CREDIT |
+--------------+--------+---------+
| | | |
| INV/0001 | 100 | 100 |
| | | |
| INV/0002 | 200 | 200 |
+--------------+--------+---------+
I have tried with:
SELECT DISTINCT ON (accounting_journal.id)
accounting_journal.name,
accounting_journal_line.debit,
accounting_journal_line.credit
FROM accounting_journal_line
JOIN accounting_journal ON accounting_journal.id = accounting_journal_line.move_id
ORDER BY accounting_journal.id ASC
LIMIT 3;
With the above query, I have all the journal and the journal lines. I just need to have the above query to sum the debits and credits for every same accounting_journal.name.
I have tried with SUM() but it always stuck in GROUP BY` clause.
SELECT DISTINCT ON (accounting_journal.id)
accounting_journal.name,
accounting_journal.ref,
accounting_journal_line.name,
SUM(accounting_journal_line.debit),
SUM(accounting_journal_line.credit)
FROM accounting_journal_line
JOIN accounting_journal ON accounting_journal.id = accounting_journal_line.move_id
ORDER BY accounting_journal.id ASC
LIMIT 3;
The error:
Error in query (7): ERROR: column "accounting_journal.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: accounting_journal.name,
I hope I can get assistance or pointer where I need to look at, here. Thanks!
When you are using any aggregation function with normal columns then your have to mention all the non-aggregating column in group by clause,
So try This:
SELECT DISTINCT ON (accounting_journal.id)
accounting_journal.name,
accounting_journal.ref,
accounting_journal_line.name,
SUM(accounting_journal_line.debit),
SUM(accounting_journal_line.credit)
FROM accounting_journal_line
JOIN accounting_journal ON accounting_journal.id = accounting_journal_line.move_id
group by 1,2,3
ORDER BY accounting_journal.id ASC
LIMIT 3;
In your query you are having 3 non-aggregation column so you can mention column number in group by clause to achieve it.
You can use the Sum Window Function, it does not require "group by". So:
select aj.id journal_id
aj.name journal_name,
aj.ref journal_ref,
ajl.name line_name,
sum(ajl.debit) over(partition by aj.id) total_debit,
sum(ajl.credit) over(partition by aj.id) total_credit
from accounting_journal_line ajl
join accounting_journal aj
on aj.id = ajl.move_id
order by aj.id;
See fiddle for a working example.

Reset column with numeric value that represents the order when destroying a row

I have a table of users that has a column called order that represents the order in they will be elected.
So, for example, the table might look like:
| id | name | order |
|-----|--------|-------|
| 1 | John | 2 |
| 2 | Mike | 0 |
| 3 | Lisa | 1 |
So, say that now Lisa gets destroyed, I would like that in the same transaction that I destroy Lisa, I am able to update the table so the order is still consistent, so the expected result would be:
| id | name | order |
|-----|--------|-------|
| 1 | John | 1 |
| 2 | Mike | 0 |
Or, if Mike were the one to be deleted, the expected result would be:
| id | name | order |
|-----|--------|-------|
| 1 | John | 1 |
| 3 | Lisa | 0 |
How can I do this in PostgreSQL?
If you are just deleting one row, one option uses a cte and the returning clause to then trigger an update
with del as (
delete from mytable where name = 'Lisa'
returning ord
)
update mytable
set ord = ord - 1
from del d
where mytable.ord > d.ord
As a more general approach, I would really recommend trying to renumber the whole table after every delete. This is inefficient, and can get tedious for multi-rows delete.
Instead, you could build a view on top of the table:
create view myview as
select id, name, row_number() over(order by ord) ord
from mytable

Grouping in t-sql with latest dates

I have a table like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
1 | 1 | 2009-01-01 | 100 |
2 | 1 | 2009-01-02 | 20 |
3 | 1 | 2009-01-03 | 50 |
4 | 2 | 2009-01-01 | 80 |
5 | 2 | 2009-01-04 | 30 |
For each contract I need to fetch the latest event and amount associated with the event and get something like this
Event ID | Contract ID | Event date | Amount |
----------------------------------------------
3 | 1 | 2009-01-03 | 50 |
5 | 2 | 2009-01-04 | 30 |
I can't figure out how to group the data correctly. Any ideas?
Thanks in advance.
SQL 2k5/2k8:
with cte_ranked as (
select *
, row_number() over (
partition by ContractId order by EvantDate desc) as [rank]
from [table])
select *
from cte_ranked
where [rank] = 1;
SQL 2k:
select t.*
from table as t
join (
select max(EventDate) as MaxDate
, ContractId
from table
group by ContractId) as mt
on t.ContractId = mt.ContractId
and t.EventDate = mt.MaxDate