Combine two different tables Without Duplicating using Postgres SQL - postgresql

I am working on a query where I should combine 2 tables and get each user as a separate entry (The user should not be duplicate). For the date, I need to get the latest out of those 2 tables
table 1
table 2
Expected output ( I need to combine both tables and get the data's of the user as a single entry and for the date, i need to get the latest date out of those 2 tables)
user_id name date
----------------------------------
1 John 2020-10-29 --The latest date--
2 Tom 2020-11-15 --The latest date--
3 Peter 2020-12-10 --The latest date--
Actual Output
My postgresql
SELECT user_id, name, date
FROM
table_1
UNION
SELECT user_id, name, date
FROM
table_2
I tried many ways but nothing worked. The datas are duplicating when doing the union. Can someone help me

Use combine two tables using UNION ALL then apply ROW_NUMBER() for serializing user_id wise value with descending date. Then retrieve last record by using CTE. Using UNION ALL for avoiding extra ordering.
-- PostgreSQL
WITH c_cte AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
)
SELECT user_id, name, date
FROM c_cte
WHERE row_num = 1
ORDER BY user_id
Also another way for doing same thing without CTE
SELECT u.user_id, u.name, u.date
FROM (SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
) u
WHERE u.row_num = 1
ORDER BY u.user_id

Related

Converting counts inside query result tables to percentages of total

I have a table and want to calculate the percentage of total by store_id which each (category_id, store_id) subtotal represents. My code is below:
WITH
example_table (name, store_id)
AS
(
select name, store_id
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
)
SELECT name, store_id, cast(count(*) as numeric)/(SELECT count(*) FROM example_table)
FROM example_table
GROUP BY name, store_id
ORDER BY name, store_id
This code actually works, as in, it doesn't throw an error, only they're not the results I'm looking for. Here each of the subtotals is divided by the total across both stores and all 16 names. Instead, I want the subtotals divided by their respective store totals or divided by their respective name totals.
I'm wondering how to perform calculations on those subtotals in general.
Thanks in advance,
I believe you need to explore the possibilities of using aggregate functions combined with an OVER(PARTITION BY ...) e.g.
SELECT DISTINCT
name, store_id, store_id_count, name_count
FROM (
select name, store_id
, count(*) over(partition by store_id) as store_id_count
, count(*) over(partition by name) as name_count
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
) AS example_table
When using aggregate function with the over clause you get the wanted counts on each row of the result, and it seems that in this case you need this. Note that select distinct has been used simply to reduce the final number of rows returned, you might still need to use a group by but I am not sure if you do.
Once you have the needed values within the derived table (aliases as example_table) then it should be a simple matter of some arithmetic in the overall select clause.

select last of an item for each user in postgres

I want to get the last entry for each user but the customer_id is a hash 'ASAG#...' order by customer_id destroys the query. Is there an alternative?
Select Distinct On (l.customer_id)
l.customer_id
,l.created_at
,l.text
From likes l
Order By l.customer_id, l.created_at Desc
Your current query already appears to be working, q.v. here:
Demo
I don't know why your current query is not generating the results you would expect. It should return one distinct record for every customer, corresponding to the more recent one, given your ORDER BY statement.
In any case, if it does not do what you want, an alternative would be to use ROW_NUMBER() here with a partition by user. The inner query assigns a row number to each user, with the value 1 going to the most recent record for each user. Then the outer query retains only the latest record.
SELECT
t.customer_id,
t.created_at,
t.text
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) rn
FROM likes
) t
WHERE t.rn = 1
To speed up the inner query which uses ROW_NUMBER() you can try adding a composite index on the customer_id and created_at columns:
CREATE INDEX yourIdx ON likes (customer_id, created_at);

SQL SELECT in another table with most recent date

I have a list of Matter data in Table1 that I need to query, as well as get the most recent Invoice Number in Table2 that is tied to the original Matter. I'm having extreme difficulty in joining these tables together and only getting one result for each Matter as I only want the most recent Invoice #.
Any and all help would be greatly appreciated.
Table1
Table2
RESULT
The following assigns numbers to each invoice row in order of date, and selects only the most recent. Note that this assumes InvoiceDate is stored as a date,datetime, or something else that will sort chronologically, and that in the event of two invoices for the same date, returning either will be fine. If you need to return both invoices in the event of ties, replace row_number with rank.
Select * from Table1 a
inner join
(Select *
, row_number() over (partition by MatterID order by InvoiceDate desc) as RN
from Table2) b
on a.MatterID = b.MatterID and b.RN = 1

PostgreSQL pulling the MAX value from a set of data

I have the following query:
SELECT user_id,
greatest(created_at),
note
FROM user_notes
Which outputs
user_id greatest latest_note
12345 2012-09-05 note1
23456 2013-09-01 note2
23456 2013-09-02 note3
etc. etc. etc.
I thought this query would eliminate duplicates from the user_id row. I want each user_id to only have one "greatest" result. I can't seem to figure out why there are multiple "greatests" for the same user_id.
I think you want MAX() not GREATEST():
WITH latest AS (
SELECT user_id, max(created_at) AS created_at
FROM user_notes
GROUP BY user_id)
SELECT user_notes.*
FROM user_notes
INNER JOIN latest
USING (user_id, created_at)

PostgreSQL: Select first row as column inside select

I got 2 tables like Customers and Orders, in table Customers I got columns id, name, in table Orders I got columns id, customer_id, order_date.
Now I need to make one select that will return me each Customer's id, name and the last order_date.
I tried to make like this:
select
Customers.id,
Customers.name,
(select Orders.order_date from Orders where Orders.customer_id = Customer.id order by order_date desc) as last_order_date
from
Customers
But it get the wrong index and takes forever to execute.
Whats the best way to make this select in PostgreSQL?
Thanks in advanced.
If not restricting by customer_id, then the query will end up having to scan the entire orders table.
SELECT c.id
,c.name
,MAX(o.order_date) AS last_order_date
FROM Customers c
LEFT OUTER JOIN Orders o ON (o.customer_id = c.id)
GROUP BY c.id, c.name