PostgreSQL pulling the MAX value from a set of data - postgresql

I have the following query:
SELECT user_id,
greatest(created_at),
note
FROM user_notes
Which outputs
user_id greatest latest_note
12345 2012-09-05 note1
23456 2013-09-01 note2
23456 2013-09-02 note3
etc. etc. etc.
I thought this query would eliminate duplicates from the user_id row. I want each user_id to only have one "greatest" result. I can't seem to figure out why there are multiple "greatests" for the same user_id.

I think you want MAX() not GREATEST():
WITH latest AS (
SELECT user_id, max(created_at) AS created_at
FROM user_notes
GROUP BY user_id)
SELECT user_notes.*
FROM user_notes
INNER JOIN latest
USING (user_id, created_at)

Related

Combine two different tables Without Duplicating using Postgres SQL

I am working on a query where I should combine 2 tables and get each user as a separate entry (The user should not be duplicate). For the date, I need to get the latest out of those 2 tables
table 1
table 2
Expected output ( I need to combine both tables and get the data's of the user as a single entry and for the date, i need to get the latest date out of those 2 tables)
user_id name date
----------------------------------
1 John 2020-10-29 --The latest date--
2 Tom 2020-11-15 --The latest date--
3 Peter 2020-12-10 --The latest date--
Actual Output
My postgresql
SELECT user_id, name, date
FROM
table_1
UNION
SELECT user_id, name, date
FROM
table_2
I tried many ways but nothing worked. The datas are duplicating when doing the union. Can someone help me
Use combine two tables using UNION ALL then apply ROW_NUMBER() for serializing user_id wise value with descending date. Then retrieve last record by using CTE. Using UNION ALL for avoiding extra ordering.
-- PostgreSQL
WITH c_cte AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
)
SELECT user_id, name, date
FROM c_cte
WHERE row_num = 1
ORDER BY user_id
Also another way for doing same thing without CTE
SELECT u.user_id, u.name, u.date
FROM (SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
) u
WHERE u.row_num = 1
ORDER BY u.user_id

In PostgresSQL how to write a query returning multiple rows and columns

My original query was like this:
SELECT *,
(SELECT COUNT(*), user_id FROM user_sessions WHERE company_id=companies.id GROUP BY user_id) as user_sessions
FROM companies
Which comes back with this error:
error: more than one row returned by a subquery used as an expression
I found a way past that error with this:
SELECT *,
ARRAY (SELECT COUNT(*), user_id FROM user_sessions WHERE company_id=companies.id GROUP BY user_id) as user_sessions
FROM companies
But then it has this error:
error: subquery must return only one column
If I remove either COUNT(*) or user_id from the returned columns it works, however I need both sets of data. How do I return more than one column in a sub-query like this?
I guess a join should do the trick:
select * from
companies
join
( select count(*), company_id, user_id
from user_sessions
group by (company_id, user_id)
) as user_sessions
on companies.id = company_id
For anyone who runs into this in the future, tested the best way to dot this is what #Matt mentioned in the comments above:
Replace COUNT(*), user_id with ARRAY[COUNT(*), user_id]
Works perfectly

Converting counts inside query result tables to percentages of total

I have a table and want to calculate the percentage of total by store_id which each (category_id, store_id) subtotal represents. My code is below:
WITH
example_table (name, store_id)
AS
(
select name, store_id
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
)
SELECT name, store_id, cast(count(*) as numeric)/(SELECT count(*) FROM example_table)
FROM example_table
GROUP BY name, store_id
ORDER BY name, store_id
This code actually works, as in, it doesn't throw an error, only they're not the results I'm looking for. Here each of the subtotals is divided by the total across both stores and all 16 names. Instead, I want the subtotals divided by their respective store totals or divided by their respective name totals.
I'm wondering how to perform calculations on those subtotals in general.
Thanks in advance,
I believe you need to explore the possibilities of using aggregate functions combined with an OVER(PARTITION BY ...) e.g.
SELECT DISTINCT
name, store_id, store_id_count, name_count
FROM (
select name, store_id
, count(*) over(partition by store_id) as store_id_count
, count(*) over(partition by name) as name_count
from category
join film_category using (category_id)
join film using (film_id)
join inventory using (film_id)
join rental using (inventory_id)
) AS example_table
When using aggregate function with the over clause you get the wanted counts on each row of the result, and it seems that in this case you need this. Note that select distinct has been used simply to reduce the final number of rows returned, you might still need to use a group by but I am not sure if you do.
Once you have the needed values within the derived table (aliases as example_table) then it should be a simple matter of some arithmetic in the overall select clause.

PostgreSQL Query: How to get Group-By to Roll Up

Data:
There are 2 possible statuses, 0 or 1
There are a variety of different Sub Statuses and sources
I am looking to create a query that will accomplish the following:
By Week/Year, show the number of unique instances of each status, sub status and source combination. I do have UIDs I can count for "number of instances".
I have written the following:
SELECT date_part('week', date) as week, date_part('year', date) as year, active_date, status, sub_status, source, id
FROM public.users
WHERE status < 2
GROUP BY created_at, active_date, status, sub_status, source, id
ORDER BY created_at DESC
Which accomplished the following:
How do I get these to roll up?
Thanks!
It turns out that the terms 'week' and 'year' are reserved words. by replacing them with 'theweek' and 'theyear', as well as adding in a count function, I was able to roll these up.
SELECT date_part('week', created_at) as theweek, date_part('year', created_at) as theyear, status, sub_status, source, Count(*)
FROM public.users
WHERE status < 2
GROUP BY theweek, theyear, status, sub_status, source
ORDER BY theweek DESC
Results looked like the following:
Thanks everyone!

select last of an item for each user in postgres

I want to get the last entry for each user but the customer_id is a hash 'ASAG#...' order by customer_id destroys the query. Is there an alternative?
Select Distinct On (l.customer_id)
l.customer_id
,l.created_at
,l.text
From likes l
Order By l.customer_id, l.created_at Desc
Your current query already appears to be working, q.v. here:
Demo
I don't know why your current query is not generating the results you would expect. It should return one distinct record for every customer, corresponding to the more recent one, given your ORDER BY statement.
In any case, if it does not do what you want, an alternative would be to use ROW_NUMBER() here with a partition by user. The inner query assigns a row number to each user, with the value 1 going to the most recent record for each user. Then the outer query retains only the latest record.
SELECT
t.customer_id,
t.created_at,
t.text
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) rn
FROM likes
) t
WHERE t.rn = 1
To speed up the inner query which uses ROW_NUMBER() you can try adding a composite index on the customer_id and created_at columns:
CREATE INDEX yourIdx ON likes (customer_id, created_at);