PostgreSQL Query: How to get Group-By to Roll Up - postgresql

Data:
There are 2 possible statuses, 0 or 1
There are a variety of different Sub Statuses and sources
I am looking to create a query that will accomplish the following:
By Week/Year, show the number of unique instances of each status, sub status and source combination. I do have UIDs I can count for "number of instances".
I have written the following:
SELECT date_part('week', date) as week, date_part('year', date) as year, active_date, status, sub_status, source, id
FROM public.users
WHERE status < 2
GROUP BY created_at, active_date, status, sub_status, source, id
ORDER BY created_at DESC
Which accomplished the following:
How do I get these to roll up?
Thanks!

It turns out that the terms 'week' and 'year' are reserved words. by replacing them with 'theweek' and 'theyear', as well as adding in a count function, I was able to roll these up.
SELECT date_part('week', created_at) as theweek, date_part('year', created_at) as theyear, status, sub_status, source, Count(*)
FROM public.users
WHERE status < 2
GROUP BY theweek, theyear, status, sub_status, source
ORDER BY theweek DESC
Results looked like the following:
Thanks everyone!

Related

Combine two different tables Without Duplicating using Postgres SQL

I am working on a query where I should combine 2 tables and get each user as a separate entry (The user should not be duplicate). For the date, I need to get the latest out of those 2 tables
table 1
table 2
Expected output ( I need to combine both tables and get the data's of the user as a single entry and for the date, i need to get the latest date out of those 2 tables)
user_id name date
----------------------------------
1 John 2020-10-29 --The latest date--
2 Tom 2020-11-15 --The latest date--
3 Peter 2020-12-10 --The latest date--
Actual Output
My postgresql
SELECT user_id, name, date
FROM
table_1
UNION
SELECT user_id, name, date
FROM
table_2
I tried many ways but nothing worked. The datas are duplicating when doing the union. Can someone help me
Use combine two tables using UNION ALL then apply ROW_NUMBER() for serializing user_id wise value with descending date. Then retrieve last record by using CTE. Using UNION ALL for avoiding extra ordering.
-- PostgreSQL
WITH c_cte AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
)
SELECT user_id, name, date
FROM c_cte
WHERE row_num = 1
ORDER BY user_id
Also another way for doing same thing without CTE
SELECT u.user_id, u.name, u.date
FROM (SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
) u
WHERE u.row_num = 1
ORDER BY u.user_id

postgresql get weekly average of cases with daily data

I have a table called Table1. I am trying to get the weekly average, but I only have daily data. My table contains the following attributes: caseID, date, status and some other (irrelevant) attributes. With the following query, I made the following table which comes close to what I want:
However, I would like to add a average per week of the number of cases. I have look everywhere, but I am not sure how to include that. Has anybody any clues for how to add that.
Thanks.
To expand on #luuk's answer...
SELECT
date,
COUNT(id) as countcase,
EXTRACT(WEEK FROM date) AS weeknbr,
AVG(COUNT(id)) OVER (PARTITION BY EXTRACT(WEEK FROM date)) as weeklyavg
FROM table1
GROUP BY date, weeknbr
ORDER BY date, weeknbr
This is possible as the Aggregation / GROUP BY is applied before the window/analytic function.
select
date,
countcase,
extract(week from date) as weeknbr,
avg(countcase) over (partition by extract(week from date)) as weeklyavg
from table1;

Get the first live record on each quarter

we have a pricing table and I need to get the first live record on each quarter, the table structure is like this:
record_id (int)
start_date (date)
price (decimal)
live (boolean)
I need to be able to get the first "live" record on each quarter.
So far, I've been able to do this:
SELECT DISTINCT EXTRACT(QUARTER FROM start_date::TIMESTAMP) as quarter,
EXTRACT(YEAR FROM start_date::TIMESTAMP) as year,
distinct start_date,
live
FROM record_pricing rp
group by year, quarter,record_instance_uid
order by year,quarter;
I get this:
As you can see there are live and not live records there in the results, I just need the first live record on each Q, as highlighted in the picture above as an example.
you can use:
SELECT *, ROW_NUMBER() OVER(PARTITION BY year,quarter order by start_date asc) as Rank,
FROM (
SELECT EXTRACT(QUARTER FROM start_date::TIMESTAMP) as quarter,
EXTRACT(YEAR FROM start_date::TIMESTAMP) as year,
record_instance_uid,live,start_date
FROM record_pricing rp
)Tab
where tab.Rank=1

PostgreSQL subquery not working

What's wrong with this query?
select extract(week from created_at) as week,
count(*) as received,
(select count(*) from bugs where extract(week from updated_at) = a.week) as done
from bugs as a
group by week
The error message is:
column a.week does not exist
UPDATE:
following the suggestion of the first comment, I tried this:
select a.extract(week from created_at) as week,
count(*) as received, (select count(*)
from bugs
where extract(week from updated_at) = a.week) as done from bugs as a group by week
But it doesn't seem to work:
ERROR: syntax error at or near "from"
LINE 1: select a.extract(week from a.created_at) as week, count(*) a...
As far as I can tell you don't need the sub-select at all:
select extract(week from created_at) as week,
count(*) as received,
sum( case when extract(week from updated_at) = extract(week from created_at) then 1 end) as done
from bugs
group by week
This counts all bugs per week and counts those that are updated in the same week as "done".
Note that your query will only report correct values if you never have more than one year in your table.
If you have more than one year of data in the table you need to include the year in the comparison as well:
select to_char(created_at, 'iyyy-iw') as week,
count(*) as received,
sum( case when to_char(created_at, 'iyyy-iw') = to_char(updated_at, 'iyyy-iw') then 1 end) as done
from bugs
group by week
Note that I used IYYY an IW to cater for the ISO definition of the year and the week around the year end/start.
Maybe a little explanation on why your original query did not work would be helpful:
The "outer" query uses two aliases
a table alias for bugs named a
a column alias for the expression extract(week from created_at) named week
The only place where the column alias week can be used is in the group by clause.
To the sub-select (select count(*) from bugs where extract(week from updated_at) = a.week)) the alias a is visible, but not the alias week (that's how the SQL standard is defined).
To get your subselect working (in terms of column visibility) you would need to reference the full expression of the "outer" column:
(select count(*) from bugs b where extract(week from b.updated_at) = extract(week from a.created_at))
Note that I introduced another table alias b in order to make it clear which column stems from which alias.
But even then you'd have a problem with the grouping as you can't reference an ungrouped column like that.
that could work as well
with origin as (
select extract(week from created_at) as week, count(*) as received
from bugs
group by week
)
select week, received,
(select count(*) from bugs where week = extract(week from updated_at) )
from origin
it should have a good performance

PostgreSQL pulling the MAX value from a set of data

I have the following query:
SELECT user_id,
greatest(created_at),
note
FROM user_notes
Which outputs
user_id greatest latest_note
12345 2012-09-05 note1
23456 2013-09-01 note2
23456 2013-09-02 note3
etc. etc. etc.
I thought this query would eliminate duplicates from the user_id row. I want each user_id to only have one "greatest" result. I can't seem to figure out why there are multiple "greatests" for the same user_id.
I think you want MAX() not GREATEST():
WITH latest AS (
SELECT user_id, max(created_at) AS created_at
FROM user_notes
GROUP BY user_id)
SELECT user_notes.*
FROM user_notes
INNER JOIN latest
USING (user_id, created_at)