Aggregate count by several weeks after field data in PostgreSQL - postgresql

I have a query returns something like that:
registered_at - date of user registration;
action_at - date of some kind of action.
| registered_at | user_id | action_at |
-------------------------------------------------------
| 2015-05-01 12:00:00 | 1 | 2015-05-04 12:00:00 |
| 2015-05-01 12:00:00 | 1 | 2015-05-10 12:00:00 |
| 2015-05-01 12:00:00 | 1 | 2015-05-16 12:00:00 |
| 2015-04-01 12:00:00 | 2 | 2015-04-04 12:00:00 |
| 2015-04-01 12:00:00 | 2 | 2015-04-05 12:00:00 |
| 2015-04-01 12:00:00 | 2 | 2015-04-10 12:00:00 |
| 2015-04-01 12:00:00 | 2 | 2015-04-30 12:00:00 |
I'm trying to implement query that will returns me something like that:
weeks_after_registration - in this example limited by 3, in real task it will be limited by 6.
| user_id | weeks_after_registration | action_counts |
-------------------------------------------------------
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 2 | 1 | 2 |
| 2 | 2 | 1 |
| 2 | 3 | 0 |

You can use extract(days from (action_at - registered_at) / 7)+1 to get the number of weeks. Then count the number of actions grouped by the number of weeks.
select user_id, wk, count(*) actions
from (select user_id, extract(days from (action_at - registered_at) / 7)+1 wk from Table1) a
where wk <= 3
group by user_id, wk
If you must display rows where action_counts = 0 in the result, then you need to join with the all possible week numbers (1, 2, 3) and all possible user_ids (1, 2) like:
select b.user_id, a.wk, coalesce(c.actions, 0) actions
from (select * from generate_series(1, 3) wk) a
join (select distinct user_id from Table1) b on true
left join (
select user_id, wk, count(*) actions
from (select user_id, extract(days from (action_at - registered_at) / 7)+1 wk from Table1) a
where wk <= 3
group by user_id, wk
) c on a.wk = c.wk and b.user_id = c.user_id
order by b.user_id, a.wk;
fiddle

Related

PostgreSQL - Check if column value exists in any previous row

I'm working on a problem where I need to check if an ID exists in any previous records within another ID set, and create a tag if it does.
Suppose I have the following table
| client_id | order_date | supplier_id |
| 1 | 2022-01-01 | 1 |
| 1 | 2022-02-01 | 2 |
| 1 | 2022-03-01 | 1 |
| 1 | 2022-04-01 | 3 |
| 2 | 2022-05-01 | 1 |
| 2 | 2022-06-01 | 1 |
| 2 | 2022-07-01 | 2 |
And I want to create a column with a "is new supplier" tag (for each client):
| client_id | order_date | supplier_id | is_new_supplier|
| 1 | 2022-01-01 | 1 | True
| 1 | 2022-02-01 | 2 | True
| 1 | 2022-03-01 | 1 | False
| 1 | 2022-04-01 | 3 | True
| 2 | 2022-05-01 | 1 | True
| 2 | 2022-06-01 | 1 | False
| 2 | 2022-07-01 | 2 | True
First I tried doing this by creating a dense_rank and filtering out repeated ranks, but it didn't work:
with aux as (SELECT client_id,
order_date,
supplier_id
FROM table)
SELECT *, dense_rank() over (
partition by client_id
order by supplier_id
) as _dense_rank
FROM aux
Another way I thought about doing this, is by creating an auxiliary id with client_id + supplier_id, ordering by date and checking if the aux id exists in any previous row, but I don't know how to do this in SQL.
You are on the right track.
Instead of dense_rank, you can just use row_number and on your partition by add supplier id..
Don't forget to order by order_date
with aux as (SELECT client_id,
order_date,
supplier_id,
row_number() over (
partition by client_id, supplier_id
order by order_date
) as rank
FROM table)
SELECT client_id,
order_date,
supplier_id,
rank,
(rank = 1) as is_new_supplier
FROM aux

PostgreSQL build working range from one date column

I'm using PostgreSQL v. 11.2
I have a table
|id | bot_id | date |
| 1 | 1 | 2020-04-20 16:00:00|
| 2 | 2 | 2020-04-22 12:00:00|
| 3 | 3 | 2020-04-24 04:00:00|
| 4 | 1 | 2020-04-27 09:00:00|
And for example, I have DateTime range 2020-03-30 00:00:00 and 2020-04-30 00:00:00
I need to show get working ranges to count the total working hours of each bot.
Like this:
|bot_id | start_date | end_date |
| 1 | 2020-03-30 00:00:00 | 2020-04-20 16:00:00 |
| 2 | 2020-04-20 16:00:00 | 2020-04-22 12:00:00 |
| 3 | 2020-04-22 12:00:00 | 2020-04-24 04:00:00 |
| 1 | 2020-04-24 04:00:00 | 2020-04-27 09:00:00 |
| 1 | 2020-04-27 09:00:00 | 2020-04-30 00:00:00 |
I've tried to use LAG(date) but I'm not getting first and last dates of the range.
You could use a UNION ALL, with one part building the start_date/end_date couples from your values & the other part filling in the last period (from the last date to 2020-04-30 00:00:00):
WITH values (id, bot_id, date) AS (
VALUES (1, 1, '2020-04-20 16:00:00'::TIMESTAMP)
, (2, 2, '2020-04-22 12:00:00')
, (3, 3, '2020-04-24 04:00:00')
, (4, 1, '2020-04-27 09:00:00')
)
(
SELECT bot_id
, LAG(date, 1, '2020-03-30 00:00:00') OVER (ORDER BY id) AS start_date
, date AS end_date
FROM values
)
UNION ALL
(
SELECT bot_id
, date AS start_date
, '2020-04-30 00:00:00' AS end_date
FROM values
ORDER BY id DESC
LIMIT 1
)
+------+--------------------------+--------------------------+
|bot_id|start_date |end_date |
+------+--------------------------+--------------------------+
|1 |2020-03-30 00:00:00.000000|2020-04-20 16:00:00.000000|
|2 |2020-04-20 16:00:00.000000|2020-04-22 12:00:00.000000|
|3 |2020-04-22 12:00:00.000000|2020-04-24 04:00:00.000000|
|1 |2020-04-24 04:00:00.000000|2020-04-27 09:00:00.000000|
|1 |2020-04-27 09:00:00.000000|2020-04-30 00:00:00.000000|
+------+--------------------------+--------------------------+

How to get all days in one table a date range even if no data exists also in SQL Server

I have one table name called Tab1. I would like to get all date even if any one of the days is missing also.
+-------------------+--------------------------+
|Name | dateCheck |
+-------------------+--------------------------+
| 1 | 2016-01-01 00:00:00.000 |
| 2 | 2016-01-02 00:00:00.000 |
| 3 | 2016-01-05 00:00:00.000 |
| 4 | 2016-01-07 00:00:00.000 |
+-------------------+--------------------------+
I need output like below :
+-------------------+--------------------------+
|Name | dateCheck |
+-------------------+--------------------------+
| 1 | 2016-01-01 00:00:00.000 |
| 2 | 2016-01-02 00:00:00.000 |
| 0 | 2016-01-03 00:00:00.000 |
| 0 | 2016-01-04 00:00:00.000 |
| 3 | 2016-01-05 00:00:00.000 |
| 0 | 2016-01-06 00:00:00.000 |
| 4 | 2016-01-07 00:00:00.000 |
You may use a calendar table:
SELECT
COALESCE(t2.Name, 0) AS Name,
t1.dateCheck
FROM
(
SELECT '2016-01-01' AS dateCheck UNION ALL
SELECT '2016-01-02' UNION ALL
SELECT '2016-01-03' UNION ALL
SELECT '2016-01-04' UNION ALL
SELECT '2016-01-05' UNION ALL
SELECT '2016-01-06' UNION ALL
SELECT '2016-01-07'
) t1
LEFT JOIN yourTable t2
ON t1.dateCheck = t2.dateCheck;

How to get list day of month data per month in postgresql

i use psql v.10.5
and i have a structure table like this :
| date | total |
-------------------------
| 01-01-2018 | 50 |
| 05-01-2018 | 90 |
| 30-01-2018 | 20 |
how to get recap data by month, but the data showed straight 30 days, i want the data showed like this :
| date | total |
-------------------------
| 01-01-2018 | 50 |
| 02-01-2018 | 0 |
| 03-01-2018 | 0 |
| 04-01-2018 | 0 |
| 05-01-2018 | 90 |
.....
| 29-01-2018 | 0 |
| 30-01-2018 | 20 |
i've tried this query :
SELECT * FROM date
WHERE EXTRACT(month FROM "date") = 1 // dynamically
AND EXTRACT(year FROM "date") = 2018 // dynamically
but the result is not what i expected. also the params of month and date i create dynamically.
any help will be appreciated
Use the function generate_series(start, stop, step interval), e.g.:
select d::date, coalesce(total, 0) as total
from generate_series('2018-01-01', '2018-01-31', '1 day'::interval) d
left join my_table t on d::date = t.date
Working example in rextester.

Calculating the forecasts by month for the last 3 months in postgres

I have a table called forecasts where we store the forecasts for all the products for the next 6 months. For example when we are in November we create the forecast for December, January, February, March, April and May. The forecasts table looks something like the one below
+----------------+---------------+--------------+----------+
| product_number | forecasted_on | forecast_for | quantity |
+----------------+---------------+--------------+----------+
| Prod 1 | 2016-11-01 | 2016-12-01 | 100 |
| Prod 1 | 2016-11-01 | 2017-01-01 | 200 |
| Prod 1 | 2016-11-01 | 2017-02-01 | 300 |
| Prod 1 | 2016-11-01 | 2017-03-01 | 400 |
| Prod 1 | 2016-11-01 | 2017-04-01 | 500 |
| Prod 1 | 2016-11-01 | 2017-05-01 | 600 |
+----------------+---------------+--------------+----------+
Where the table contains a list of product numbers and the date on which the forecast was created i.e. forecasted_on and a month for which the forecast was created for along with the forecasted quantity.
Each month data gets added for the next 6 months. So when the forecasted_on is 1-December-2016 forecasts will be created for January till June.
I am trying to create a report that shows how the total forecasts have varied for the last 3 months. Something like this
+------------+----------------+---------------+----------------+
| | 0 months prior | 1 month prior | 2 months prior |
+------------+----------------+---------------+----------------+
| 2016-12-01 | 200 | 150 | 250 |
| 2017-01-01 | 300 | 250 | 150 |
| 2017-02-01 | 100 | 150 | 100 |
+------------+----------------+---------------+----------------+
Currently I am using a lot of repetitive code in rails to generate this table. I wanted to see if there was an easier way to do it directly using a SQL query.
Any help would be greatly appreciated.
Use PIVOT query:
select forecast_for,
sum( case when forecasted_on + interval '1' month = forecast_for
then quantity end ) q_0,
sum( case when forecasted_on + interval '2' month = forecast_for
then quantity end ) q_1,
sum( case when forecasted_on + interval '3' month = forecast_for
then quantity end ) q_2,
sum( case when forecasted_on + interval '4' month = forecast_for
then quantity end ) q_3,
sum( case when forecasted_on + interval '5' month = forecast_for
then quantity end ) q_4,
sum( case when forecasted_on + interval '6' month = forecast_for
then quantity end ) q_5
from Table1
group by forecast_for
order by 1
;
Demo: http://sqlfiddle.com/#!15/30e5e/1
| forecast_for | q_0 | q_1 | q_2 | q_3 | q_4 | q_5 |
|----------------------------|--------|--------|--------|--------|--------|--------|
| December, 01 2016 00:00:00 | 100 | (null) | (null) | (null) | (null) | (null) |
| January, 01 2017 00:00:00 | (null) | 200 | (null) | (null) | (null) | (null) |
| February, 01 2017 00:00:00 | (null) | (null) | 300 | (null) | (null) | (null) |
| March, 01 2017 00:00:00 | (null) | (null) | (null) | 400 | (null) | (null) |
| April, 01 2017 00:00:00 | (null) | (null) | (null) | (null) | 500 | (null) |
| May, 01 2017 00:00:00 | (null) | (null) | (null) | (null) | (null) | 600 |
Assuming that (product_number, forcast_on, forcasted_for) is unique (so no aggregation is required), then this should do the job:
WITH forecast_dates AS (
SELECT DISTINCT product_number, forcast_for
FROM forecasts
)
SELECT
fd.forcast_for AS "forecast for",
m1.quantity AS "one month prior",
m2.quantity AS "two months prior",
m3.quantity AS "three months prior"
FROM forecast_dates fd
LEFT JOIN forecasts m1 ON fd.forcast_for = m1.forcast_for AND fd.forcast_for = m1.forcasted_on + INTERVAL '1 month'
LEFT JOIN forecasts m2 ON fd.forcast_for = m2.forcast_for AND fd.forcast_for = m2.forcasted_on + INTERVAL '2 month'
LEFT JOIN forecasts m3 ON fd.forcast_for = m3.forcast_for AND fd.forcast_for = m3.forcasted_on + INTERVAL '3 month'
WHERE fd.product_number = 'Prod 1'
ORDER BY fd.forcast_for;