Postgresql want to run a query for each day in an interval - postgresql

i have a query which i need to run for every day in an interval, like for each day, for the last 2 years, i don't have the day info in the table, so i need to do it in a loop i think:
' select distinct on (osu.order_id) osu.order_id, osu.order_state, osu.created_at
from stock_management.order_state_updates osu
where osu.created_at < '2021-01-26 22:00:00'
order by osu.order_id desc, osu.created_at desc) temp
where temp.order_state = 'Filter1';'
in which the date '2021-01-26 22:00:00' would go through each day of the interval. thank you
https://docs.google.com/spreadsheets/d/1B2xx-c3wWZYaEN76LxjYhHnlrPRUx4TG8vsAkZ1X_Vs/edit?usp=sharing
error

You can generate a calendar and join it to your query. I'm not sure this will retrieve the right datas because I don't have sample data and expected result.
with d as (select * from generate_series ('20210101','20220427',interval '1 day') as date)
select distinct on (osu.order_id) osu.order_id, osu.order_state, osu.created_at::date
from stock_management.order_state_updates osu right join d on osu.created_at::date = d.date
order by osu.order_id desc, osu.created_at desc) temp
where temp.order_state = 'Filter1';

Related

PostgreSQL - SQL function to loop through all months of the year and pull 10 random records from each

I am attempting to pull 10 random records from each month of this year using this query here but I get an error "ERROR: relation "c1" does not exist
"
Not sure where I'm going wrong - I think it may be I'm using Mysql syntax instead, but how do I resolve this?
My desired output is like this
Month
Another header
2021-01
random email 1
2021-01
random email 2
total of ten random emails from January, then ten more for each month this year (til November of course as Dec yet to happen)..
With CTE AS
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
DISTINCT(TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM')) AS month
,CASE
WHEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
)
SELECT *
FROM CTE C1
LEFT JOIN
(SELECT RN
,month
,email
FROM CTE C2
WHERE C2.month = C1.month
ORDER BY RANDOM() LIMIT 10) C3
ON C1.RN = C3.RN
ORDER By month ASC```
You can't reference an outer table inside a derived table with a regular join. You need to use left join lateral to make that work
I did end up finding a more elegant solution to my query here via this source from github :
SELECT
month
,email
FROM
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM') AS month
,CASE
WHEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
) q
WHERE
RN <=10
ORDER BY month ASC

How to fix column "sv.last_updated" must appear in the GROUP BY clause or be used in an aggregate function in PostgreSQL

What I'm trying to achieve is like get the total count of transactions and their total amount for every hour a given day.
I have written a cast query in PostgreSQL. if I execute it gives
column "sv.last_updated" must appear in the GROUP BY clause or be used in an aggregate function
select cast('00:00' as time) + g.h * interval '1 hour' as time,
count(sv.id) as counts,
sum(sv.amount) as amount
from generate_series(0, 23, 1) g(h)
left join paymentvirtualization.summery_virtualizer sv
on extract(hour from sv.last_updated) = g.h
and date_trunc('day', sv.last_updated) = '2021-09-28'
and sv.guid = '1aecb2ba5c3941fe9cdab0cbf0c64937'
group by g.h
order by g.h,sv.last_updated desc limit 1;
What is this issue and how can i fix this?
Thanks.

Is there a SQL code for cumulative count of SaaS customer over months?

I have a table with:
ID (id client), date_start (subscription of SaaS), date_end (could be a date value or be NULL).
So I need a cumulative count of active clients month by month.
any idea on how to write that in Postgres and achieve this result?
Starting from this, but I don't know how to proceed
select
date_trunc('month', c.date_start)::date,
count(*)
from customer
Please check next solution:
select
subscrubed_date,
subscrubed_customers,
unsubscrubed_customers,
coalesce(subscrubed_customers, 0) - coalesce(unsubscrubed_customers, 0) cumulative
from (
select distinct
date_trunc('month', c.date_start)::date subscrubed_date,
sum(1) over (order by date_trunc('month', c.date_start)) subscrubed_customers
from customer c
order by subscrubed_date
) subscribed
left join (
select distinct
date_trunc('month', c.date_end)::date unsubscrubed_date,
sum(1) over (order by date_trunc('month', c.date_end)) unsubscrubed_customers
from customer c
where date_end is not null
order by unsubscrubed_date
) unsubscribed on subscribed.subscrubed_date = unsubscribed.unsubscrubed_date;
share SQL query
You have a table of customers. With a start date and sometimes an end date. As you want to group by date, but there are two dates in the table, you need to split these first.
Then, you may have months where only customers came and others where only customers left. So, you'll want a full outer join of the two sets.
For a cumulative sum (also called a running total), use SUM OVER.
with came as
(
select date_trunc('month', date_start) as month, count(*) as cnt
from customer
group by date_trunc('month', date_start)
)
, went as
(
select date_trunc('month', date_end) as month, count(*) as cnt
from customer
where date_end is not null
group by date_trunc('month', date_end)
)
select
month,
came.cnt as cust_new,
went.cnt as cust_gone,
sum(came.cnt - went.cnt) over (order by month) as cust_active
from came full outer join went using (month)
order by month;

Filter duplicates on row_number results

I'm trying to make a query on PostgreSQL that gives me the top 10 jobs that take more time each month (excluding current month), I have made this query so far but it gives me duplicates on the job name. How can I filter these?
SELECT job, month, duration
FROM (
SELECT
month,
job,
duration,
ROW_NUMBER() OVER (PARTITION BY month ORDER BY duration DESC) AS RN
FROM
run_history
WHERE
owner = 'john'
) x
WHERE RN <= 10
AND month < TO_CHAR(CURRENT_DATE, 'yyyymm')
Sounds like there can be multiple rows per (owner, month, job) and you want to work with the maximum duration per month for each job.
If so, aggregate computing max(duration) first, then use row_number() on top of it:
SELECT job, month, max_duration
FROM (
SELECT month, job, max(duration) AS max_duration
, row_number() OVER (PARTITION BY month ORDER BY max(duration) DESC NULLS LAST) AS rn
FROM run_history
WHERE owner = 'john'
AND month < to_char(CURRENT_DATE, 'yyyymm')
GROUP BY month, job
) sub
WHERE rn <= 10
ORDER BY month DESC, rn;
Aside: consider integer or date instead of text for the column month: cleaner and more efficient.

PostgreSQL SELECT date before max(DATE)

I need to select the rows for which the difference between max(date) and the date just before max(date) is smaller than 366 days. I know about SELECT MAX(date) FROM table to get the last date from now, but how could I get the date before?
I would need a query of this kind:
SELECT code, MAX(date) - before_date FROM troncon WHERE MAX(date) - before_date < 366 ;
NB : before_date does not refer to anything and is to be replaced by a functionnal stuff.
Edit : Example of the table I'm testing it on:
CREATE TABLE troncon (code INTEGER, ope_date DATE) ;
INSERT INTO troncon (code, ope_date) VALUES
('C086000-T10001', '2014-11-11'),
('C086000-T10001', '2014-11-11'),
('C086000-T10002', '2014-12-03'),
('C086000-T10002', '2014-01-03'),
('C086000-T10003', '2014-08-11'),
('C086000-T10003', '2014-03-03'),
('C086000-T10003', '2012-02-27'),
('C086000-T10004', '2014-08-11'),
('C086000-T10004', '2013-12-30'),
('C086000-T10004', '2013-06-01'),
('C086000-T10004', '2012-07-31'),
('C086000-T10005', '2013-10-01'),
('C086000-T10005', '2012-11-01'),
('C086000-T10006', '2014-04-01'),
('C086000-T10006', '2014-05-15'),
('C086000-T10001', '2014-07-05'),
('C086000-T10003', '2014-03-03');
Many thanks!
The sub query contains all rows joined with the unique max date, and you select only ones which there differente with the max date is smaller than 366 days:
select * from
(
SELECT id, date, max(date) over(partition by code) max_date FROM your_table
) A
where max_date - date < interval '366 day'
PS: As #a_horse_with_no_name said, you can partition by code to get maximum_date for each code.