How to order by TO_CHAR(datefield,'MM-YYYY') in DB2 - db2

I need to order by month and year in ascending order format. I tried to use ORDER BY readingdate however it resulted in eoor. Below is the query
select A,B,C,(TO_CHAR(readingdate,'MM-YYYY'))mth
from TABLE1
inner join TABLE2
left join TABLE3
where (Readingdate >= DATE ('2019-11-01')
AND Readingdate < DATE ('2020-01-31') + 1 DAY)
group by A,B,C,TO_CHAR(readingdate,'MM-YYYY')
order by mth;
Output
A Mth
200 01-2020
200 11-2019
200 12-2019
Expected output
A Mth
200 11-2019
200 12-2019
200 01-2020

You have to group by a value that keeps the ordering you need and then transform, for example :
select A,B,C, TO_CHAR(mth,'MM-YYYY') mthchar
from (
select A,B,C, last_day(readingdate) mth
from TABLE1
inner join TABLE2
left join TABLE3
where (Readingdate >= DATE ('2019-11-01')
AND Readingdate < DATE ('2020-01-31') + 1 DAY)
group by A,B,C,last_day(readingdate)
)
order by mth;

Related

PostgreSQL - SQL function to loop through all months of the year and pull 10 random records from each

I am attempting to pull 10 random records from each month of this year using this query here but I get an error "ERROR: relation "c1" does not exist
"
Not sure where I'm going wrong - I think it may be I'm using Mysql syntax instead, but how do I resolve this?
My desired output is like this
Month
Another header
2021-01
random email 1
2021-01
random email 2
total of ten random emails from January, then ten more for each month this year (til November of course as Dec yet to happen)..
With CTE AS
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
DISTINCT(TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM')) AS month
,CASE
WHEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN
JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
)
SELECT *
FROM CTE C1
LEFT JOIN
(SELECT RN
,month
,email
FROM CTE C2
WHERE C2.month = C1.month
ORDER BY RANDOM() LIMIT 10) C3
ON C1.RN = C3.RN
ORDER By month ASC```
You can't reference an outer table inside a derived table with a regular join. You need to use left join lateral to make that work
I did end up finding a more elegant solution to my query here via this source from github :
SELECT
month
,email
FROM
(
Select month,
email,
Row_Number() Over (Partition By month Order By FLOOR(RANDOM()*(1-1000000+1))) AS RN
From (
SELECT
TO_CHAR(DATE_TRUNC('month', timestamp ), 'YYYY-MM') AS month
,CASE
WHEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'name') = 'email'
THEN JSON_EXTRACT_PATH_TEXT(json_extract_array_element_text (form_data,0),'value')
END AS email
FROM form_submits_y2 fs
WHERE fs.website_id IN (791)
AND month LIKE '2021%'
GROUP BY 1,2
ORDER BY 1 ASC
)
) q
WHERE
RN <=10
ORDER BY month ASC

Get an average monthly view of active members (Postgresql)

I am working with members data. I have the responsible Coach, the coachee entry, exit status and date. Because some coachees might graduate/leave during a month I want to calculate a daily number and then get a monthly average of active members for each coach. That means that I need to take in the account all coachees from previous months, that are still active that current month. This is my data:
I am thinking of creating a variable first where I can get the daily active member count for each coach. This is my first approach:
with all_years as (
select y.year, m.month, d.day
from generate_series(2019, 2022) as y(year)
cross join generate_series(1, 12) as m(month)
cross join generate_series(1, 31) as d(day) --<<*not sure how to adjust for days with less than 31 days??*
select ay.*, coach, coachee, entry_status, entry_date, exit_reason, exit_date, sum(count) over (partition by ay.coach order by ay.year, ay.month, ay.day)
from all_years ay
left join table t
on --.... *not sure what I can join on in this case*;
I am open to an easier approach, this logic is just an idea.
You can cross join the list of distinct coaches with all dates to generat combinations, then bring the table with a left join:
select d.dt, c.coach, count(t.coach) no_coachees
from (select distinct coach from mytable) c
cross join generate_series('2019-01-01'::date, '2022-12-31'::date, '1 day':: interval) d(dt)
left join mytable t on t.coach = c.coach and t.entry_date <= d.dt and t.exit_date > d.dt
group by d.dt, c.coach
Then you can use another level of aggregation to get the monthly average:
select date_trunc('month', d.dt) d_month, coach, avg(no_coachees) avg_coaches
from (
select d.dt, c.coach, count(t.coach) no_coachees
from (select distinct coach from mytable) c
cross join generate_series('2019-01-01'::date, '2022-12-31'::date, '1 day':: interval) d(dt)
left join mytable t on t.coach = c.coach and t.entry_date <= d.dt and t.exit_date > d.dt
group by d.dt, c.coach
) t
group by date_trunc('month', d.dt), coach

Sub query in SELECT - ungrouped column from outer query

I have to calculate the ARPU (Revenue / # users) but I got this error:
subquery uses ungrouped column "usage_records.date" from outer query
LINE 7: WHERE created_at <= date_trunc('day', usage_records.d... ^
Expected results:
Revenue(day) = SUM(quantity_eur) for that day
Users Count (day) = Total signed up users before that day
Postgresql (Query)
SELECT
date_trunc('day', usage_records.date) AS day,
SUM(usage_records.quantity_eur) as Revenue,
( SELECT
COUNT(users.id)
FROM users
WHERE created_at <= date_trunc('day', usage_records.date)
) as users_count
FROM users
INNER JOIN ownerships ON (ownerships.user_id = users.id)
INNER JOIN profiles ON (profiles.id = ownerships.profile_id)
INNER JOIN usage_records ON (usage_records.profile_id = profiles.id)
GROUP BY DAY
ORDER BY DAY asc
your subquery (executed for each row ) cointain a column nont mentioned in group by but not involeved in aggregation ..
this produce error
but you could refactor your query using a contional also for this value
SELECT
date_trunc('day', usage_records.date) AS day,
SUM(usage_records.quantity_eur) as Revenue,
sum( case when created_at <= date_trunc('day', usage_records.date)
AND users.id is not null
then 1 else 0 end ) users_count
FROM users
INNER JOIN ownerships ON (ownerships.user_id = users.id)
INNER JOIN profiles ON (profiles.id = ownerships.profile_id)
INNER JOIN usage_records ON (usage_records.profile_id = profiles.id)
GROUP BY DAY
ORDER BY DAY asc

POSTGRESQL Time1 = Time2 and values 2 from time2 + 1 hour interval

Hy Coders, i got a question.
i have a table1 with 2 colums (timestamp,value1) and 3 rows. A table 2 with 2 columes( timestamp, value2) with 1000 rows
Table1 are LAb values from samples Table2 are values from an online measurement with 3min interval.
i want to mean all values2 from table 2 in an interval from timestamp1 = timestamp2 + 1 hour before
has anyone an idea?
You can use a condition like:
t2.timestamp BETWEEN t1.timestamp - '1 hour'::interval AND t1.timestamp
This condition can be used to JOIN the tables like:
SELECT * -- the columns you need
FROM table1
LEFT JOIN table2 t2
ON t2.timestamp BETWEEN t1.timestamp - '1 hour'::interval AND t1.timestamp
To get the average of values from table2, you can turn on aggregation:
SELECT t1.timestamp, t1.value, AVG(t2.value) avg_value2
FROM table1
LEFT JOIN table2 t2
ON t2.timestamp BETWEEN t1.timestamp - '1 hour'::interval AND t1.timestamp
GROUP BY t1.timestamp, t1.value

Grouping consecutive dates in PostgreSQL

I have two tables which I need to combine as sometimes some dates are found in table A and not in table B and vice versa. My desired result is that for those overlaps on consecutive days be combined.
I'm using PostgreSQL.
Table A
id startdate enddate
--------------------------
101 12/28/2013 12/31/2013
Table B
id startdate enddate
--------------------------
101 12/15/2013 12/15/2013
101 12/16/2013 12/16/2013
101 12/28/2013 12/28/2013
101 12/29/2013 12/31/2013
Desired Result
id startdate enddate
-------------------------
101 12/15/2013 12/16/2013
101 12/28/2013 12/31/2013
Right. I have a query that I think works. It certainly works on the sample records you provided. It uses a recursive CTE.
First, you need to merge the two tables. Next, use a recursive CTE to get the sequences of overlapping dates. Finally, get the start and end dates, and join back to the "merged" table to get the id.
with recursive allrecords as -- this merges the input tables. Add a unique row identifier
(
select *, row_number() over (ORDER BY startdate) as rowid from
(select * from table1
UNION
select * from table2) a
),
path as ( -- the recursive CTE. This gets the sequences
select rowid as parent,rowid,startdate,enddate from allrecords a
union
select p.parent,b.rowid,b.startdate,b.enddate from allrecords b join path p on (p.enddate + interval '1 day')>=b.startdate and p.startdate <= b.startdate
)
SELECT id,g.startdate,g.enddate FROM -- outer query to get the id
-- inner query to get the start and end of each sequence
(select parent,min(startdate) as startdate, max(enddate) as enddate from
(
select *, row_number() OVER (partition by rowid order by parent,startdate) as row_number from path
) a
where row_number = 1 -- We only want the first occurrence of each record
group by parent)g
INNER JOIN allrecords a on a.rowid = parent
The below fragment does what you intend. (but it will probably be very slow) The problem is that detecteng (non)overlapping dateranges is impossible with standard range operators, since a range could be split into two parts.
So, my code does the following:
split the dateranges from table_A into atomic records, with one date per record
[the same for table_b]
cross join these two tables (we are only interested in A_not_in_B, and B_not_in_A) , remembering which of the L/R outer join wings it came from.
re-aggregate the resulting records into date ranges.
-- EXPLAIN ANALYZE
--
WITH RECURSIVE ranges AS (
-- Chop up the a-table into atomic date units
WITH ar AS (
SELECT generate_series(a.startdate,a.enddate , '1day'::interval)::date AS thedate
, 'A'::text AS which
, a.id
FROM a
)
-- Same for the b-table
, br AS (
SELECT generate_series(b.startdate,b.enddate, '1day'::interval)::date AS thedate
, 'B'::text AS which
, b.id
FROM b
)
-- combine the two sets, retaining a_not_in_b plus b_not_in_a
, moments AS (
SELECT COALESCE(ar.id,br.id) AS id
, COALESCE(ar.which, br.which) AS which
, COALESCE(ar.thedate, br.thedate) AS thedate
FROM ar
FULL JOIN br ON br.id = ar.id AND br.thedate = ar.thedate
WHERE ar.id IS NULL OR br.id IS NULL
)
-- use a recursive CTE to re-aggregate the atomic moments into ranges
SELECT m0.id, m0.which
, m0.thedate AS startdate
, m0.thedate AS enddate
FROM moments m0
WHERE NOT EXISTS ( SELECT * FROM moments nx WHERE nx.id = m0.id AND nx.which = m0.which
AND nx.thedate = m0.thedate -1
)
UNION ALL
SELECT rr.id, rr.which
, rr.startdate AS startdate
, m1.thedate AS enddate
FROM ranges rr
JOIN moments m1 ON m1.id = rr.id AND m1.which = rr.which AND m1.thedate = rr.enddate +1
)
SELECT * FROM ranges ra
WHERE NOT EXISTS (SELECT * FROM ranges nx
-- suppress partial subassemblies
WHERE nx.id = ra.id AND nx.which = ra.which
AND nx.startdate = ra.startdate
AND nx.enddate > ra.enddate
)
;