Showing percentage for the group using MS SQL - mssql-jdbc

I have the following query:
SELECT machine, start_date, sum(duration),
st.status_code, st.status_text
FROM er_table er
LEFT JOIN status_table st on er.status_code=st.status_code
where machine in ('mach1','mach2','mach3')
group by machine, start_date, st.status_code, st.status_text
order by machine, start_date, status_text
It produces the following result:
However, I need to add a percentage for the group of machines for a particular date. E.g. on 15 Sep, mach1 was in idle for 20 secs, thus, 20/(20+800) would give me 2% idle time.
This is the result I need to get:
I saw a similar question from a similar post and i modified my code as follows, but it didn't quite give the result I'm looking for:
SELECT machine, start_date, sum(duration),
SUM(duration) * 100.0 / SUM(SUM(duration)) OVER () AS Percentage,
st.status_code, st.status_text
FROM er_table er
LEFT JOIN status_table st on er.status_code=st.status_code
where machine in ('mach1','mach2','mach3')
group by machine, start_date, st.status_code, st.status_text
order by machine, start_date, status_text
Any help is very much appreciated. Thank you.

SELECT machine, start_date, sum(duration),
SUM(duration) * 100.0 / SUM(duration) OVER(partition by machine, start_date) AS Percentage,
st.status_code, st.status_text
FROM er_table er
LEFT JOIN status_table st on er.status_code=st.status_code
where machine in ('mach1','mach2','mach3')
group by machine, start_date, st.status_code, st.status_text
order by machine, start_date, status_text
How about doing it like this?
i used partition by in sum() over()
result:
machine
start_date
duration
percentage
status_code
status_text
mach1
2021-09-15
20
2.439024390243
1
IDLE
mach1
2021-09-15
800
97.560975609756
1
RUNNING
mach1
2021-09-16
40
4.255319148936
1
IDLE
mach1
2021-09-16
900
95.744680851063
1
RUNNING
mach2
2021-09-15
100
12.500000000000
1
IDLE
mach2
2021-09-15
700
87.500000000000
1
RUNNING

Related

A query to get per month data for all months and calculate percentage per month per type

From the DB (Postgresql) I want to get the percentage per month (of all months) of stock items with a certain condition. So the total of the whole month is 100% and per condition it would be a percentage of that. I'm trying all kinds of 'partition by' queries, but i quite can't get it right.
In the example there would be an extra column and on each row there would be the percentage of that month. So the value for the new column for the first row it would be 25/506*100.
Right now I have and works is:
select to_char(created_at, 'YYYY-MM') as maand, count(si.id) as aantal,
case
when condition_id=1 then 'Nieuw'
when condition_id=2 then 'Als nieuw'
when condition_id=3 then 'Goed'
when condition_id=4 then 'Redelijk'
when condition_id=5 then 'Matig'
else 'Onbepaald'
end
from stock_items si
group by maand, condition_id
order by maand desc, condition_id asc
maand
aantal
case
new column
2022-01
25
Nieuw
25/506*100
2022-01
234
Als nieuw
234/506*100
2022-01
127
Goed
127/506*100
2022-01
16
Redelijk
16/506*100
2022-01
104
Matig
104/506*100
2021-12
456
Nieuw
other month
I hope it's all clear. Thanks!
I got what I wanted. To realise i want it a little different, but this is the answer to my question.
select
to_char(created_at, 'YYYY-MM') as maand,
count(id) as aantal,
round((count(id) / (sum(count(id)) over (partition by to_char(created_at, 'YYYY-MM'))) * 100), 2) as percentage,
case
when condition_id=1 then 'Nieuw'
when condition_id=2 then 'Als nieuw'
when condition_id=3 then 'Goed'
when condition_id=4 then 'Redelijk'
when condition_id=5 then 'Matig'
else 'Onbepaald'
end
from stock_items
group by maand, condition_id
order by maand desc, condition_id asc
just warp it with CTE.
with a as (
select to_char(created_at, 'YYYY-MM') as maand, count(si.id) as aantal,
case
when condition_id=1 then 'Nieuw'
when condition_id=2 then 'Als nieuw'
when condition_id=3 then 'Goed'
when condition_id=4 then 'Redelijk'
when condition_id=5 then 'Matig'
else 'Onbepaald'
end as case
from stock_items si
group by maand, condition_id
order by maand desc, condition_id asc)
select a.*, aantal * 100 / sum(aantal) over (PARTITION BY maand) as anntal_rate from a;
/* some characters so the edit is accepted */

Quartiles calculation in Postgresql Query

I am having a hard time trying to get this done. I have the following table:
cod_prod seller price date
A Andres 10 anydate
A Paul 5 anydate
A Mike 2.5 anydate
A Josh 1.75 anydate
A Karen 7.5 anydate
.... ..... ... .......
I am trying to calculate quartiles of the price for each product and classify each seller's price into 4 quartiles.
The output I am expecting is:
Cod_Prod Seller Price Quartile 1stQ 2ndQ 3rdQ 4thQ
A Andres 10 4 2.5 5 7.5 10
A Karen 7.5 3 2.5 5 7.5 10
A Paul 5 2 2.5 5 7.5 10
A Mike 2.5 1 2.5 5 7.5 10
A Josh 1.75 1 2.5 5 7.5 10
.. ..... .... .... .... .. ... ...
This table has thousands of distinct cod_prod and thousands of sellers.
I am trying this query:
with cte as (
select seller, cod_prod, sum(price) as sum_price
from tablename
group by 2,1
)
select seller,
cod_prod,
sum_price,
ntile(4) over (partition by seller order by sum_price asc) quartile
from cte
But this not doing what I expect and still mising the 1stQ to 4thQ indicators bins
I tried many different things but this is the closest I got from what I want.
Can someone help me to solve it?
I am not sure if this query is exactly what you want, but I think can help you.
I calculated quartiles grouping by cod_prod.
WITH cte AS (SELECT seller, cod_prod, sum(price) as sum_price
FROM t
GROUP BY seller, cod_prod),
quartiles AS (SELECT
cod_prod,
percentile_cont(0.25) within group (order by sum_price asc) as "1stQ",
percentile_cont(0.50) within group (order by sum_price asc) as "2ndQ",
percentile_cont(0.75) within group (order by sum_price asc) as "3rdQ",
percentile_cont(1) within group (order by sum_price asc) as "4thQ"
FROM cte
GROUP BY cod_prod)
SELECT cte.*,
ntile(4) over (PARTITION BY cte.cod_prod ORDER BY sum_price ASC) quartile,
quartiles.*
FROM cte
INNER JOIN quartiles ON cte.cod_prod = quartiles.cod_prod;

Monthly retention in Amazon redshift

I'm trying to calculate monthly retention rate in Amazon Redshift and have come up with the following query:
Query 1
SELECT EXTRACT(year FROM activity.created_at) AS Year,
EXTRACT(month FROM activity.created_at) AS Month,
COUNT(DISTINCT activity.member_id) AS active_users,
COUNT(DISTINCT future_activity.member_id) AS retained_users,
COUNT(DISTINCT future_activity.member_id) / COUNT(DISTINCT activity.member_id)::float AS retention
FROM ads.fbs_page_view_staging activity
LEFT JOIN ads.fbs_page_view_staging AS future_activity
ON activity.mongo_id = future_activity.mongo_id
AND datediff ('month',activity.created_at,future_activity.created_at) = 1
GROUP BY Year,
Month
ORDER BY Year,
Month
For some reason this query returns zero retained_users and zero retention. I'd appreciate any help regarding why this may be happening or maybe a completely different query for monthly retention would work.
I modified the query as per another SO post and here it goes:
Query 2
WITH t AS (
SELECT member_id
,date_trunc('month', created_at) AS month
,count(*) AS item_transactions
,lag(date_trunc('month', created_at)) OVER (PARTITION BY member_id
ORDER BY date_trunc('month', created_at))
= date_trunc('month', created_at) - interval '1 month'
OR NULL AS repeat_transaction
FROM ads.fbs_page_view_staging
WHERE created_at >= '2016-01-01'::date
AND created_at < '2016-04-01'::date -- time range of interest.
GROUP BY 1, 2
)
SELECT month
,sum(item_transactions) AS num_trans
,count(*) AS num_buyers
,count(repeat_transaction) AS repeat_buyers
,round(
CASE WHEN sum(item_transactions) > 0
THEN count(repeat_transaction) / sum(item_transactions) * 100
ELSE 0
END, 2) AS buyer_retention
FROM t
GROUP BY 1
ORDER BY 1;
This query gives me the following error:
An error occurred when executing the SQL command:
WITH t AS (
SELECT member_id
,date_trunc('month', created_at) AS month
,count(*) AS item_transactions
,lag(date_trunc('m...
[Amazon](500310) Invalid operation: Interval values with month or year parts are not supported
Details:
-----------------------------------------------
error: Interval values with month or year parts are not supported
code: 8001
context: interval months: "1"
query: 616822
location: cg_constmanager.cpp:145
process: padbmaster [pid=15116]
-----------------------------------------------;
I have a feeling that Query 2 would fare better than Query 1, so I'd prefer to fix the error on that.
Any help would be much appreciated.
Query 1 looks good. I tried similar one. See below. You are using self join on table (ads.fbs_page_view_staging) and the same column (created_at). Assuming mongo_id is unique, the datediff('month'....) will always return 0 and datediff ('month',activity.created_at,future_activity.created_at) = 1 will always be false.
-- Count distinct events of join_col_id that have lapsed for one month.
SELECT count(distinct E.join_col_id) dist_ct
FROM public.fact_events E
JOIN public.dim_table Z
ON E.join_col_id = Z.join_col_id
WHERE datediff('month', event_time, sysdate) = 1;
-- 2771654 -- dist_ct

Obtaining a date-bound running total on postgresql

I have a database query running on Postgresql 9.3 that looks like this in order to obtain a running balance of accounting entries:
select *,(sum(amount) over(partition
by
ae.account_id
order by
ae.date_posted,
ae.account_id
)) as formula0_1_
from
account_entry as ae
-- where ae.date_posted> '2014-01-01'
order by account_id desc, date_posted asc
expected output without the where clause would be:
id | date | amount | running balance
1 2014-01-01 10 10
2 2014-01-02 10 20
what I'm getting with the where clause:
id | date | amount | running balance
2 2014-01-02 10 10
How can I make this this query return me the same correct results if I try filtering by a date range (the bit commented above)?
You need to select and calculate your running balances first over all the data, and then put a WHERE clause in an outer SELECT.
SELECT
*
FROM
(SELECT
*,
SUM(amount) OVER (
PARTITION BY
ae.account_id
ORDER BY
ae.date_posted,
ae.account_id
) AS formula0_1_
FROM
account_entry AS ae) AS total
WHERE
total.date_posted > '2014-01-01'
ORDER BY
account_id DESC,
date_posted ASC;

need to capture NULL in a query

I have a query that shows success rate for staff and works splendidly except: If staff "Bob" has not had any activity in the date range, he will not appear in the results. If he had at least one code in the query it would result in a 0% or 100%. If there are no codes attached to his name, he does not show in the results. I have seen an example of -
ISNULL(s.code, 'No Entry') AS NoContact to use but I guess I am not using it correctly
and I just cannot figure out how to add it into the query. Can someone assist?
Here is the current query that works great (but omits any staff who do not have any of the codes:
SELECT st.staff_id
,round((count(s.code IN ('10401','10402','10403') OR NULL) * 100.0)
/ count(*), 1) AS successes
-- unsuccessful code is 10405
FROM notes n
JOIN services s ON s.zzud_service = n.zrud_service
JOIN staff st ON st.zzud_staff = n.zrud_staff
WHERE n.date_service >= DATE '07/01/2014' AND n.date_service <= CURRENT_DATE
-- n.date_service BETWEEN (now() - '30 days'::interval) AND now()
AND s.code IN ('10401','10402','10403','10405')
GROUP BY st.staff_id;
Here is a sample result:
Staff SuccessRate Explination
Sam 100% (has 1 successful and 0 unsuccessful)
Joe 50% (has 1 successful and 1 unsuccessful)
Amy 0% (has 1 unsuccessful)
Bob does not show ( no discharges in the date range)
Since you place the staff table at the end you need to right join it and move the conditions to the join conditions.
select
st.staff_id,
round(
count(s.code in ('10401','10402','10403') or null) * 100.0
/
count(*)
, 1) as successes
-- unsuccessful code is 10405
from
notes n
inner join
services s on
s.zzud_service = n.zrud_service and
n.date_service >= date '07/01/2014' and
n.date_service <= current_date
right join
staff st on
st.zzud_staff = n.zrud_staff
-- n.date_service between (now() - '30 days'::interval) and now()
and s.code in ('10401','10402','10403','10405')
group by st.staff_id;