Count Divided by a Count - tsql

How do you divide a count by another count. I have seen a few different methods but I am unable to get them to work for my purposes. The code that I am working on currently is:
Select(select COUNT(lUsers)
FROM Tlocation
WHERE dLastUpdated IS NOT NULL
AND dRemovalDate BETWEEN DATEADD(day,-7,GETDATE()) and GETDATE())
/
(SELECT COUNT(lUsers)
FROM Tlocation
WHERE dLastUpdated < GETDATE()
AND dRemovalDate IS NULL OR dRemovalDate > GETDATE())
But this just returns a 0 every time.

It's because the value is less than an INT (which COUNT() returns). Here's how you could do it:
SELECT
(
SELECT 1562
)
/
CAST(
(
SELECT 92825
)
AS DECIMAL(20, 10)
)
;
Or for your query:
Select(select COUNT(lUsers)
FROM Tlocation
WHERE dLastUpdated IS NOT NULL
AND dRemovalDate BETWEEN DATEADD(day,-7,GETDATE()) and GETDATE())
/
CAST(
(SELECT COUNT(lUsers)
FROM Tlocation
WHERE dLastUpdated < GETDATE()
AND dRemovalDate IS NULL OR dRemovalDate > GETDATE())
AS DECIMAL(20, 10)
)
;

Related

Count the number of instances the time is above average time

Here is my code:
arrival_cluster_raw as (
SELECT
routes.uc_id ,
cg.cluster_id ,
cg.cluster_centroid ,
routes.imei ,
routes.time_created::date as campaign_date,
min(routes.time_created) as m_per_imei_cluster
FROM cluster_groups as cg
group by 1,2,3,4,5
)
,
arrival_cluster_final as
(
select uc_id, campaign_date, cluster_id, cluster_centroid , date_trunc('second', AVG(m_per_imei_cluster::TIME)) as avg_arrival_time,
count(case when m_per_imei_cluster::TIME < (select AVG(m_per_imei_cluster::TIME) from arrival_cluster_raw) then 1 else null END) as "num_of_arrival_teams_before_avg_time"
,count(case when m_per_imei_cluster::TIME > (select AVG(m_per_imei_cluster::TIME) from arrival_cluster_raw) then 1 else null END) as "num_of_arrival_teams_after_avg_time"
FROM arrival_cluster_raw
group by uc_id,cluster_id, cluster_centroid ,campaign_date
)
The problem is that in the "arrival_cluster_final", the average value of the entire cluster
is being compared whereas I want to compare the average value for the combination of uc_id,cluster_id, cluster_centroid ,campaign_date
--can you try this one.
WITH arrival_cluster_raw AS (
SELECT
routes.uc_id,
cg.cluster_id,
cg.cluster_centroid,
routes.imei,
routes.time_created::date AS campaign_date,
min(routes.time_created) AS m_per_imei_cluster
FROM
cluster_groups AS cg
JOIN routes ON routes.uc_id = cg.id --assume the way you want join.
GROUP BY
1,2,3,4,5
),
arrival_cluster_final AS (
SELECT
uc_id,
cluster_id,
cluster_centroid,
imei,
campaign_date,
date_trunc('second', (avg(m_per_imei_cluster) OVER w))
,count( CASE WHEN (avg(m_per_imei_cluster) OVER w) < m_per_imei_cluster THEN
1
ELSE
NULL
END) AS num_of_arrival_teams_before_avg_time
,count(
CASE WHEN (avg(m_per_imei_cluster) OVER w) > m_per_imei_cluster THEN
1
ELSE
NULL
END) AS num_of_arrival_teams_after_avg_time
FROM
arrival_cluster_raw
WINDOW w AS (PARTITION BY uc_id,
cluster_id,
cluster_centroid,
campaign_date))
SELECT * FROM arrival_cluster_final ORDER BY 1;

How can I increment the numerical value in my WHERE clause using a loop?

I am currently using the UNION ALL workaround below to calculate old_eps_tfq regression slopes of each ticker based off its corresponding rownum value (see WHERE rownum < x). I am interested to know what the old_eps_tfq is when rownum < 4 then increment 4 by 1 to find out what old_eps_tfq is when rownum < 5, and so on (there are ~20 rownum)
Could I use PL/pgSQL for this?
SELECT * FROM(
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
WHERE rownum < 4
GROUP BY ticker, current_period_end_date
ORDER BY ticker asc ) q
UNION ALL
SELECT * FROM(
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
WHERE rownum < 5
GROUP BY ticker, current_period_end_date
ORDER BY ticker asc ) q
Here is my table
The top query SELECT * FROM (...) q sounds like useless.
Then you can try this :
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
max,
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
INNER JOIN generate_series(4, 24) AS max -- the range 4 to 24 can be adjusted to the need
ON rownum < max
GROUP BY max, ticker, current_period_end_date
ORDER BY max asc, ticker asc

Performance tune tSQL Query count(*) & subqueries

I know that there's a better way to do what I'm trying to accomplish here. Though the query works I fear it's performance will suffer as the dataset's it is applied to grow.
I don't even necesarily need someone to rewrite what I have if they would just be willing to point me in the direction of the topic I should study I would greatly appreciate it.
What I'm trying to return with this query is a count of the number of records at or above a certain status.
Thanks in advance for your help!
SELECT
( SELECT count(*)
FROM TABLE1 c1
WHERE ( c1.U_KEY3 NOT LIKE 'z%' AND (c1.U_KEY1 = '' or c1.U_KEY1 IS NULL) )
) AS 'STATUS is EMPTY'
,
( SELECT count(*)
FROM TABLE1 c1
WHERE ( c1.U_KEY3 NOT LIKE 'z%' AND LEFT(c1.U_KEY1,2) >= '70' )
) AS 'STATUS > 70'
,
( SELECT count(*)
FROM TABLE1 c1
WHERE ( c1.U_KEY3 NOT LIKE 'z%' AND LEFT(c1.U_KEY1,2) >= '50' )
) AS 'STATUS > 50'
,
( SELECT count(*)
FROM TABLE1 c1
WHERE ( c1.U_KEY3 NOT LIKE 'z%' AND LEFT(c1.U_KEY1,2) >= '30' )
) AS 'STATUS > 30'
,
( SELECT count(*)
FROM TABLE1 c1
WHERE ( c1.U_KEY3 NOT LIKE 'z%' AND LEFT(c1.U_KEY1,2) >= '10' )
) AS 'STATUS > 10'
You could roll all the subqueries into a single query using a CASE statement:
SELECT
SUM(CASE WHEN c1.U_KEY1 = '' OR c1.U_KEY1 IS NULL THEN 1 ELSE 0 END) AS 'STATUS IS EMPTY',
SUM(CASE WHEN LEFT(c1.U_KEY1,2) >= '70' THEN 1 ELSE 0 END) AS 'STATUS > 70',
SUM(CASE WHEN LEFT(c1.U_KEY1,2) >= '50' THEN 1 ELSE 0 END) AS 'STATUS > 50',
SUM(CASE WHEN LEFT(c1.U_KEY1,2) >= '30' THEN 1 ELSE 0 END) AS 'STATUS > 30',
SUM(CASE WHEN LEFT(c1.U_KEY1,2) >= '10' THEN 1 ELSE 0 END) AS 'STATUS > 10'
FROM TABLE1 c1
WHERE c1.U_KEY3 NOT LIKE 'z%'
But this might not run as fast as the individual subqueries.
I would turn the question around like this:
DECLARE #t TABLE (Id INT, U_Key1 VARCHAR(4) null);
INSERT INTO #t (id,U_Key1)
VALUES
(1,null),
(2,'902'),
(3,'452'),
(4,'401'),
(5,'103'),
(6,'359'),
(7,'335'),
(8,'772'),
(9,'143'),
(10,'222'),
(11,'664'),
(12,'992'),
(13,'122'),
(14,'332'),
(15,'421'),
(16,'622'),
(17,'982'),
(18,'1234'),
(19,null),
(20,'012');
WITH A AS (
SELECT CAST(LEFT(U_Key1,2) AS INT) val FROM #t
), limits AS (
SELECT 10 limitval, 'Status >= 10' limittext
UNION ALL
SELECT 30 , 'Status >= 30'
UNION ALL
SELECT 50 , 'Status >= 50'
UNION ALL
SELECT 70 , 'Status >= 70'
), Counts AS (
SELECT 'Status is empty' Limittext, COUNT(id) Count FROM #t
WHERE U_Key1 IS null
UNION ALL
SELECT l.limittext, COUNT( A.val) Count FROM A
CROSS JOIN limits l
WHERE A.val >= l.limitval
GROUP BY l.limittext
)
SELECT * FROM Counts
That produces the result:
Status is empty 2
Status >= 10 17
Status >= 30 12
Status >= 50 6
Status >= 70 4

TSQL get overlapping periods from datetime ranges

I have a table with date range an i need the sum of overlapping periods (in hours) between its rows.
This is a schema example:
create table period (
id int,
starttime datetime,
endtime datetime,
type varchar(64)
);
insert into period values (1,'2013-04-07 8:00','2013-04-07 13:00','Work');
insert into period values (2,'2013-04-07 14:00','2013-04-07 17:00','Work');
insert into period values (3,'2013-04-08 8:00','2013-04-08 13:00','Work');
insert into period values (4,'2013-04-08 14:00','2013-04-08 17:00','Work');
insert into period values (5,'2013-04-07 10:00','2013-04-07 11:00','Holyday'); /* 1h overlapping with 1*/
insert into period values (6,'2013-04-08 10:00','2013-04-08 20:00','Transfer'); /* 6h overlapping with 3 and 4*/
insert into period values (7,'2013-04-08 11:00','2013-04-08 12:00','Test'); /* 1h overlapping with 3 and 6*/
And its fiddle: http://sqlfiddle.com/#!6/9ca31/10
I expect a sum of 8h overlapping hours:
1h (id 5 over id 1)
6h (id 6 over id 3 and 4)
1h (id 7 over id 3 and 6)
I check this: select overlapping datetime events with SQL but seems to not do what I need.
Thank you.
select sum(datediff(hh, case when t2.starttime > t1.starttime then t2.starttime else t1.starttime end,
case when t2.endtime > t1.endtime then t1.endtime else t2.endtime end))
from period t1
join period t2 on t1.id < t2.id
where t2.endtime > t1.starttime and t2.starttime < t1.endtime;
Updated to handle several overlaps:
select sum(datediff(hh, start, fin))
from (select distinct
case when t2.starttime > t1.starttime then t2.starttime else t1.starttime end as start,
case when t2.endtime > t1.endtime then t1.endtime else t2.endtime end as fin
from period t1
join period t2 on t1.id < t2.id
where t2.endtime > t1.starttime and t2.starttime < t1.endtime
) as overlaps;
I have some "dirty" solution. Hope this helps :)
with src as (
select
convert(varchar, starttime, 112) [start_date]
, cast(left(convert(varchar, starttime, 108), 2) as int) [start_time]
, convert(varchar, endtime, 112) [end_date]
, cast(left(convert(varchar, endtime, 108), 2) as int) [end_time]
, id
from [period]),
[gr] as (
select
row_number() over(order by s1.[start_date], s1.[start_time], s1.[end_time], s2.[start_time], s2.[end_time]) [no]
, s1.[start_date] [date]
, s1.[start_time] [t1]
, s1.[end_time] [t2]
, s2.[start_time] [t3]
, s2.[end_time] [t4]
from src s1
join src s2 on s1.[start_date] = s2.[start_date]
and s1.[end_date] = s2.[end_date]
and (s1.[start_time] between s2.[start_time] and s2.[end_time] or s1.[end_time] between s2.[start_time] and s2.[end_time])
and s1.id != s2.id),
[raw] as (
select [no], [date], [t1] [h] from [gr] union all
select [no], [date], [t2] from [gr] union all
select [no], [date], [t3] from [gr] union all
select [no], [date], [t4] from [gr]),
[max_min] as (
select [no], [date], max(h) [max_h], min(h) [min_h]
from [raw]
group by [no], [date]
),
[result] as (
select [raw].*
from [raw]
left join [max_min] on [raw].[no] = [max_min].[no]
and ([raw].h = [max_min].[max_h] or [raw].h = [max_min].[min_h])
where [max_min].[no] is null),
[final] as (
select distinct r1.[date], r1.h [start_h], r2.h [end_h], abs(r1.h - r2.h) [dif]
from [result] r1
join [result] r2 on r1.[no] = r2.[no]
where abs(r1.h - r2.h) > 0
and r1.h > r2.h)
select sum(dif) [overlapping hours] from [final]
SQLFiddle

How to exclude nights from a TSQL query?

I'm writing a TSQL query to find the next available datetime from a list of appointments. So far what I've managed to get working does find the gaps in a time query but I can't seem to find a great way to exclude nights (after 7pm lets say).
;WITH CTE
AS ( SELECT
ID,StartAptDate,EndAptDate,
RowNumber = ROW_NUMBER() OVER( ORDER BY StartAptDate ASC )
FROM Appointments WHERE StylistId = 1 AND StartAptDate > CAST( CONVERT( CHAR(8), GetDate(), 112) AS DATETIME)
)
SELECT FirstApptAvail = min( a.EndAptDate )
FROM CTE a
INNER JOIN CTE b
ON a.RowNumber = b.RowNumber - 1
WHERE datediff( minute, a.EndAptDate, b.StartAptDate) >= 15 AND ...
A little pseudo code for the ... would be something like this
(a.StartAptDate < GETDATE #7pm AND a.StartAptDate > GETDATE + 1 #8am)
The part I can't seem to get right is constructing the right side of each comparison. I need to exclude anything that might be returned between 7pm that night - 8am the next morning.
Thank you in advance
Thanks for the quick feedback - it looks like I was able to get the desired result using the BETWEEN statement mentioned in the comments above. I first made the startdate and enddate in question time specific (meaning the date part was 1900 / 01 / 01 so it didn't matter) This way I could use the time ONLY to compare with.
;WITH CTE
AS ( SELECT
ID,StartAptDate,EndAptDate,
RowNumber = ROW_NUMBER() OVER( ORDER BY StartAptDate ASC )
FROM Appointments WHERE StylistId = 1 AND StartAptDate > CAST( CONVERT( CHAR(8), GetDate() - 5, 112) AS DATETIME)
)
SELECT FirstApptAvail = min( a.EndAptDate )
FROM CTE a
INNER JOIN CTE b
ON a.RowNumber = b.RowNumber - 1
WHERE datediff( minute, a.EndAptDate, b.StartAptDate) >= 15 AND (CAST ( CONVERT( CHAR(8), a.StartAptDate, 108) AS DATETIME) BETWEEN '1900-01-01 07:59:59' AND '1900-01-01 18:59:59' AND CAST ( CONVERT( CHAR(8), a.EndAptDate, 108) AS DATETIME) BETWEEN '1900-01-01 07:59:59' AND '1900-01-01 18:59:59')