T-SQL Union returns duplicated row (date) - tsql

I was trying to solve the issue of returning sum for each date in a specific date range, no matter if data is present for a day or not.
I have found that the best way would be to use a table pre-populated with all dates, select date range and union it with my data.
For some reason couldn't get left join to work, but union looks like working almost perfectly. The only issue is that it returns duplicates for dates where data is present in my data table.
SELECT NULL AS Visitors
,D.DATE AS Day
FROM Database.support.dates D
WHERE D.DATE BETWEEN #Start_date
AND #End_date
UNION
SELECT count(V.id) AS Visitors
,DATEADD(day, 0, DATEDIFF(day, 0, V.CreateTime)) AS Day
FROM Database.Clients.Clients V
WHERE V.CreateTime BETWEEN #Start_date
AND #End_date
AND V.WADID = #WADID
AND (
V.WAPID = #WAPID
OR #WAPID IS NULL
)
GROUP BY DATEADD(day, 0, DATEDIFF(day, 0, V.CreateTime))
ORDER BY Day DESC
Was researching it whole last day and still can't get it working :/

I think that left joining your calendar table to your current query actually was the right thing to do. That being said, you can remedy your current situation by simply aggregating on the day, e.g. using MAX. By default, NULL values for each day would be ignored, so long as there is a non null visitor count present:
WITH cte AS (
SELECT NULL AS Visitors, D.DATE AS Day
FROM Database.support.dates D
WHERE D.DATE BETWEEN #Start_date AND #End_date
UNION
SELECT COUNT(V.id), DATEADD(day, 0, DATEDIFF(day, 0, V.CreateTime))
FROM Database.Clients.Clients V
WHERE V.CreateTime BETWEEN #Start_date AND #End_date AND
V.WADID = #WADID AND (V.WAPID = #WAPID OR #WAPID IS NULL)
GROUP BY DATEADD(day, 0, DATEDIFF(day, 0, V.CreateTime))
)
SELECT Day, MAX(Visitors) AS Visitors -- filter off unwanted NULL values
FROM cte
GROUP BY Day
ORDER BY Day DESC;

Related

Is there a SQL code for cumulative count of SaaS customer over months?

I have a table with:
ID (id client), date_start (subscription of SaaS), date_end (could be a date value or be NULL).
So I need a cumulative count of active clients month by month.
any idea on how to write that in Postgres and achieve this result?
Starting from this, but I don't know how to proceed
select
date_trunc('month', c.date_start)::date,
count(*)
from customer
Please check next solution:
select
subscrubed_date,
subscrubed_customers,
unsubscrubed_customers,
coalesce(subscrubed_customers, 0) - coalesce(unsubscrubed_customers, 0) cumulative
from (
select distinct
date_trunc('month', c.date_start)::date subscrubed_date,
sum(1) over (order by date_trunc('month', c.date_start)) subscrubed_customers
from customer c
order by subscrubed_date
) subscribed
left join (
select distinct
date_trunc('month', c.date_end)::date unsubscrubed_date,
sum(1) over (order by date_trunc('month', c.date_end)) unsubscrubed_customers
from customer c
where date_end is not null
order by unsubscrubed_date
) unsubscribed on subscribed.subscrubed_date = unsubscribed.unsubscrubed_date;
share SQL query
You have a table of customers. With a start date and sometimes an end date. As you want to group by date, but there are two dates in the table, you need to split these first.
Then, you may have months where only customers came and others where only customers left. So, you'll want a full outer join of the two sets.
For a cumulative sum (also called a running total), use SUM OVER.
with came as
(
select date_trunc('month', date_start) as month, count(*) as cnt
from customer
group by date_trunc('month', date_start)
)
, went as
(
select date_trunc('month', date_end) as month, count(*) as cnt
from customer
where date_end is not null
group by date_trunc('month', date_end)
)
select
month,
came.cnt as cust_new,
went.cnt as cust_gone,
sum(came.cnt - went.cnt) over (order by month) as cust_active
from came full outer join went using (month)
order by month;

Include value from cte when it has not match

In my table I have some entries which - by the table's date column - is not older than 2016-01-04 (January 4, 2016).
Now I would like to make a query which more or less counts the number of rows which have a specific date value, but I'd like this query to be able to return a 0 count for dates not present in table.
I have this:
with date_count as (select '2016-01-01'::date + CAST(offs || ' days' as
interval) as date from generate_series(0, 6, 1) AS offs ) select
date_count.date, count(allocation_id) as packs_used from medicine_allocation,
date_count where site_id = 1 and allocation_id is not null and timestamp
between date_count.date and date_count.date + interval '1 days' group by
date_count.date order by date_count.date;
This surely gives me a nice aggregated view of the date in my table, but since no rows are from before January 4 2016, they don't show in the result:
"2016-01-04 00:00:00";1
"2016-01-05 00:00:00";2
"2016-01-06 00:00:00";4
"2016-01-07 00:00:00";3
I would like this:
"2016-01-01 00:00:00";0
"2016-01-02 00:00:00";0
"2016-01-03 00:00:00";0
"2016-01-04 00:00:00";1
"2016-01-05 00:00:00";2
"2016-01-06 00:00:00";4
"2016-01-07 00:00:00";3
I have also tried right join on the cte, but this yields the same result. I cannot quite grasp how to do this... any help out there?
Best,
Janus
You simply need a left join:
with date_count as (
select '2016-01-01'::date + CAST(offs || ' days' as
interval) as date
from generate_series(0, 6, 1) AS offs
)
select dc.date, count(ma.allocation_id) as packs_used
from date_count dc left join
medicine_allocation ma
on ma.site_id = 1 and ma.allocation_id is not null and
ma.timestamp between dc.date and dc.date + interval '1 days'
group by dc.date
order by dc.date;
A word of advice: Never use commas in the FROM clause. Always use explicit JOIN syntax.
You will also notice that the where conditions were moved to the ON clause. That is necessary because they are on the second table.

How do I get a recursive daily average for one month?

I need a daily average for an entire month, but the trick is that all of the clients have different start and end dates. For example, some clients are only enrolled for part of the month. Assume client A is enrolled from 4/3/13-4/8/13, client B from 4/6-4/30, client C from 4/1-5/1, etc. How can I achieve this? Here is my current code which returns super low counts because it assumes all clients are enrolled the entire month:
if exists (
select * from tempdb.dbo.sysobjects o where o.xtype in ('U') and o.id = object_id(N'tempdb..#enrollments_PreviousMonth2')
) DROP TABLE #enrollments_PreviousMonth2;
Select
people_id,
program_modifier,
program_modifier_id,
DATEADD(dd, 0, DATEDIFF(dd, 0, actual_date)) as enroll_midnight_date,
actual_date as enroll_start_date,
end_date as enroll_end_date
INTO #enrollments_PreviousMonth2
From
program_modifier_enrollment_view pmev with(nolock)
Where
program_modifier_id = 'E1AA7A36-0500-4BAE-A0AA-D9E0BC91A6F3' and
actual_date <= '4/30/13' and (end_date >= '4/1/13' or end_date is null)
;with cte as (
select cast(enroll_start_date as date) as actual_date,
count(people_id) cnt
From #enrollments_PreviousMonth2 en
left join Calendar c on en.enroll_midnight_date = c.dt
where program_modifier_id = 'E1AA7A36-0500-4BAE-A0AA-D9E0BC91A6F3'
AND enroll_start_date <= '4/30/13' and (enroll_end_date >= '4/1/13' or enroll_end_date is null)
Group by enroll_start_date--, enroll_end_date, program_modifier_id, program_modifier
)
select
sum(cnt*1.0)
from cte
I prefer to not use a CURSOR for the solution, however.

counting all occurrences in the last year

I have a question, although I can't really go into specifics.
Will the following query:
SELECT DISTINCT tableOuter.Property, (SELECT COUNT(ID) FROM table AS tableInner WHERE tableInner.Property = tableOuter.Property)
FROM table AS tableOuter
WHERE tableOuter.DateTime > DATEADD(year, -1, GETDATE())
AND tableOuter.Property IN (
...
)
Select one instance of each property in the IN clause, together with how often a row with that property occured in the last year?
I just read up on Correlated Subqueries on MSDN, but am not sure if I got it right.
If i understand you corrrecly, you want to get all occurences of each Property in the last year, am i right?
Then use GROUP BY with a HAVING clause:
SELECT tableOuter.Property, COUNT(*) AS Count
FROM table AS tableOuter
GROUP BY tableOuter.Property
HAVING tableOuter.DateTime > DATEADD(year, -1, GETDATE())
AND tableOuter.Property IN ( .... )

Two questions about my SQL script. How to add subtotal/total lines, and a sorting issue

I have some T-SQL that generates a nice report giving a summary some stuff by month.
I have 2 questions, is there a way to get the to sort the months by calendar order, not by alpha? And, what i would like to do is add a total line for each year, and a total line for the whole report?
SELECT
CASE WHEN tmpActivity.Year IS NULL THEN
CASE WHEN tmpCreated.Year IS NULL THEN
CASE WHEN tmpContactsCreated.Year IS NULL THEN
null
ELSE tmpContactsCreated.Year END
ELSE tmpCreated.Year END
ELSE tmpActivity.Year END As Year,
CASE WHEN tmpActivity.Month IS NULL THEN
CASE WHEN tmpCreated.Month IS NULL THEN
CASE WHEN tmpContactsCreated.Month IS NULL THEN
null
ELSE DateName(month, DateAdd(month, tmpContactsCreated.Month - 1, '1900-01-01' )) END
ELSE DateName(month, DateAdd(month, tmpCreated.Month - 1, '1900-01-01' )) END
ELSE DateName(month, DateAdd(month, tmpActivity.Month - 1, '1900-01-01' )) END As Month,
CASE WHEN tmpActivity.ActiveAccounts IS NULL THEN 0 ELSE tmpActivity.ActiveAccounts END AS ActiveAccounts,
CASE WHEN tmpCreated.NewAccounts IS NULL THEN 0 ELSE tmpCreated.NewAccounts END AS NewAccounts,
CASE WHEN tmpContactsCreated.NewContacts IS NULL THEN 0 ELSE tmpContactsCreated.NewContacts END AS NewContacts
FROM
(
SELECT YEAR(LastLogon) As Year, MONTH(LastLogon) As Month, COUNT(*) As ActiveAccounts
FROM Users
WHERE LastLogon >= '1/1/1800'
GROUP BY YEAR(LastLogon), MONTH(LastLogon)
) as tmpActivity
FULL JOIN
(
SELECT YEAR(Created) As Year, MONTH(Created) As Month, COUNT(*) As NewAccounts
FROM Users
WHERE Created >= '1/1/1800'
GROUP BY YEAR(Created), MONTH(Created)
) as tmpCreated ON tmpCreated.Year = tmpActivity.Year AND tmpCreated.Month = tmpActivity.Month
FULL JOIN
(
SELECT YEAR(Created) As Year, MONTH(Created) As Month, COUNT(*) As NewContacts
FROM Contacts
WHERE Created >= '1/1/1800'
GROUP BY YEAR(Created), MONTH(Created)
) as tmpContactsCreated ON tmpContactsCreated.Year = tmpCreated.Year AND tmpContactsCreated.Month = tmpCreated.Month
Order By Year DESC, Month DESC
To order by the month use the following:
ORDER BY DATEPART(Month,Created) ASC
DatePart() returns an integer for the part specified 1 for January, 2 for Febuary etc.
You're SQL could be helped with the COALESCE() and ISNULL() functions. This is the same as your first select:
SELECT
COALESCE(tmpActivity.Year,tmpCreated.Year,tmpContactsCreated.Year) as Year,
COALESCE(tmpActivity.Month,tmpCreated.Month,tmpContactsCreated.Month) as Month,
ISNULL(tmpActivity.ActiveAccounts,0) AS ActiveAccounts,
ISNULL(tmpCreated.NewAccounts,0) AS NewAccounts,
ISNULL(tmpContactsCreated.NewContacts,0) AS NewContacts
I think there is a bug in your select, I believe your last line has to be this:
) as tmpContactsCreated ON (tmpContactsCreated.Year = tmpCreated.Year AND tmpContactsCreated.Month = tmpCreated.Month) OR
(tmpContactsCreated.Year = tmpActivity.Year AND tmpContactsCreated.Month = tmpActivity.Month)
But I would have to test this to be sure.
Adding in rollups is hard to do -- typically this is done externally to the SQL in the control or whatever displays the results. You could do something like this (contrived example):
SELECT 1 as reportOrder, date, amount, null as total
FROM invoices
UNION ALL
SELECT 2 , null, null, sum(amount)
FROM invoices
ORDER BY reportOrder, date
or you could not have the "extra" total column and put it in the amount column.