TSQL: Need to Count Multiple Columns and Group by their Contents - tsql

I have the following dataset:
StartDate EnterDate Order#
---------- ---------- ------
2018-01-01 2018-01-01 1
2018-01-01 2018-01-01 2
2018-01-01 2018-01-02 3
2018-01-02 2018-01-02 4
2018-01-02 2018-01-03 5
2018-01-02 2018-01-03 6
2018-01-03 2018-01-04 7
2018-01-03 2018-01-04 8
2018-01-03 2018-01-04 9
2018-01-03 2018-01-05 10
I need to COUNT the number of dates in each column.
Example output:
Date StartDate EnterDate
---------- --------- ---------
01-01-2018 3 2
01-02-2018 3 2
01-03-2018 4 2
01-04-2018 0 3
01-05-2018 0 1
NULL can be substituted for 0.

You can use full join to achieve that
select
Date = isnull(t.StartDate, q.EnterDate), StartDate = isnull(t.cnt, 0), EnterDate = isnull(q.cnt, 0)
from (
select
StartDate, count(*) cnt
from
myTable
group by StartDate
) t
full join (
select
EnterDate, count(*) cnt
from
myTable
group by EnterDate
) q on t.StartDate = q.EnterDate

Related

Is there a way to do a selective sum using a time interval in Postgres?

I have two tables, the first table has columns: id, start_time, and end_time. The second table has columns: id, timestamp, value. Is there a way to make a sum of table 2 based on the conditions in table 1?
Table 1:
id
start_date
end_date
5
2000-01-01 01:00:00
2000-01-05 02:45:00
5
2000-01-10 01:00:00
2000-01-15 02:45:00
6
2000-01-01 01:00:00
2000-01-05 02:45:00
6
2000-01-11 01:00:00
2000-01-12 02:45:00
6
2000-01-15 01:00:00
2000-01-20 02:45:00
Table 2:
id
timestamp
value
5
2000-01-01 05:00:00
1
5
2000-01-01 06:00:00
2
6
2000-01-01 05:00:00
1
6
2000-01-11 05:00:00
2
6
2000-01-15 05:00:00
2
6
2000-01-15 05:30:00
2
Desired result:
id
start_date
end_date
Sum
5
2000-01-01 01:00:00
2000-01-05 02:45:00
3
5
2000-01-10 01:00:00
2000-01-15 02:45:00
null
6
2000-01-01 01:00:00
2000-01-05 02:45:00
1
6
2000-01-11 01:00:00
2000-01-12 02:45:00
2
6
2000-01-15 01:00:00
2000-01-20 02:45:00
4
Try this :
SELECT a.id, a.start_date, a.end_date, sum(b.value) AS sum
FROM table1 AS a
LEFT JOIN table2 AS b
ON b.id = a.id
AND b.timestamp >= a.start_date
AND b.timestamp < a.end_date
GROUP BY a.id, a.start_date, a.end_date

Fetch records created within 24 hours in DB2

I need to fetch the records created within 24 hours . I wrote the below query however its not giving the desired result.
SELECT a,b,enddate,status
FROM data WHERE a='1013'AND c ='1250'and (TIMESTAMPDIFF(8,char(timestamp(enddate)-
TIMESTAMP(CURRENT_DATE)))) between 0 and 24
Below is the data present in the table
A B C Enddate
1013 Test1 1250 28-March-2020 11:00 AM
1013 Test2 1000 28-March-2020 15:00 PM
1013 Test3 1250 29-March-2020 05:00 AM
1013 Test4 1250 29-March-2020 13:00 PM
1013 Test5 2500 29-March-2020 17:00 PM
1013 Test6 1250 31-March-2020 19:00 PM
Assuming that CURRENT_DATE = 29-March-2020 19:00 PM the query should return 2 rows Test3 and Test4 . The above query does not return any row .
SELECT B, TS
FROM
(
VALUES
('Test1', TIMESTAMP('2020-03-28-11.00.00'))
, ('Test2', TIMESTAMP('2020-03-28-15.00.00'))
, ('Test3', TIMESTAMP('2020-03-29-05.00.00'))
, ('Test4', TIMESTAMP('2020-03-29-13.00.00'))
, ('Test5', TIMESTAMP('2020-03-29-17.00.00'))
, ('Test6', TIMESTAMP('2020-03-31-19.00.00'))
) T (B, TS)
WHERE TS BETWEEN TIMESTAMP('2020-03-29-19.00.00') - 24 HOURS AND TIMESTAMP('2020-03-29-19.00.00');
The result is:
|B |TS |
|-----|--------------------------|
|Test3|2020-03-29-05.00.00.000000|
|Test4|2020-03-29-13.00.00.000000|
|Test5|2020-03-29-17.00.00.000000|

Data from last 12 months each month with trailing 12 months

This is TSQL and I'm trying to calculate repeat purchase rate for last 12 months. This is achieved by looking at sum of customers who have bought more than 1 time last 12 months and the total number of customers last 12 months.
The SQL code below will give me just that; but i would like to dynamically do this for the last 12 months. This is the part where i'm stuck and not should how to best achieve this.
Each month should include data going back 12 months. I.e. June should hold data between June 2018 and June 2018, May should hold data from May 2018 till May 2019.
[Order Date] is a normal datefield (yyyy-mm-dd hh:mm:ss)
DECLARE #startdate1 DATETIME
DECLARE #enddate1 DATETIME
SET #enddate1 = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0) -- Starting June 2018
SET #startdate1 = DATEADD(mm,DATEDIFF(mm,0,GETDATE())-13,0) -- Ending June 2019
;
with dataset as (
select [Phone No_] as who_identifier,
count(distinct([Order No_])) as mycount
from [MyCompany$Sales Invoice Header]
where [Order Date] between #startdate1 and #enddate1
group by [Phone No_]
),
frequentbuyers as (
select who_identifier, sum(mycount) as frequentbuyerscount
from dataset
where mycount > 1
group by who_identifier),
allpurchases as (
select who_identifier, sum(mycount) as allpurchasescount
from dataset
group by who_identifier
)
select sum(frequentbuyerscount) as frequentbuyercount, (select sum(allpurchasescount) from allpurchases) as allpurchasecount
from frequentbuyers
I'm hoping to achieve end result looking something like this:
...Dec, Jan, Feb, March, April, May, June each month holding both values for frequentbuyercount and allpurchasescount.
Here is the code. I made a little modification for the frequentbuyerscount and allpurchasescount. If you use a sumif like expression you don't need a second cte.
if object_id('tempdb.dbo.#tmpMonths') is not null drop table #tmpMonths
create table #tmpMonths ( MonthID datetime, StartDate datetime, EndDate datetime)
declare #MonthCount int = 12
declare #Month datetime = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)
while #MonthCount > 0 begin
insert into #tmpMonths( MonthID, StartDate, EndDate )
select #Month, dateadd(month, -12, #Month), #Month
set #Month = dateadd(month, -1, #Month)
set #MonthCount = #MonthCount - 1
end
;with dataset as (
select m.MonthID as MonthID, [Phone No_] as who_identifier,
count(distinct([Order No_])) as mycount
from [MyCompany$Sales Invoice Header]
inner join #tmpMonths m on [Order Date] between m.StartDate and m.EndDate
group by m.MonthID, [Phone No_]
),
buyers as (
select MonthID, who_identifier
, sum(iif(mycount > 1, mycount, 0)) as frequentbuyerscount --sum only if count > 1
, sum(mycount) as allpurchasescount
from dataset
group by MonthID, who_identifier
)
select
b.MonthID
, max(tm.StartDate) StartDate, max(tm.EndDate) EndDate
, sum(b.frequentbuyerscount) as frequentbuyercount
, sum(b.allpurchasescount) as allpurchasecount
from buyers b inner join #tmpMonths tm on tm.MonthID = b.MonthID
group by b.MonthID
Be aware, that the code was tested only syntax-wise.
After the test data, this is the result:
MonthID | StartDate | EndDate | frequentbuyercount | allpurchasecount
-----------------------------------------------------------------------------
2018-08-01 | 2017-08-01 | 2018-08-01 | 340 | 3702
2018-09-01 | 2017-09-01 | 2018-09-01 | 340 | 3702
2018-10-01 | 2017-10-01 | 2018-10-01 | 340 | 3702
2018-11-01 | 2017-11-01 | 2018-11-01 | 340 | 3702
2018-12-01 | 2017-12-01 | 2018-12-01 | 340 | 3703
2019-01-01 | 2018-01-01 | 2019-01-01 | 340 | 3703
2019-02-01 | 2018-02-01 | 2019-02-01 | 2 | 8
2019-03-01 | 2018-03-01 | 2019-03-01 | 2 | 3
2019-04-01 | 2018-04-01 | 2019-04-01 | 2 | 3
2019-05-01 | 2018-05-01 | 2019-05-01 | 2 | 3
2019-06-01 | 2018-06-01 | 2019-06-01 | 2 | 3
2019-07-01 | 2018-07-01 | 2019-07-01 | 2 | 3

Show complete date range with NULL in PostgreSQL

I'm trying to create this query to get all complete date on range and data with nulls if the date is not exist on the table
For example this is my tbl_example
Original data:
id | userid(str) | comment(str) | mydate(date)
1 0001 sample1 2019-06-20T16:00:00.000Z
2 0002 sample2 2019-06-21T16:00:00.000Z
3 0003 sample3 2019-06-24T16:00:00.000Z
4 0004 sample4 2019-06-25T16:00:00.000Z
5 0005 sample5 2019-06-26T16:00:00.000Z
Then:
select * from tbl_example where mydate between '2019-06-20' AND
DATE('2019-06-20') + interval '5 day')
how to output all the dates on range with possible null like this
Expected output:
id | userid(str) | comment(str) | mydate(date)
1 0001 sample1 2019-06-20T16:00:00.000Z
2 0002 sample2 2019-06-21T16:00:00.000Z
null null null 2019-06-22T16:00:00.000Z
null null null 2019-06-23T16:00:00.000Z
4 0003 sample3 2019-06-24T16:00:00.000Z
5 0004 sample4 2019-06-25T16:00:00.000Z
This is my sample test environment: http://www.sqlfiddle.com/#!17/f5285/2
OK, just see my SQL as below:
with all_dates as (
select generate_series(min(mydate),max(mydate),'1 day'::interval) as dates from tbl_example
)
,null_dates as (
select
a.dates
from
all_dates a
left join
tbl_example t on a.dates = t.mydate
where
t.mydate is null
)
select null as id, null as userid, null as comment, dates as mydate from null_dates
union
select * from tbl_example order by mydate;
id | userid | comment | mydate
----+--------+---------+---------------------
1 | 0001 | sample1 | 2019-06-20 16:00:00
2 | 0002 | sample1 | 2019-06-21 16:00:00
| | | 2019-06-22 16:00:00
| | | 2019-06-23 16:00:00
3 | 0003 | sample1 | 2019-06-24 16:00:00
4 | 0004 | sample1 | 2019-06-25 16:00:00
5 | 0005 | sample1 | 2019-06-26 16:00:00
(7 rows)
Or the generate_series clause you can just write the date arguments you want ,as below:
select generate_series('2019-06-20 16:00:00','2019-06-20 16:00:00'::timestamp + '5 days'::interval,'1 day'::interval) as dates
SELECT id, userid, "comment", d.mydate
FROM generate_series('2019-06-20'::date, '2019-06-25'::date, INTERVAL '1 day') d (mydate)
LEFT JOIN tbl_example ON d.mydate = tbl_example.mydate
Result

How to query with lead() values not in current range

I´m having problems querying when lead() values are not within the range of current row, rows on the range's edge return null lead() values.
Let’s say I have a simple table to keep track of continuous counters
create table anytable
( wseller integer NOT NULL,
wday date NOT NULL,
wshift smallint NOT NULL,
wconter numeric(9,1) )
with the following values
wseller wday wshift wcounter
1 2016-11-30 1 100.5
1 2017-01-03 1 102.5
1 2017-01-25 2 103.2
1 2017-02-05 2 106.1
2 2015-05-05 2 81.1
2 2017-01-01 1 92.1
2 2017-01-01 2 93.1
3 2016-12-01 1 45.2
3 2017-01-05 1 50.1
and want net units for current year
wseller wday wshift units
1 2017-01-03 1 2
1 2017-01-25 2 0.7
1 2017-02-05 2 2.9
2 2017-01-01 1 11
2 2017-01-01 2 1
3 2017-01-05 1 4.9
If I use
seletc wseller, wday, wshift, wcounter-lead(wcounter) over (partition by wseller order by wseller, wday desc, wshift desc)
from anytable
where wday>='2017-01-01'
gives me nulls on the first wseller by partition. I´m using this query within a large CTE.
What am I doing wrong?
The scope of a window function takes into account conditions in the WHERE clause. Move the condition to the outer query:
select *
from (
select
wseller, wday, wshift,
wcounter- lead(wcounter) over (partition by wseller order by wday desc, wshift desc)
from anytable
) s
where wday >= '2017-01-01'
order by wseller, wday, wshift
wseller | wday | wshift | ?column?
---------+------------+--------+----------
1 | 2017-01-03 | 1 | 2.0
1 | 2017-01-25 | 2 | 0.7
1 | 2017-02-05 | 2 | 2.9
2 | 2017-01-01 | 1 | 11.0
2 | 2017-01-01 | 2 | 1.0
3 | 2017-01-05 | 1 | 4.9
(6 rows)