Adjusting start and end dates - tsql

Given a data set in MS SQL Server 2016
StoreID PurchaseID ShopID LocationID Starttime Endtime
1020 20200102 9856 0010 2020-01-08 09:08:53 2020-01-08 09:11:52
1021 20200102 9856 0020 2020-01-08 09:09:48 2020-01-08 09:11:52
1022    20200102    9856  0030    2020-01-09 09:08:53  2020-01-09 09:12:52 
1023 20200102    9856  0040    2020-01-10 09:09:48  2020-01-10 09:13:52
Here the StoreID is primary key. I'm looking for a query that will change the value of the first record end time to the value present in the starttime of next second record. To be precise I need to look for records that happened on same day for PurchaseID & shopkeeperID combination where the location id is different for both and then grab the starttime of later record and update the value in the prior row endtime.
Note: Here I gave sample size of just two but in my dataset I have more than 2 with above scenarios.
I would like this change to get updated to only records that occurred in that particular day. Logic should not update all the prior records end date which doesnt occur on same day. To be precise I would like this logic to get updated only those instances that are generated on same day with different LocationID.
CREATE TABLE [dbo].[TestTab1](
StoreID [int] NOT NULL,
PurchaseID [int] NOT NULL,
ShopID [int] NOT NULL,
LocationID [int] NOT NULL,
starttime [datetime] NOT NULL,
Endtime [datetime] NOT NULL,
) ON [PRIMARY]
INSERT INTO [TestTab1]
VALUES (1020,20200102,9856,0010,'2020-01-08 09:08:53','2020-01-08 09:11:52'),
(1021,20200102,9856,0020,'2020-01-08 09:09:48','2020-01-08 09:11:52'),
(1022,20200102,9856,0030,'2020-01-09 09:08:53','2020-01-09 09:11:52'),
(1023,20200102,9856,0040,'2020-01-10 09:09:48','2020-01-10 09:11:52')
Existing Data:
StoreID PurchaseID ShopID LocationID starttime Endtime
1020 20200102 9856 10 2020-01-08 09:08:53.000 2020-01-08 09:11:52.000
1021 20200102 9856 20 2020-01-08 09:09:48.000 2020-01-08 09:11:52.000
1022 20200102 9856 30 2020-01-09 09:08:53.000 2020-01-09 09:12:52.000
1023 20200102 9856 40 2020-01-10 09:09:48.000 2020-01-10 09:13:52.000
Final Result set:
StoreID PurchaseID ShopID LocationID starttime Endtime
1020 20200102 9856 10 2020-01-08 09:08:53.000 2020-01-08 09:09:48.000
1021 20200102 9856 20 2020-01-08 09:09:48.000 2020-01-08 09:11:52.000
1022 20200102 9856 30 2020-01-09 09:08:53.000 2020-01-09 09:12:52.000
1023 20200102 9856 40 2020-01-10 09:09:48.000 2020-01-10 09:13:52.000

I think this is what you are looking for, but the last two rows of your expected output have EndTimes that make no sense as they are not in the original data set. But give this a go and see if it gets you what you need:
UPDATE TestTab1
SET Endtime = T2.NewEndDate
FROM TestTab1 T1
INNER JOIN
(
SELECT *,
LEAD(Starttime,1,endtime) OVER (PARTITION BY ShopID, PurchaseID ,CAST(StartTime as DATE) ORDER BY StartTime) NewEndDate
from TestTab1
) T2 on T1.StoreID = t2.StoreID
WHERE T2.NewEndDate <> T2.Endtime
EDIT:
This query takes into account using only different locations. You can adjust the inner most to use either the MIN or MAX StoreID depending on if you want the earliest or latest record to be updated:
UPDATE TestTab1
SET Endtime = T2.NewEndDate
FROM TestTab1 T1
INNER JOIN
(
SELECT T1.*,
LEAD(T1.Starttime,1,endtime) OVER (PARTITION BY T1.ShopID, T1.PurchaseID ,CAST(T1.StartTime as DATE) ORDER BY T1.StartTime) NewEndDate
FROM TestTab1 T1
INNER JOIN
(SELECT MIN(StoreID) StoreID, PurchaseID, ShopID, LocationID, MAX(StartTime) StartTime
FROM TestTab1
GROUP BY PurchaseID, ShopID, LocationID
) t3 on t3.StoreID = t1.StoreID
) T2 on T1.StoreID = t2.StoreID
WHERE T2.NewEndDate <> T2.Endtime

Related

Joining 2 subsets of same table with different conditions

ID
Timestamp
type
account
212
2021-01-06 14:47:35
019
ALA058748
212
2021-01-07 18:34:44
021
API305575
212
2021-01-07 22:34:48
021
XYZ565656
212
2021-01-08 00:31:25
021
API305575
212
2021-01-08 00:31:31
021
API305575
212
2021-01-08 00:34:44
020
API305575
123
2021-05-21 03:34:44
021
API305575
123
2021-05-21 05:34:44
019
API305575
123
2021-05-21 09:34:44
021
API305575
123
2021-05-21 03:34:44
020
PQR464646
I have a table like above
I need to choose only those IDs for which -
Step 1) MINIMUM(Timestamp) with type = 021 for an ID --- Say X
Step 2) (Timestamp) with type = 020 and with same ID and account as in X --- Say Y
WHERE (Y-X) in minutes > 30
in this example - only ID 212 will be selected since for ID 123 , account with MIN(Timestamp) where type=021 <> account with type=020
Thank You
Schema:
create table t
(
ID int,
Timestamp datetime,
type int,
account varchar(50)
);
insert into t values(212, '2021-01-06 14:47:35', 019, 'ALA058748');
insert into t values(212, '2021-01-07 18:34:44', 021, 'API305575');
insert into t values(212, '2021-01-07 22:34:48', 021, 'XYZ565656');
insert into t values(212, '2021-01-08 00:31:25', 021, 'API305575');
insert into t values(212, '2021-01-08 00:31:31', 021, 'API305575');
insert into t values(212, '2021-01-08 00:34:44', 020, 'API305575');
insert into t values(123, '2021-05-21 03:34:44', 021, 'API305575');
insert into t values(123, '2021-05-21 05:34:44', 019, 'API305575');
insert into t values(123, '2021-05-21 09:34:44', 021, 'API305575');
insert into t values(123, '2021-05-21 03:34:44', 020, 'PQR464646');
Query #1 for MySQL:
select id
from t
group by id
having TIMESTAMPDIFF(minute, min(case when type = '021' then timestamp end),
min(case when type = '020' then timestamp end))>30
Query #2 for SQL Server:
select id
from t
group by id
having datediff(minute, min(case when type = '021' then timestamp end),
min(case when type = '020' then timestamp end))>30
Output:
id
212
db<>fiddle here
You can do what you want using aggregation and filtering. Date functions are notoriously database dependent, so the "30 minutes" logic probably differs on your database:
select id, account
from t
group by id, account
having min(timestamp) = min(case when type = '021' then timestamp end) and
min(timestamp) < min(case when type = '020' then timestamp end) + interval '30 minute'

Fetch value of a field based on first and last record of a month in DB2

I have a query which brings result of the every month. From this query I need to fetch sum of reading in a month and first and last reading of that month. I was able to do SUM using CASE however I'm unable to fetch the first and last reading, the condition to fetch this is that TYPE='YYY'
SELECT g.id, g.DATE_MONTH,
(CASE
WHEN g.TYPE ='XXX'
THEN (g.reading)
ELSE NULL
END ) AS fsum
FROM
(select to_char(DATE, 'Mon YYYY', 'en_US') DATE_MONTH ,
year(DATE) DATE_Y ,
month(DATE) DATE_M ,
min(DATE) as DATE_MIN ,
max(DATE) as DATE_MAX, id,sum(reading) AS reading, TYPE
from CXDATA
group by to_char(DATE, 'Mon YYYY', 'en_US'),
year(DATE), month(DATE),id, TYPE ) g
Data should be like below
ID------+READING+-------STARTDATE+-----+--TYPE
1010 250 05-Jan-2020 XXX
1010 500 12-Jan-2020 XXX
1010 680 20-Jan-2020 XXX
1011 100 08-Feb-2020 YYY
1011 340 11-Feb-2020 YYY
1011 180 12-Feb-2020 YYY
OUTPUT
-----------------------------------------
ID------+DATE_MONTH+----+FSUM+-----FIRSTREADING+-----LASTREADING+---TYPE
1010 JAN 2020 1430 NULL NULL XXX
1011 FEB 2020 NULL 100 180 YYY
Try this:
WITH
TAB (ID, READING, STARTDATE, TYPE) AS
(
VALUES
(1010, 250, DATE('2020-01-05'), 'XXX')
, (1010, 500, DATE('2020-01-12'), 'XXX')
, (1010, 680, DATE('2020-01-20'), 'XXX')
, (1011, 100, DATE('2020-02-08'), 'YYY')
, (1011, 340, DATE('2020-02-11'), 'YYY')
, (1011, 180, DATE('2020-02-12'), 'YYY')
)
SELECT
ID, DATE_MONTH, SUM(CASE TYPE WHEN 'XXX' THEN READING END) FSUM, FIRSTREADING, LASTREADING, TYPE
FROM
(
SELECT
ID
, to_char(STARTDATE, 'Mon YYYY', 'en_US') DATE_MONTH
, STARTDATE, READING, TYPE
, FIRST_VALUE (CASE TYPE WHEN 'YYY' THEN READING END, 'IGNORE NULLS') OVER (PARTITION BY ID, to_char(STARTDATE, 'Mon YYYY', 'en_US') ORDER BY STARTDATE) FIRSTREADING
, FIRST_VALUE (CASE TYPE WHEN 'YYY' THEN READING END, 'IGNORE NULLS') OVER (PARTITION BY ID, to_char(STARTDATE, 'Mon YYYY', 'en_US') ORDER BY STARTDATE DESC) LASTREADING
FROM TAB
)
GROUP BY ID, DATE_MONTH, FIRSTREADING, LASTREADING, TYPE;

update a column with the latest hiredate of an employee

I have an employee status history table.
I need to create one more column that should copy the min(EffectiveStartDate) on each row till the employee is rehired. I need to get the length of service of the employee where the date will be passed by UI.
How can i achieve in SQL server 2014
This answer has a few assumptions.
Assumptions
The data set is only for one Employee at a time. If it is not,
and there is another column, such as EmployeeID, then you will
want to specify that in a partition by clause inside the over
clause where my comments denote that.
That the EmployeeStatusCatalog values have the below meanings:
A: Active
L: Leave (of Absence)
I: Inactive
That a "Hire" or "Rehire" transaction is considered to happen
either at initial A status, or after an I status has ended.
Sample Data Setup
Did not include the EmployeeStatusId column, as my assumption is that it is not relevant to creating the expected outcome.
declare #employee table
(
EffectiveStartDate date not null
, EffectiveEndDate date not null
, EmployeeStatusCatalog char(1) not null
)
insert into #employee
values ('2008-02-29', '2016-05-31', 'A')
, ('2016-06-01', '2016-06-30', 'A')
, ('2016-07-01', '2016-07-30', 'L')
, ('2016-07-31', '2016-09-02', 'A')
, ('2016-09-03', '2016-10-09', 'I')
, ('2016-10-10', '2016-11-01', 'A')
, ('2016-11-02', '2016-12-02', 'L')
, ('2016-12-03', '2016-12-05', 'I')
, ('2016-12-06', '2016-12-06', 'A')
, ('2016-12-07', '2017-01-01', 'L')
, ('2017-01-02', '9999-12-31', 'A')
Answer
As you may or may not know, this is a classic gaps and islands scenario. Where each segment between Hire/Rehire dates is an island (no gaps in this example).
I used a CTE to move the I status forward one row (via LAG function), and then get the running count of the number of I rows to give each island a "ID" number.
After that, used a min function, while partitioning by the island number, to determine the minimum EffectiveStartDate for each island.
; with inactive_dts as
(
--move the I status forward one row
select e.EffectiveStartDate
, e.EffectiveEndDate
, e.EmployeeStatusCatalog
, lag(e.EmployeeStatusCatalog, 1, 'A') over (/*partion by here*/ order by e.EffectiveStartDate asc) as prev_status
from #employee as e
where 1=1
)
, active_island_nbr as
(
--get the running count of the number of I rows
select a.EffectiveStartDate
, a.EffectiveEndDate
, a.EmployeeStatusCatalog
, a.prev_status
, sum(case a.prev_status when 'I' then 1 else 0 end) over (/*partition by here*/ order by a.EffectiveStartDate asc) as ActiveIslandNbr
from inactive_dts as a
)
select min(a.EffectiveStartDate) over (partition by a.ActiveIslandNbr) as HireRehireDate
, a.EffectiveStartDate
, a.EffectiveEndDate
, a.EmployeeStatusCatalog
from active_island_nbr as a
Results
HireRehireDate EffectiveStartDate EffectiveEndDate EmployeeStatusCatalog
2008-02-29 2008-02-29 2016-05-31 A
2008-02-29 2016-06-01 2016-06-30 A
2008-02-29 2016-07-01 2016-07-30 L
2008-02-29 2016-07-31 2016-09-02 A
2008-02-29 2016-09-03 2016-10-09 I
2016-10-10 2016-10-10 2016-11-01 A
2016-10-10 2016-11-02 2016-12-02 L
2016-10-10 2016-12-03 2016-12-05 I
2016-12-06 2016-12-06 2016-12-06 A
2016-12-06 2016-12-07 2017-01-01 L
2016-12-06 2017-01-02 9999-12-31 A

Getting attendance of an employee with a date series in a particular range in Postgres

I have a attendance table with employee_id, date and punch-in time.
Emp_Id PunchTime
101 10/10/2016 07:15
101 10/10/2016 12:20
101 10/10/2016 12:50
101 10/10/2016 16:31
102 10/10/2016 07:15
Here I have the date only for the working days. I want to get the attendance list of a employee with series of given date period. I need the day also. Result should look like as follows
date | day |employee_id | Intime | outtime |
2016-10-09 | sunday | 101 | | |
2016-10-10 | monday | 101 | 2016-10-10 7:15AM |2016-10-10 4:31 PM |
You can generate a list of dates and then do an outer join on them:
The following displays all days in October:
select d.date, a.emp_id,
min(punchtime) as intime,
max(punchtime) as outtime
from generate_series(date '2016-10-01', date '2016-11-01' - 1, interval '1' day) as d (date)
left join attendance a on d.date = a.punchtime::date
group by d.date, a.emp_id;
order by d.date, a.emp_id;
As you want the first and last timestamp from each day this can be done using a simple group by query.
This will however not repeat the emp_id for the non_existing days.
Something like the following will generate a list of the range of dates (starting and ending with whatever range is found in your punchtime table), with employees and intime, outtime for each. Check the SQL fiddle here:
http://sqlfiddle.com/#!15/d93bd/1
WITH RECURSIVE minmax AS
(
SELECT MIN(CAST(time AS DATE)) AS min, MAX(CAST(time as DATE)) AS max
FROM emp_time
),
dates AS
(
SELECT m.min as datepart
FROM minmax m
RIGHT JOIN emp_time e ON m.min = CAST(e.time as DATE)
UNION ALL
SELECT d.datepart + 1 FROM dates d, minmax mm
WHERE d.datepart + 1 <= mm.max
)
SELECT d.datepart as date, e.emp, MIN(e.time) as intime, MAX(e.time) as outtime FROM dates d
LEFT JOIN emp_time e ON d.datepart = CAST(e.time as DATE)
GROUP BY d.datepart, e.emp
ORDER BY d.datepart;

DateDiff Rows Where UserID is a match

On Sql Server 2012 (T-SQL), I would like to analyse the date difference between the end dates and start dates for the same userid, and to see if there is a equal or greater than twelve month gap between times.
So for which ContractID the start date is =>12m than the previous end date.
ContractID UserID StartDate EndDate 12m Lapse
1 779 01/01/2000 01/01/2010 False
2 779 01/01/2010 01/01/2015 False
3 779 01/01/2016 NULL True
4 1021 09/03/2008 NULL False
Things perhaps to note are the userID is not in order on the real table, only the contractID is.
Using a CTE and the LAG() window function it's quite easy:
Create sample data:
DECLARE #T as table
(
ContractID int,
UserID int,
StartDate date,
EndDate date
)
INSERT INTO #T VALUES
(1, 779, '01/01/2000', '01/01/2010'),
(2, 779, '01/01/2010', '01/01/2015'),
(3, 779, '01/01/2016', NULL),
(4, 1021, '09/03/2008', NULL)
The query:
;WITH CTE AS
(
SELECT ContractID,
UserID,
StartDate,
EndDate,
LAG(EndDate) OVER(PARTITION BY UserId ORDER BY StartDate) As PreviousEndDate
FROM #T
)
SELECT ContractID,
UserID,
StartDate,
EndDate,
CASE WHEN DATEDIFF(MONTH, ISNULL(PreviousEndDate, StartDate), StartDate) >= 12 THEN
'True'
ELSE
'False'
END As '12m Lapse'
FROM CTE
Results:
ContractID UserID StartDate EndDate 12m Lapse
----------- ----------- ---------- ---------- ---------
1 779 2000-01-01 2010-01-01 False
2 779 2010-01-01 2015-01-01 False
3 779 2016-01-01 NULL True
4 1021 2008-09-03 NULL False
SELECT * FROM Table WHERE DATEDIFF(M,StartDate,EndDate) >=12
Starting with SQL Server 2012, there is a function called Lag that will help you get what you need.
The partition by of the window function will make sure that its separated by userID, and the order by will make sure its in ContractID order.
with prevEndDate as
(
select t.contractID
, t.userID
, t.startDate
, t.endDate
, lag(t.endDate,1,NULL) over (partition by t.userID order by t.contractID asc) as prevEndDate
from db_name.dbo.myTable as t
)
select p.contractID
, p.userID
, p.startDate
, p.endDate
, case when datediff(m,p.prevEndDate, p.startDate) >= 12 then 'True' else 'False' end as [12m Lapse]
from prevEndDate as p