Create trip number on tracking data

Create trip number on tracking data - postgresql

I am currently working on a Postgres database with data for car tracking which looks similar to this:
+----+--------+------------+----------+
| id | car_id | date | time |
+----+--------+------------+----------+
| 11 | 1 | 2014-12-20 | 12:12:12 |
| 12 | 1 | 2014-12-20 | 12:12:13 |
| 13 | 1 | 2014-12-20 | 12:12:14 |
| 23 | 1 | 2015-12-20 | 23:42:10 |
| 24 | 1 | 2015-12-20 | 23:42:11 |
| 31 | 2 | 2014-12-20 | 15:12:12 |
| 32 | 2 | 2014-12-20 | 15:12:14 |
+----+--------+------------+----------+
Here is the setup:
CREATE TABLE test (
id int
, car_id int
, date text
, time text
);
INSERT INTO test VALUES
(11, 1, '2014-12-20', '12:12:12'),
(12, 1, '2014-12-20', '12:12:13'),
(13, 1, '2014-12-20', '12:12:14'),
(23, 1, '2015-12-20', '23:42:10'),
(24, 1, '2015-12-20', '23:42:11'),
(31, 2, '2014-12-20', '15:12:12'),
(32, 2, '2014-12-20', '15:12:14');
I want to create a column where the traces are assigned a trip number sorted by id
id car_id date time (trip)
11 1 2014-12-20 12:12:12 1
12 1 2014-12-20 12:12:13 1
13 1 2014-12-20 12:12:14 1
23 1 2015-12-20 23:42:10 2 (trip +1 because time difference is bigger then 5 sec)
24 1 2015-12-20 23:42:11 2
31 2 2014-12-20 15:12:12 3 (trip +1 because car id is different)
32 2 2014-12-20 15:12:14 3 `
I have put op following rules
first row (lowest id) gets the value trip = 1
for the following rows: if car_id is equal to the row above and time
difference between the row and the row above is smaller then 5 then trip is
the same as the row above, else trip is the row above +1
I have tried with the following
Create table test as select
"id", "date", "time", car_id,
extract(epoch from "date" + "time") - lag(extract(epoch from "date" + "time")) over (order by "id") as diff,
Case
when t_diff < 5 and car_id - lag(car_id) over (order by "id") = 0
then lag(trip) over (order by "id")
else lag(trip) over (order by "id") + 1
end as trip
From road_1 order by "id"
but it does not work :( How can I compute the trip column?

First, use (date || ' ' || time)::timestamp AS datetime to form a timestamp out of date and time
SELECT id, test.car_id
, (date || ' ' || time)::timestamp AS datetime
FROM test
which yields
| id | car_id | datetime |
|----+--------+---------------------|
| 11 | 1 | 2014-12-20 12:12:12 |
| 12 | 1 | 2014-12-20 12:12:13 |
| 13 | 1 | 2014-12-20 12:12:14 |
| 23 | 1 | 2015-12-20 23:42:10 |
| 24 | 1 | 2015-12-20 23:42:11 |
| 31 | 2 | 2014-12-20 15:12:12 |
| 32 | 2 | 2014-12-20 15:12:14 |
It is helpful to do this since we'll be using datetime - prev > '5 seconds'::interval
to identify rows which are 5 seconds apart. Notice that
2014-12-20 23:59:59 and 2014-12-21 00:00:00 are 5 seconds apart
but it would be difficult/tedious to determine this if all we had were separate date and time columns.
Now we can express the rule that the trip is increased by 1 when
NOT ((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval))
(More on why the condition is expressed in this seemingly backwards way, below).
SELECT id, car_id, prev_car_id, datetime, prev_date
, (CASE WHEN ((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval)) THEN 0 ELSE 1 END) AS new_trip
FROM (
SELECT id, car_id, datetime
, lag(datetime) OVER () AS prev_date
, lag(car_id) OVER () AS prev_car_id
FROM (
SELECT id, car_id
, (date || ' ' || time)::timestamp AS datetime
FROM test ) t1
) t2
yields
| id | car_id | prev_car_id | datetime | prev_date | new_trip |
|----+--------+-------------+---------------------+---------------------+----------|
| 11 | 1 | | 2014-12-20 12:12:12 | | 1 |
| 12 | 1 | 1 | 2014-12-20 12:12:13 | 2014-12-20 12:12:12 | 0 |
| 13 | 1 | 1 | 2014-12-20 12:12:14 | 2014-12-20 12:12:13 | 0 |
| 23 | 1 | 1 | 2015-12-20 23:42:10 | 2014-12-20 12:12:14 | 1 |
| 24 | 1 | 1 | 2015-12-20 23:42:11 | 2015-12-20 23:42:10 | 0 |
| 31 | 2 | 1 | 2014-12-20 15:12:12 | 2015-12-20 23:42:11 | 1 |
| 32 | 2 | 2 | 2014-12-20 15:12:14 | 2014-12-20 15:12:12 | 0 |
Now trip can be expressed as the cumulative sum over the new_trip column:
SELECT id, car_id, datetime, sum(new_trip) OVER (ORDER BY datetime) AS trip
FROM (
SELECT id, car_id, prev_car_id, datetime, prev_date
, (CASE WHEN ((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval)) THEN 0 ELSE 1 END) AS new_trip
FROM (
SELECT id, car_id, datetime
, lag(datetime) OVER () AS prev_date
, lag(car_id) OVER () AS prev_car_id
FROM (
SELECT id, car_id
, (date || ' ' || time)::timestamp AS datetime
FROM test ) t1
) t2
) t3
yields
| id | car_id | datetime | trip |
|----+--------+---------------------+------|
| 11 | 1 | 2014-12-20 12:12:12 | 1 |
| 12 | 1 | 2014-12-20 12:12:13 | 1 |
| 13 | 1 | 2014-12-20 12:12:14 | 1 |
| 31 | 2 | 2014-12-20 15:12:12 | 2 |
| 32 | 2 | 2014-12-20 15:12:14 | 2 |
| 23 | 1 | 2015-12-20 23:42:10 | 3 |
| 24 | 1 | 2015-12-20 23:42:11 | 3 |
I used
(CASE WHEN ((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval)) THEN 0 ELSE 1 END)
instead of
(CASE WHEN ((car_id != prev_car_id) OR (datetime-prev_date > '5 seconds'::interval)) THEN 1 ELSE 0 END)
because prev_car_id and prev_date may be NULL. Thus, on the first row, (car_id != prev_car_id) returns NULL when instead we want TRUE.
By expressing the condition in the opposite way, we can identify the unintersting rows correctly:
((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval))
and use the ELSE clause to return 1 when the condition is TRUE or NULL. You can see the difference here:
SELECT id
, (CASE WHEN ((car_id = prev_car_id) AND (datetime-prev_date <= '5 seconds'::interval)) THEN 0 ELSE 1 END) AS new_trip
, (CASE WHEN ((car_id != prev_car_id) OR (datetime-prev_date > '5 seconds'::interval)) THEN 1 ELSE 0 END) AS new_trip_wrong
, car_id, prev_car_id, datetime, prev_date
FROM (
SELECT id, car_id, datetime
, lag(datetime) OVER () AS prev_date
, lag(car_id) OVER () AS prev_car_id
FROM (
SELECT id, car_id
, (date || ' ' || time)::timestamp AS datetime
FROM test ) t1
) t2
yields
| id | new_trip | new_trip_wrong | car_id | prev_car_id | datetime | prev_date |
|----+----------+----------------+--------+-------------+---------------------+---------------------|
| 11 | 1 | 0 | 1 | | 2014-12-20 12:12:12 | |
| 12 | 0 | 0 | 1 | 1 | 2014-12-20 12:12:13 | 2014-12-20 12:12:12 |
| 13 | 0 | 0 | 1 | 1 | 2014-12-20 12:12:14 | 2014-12-20 12:12:13 |
| 23 | 1 | 1 | 1 | 1 | 2015-12-20 23:42:10 | 2014-12-20 12:12:14 |
| 24 | 0 | 0 | 1 | 1 | 2015-12-20 23:42:11 | 2015-12-20 23:42:10 |
| 31 | 1 | 1 | 2 | 1 | 2014-12-20 15:12:12 | 2015-12-20 23:42:11 |
| 32 | 0 | 0 | 2 | 2 | 2014-12-20 15:12:14 | 2014-12-20 15:12:12 |
Note the difference in the new_trip versus new_trip_wrong columns.

Related

Cumulative sum of multiple window functions

I have a table with the structure:
id | date | player_id | score
--------------------------------------
1 | 2019-01-01 | 1 | 1
2 | 2019-01-02 | 1 | 1
3 | 2019-01-03 | 1 | 0
4 | 2019-01-04 | 1 | 0
5 | 2019-01-05 | 1 | 1
6 | 2019-01-06 | 1 | 1
7 | 2019-01-07 | 1 | 0
8 | 2019-01-08 | 1 | 1
9 | 2019-01-09 | 1 | 0
10 | 2019-01-10 | 1 | 0
11 | 2019-01-11 | 1 | 1
I want to create two more columns, 'total_score', 'last_seven_days'.
total_score is a rolling sum of the player_id score
last_seven_days is the score for the last seven days including to and prior to the date
I have written the following SQL query:
SELECT id,
date,
player_id,
score,
sum(score) OVER all_scores AS all_score,
sum(score) OVER last_seven AS last_seven_score
FROM scores
WINDOW all_scores AS (PARTITION BY player_id ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
last_seven AS (PARTITION BY player_id ORDER BY id ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING);
and get the following output:
id | date | player_id | score | all_score | last_seven_score
------------------------------------------------------------------
1 | 2019-01-01 | 1 | 1 | |
2 | 2019-01-02 | 1 | 1 | 1 | 1
3 | 2019-01-03 | 1 | 0 | 2 | 2
4 | 2019-01-04 | 1 | 0 | 2 | 2
5 | 2019-01-05 | 1 | 1 | 2 | 2
6 | 2019-01-06 | 1 | 1 | 3 | 3
7 | 2019-01-07 | 1 | 0 | 4 | 4
8 | 2019-01-08 | 1 | 1 | 4 | 4
9 | 2019-01-09 | 1 | 0 | 5 | 4
10 | 2019-01-10 | 1 | 0 | 5 | 3
11 | 2019-01-11 | 1 | 1 | 5 | 3
I have realised that I need to change this
last_seven AS (PARTITION BY player_id ORDER BY id ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING)
to instead of being 7, to use some sort of date format because just having the number 7 will introduce errors.
i.e. it would be nice to be able to do date - 2days or date - 6days
I also would like to add columns such as 3 months, 6 months, 12 months later down the track and so need it to be able to be dynamic.
DEMO

demo:db<>fiddle
Solution for Postgres 11+:
Using RANGE interval as #LaurenzAlbe did
Solution for Postgres <11:
(just presenting the "days" part, the "all_scores" part is the same)
Joining the table against itself on the player_id and the relevant date range:
SELECT s1.*,
(SELECT SUM(s2.score)
FROM scores s2
WHERE s2.player_id = s1.player_id
AND s2."date" BETWEEN s1."date" - interval '7 days' AND s1."date" - interval '1 days')
FROM scores s1

You need to use a window by RANGE:
last_seven AS (PARTITION BY player_id
ORDER BY date
RANGE BETWEEN INTERVAL '7 days' PRECEDING
AND INTERVAL '1 day' PRECEDING)
This solution will work only from v11 on.

Month Not Printing When No Transaction In Particular Month

I had written a code for getting employee attrition details, Showing employee count of opening, join, left & closing, Month wise.
Issue here is that if there is no value in any above four column, system is not generating the month.
Please suggest the solution.
OUTPUT:
yyear | mmonth | charmonth | opening | incoming | relived | closing
-------+--------+-----------+---------+----------+---------+---------
2018 | 4 | Apr-18 | 14 | 2 | 0 | 16
2018 | 5 | May-18 | 16 | 1 | 0 | 17
2018 | 8 | Aug-18 | 17 | 3 | 0 | 20
2018 | 9 | Sep-18 | 20 | 1 | 0 | 21
2018 | 10 | Oct-18 | 21 | 23 | 4 | 40
2018 | 11 | Nov-18 | 40 | 5 | 1 | 44
2018 | 12 | Dec-18 | 44 | 2 | 0 | 46
2019 | 1 | Jan-19 | 46 | 1 | 0 | 47
2019 | 2 | Feb-19 | 47 | 1 | 0 | 48
2019 | 3 | Mar-19 | 48 | 6 | 1 | 53
2019 | 4 | Apr-19 | 53 | 1 | 0 | 54
2019 | 5 | May-19 | 54 | 3 | 1 | 56
2019 | 6 | Jun-19 | 56 | 2 | 0 | 58
(13 rows)
If you see the sequence of month, June-18, July-18 is missing.
Code:
WITH table_1 AS (
select
startdate as ddate,
enddate as lastday,
extract('month' from startdate) as mmonth,
extract('year' from startdate) as yyear,
to_char(to_timestamp(startdate),'Mon-YY') as months
from shr_period
where startdate >= DATE('2018-01-01')
and enddate <= DATE('2019-07-01')
and ad_org_id = 'C9D035B52FAF46329D9654B1ECA0289F'
)
SELECT
table_1.yyear,
table_1.mmonth,
table_1.months as charmonth,
(SELECT
COUNT(*)
FROM shr_emp_job OPENING
WHERE OPENING.dateofjoining < table_1.ddate
and OPENING.relieveddate is null
and OPENING.ad_org_id = 'C9D035B52FAF46329D9654B1ECA0289F'
) AS OPENING,
count(*) as incoming,
(select count(*)
from shr_emp_job rel
where rel.relieveddate is not null
and rel.dateofjoining <= table_1.lastday
and rel.dateofjoining >= table_1.ddate
and rel.ad_org_id = 'C9D035B52FAF46329D9654B1ECA0289F'
) as relived,
(SELECT COUNT(*)
FROM shr_emp_job CLOSING
WHERE CLOSING.dateofjoining <= table_1.lastday
and relieveddate is null
and CLOSING.ad_org_id = 'C9D035B52FAF46329D9654B1ECA0289F'
) AS CLOSING
FROM
shr_emp_job
JOIN table_1 ON table_1.mmonth = extract('month' from shr_emp_job.dateofjoining)
AND table_1.yyear = extract('year' from shr_emp_job.dateofjoining)
where shr_emp_job.ad_org_id = 'C9D035B52FAF46329D9654B1ECA0289F'
GROUP BY table_1.mmonth, table_1.yyear, table_1.ddate, table_1.lastday, charmonth
ORDER BY table_1.yyear, table_1.mmonth;

As a quick look try changing your JOIN from an inner join to an outer join. So instead of
FROM
shr_emp_job
JOIN table_1 ON
do
FROM
shr_emp_job
RIGHT OUTER JOIN table_1 ON
This tells Postgres to keep the selected columns from the right mentioned table (table_1) even when there is no matching values in the left mentioned table (shr_emp_job). For those conditions NULL is supplied for the missing values.

postgres tablefunc, sales data grouped by product, with crosstab of months

TIL about tablefunc and crosstab. At first I wanted to "group data by columns" but that doesn't really mean anything.
My product sales look like this
product_id | units | date
-----------------------------------
10 | 1 | 1-1-2018
10 | 2 | 2-2-2018
11 | 3 | 1-1-2018
11 | 10 | 1-2-2018
12 | 1 | 2-1-2018
13 | 10 | 1-1-2018
13 | 10 | 2-2-2018
I would like to produce a table of products with months as columns
product_id | 01-01-2018 | 02-01-2018 | etc.
-----------------------------------
10 | 1 | 2
11 | 13 | 0
12 | 0 | 1
13 | 20 | 0
First I would group by month, then invert and group by product, but I cannot figure out how to do this.

After enabling the tablefunc extension,
SELECT product_id, coalesce("2018-1-1", 0) as "2018-1-1"
, coalesce("2018-2-1", 0) as "2018-2-1"
FROM crosstab(
$$SELECT product_id, date_trunc('month', date)::date as month, sum(units) as units
FROM test
GROUP BY product_id, month
ORDER BY 1$$
, $$VALUES ('2018-1-1'::date), ('2018-2-1')$$
) AS ct (product_id int, "2018-1-1" int, "2018-2-1" int);
yields
| product_id | 2018-1-1 | 2018-2-1 |
|------------+----------+----------|
| 10 | 1 | 2 |
| 11 | 13 | 0 |
| 12 | 0 | 1 |
| 13 | 10 | 10 |

Compare Consecutive Records and Get Earliest Start Date

A Customer can be in multiple positions over the lifetime and can only have one active position (marked by start date and end date). A position is part of a cost centre.
If a customer over the lifetime had 18 positions, i have to check if any of those positions in a consecutive order, were part of the same cost centre. if they were, then i have use the start date from the earliest position (same cost centre). i have written something like this:

By using 2 row_number() calculations on slightly different partitions it is possible to get a calculation(rn) that allows us to group for each consecutive set of positions in the same cost centre. You already have one such row_number when you setup the temp table. I included rn1 & rn2 so you could investigate how it works.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE TempTbl
([ConsecutivePositions] int, [CustomerID] int, [PositionID] int, [CustomerPositionId] int, [StartDate] datetime, [EndDate] varchar(23), [CostCentreID] int)
;
INSERT INTO TempTbl
([ConsecutivePositions], [CustomerID], [PositionID], [CustomerPositionId], [StartDate], [EndDate], [CostCentreID])
VALUES
(1, 2734, 195, 31860, '2013-10-17 16:08:53', '2015-03-06 11:51:09.440', 5),
(2, 2734, 29, 39405, '2015-03-06 11:51:09', '2016-01-27 13:10:19.720', 3),
(3, 2734, 271, 23599, '2012-04-05 16:21:41', '2012-12-04 11:32:47.433', 13),
(4, 2734, 107, 26479, '2012-12-04 11:32:47', '2013-03-19 09:07:13.633', 14),
(5, 2734, 297, 28497, '2013-03-19 09:07:13', '2013-10-17 16:08:53.120', 14),
(6, 2734, 154, 2723, '2007-11-27 00:00:00', '2009-07-10 15:44:16.640', 3),
(7, 2734, 145, 19436, '2011-03-15 00:00:00', '2011-10-18 15:42:36.877', 906),
(8, 2734, 146, 17453, '2010-09-12 00:00:00', '2010-11-11 15:58:25.043', 13),
(9, 2734, 8, 18180, '2010-11-11 00:00:00', '2011-03-15 17:57:48.027', 13),
(10, 2734, 8, 21606, '2011-10-18 15:42:36', '2011-11-11 16:42:54.787', 13),
(11, 2734, 8, 21982, '2011-11-14 11:18:24', '2012-04-05 16:21:41.230', 13),
(12, 2734, 264, 21958, '2011-11-11 16:42:54', '2011-11-14 11:18:24.057', 906),
(13, 2734, 5, 12785, '2009-07-10 00:00:00', '2009-07-29 09:30:52.430', 3),
(14, 2734, 5, 12999, '2009-07-29 00:00:00', '2010-03-04 13:00:30.223', 3),
(15, 2734, 149, 15165, '2010-03-04 00:00:00', '2010-08-16 12:13:30.703', 3),
(16, 2734, 8, 17044, '2010-08-16 00:00:00', '2010-09-12 16:29:01.203', 13),
(17, 2734, 891, 45453, '2016-01-27 13:10:19', NULL, 906)
;
Query 1:
with cte as (
select
*
, row_number() over(partition by CustomerID order by StartDate) rn1
, row_number() over(partition by CustomerID, CostCentreID order by StartDate) rn2
, row_number() over(partition by CustomerID order by StartDate)
- row_number() over(partition by CustomerID, CostCentreID order by StartDate) rn3
from temptbl
)
select
CustomerID
, CostCentreID
, rn3
, count(*) c
, min(StartDate) StartDate
, max(EndDate) EndDate
from cte
group by
CustomerID, CostCentreID, rn3
order by
CustomerID, StartDate
Results:
| CustomerID | CostCentreID | rn3 | c | StartDate | EndDate |
|------------|--------------|-----|---|----------------------|-------------------------|
| 2734 | 3 | 0 | 4 | 2007-11-27T00:00:00Z | 2010-08-16 12:13:30.703 |
| 2734 | 13 | 4 | 3 | 2010-08-16T00:00:00Z | 2011-03-15 17:57:48.027 |
| 2734 | 906 | 7 | 1 | 2011-03-15T00:00:00Z | 2011-10-18 15:42:36.877 |
| 2734 | 13 | 5 | 1 | 2011-10-18T15:42:36Z | 2011-11-11 16:42:54.787 |
| 2734 | 906 | 8 | 1 | 2011-11-11T16:42:54Z | 2011-11-14 11:18:24.057 |
| 2734 | 13 | 6 | 2 | 2011-11-14T11:18:24Z | 2012-12-04 11:32:47.433 |
| 2734 | 14 | 12 | 2 | 2012-12-04T11:32:47Z | 2013-10-17 16:08:53.120 |
| 2734 | 5 | 14 | 1 | 2013-10-17T16:08:53Z | 2015-03-06 11:51:09.440 |
| 2734 | 3 | 11 | 1 | 2015-03-06T11:51:09Z | 2016-01-27 13:10:19.720 |
| 2734 | 906 | 14 | 1 | 2016-01-27T13:10:19Z | (null) |
----
Instead of a group by query, use more window functions, and you can get the all details in the temp table as well as the wanted cost center related dates.
with cte as (
select
*
, row_number() over(partition by CustomerID order by StartDate) rn1
, row_number() over(partition by CustomerID, CostCentreID order by StartDate) rn2
, row_number() over(partition by CustomerID order by StartDate)
- row_number() over(partition by CustomerID, CostCentreID order by StartDate) rn3
from temptbl
)
, cte2 as (
select
*
, min(StartDate) over(partition by CustomerID, CostCentreID, rn3) MinStartDate
, max(EndDate) over(partition by CustomerID, CostCentreID, rn3) MaxEndDate
from cte
)
select
*
from cte2
;
Results:
| ConsecutivePositions | CustomerID | PositionID | CustomerPositionId | StartDate | EndDate | CostCentreID | rn1 | rn2 | rn3 | MinStartDate | MaxEndDate |
|----------------------|------------|------------|--------------------|----------------------|-------------------------|--------------|-----|-----|-----|----------------------|-------------------------|
| 6 | 2734 | 154 | 2723 | 2007-11-27T00:00:00Z | 2009-07-10 15:44:16.640 | 3 | 1 | 1 | 0 | 2007-11-27T00:00:00Z | 2010-08-16 12:13:30.703 |
| 13 | 2734 | 5 | 12785 | 2009-07-10T00:00:00Z | 2009-07-29 09:30:52.430 | 3 | 2 | 2 | 0 | 2007-11-27T00:00:00Z | 2010-08-16 12:13:30.703 |
| 14 | 2734 | 5 | 12999 | 2009-07-29T00:00:00Z | 2010-03-04 13:00:30.223 | 3 | 3 | 3 | 0 | 2007-11-27T00:00:00Z | 2010-08-16 12:13:30.703 |
| 15 | 2734 | 149 | 15165 | 2010-03-04T00:00:00Z | 2010-08-16 12:13:30.703 | 3 | 4 | 4 | 0 | 2007-11-27T00:00:00Z | 2010-08-16 12:13:30.703 |
| 16 | 2734 | 8 | 17044 | 2010-08-16T00:00:00Z | 2010-09-12 16:29:01.203 | 13 | 5 | 1 | 4 | 2010-08-16T00:00:00Z | 2011-03-15 17:57:48.027 |
| 8 | 2734 | 146 | 17453 | 2010-09-12T00:00:00Z | 2010-11-11 15:58:25.043 | 13 | 6 | 2 | 4 | 2010-08-16T00:00:00Z | 2011-03-15 17:57:48.027 |
| 9 | 2734 | 8 | 18180 | 2010-11-11T00:00:00Z | 2011-03-15 17:57:48.027 | 13 | 7 | 3 | 4 | 2010-08-16T00:00:00Z | 2011-03-15 17:57:48.027 |
| 10 | 2734 | 8 | 21606 | 2011-10-18T15:42:36Z | 2011-11-11 16:42:54.787 | 13 | 9 | 4 | 5 | 2011-10-18T15:42:36Z | 2011-11-11 16:42:54.787 |
| 11 | 2734 | 8 | 21982 | 2011-11-14T11:18:24Z | 2012-04-05 16:21:41.230 | 13 | 11 | 5 | 6 | 2011-11-14T11:18:24Z | 2012-12-04 11:32:47.433 |
| 3 | 2734 | 271 | 23599 | 2012-04-05T16:21:41Z | 2012-12-04 11:32:47.433 | 13 | 12 | 6 | 6 | 2011-11-14T11:18:24Z | 2012-12-04 11:32:47.433 |
| 7 | 2734 | 145 | 19436 | 2011-03-15T00:00:00Z | 2011-10-18 15:42:36.877 | 906 | 8 | 1 | 7 | 2011-03-15T00:00:00Z | 2011-10-18 15:42:36.877 |
| 12 | 2734 | 264 | 21958 | 2011-11-11T16:42:54Z | 2011-11-14 11:18:24.057 | 906 | 10 | 2 | 8 | 2011-11-11T16:42:54Z | 2011-11-14 11:18:24.057 |
| 2 | 2734 | 29 | 39405 | 2015-03-06T11:51:09Z | 2016-01-27 13:10:19.720 | 3 | 16 | 5 | 11 | 2015-03-06T11:51:09Z | 2016-01-27 13:10:19.720 |
| 4 | 2734 | 107 | 26479 | 2012-12-04T11:32:47Z | 2013-03-19 09:07:13.633 | 14 | 13 | 1 | 12 | 2012-12-04T11:32:47Z | 2013-10-17 16:08:53.120 |
| 5 | 2734 | 297 | 28497 | 2013-03-19T09:07:13Z | 2013-10-17 16:08:53.120 | 14 | 14 | 2 | 12 | 2012-12-04T11:32:47Z | 2013-10-17 16:08:53.120 |
| 1 | 2734 | 195 | 31860 | 2013-10-17T16:08:53Z | 2015-03-06 11:51:09.440 | 5 | 15 | 1 | 14 | 2013-10-17T16:08:53Z | 2015-03-06 11:51:09.440 |
| 17 | 2734 | 891 | 45453 | 2016-01-27T13:10:19Z | (null) | 906 | 17 | 3 | 14 | 2013-10-17T16:08:53Z | 2015-03-06 11:51:09.440 |

Valid periods - SQL VIEW

I have 2 tables (actually there are 4, but for now lets say it's 2) with data like this:
Table PersonA
ClientID ID From Till
1 10 1.1.2017 30.4.2017
1 12 1.8.2017 2.1.2018
Table PersonB
ClientID ID From Till
1 6 1.3.2017 30.6.2017
And I need to generate view that would show something like this:
ClientID From Till PersonA PersonB
1 1.1.2017 28.2.2017 10 NULL
1 1.3.2017 30.4.2017 10 6
1 1.5.2017 30.6.2017 NULL 6
1 1.8.2017 02.1.2018 12 NULL
So basically I need to create view that would show what "persons" each client had in given period.
So when there is an overlap, client have both PersonA and PersonB (same should apply for PersonC and PersonD).
So in the final view one client can't have any overlapping dates.
I don't know how to approach this.

In an adaptation of this algorithm, we can already handle the overlaps:
declare #PersonA table(ClientID int, ID int, [From] date, Till date);
insert into #PersonA values (1,10,'20170101','20170430'),(1,12,'20170801','20180112');
declare #PersonB table(ClientID int, ID int, [From] date, Till date);
insert into #PersonB values (1,6,'20170301','20170630');
declare #PersonC table(ClientID int, ID int, [From] date, Till date);
insert into #PersonC values (1,12,'20170401','20170625');
declare #PersonD table(ClientID int, ID int, [From] date, Till date);
insert into #PersonD values (1,14,'20170501','20170525'),(1,14,'20170510','20171122');
with X(ClientID,EdgeDate)
as (select ClientID
,case
when toggle = 1
then Till
else [From]
end as EdgeDate
from
(
select ClientID,[From],Till from #PersonA
union all
select ClientID,[From],Till from #PersonB
union all
select ClientID,[From],Till from #PersonC
union all
select ClientID,[From],Till from #PersonD
) as concated
cross join
(
select-1 as toggle
union all
select 1 as toggle
) as toggler
),merged
as (select distinct
S.ClientID
,S.EdgeDate as [From]
,min(E.EdgeDate) as Till
from
X as S
inner join X as E
on S.ClientID = E.ClientID
and S.EdgeDate < E.EdgeDate
group by S.ClientID
,S.EdgeDate
),prds
as (select distinct
merged.ClientID
,merged.[From]
,merged.Till
,A.ID as PersonA
,B.ID as PersonB
,C.ID as PersonC
,D.ID as PersonD
from
merged
left join #PersonA as A
on merged.ClientID = A.ClientID
and A.[From] <= merged.[From]
and merged.Till <= A.Till
left join #PersonB as B
on merged.ClientID = B.ClientID
and B.[From] <= merged.[From]
and merged.Till <= B.Till
left join #PersonC as C
on merged.ClientID = C.ClientID
and C.[From] <= merged.[From]
and merged.Till <= C.Till
left join #PersonD as D
on merged.ClientID = D.ClientID
and D.[From] <= merged.[From]
and merged.Till <= D.Till
where not(A.ID is null
and B.ID is null
and C.ID is null
and D.ID is null
)
)
select ClientID
,[From]
,case
when Till = lead([From]
) over(order by Till)
then dateadd(d,-1,Till)
else Till
end as Till
,PersonA
,PersonB
,PersonC
,PersonD
from
prds
order by ClientID
,[From]
,Till;
Output with just the two Person tables given in the question:
+----------+------------+------------+---------+---------+
| ClientID | From | Till | PersonA | PersonB |
+----------+------------+------------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL |
| 1 | 2017-03-01 | 2017-04-29 | 10 | 6 |
| 1 | 2017-04-30 | 2017-06-30 | NULL | 6 |
| 1 | 2017-08-01 | 2018-01-12 | 12 | NULL |
+----------+------------+------------+---------+---------+
Output of script as it is above, with four Person tables:
+----------+------------+------------+---------+---------+---------+---------+
| ClientID | From | Till | PersonA | PersonB | PersonC | PersonD |
+----------+------------+------------+---------+---------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL | NULL | NULL |
| 1 | 2017-03-01 | 2017-03-31 | 10 | 6 | NULL | NULL |
| 1 | 2017-04-01 | 2017-04-29 | 10 | 6 | 12 | NULL |
| 1 | 2017-04-30 | 2017-04-30 | NULL | 6 | 12 | NULL |
| 1 | 2017-05-01 | 2017-05-09 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-10 | 2017-05-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-25 | 2017-06-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-06-25 | 2017-06-29 | NULL | 6 | NULL | 14 |
| 1 | 2017-06-30 | 2017-07-31 | NULL | NULL | NULL | 14 |
| 1 | 2017-08-01 | 2017-11-21 | 12 | NULL | NULL | 14 |
| 1 | 2017-11-22 | 2018-01-12 | 12 | NULL | NULL | NULL |
+----------+------------+------------+---------+---------+---------+---------+