How do I produce a report to show the number of occurrences an employee has been absent from work - tsql

I have been asked to generate a report to show the number of occurrences an employee is absent from work sick.
If an employee is absent from work for 3 consecutive days this will be counted as 1 occurrence. If they then return to work and are then absent again for another 2 consecutive days this will be recorded as 2 occurrences.
I need to generate a report to show the number of occurrences an employee is away from work sick within a 6 month period.
I have set out an example below of the data showing an employee's absence records and how i need the report to look.
How data shows in database:
enter image description here
Name Absence Dates
John Smith 01-Sep-19
John Smith 02-Sep-19
John Smith 03-Sep-19
John Smith 10-Sep-19
John Smith 11-Sep-19
How i wish for the report to look:
Name Occurrences
John Smith 2
I would be grateful for any assistance with writing to code to achieve this result.

Not a full answer, as you should really do some of this yourself, however, based on what you have detailed in your quesiton, you could use the approach below to count up any spells of absence, within a 6 month period.
Assumes you would be compiling this using SQL Server
declare #absences table (empid nvarchar(10), [abs date] date, [ret date] date);
declare #staff table ([empid] int, [name1] nvarchar(50), [name2] nvarchar(50), [surname] nvarchar(50));
-- put some test values in the staff table to work with
insert into #staff
values
(1, 'John', 'Lewis', 'Smith'), -- using a unique ID here, in any good system this should be an incremental number for each new staff member added to the table
(2, 'James', 'Thomas', 'Brown')
-- put some test values in the absences table to work with
insert into #absences
values
(1, '2019-07-01', '2019-07-04'), -- userid, absence date & return date
(1, '2019-08-04', '2019-08-06'),
(2, '2019-07-02', '2019-07-05'),
(2, '2019-08-05', '2019-08-07')
select count(*) spellsoff, empid, name1, name2, surname, [days absent]
from
(
select
s.empid,
s.name1,
s.name2,
s.surname,
a.[abs date],
a.[ret date],
datediff(d,a.[abs date], a.[ret date]) [days absent]
from #staff s
left join #absences a
on s.empid = a.empid
where [abs date] >= DATEADD(M,-6,GETDATE()) -- pull back those employeess that have been absent in the last 6 months from today's date
)doff
group by empid, name1, name2, surname, [days absent]
Gives you the following breakdown:
spellsoff empid name1 name2 surname days absent
1 1 John Lewis Smith 2
1 1 John Lewis Smith 3
1 2 James Thomas Brown 2
1 2 James Thomas Brown 3

Related

SQL Subquery for each

I have following tables
create table players
(
name varchar(30) not null primary key,
);
create table injuries
bId int not null primarykey,
date DATE not null,
name varchar(30),
foreign key(name) references players
);
create table sportsBegins
(
cId int not null primarykey,
date DATE,
sportname varchar(20),
name varchar(30)
foreign key(name) references players
);
Following example data:
players
name
John
Jane
George
shows players in db
sportsBegins
cId | date | sportname | name
1 2020-01-01 Basketball John
2 2020-02-02 Basketball John
3 2020-01-01 Soccer John
4 2020-02-02 Basketball Jane
5 2020-01-03 Basketball George
6 2020-01-04 Badminton George
shows what date players begin playing a sport
injuries
bId | date | name
1 2020-01-01 John
2 2020-02-03 Jane
3 2020-01-05 George
shows the date these players reported injuries.
I want to count the number of DISTINCT players that have experienced an injury in Basketball AFTER the first day they got assigned the sport (not the same day).
So for each player, i need to only grab the first date they started playing basketball. Then for that player, i need to compate his name AND date to the name AND date in the injuries table to see if he ever reported an injury after the date he got the sport assigned.
Example
In the example data I provided this would be the output
Total basketball injuries
2
Explanation of answer
John got assigned basketball twice. Only look at first date he got assigned basketball. Then look at injuries table. He only reported an injury on that day, but never after, so ignore. Jane and George reported injuries after first day assigned basketball so count them
This should get you the desired result
SELECT count(distinct injuries.name)
FROM injuries
INNER JOIN (SELECT name, min(date) as startDate FROM sportsBegins WHERE sportname = 'Basketball' GROUP BY name) as startDates ON injuries.name=startDates.name and injuries.date > startDates.startDate
Quick explanation:
startDates extracts the first date each player started playing basketball
the join condition filters only injuries which happened after the first start date for each player
count(distinct injuries.name) ensures each player only gets counted once even if he/she reported more than one injury after the first start date

Replace content in 'order' column with sequential numbers

I have an 'order' column in a table in a postgres database that has a lot of missing numbers in the sequence. I am having a problem figuring out how to replace the numbers currently in the column, with new ones that are incremental (see examples).
What I have:
id order name
---------------
1 50 Anna
2 13 John
3 2 Bruce
4 5 David
What I want:
id order name
---------------
1 4 Anna
2 3 John
3 1 Bruce
4 2 David
The row containing the lowest order number in the old version of the column should get the new order number '1', the next after that should get '2' etc.
You can use the window function row_number() to calculate the new numbers. The result of that can be used in an update statement:
update the_table
set "order" = t.rn
from (
select id, row_number() over (order by "order") as rn
from the_table
) t
where t.id = the_table.id;
This assumes that id is the primary key of that table.

PostgreSQL showing people that are born the same month

So let's say I have a table of:
Name Born
John 1994-01-01
John 1994-02-08
Jack 1995-03-09
Bob 1992-03-10
Tom 1995-07-13
Ronda 1984-01-25
And I want to make it that it only shows
John 1994-01-01
Ronda 1984-01-25
Jack 1995-03-09
Bob 1992-03-10
Because they are born in the same months.
I've tried different selects with EXTRACT and such but it doesn't seem to work for me:|
You can do this with window functions:
select t.*
from (select t.*,
count(*) over (partition by extract(month from born)) as cnt
from t
) t
where cnt > 1
order by extract(month from born);

How to calculate average number and give subquery label

I have two table "book" and "authorCollection". Because a book may have multi-authors, I hope to get the average number of authors in table "book" which published after year 2000(inclusive).
For example:
Table Book:
key year
1 2000
2 2001
3 2002
4 1999
Table authorCollection:
key author
1 Tom
1 John
1 Alex
1 Mary
2 Alex
3 Tony
4 Mary
The result should be (4 + 1 + 1) / 3 = 2;(key 4 publish before year 2000).
I write the following query statement, but not right, I need to get the number of result in subquery, but cannot give it a label "b", How can i solve this problem? And get the average number of author? I still confused about "COUNT(*) as count" meaning....Thanks.
SELECT COUNT(*) as count, b.COUNT(*) AS total
FROM A
WHERE key IN (SELECT key
FROM Book
WHERE year >= 2000
) b
GROUP BY key;
First, count number of authors for a key in a subquery. Next, aggregate needed values:
select avg(coalesce(ct, 0))
from book b
left join (
select key, count(*) ct
from authorcollection
group by 1
) a
using (key)
where year >= 2000;
A sample as well as handling 'divide by zero' error:
select case when count(distinct book.key)=0
then null
else count(authorCollection.key is not null)/count(distinct book.key)
end as avg_after_2000
from book
left join authorCollection on(book.key=authorCollection.key)
where book.year >= 2000

How to find the average of certain records T-SQL

I have a table variable that I am dumping data into:
DECLARE #TmpTbl_SKUs AS TABLE
(
Vendor VARCHAR (255),
Number VARCHAR(4),
SKU VARCHAR(20),
PurchaseOrderDate DATETIME,
LastReceivedDate DATETIME,
DaysDifference INT
)
Some records don't have a purchase order date or last received date, so the days difference is null as well. I have done a lot of inner joins on itself, but data seems to take too long, or comes out incorrect most of the time.
Is it possible to get the average per SKU days difference? how would I check if there is only 1 record of that SKU? I need the data, if there is only 1 record, then I have to find it at a champvendor level the average.
Here is the structure:
Vendor has many Numbers and Numbers has many SKUs
Any help would be great, I can't seem to crack this one, nor can I find anything related to this online. Thanks in advance.
Here is some sample data:
Vendor Number SKU PurchaseOrderDate LastReceivedDate DaysDifference
OTHER PMDD 1111 OP1111 2009-08-21 00:00:00.000 2009-09-02 00:00:00.000 12
OTHER PMDD 1111 OP1112 2009-12-09 00:00:00.000 2009-12-17 00:00:00.000 8
MANTOR 3333 MA1111 2006-02-15 00:00:00.000 2006-02-23 00:00:00.000 8
MANTOR 3333 MA1112 2006-02-15 00:00:00.000 2006-02-23 00:00:00.000 8
I'm sorry I may have written this wrong. If there is only 1 SKU for a record, then I want to return the DaysDifference (if it's not null), if it has more than 1 record and they are not null, then return the average days difference. If it is all nulls, then at a vendor level check for the average of the skus that are not null, otherwise it should just return 7. This is what I have tried:
SELECT t1.SKU, ISNULL
(
AVG(t1.DaysDifference),
(
SELECT ISNULL(AVG(t2.DaysDifference), 7)
FROM #TmpTbl_SKUs t2
WHERE t2.SKU=t1.SKU
GROUP BY t2.ChampVendor, t2.VendorNumber, t2.SKU
)
)
FROM #TmpTbl_SKUs t1
GROUP BY t1.SKU
Keep playing with this. I somewhat have what I got, but just don't understand how I would check if it has multiple records, and how to check at a vendor level.
Try this:
EDITED: added NULLIF(..., 0) to treat 0s as NULLs.
SELECT
t1.SKU,
COALESCE(
NULLIF(AVG(t1.DaysDifference), 0),
NULLIF(t2.AvgDifferenceVendor, 0),
7
) AS AvgDiff
FROM #TmpTbl_SKUs t1
INNER JOIN (
SELECT Vendor, AVG(DaysDifference) AS AvgDifferenceVendor
FROM #TmpTbl_SKUs
GROUP BY Vendor
) t2 ON t1.Vendor = t2.Vendor
GROUP BY t1.SKU, t2.AvgDifferenceVendor
EDIT 2: how I tested the script.
For testing I'm using the sample data posted with the question.
DECLARE #TmpTbl_SKUs AS TABLE
(
Vendor VARCHAR (255),
Number VARCHAR(4),
SKU VARCHAR(20),
PurchaseOrderDate DATETIME,
LastReceivedDate DATETIME,
DaysDifference INT
)
INSERT INTO #TmpTbl_SKUs
(Vendor, Number, SKU, PurchaseOrderDate, LastReceivedDate, DaysDifference)
SELECT 'OTHER PMDD', '1111', 'OP1111', '2009-08-21 00:00:00.000', '2009-09-02 00:00:00.000', 12
UNION ALL
SELECT 'OTHER PMDD', '1111', 'OP1112', '2009-12-09 00:00:00.000', '2009-12-17 00:00:00.000', 8
UNION ALL
SELECT 'MANTOR', '3333', 'MA1111', '2006-02-15 00:00:00.000', '2006-02-23 00:00:00.000', 8
UNION ALL
SELECT 'MANTOR', '3333', 'MA1112', '2006-02-15 00:00:00.000', '2006-02-23 00:00:00.000', 8;
First I'm running the script on the unmodified data. Here's the result:
SKU AvgDiff
-------------------- -----------
MA1111 8
MA1112 8
OP1111 12
OP1112 8
AvgDiff for every SKU is identical to the original DaysDifference for every SKU, because there's only one row per each one.
Now I'm changing DaysDifference for SKU='MA1111' to 0 and running the script again. Ther result is:
SKU AvgDiff
-------------------- -----------
MA1111 4
MA1112 8
OP1111 12
OP1112 8
Now AvgDiff for MA1111 is 4. Why? Because the average for the SKU is 0, and so the average by Vendor is taken, which has been calculated as (0 + 8) / 2 = 4.
Next step is to set DaysDifference to 0 for all the SKUs of the same Vendor. In this case I'm setting it for SKUs MA1111 and MA1112. Here's the result of the script for this change:
SKU AvgDiff
-------------------- -----------
MA1111 7
MA1112 7
OP1111 12
OP1112 8
So now AvgDiff is 7 for both MA1111 and MA1112. How has it become so? Both have DaysDifference = 0. That means that the average by Vendor should be taken for each one. But Vendor average is 0 too in this case. According to the requirement, the average here should default to 7, which is what the script has returned.
So the script seems to be working correctly. I understand that it's either me having missed something or you having forgotten to mention some details. In any case, I would be glad to see where this script fails to solve your problem.