DateDiff Missing few records - tsql

I am using the datediff function
SELECT stName
,stId
,stDob --(varchar(15))
,stJoinDt --(datetime)
FROM student stu
WHERE
DATEDIFF(yy,stu.stDob,stu.stJoinDt) between 18 and 75
Since the between operator is not effective I have also changed the code to
SELECT stName
,stId
,stDob
,stJoinDt
FROM student stu
WHERE
DATEDIFF(yy,stu.stDob,stu.stJoinDt) >= 18
AND DATEDIFF(yy,stu.stDob,stu.stJoinDt) < 75
Is there any other effective way to write datediff to capture all the missing records?
The missing records are
stDob stJoinDt
10/08/1925 2011-01-03
04/18/1935 2011-01-19
12/11/1928 2011-06-06
1/24/1927 2011-04-18
04/18/1918 2011-04-20

Those records should be missing because the number of years between stDob and stJoinDt is not between 18 and 75, as you are filtering them out with your condition that stDob and stJoinDt differ by between 18 and 75 years:
with student as (
select 'Bob' as stName, 1 as stId, '10/08/1925' as stDob, '2011-01-03' as stJoinDt
union select 'Bob' as stName, 2 as stId, '04/18/1935', '2011-01-19'
union select 'Bob' as stName, 3 as stId, '12/11/1928', '2011-06-06'
union select 'Bob' as stName, 4 as stId, '1/24/1927 ', '2011-04-18'
union select 'Bob' as stName, 5 as stId, '04/18/1918', '2011-04-20'
)
SELECT stName
,stId
,stDob --(varchar(15))
,stJoinDt --(datetime)
,datediff(yy, stu.stDob, stu.stJoinDt) as DiffYears
FROM student stu
Output:
stName stId stDob stJoinDt DiffYears
Bob 1 10/08/1925 2011-01-03 *86* (>75)
Bob 2 04/18/1935 2011-01-19 *76* (>75)
Bob 3 12/11/1928 2011-06-06 *83* (>75)
Bob 4 1/24/1927 2011-04-18 *84* (>75)
Bob 5 04/18/1918 2011-04-20 *93* (>75)
My guess would be you were wanting to capture all records where the person is at least 18 years old. In that case, remove the 75 part from the filter:
WHERE
DATEDIFF(yy,stu.stDob,stu.stJoinDt) >= 18
-- STOP HERE
Although technically this does not perform the correct calculation, because it is only finding the difference in the year values and not taking into account day and month. For instance, a date-of-birth of 12/31/1990 and a join date of 1/1/2008 would register as 18 years even though the person is only 17 years, 1 day old. I would recommend instead using the solution provided in this question:
where
(DATEDIFF(YY, stu.stDob, stu.stJoinDt) -
CASE WHEN(
(MONTH(stDob)*100 + DAY(stDob)) > (MONTH(stJoinDt)*100 + DAY(stJoinDt))
) THEN 1 ELSE 0 END
) >= 18

Related

How do i write a group by query in PostgreSQL

I'm getting errors with PostgreSQL when am writing a group by query,
am sure someone will tell me to put all the columns I've selected in group by, but that will not give me the correct results.
Am writing a query that will select all the vehicles in the database and group the results by vehicles, giving me the total distance and cost for a given period.
Here is how am doing the query.
SELECT i.vehicle AS vehicle,
i.costcenter AS costCenter,
i.department AS department,
SUM(i.quantity) AS liters,
SUM(i.totalcost) AS Totalcost,
v.model AS model,
v.vtype AS vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates::text LIKE '%2019-03%' AND i.deleted_at IS NULL
GROUP BY i.vehicle;
If I put all the columns that are in the select in the group bt, the results will not be correct.
How do i go about this without putting all the columns in group by and creating sub-queries?
The fuel table looks like:
vehicle dates department quantity totalcost
1 2019-01-01 102 12 1200
1 2019-01-05 102 15 1500
1 2019-01-13 102 18 1800
1 2019-01-22 102 10 1000
2 2019-01-01 102 12 1260
2 2019-01-05 102 19 1995
2 2019-01-13 102 28 2940
Vehicle Table
id model vtype
1 1 2
2 4 6
2 5 7
This is the results i expect from the query
vehicle dates department quantity totalcost model vtype
1 2019-01-01 102 12 1200 1 2
1 2019-01-05 102 15 1500 1 2
1 2019-01-13 102 18 1800 1 2
1 2019-01-22 102 10 1000 1 2
1 2019-01-18 102 10 1000 1 2
1 65 6500
2 2019-01-01 102 12 1260 5 7
2 2019-01-05 102 19 1995 5 7
2 2019-01-13 102 28 2940 5 7
1 45 6195
Your query doesn't really make sense. Apparently there can be multiple departments and costcenters per vehicle in the fuelissuances table - which of those should be returned?
One way to deal with that, is to return all of them, e.g. as an array:
SELECT i.vehicle,
array_agg(i.costcenter) as costcenters,
array_agg(i.department) as departments,
SUM(i.quantity) AS liters,
SUM(i.totalcost) AS Totalcost,
v.model,
v.vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates >= date '2019-03-01'
and i.date < date '2019-04-01'
AND i.deleted_at IS NULL
group by i.vehicle, v.model, v.vtype;
Instead of an array, you could also return a comma separated lists of those values, e.g. string_agg(i.costcenter, ',') as costcenters.
Adding the columns v.model and v.vtype won't (shouldn't) change anything as the group by i.vehicle will only return a single vehicle anyway and thus the model and vtype won't change for that in the group.
Note that I removed the useless aliases and replaced the condition on the date with a proper range condition that can make use of an index on the dates column.
Edit
Based on your new sample data, you want a running total, rather than a "regular" aggregation. This can easily be done using window functions
SELECT i.vehicle,
i.costcenter,
i.department,
SUM(i.quantity) over (w) AS liters,
SUM(i.totalcost) over (w) AS Totalcost,
v.model,
v.vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates >= date '2019-01-01'
and i.dates < date '2019-02-01'
AND i.deleted_at IS NULL
window w as (partition by i.vehicle order by i.dates)
order by i.vehicle, i.dates;
I would not create those "total" lines using SQL, but rather in your front end that display the data.
Online example: https://rextester.com/CRJZ27446
You need to use a nested query to get those SUM you want inside that query.
SELECT i.vehicle AS vehicle,
i.costcenter AS costCenter,
i.department AS department,
(SELECT SUM(i.quantity) FROM TABLES WHERE CONDITIONS GROUP BY vehicle) AS liters,
(SELECT SUM(i.totalcost) FROM TABLES WHERE CONDITIONS GROUP BY vehicle) AS Totalcost,
v.model AS model,
v.vtype AS vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates::text LIKE '%2019-03%' AND i.deleted_at IS NULL;

PostgreSQL: calculation between first and last

I'm trying to write a query that calculates the number of days between the first and last score per id.
The data sample:
id date score
11 1/1/2017 25.34
4 1/2/2017 34.34
25 1/2/2017 15.78
4 3/2/2017 47.2
25 7/3/2017 65.21
11 9/3/2017 96.09
25 10/3/2017 11.3
4 10/3/2017 27.12
Which is far from what I need, but I'm really lost. Clueless to be honest. Any idea?
Thanks
Try this:
SELECT
customer_id,
date(last_score) - date(first_score) AS days_between_last_and_first_score,
total_score::float/(date(last_score) - date(first_score)) AS score_per_day
FROM
(
select customer_id,
MAX(date(purchase_date)) as last_score,
MIN(date(purchase_date)) as first_score,
SUM(score) AS total_score
FROM candidate_test_q1
group by customer_id
) AS sub_query

How can I evaluate data over time in Postgresql?

I need to find users who have posted three times or more, three months in a row. I wrote this query:
select count(id), owneruserid, extract(month from creationdate) as postmonth from posts
group by owneruserid, postmonth
having count(id) >=3
order by owneruserid, postmonth
And I get this:
count owneruserid postmonth
36 -1 1
23 -1 2
45 -1 3
41 -1 4
18 -1 5
24 -1 6
31 -1 7
78 -1 8
83 -1 9
17 -1 10
88 -1 11
127 -1 12
3 6 11
3 7 12
4 8 1
8 8 12
4 12 4
3 12 5
3 22 2
4 22 4
(truncated)
Which is great. How can I query for users who posted three times or more, three months or more in a row? Thanks.
This is called the Islands and Gaps problem, specifically it's an Island problem with a date range. You should,
Fix this question up.
Flag it to be sent to dba.stackexchange.com
To solve this,
Create a pseudo column with a window that has 1 if the row preceding it does not correspond to the preceding mont
Create groups out of that with COUNT()
Check to make sure the count(*) for the group is greater than or equal to three.
Query,
SELECT l.id, creationdaterange, count(*)
FROM (
SELECT t.id,
t.creationdate,
count(range_reset) OVER (PARTITION BY t.id ORDER BY creationdate) AS creationdaterange
FROM (
SELECT id,
creationdate,
CASE
WHEN date_trunc('month',creationdate::date)::date - interval '1 month' = date_trunc('month',lag(creationdate))::date OVER (PARTITION BY id ORDER BY creationdate)
THEN 1
END AS range_reset
FROM post
ORDER BY id, creationdate
) AS t;
) AS l
GROUP BY t.id, creationdaterange
HAVING count(*) >= 3;

How do I select the min opendate from a list of duplicates?

I have 3 columns. SSN|AccountNumber|OpenDate
1 SSN may have multiple AccountNumbers
Each AccountNumber has a corresponding OpenDate
In my list I have many SSN's, each containing several account numbers which may have been opened on different days.
I want the results of my query to be SSN|earlest OpenDate|AccountNumber that corresponds with the earliest opendate.
I'm dealing with about 200,000 records.
EDIT: First I did
select SSN, min(OpenDate), AcctNumber from Table Group By SSN, AccountNumber
but that didn't quite give me the correct data.
The raw data gives me something like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 Jan
10 102 Feb
10 103 Mar
Where I got 10, Jan, and AccNumber 102 which is not the account number that is associated with Jan OpenDate After looking at others, I found that the account number I got was just one of the account numbers associated with that SSN rather than the one that corresponds with the min(OpenDate)
WITH CTE AS ( SELECT SSN, AcctNumber, OpenDate, ROW_NUM() OVER (PARTITION BY SSN ORDER BY OpenDate DESC) AS RN ) SELECT SSN, AcctNumber, OpenDate FROM CTE WHERE RN=1;
If your table is like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 101 May
10 102 April
20 201 June
20 201 July
Do you want your query to return this?
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 102 April
20 201 June
Then you would use this query:
select ssn, min(OpenDate), acctNumber from tbl group by ssn, acctNumber
You can try this..
select SSN , AcctNumber, OpenDate
from (SELECT SSN , AcctNumber, OpenDate
, ROW_NUMBER() OVER ( PARTITION BY SSN, ORDER BY OpenDate ASC ) AS RN
FROM table) AS temp
WHERE temp.RN= 1

How do I add totals/subtotals to a set of results without grouping the row data?

I'm constructing a SQL query for a business report. I need to have both subtotals (grouped by file number) and grand totals on the report.
I'm entering unknown SQL territory, so this is a bit of a first attempt. The query I made is almost working. The only problem is that the entries are being grouped -- I need them separated in the report.
Here is my sample data:
FileNumber Date Cost Charge
3 Dec 22/09 5 10
3 Jan 13/10 6 15
3B Mar 28/10 1 3
3B Mar 28/10 5 10
When I run this query
SELECT
CASE
WHEN (GROUPING(FileNumber) = 1) THEN NULL
ELSE FileNumber
END AS FileNumber,
CASE
WHEN (GROUPING(Date) = 1) THEN NULL
ELSE Date
END AS Date,
SUM(Cost) AS Cost,
SUM(Charge) AS Charge
FROM SubtotalTesting
GROUP BY FileNumber, Date WITH ROLLUP
ORDER BY
(CASE WHEN FileNumber IS NULL THEN 1 ELSE 0 END), -- Put NULLs after data
FileNumber,
(CASE WHEN Date IS NULL THEN 1 ELSE 0 END), -- Put NULLs after data
Date
I get the following:
FileNumber Date Cost Charge
3 Dec 22/09 5 10
3 Jan 13/10 6 15
3 NULL 11 25
3B Mar 28/10 6 13 <--
3B NULL 6 13
NULL NULL 17 38
What I want is:
FileNumber Date Cost Charge
3 Dec 22/09 5 10
3 Jan 13/10 6 15
3 NULL 11 25
3B Mar 28/10 1 3 <--
3B Mar 28/10 5 10 <--
3B NULL 6 13
NULL NULL 17 38
I can clearly see why the entries are being grouped, but I have no idea how to separate them while still returning the subtotals and grand total.
I'm a bit green when it comes to doing advanced SQL queries like this, so if I'm taking the wrong approach to the problem by using WITH ROLLUP, please suggest some preferred alternatives -- you don't have to write the whole query for me, I just need some direction. Thanks!
WITH SubtotalTesting (FileNumber, Date, Cost, Charge) AS
(
SELECT '3', CAST('2009-22-12' AS DATETIME), 5, 10
UNION ALL
SELECT '3', '2010-13-06', 6, 15
UNION ALL
SELECT '3B', '2010-28-03', 1, 3
UNION ALL
SELECT '3B', '2010-28-03', 5, 10
),
q AS (
SELECT *,
ROW_NUMBER() OVER (ORDER BY filenumber) AS rn
FROM SubTotalTesting
)
SELECT rn,
CASE
WHEN (GROUPING(FileNumber) = 1) THEN NULL
ELSE FileNumber
END AS FileNumber,
CASE
WHEN (GROUPING(Date) = 1) THEN NULL
ELSE Date
END AS Date,
SUM(Cost) AS Cost,
SUM(Charge) AS Charge
FROM q
GROUP BY
FileNumber, Date, rn WITH ROLLUP
HAVING GROUPING(rn) <= GROUPING(Date)
ORDER BY
(CASE WHEN FileNumber IS NULL THEN 1 ELSE 0 END),
FileNumber,
(CASE WHEN Date IS NULL THEN 1 ELSE 0 END),
Date