t-sql subquery and groupby

t-sql subquery and groupby - tsql

Under the my query.I select this way all job count by fullname.
SELECT COUNT(sy.FullName) [Count Job],
sy.FullName [FullName],
MIN(CAST(i.vrp_notificationdate AS DATE)) [Oldest Date]
FROM BusinessUnit AS b
INNER JOIN SystemUser AS sy
ON b.BusinessUnitId = sy.BusinessUnitId
INNER JOIN Incident AS i
ON i.OwnerId = sy.SystemUserId
GROUP BY f.sy.FullName
This query show this table
---------------------------------
Count Job FullName Oldest Date
10 a 2011-10-11
20 B 2011-10-11
55 C 2011-10-11
---------------------------------
But i want to make under table for example.
--------------------------------------------------------------
Count Job FullName Oldest Date Open Job Close Job
10 A 2011-10-11 5 5
20 B 2011-10-11 13 7
55 C 2011-10-11 48 7
------------------------------------------------------------
I have status of columnname on my Incident Table,if status code is 5 that the job is closed.when i used group by condition statuscode,then table is under .And i dont want show this showing table.
---------------------------------
Count Job FullName Oldest Date
10 a 2011-10-11
13 B 2011-10-11
48 C 2011-10-11
7 B 2011-10-11
7 C 2011-10-11
---------------------------------
when i use union on my t-sql,i take this error "all queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists."
how to exactly solve this query.Any suggestion.
Thanks.

How about using CASE and SUM?
SELECT COUNT(sy.FullName) [Count Job],
sy.FullName [FullName],
MIN(CAST(i.vrp_notificationdate AS DATE)) [Oldest Date],
SUM(CASE i.status
WHEN 5 THEN 1
ELSE 0) [Open Jobs],
SUM(CASE i.status
WHEN 5 THEN 0
ELSE 1) [Closed Jobs]
FROM BusinessUnit AS b
INNER JOIN SystemUser AS sy
ON b.BusinessUnitId = sy.BusinessUnitId
INNER JOIN Incident AS i
ON i.OwnerId = sy.SystemUserId
GROUP BY f.sy.FullName

Related

Query to assign max date among child items to parent item

I've data in two Postgres tables as below
table1
wid w.name owner
1 abc own1
2 def own2
3 ghi own3
table2
vid wid vname date
9 1 vnam1 10-7-2020
10 1 vnam1 10-8-2018
11 1 vnam2 10-9-2019
12 1 vnam2 10-8-2020
13 2 vnam3 10-10-2017
14 2 vnam3 10-08-2020
15 2 vnam4 10-10-2018
16 2 vnam4 10-10-2019
17 3 vnam5 10-06-2016
18 3 vnam5 10-07-2020
19 3 vnam6 10-08-2020
I was able to get max date for each of the table2 vname related to w.name in table2 but I'm looking for something like this in the result so that I can decide each w.name max date.
wid w.name owner vname maxdate
1 abc own1 vnam2 10-08-2020 (Max date out of 4 values of vnames) <br>
2 def own2 vnam3 10-08-2020
3 ghi own3 vnam6 10-08-2020

Use DISTINCT ON to achieve this.
select distinct on (t1.wid)
t1.wid, t1."w.name", t1.owner, t2.vname, t2.date
from table1 t1
join table2 t2 on t2.wid = t1.wid
order by t1.wid, t2.date desc;
Working fiddle

How do i write a group by query in PostgreSQL

I'm getting errors with PostgreSQL when am writing a group by query,
am sure someone will tell me to put all the columns I've selected in group by, but that will not give me the correct results.
Am writing a query that will select all the vehicles in the database and group the results by vehicles, giving me the total distance and cost for a given period.
Here is how am doing the query.
SELECT i.vehicle AS vehicle,
i.costcenter AS costCenter,
i.department AS department,
SUM(i.quantity) AS liters,
SUM(i.totalcost) AS Totalcost,
v.model AS model,
v.vtype AS vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates::text LIKE '%2019-03%' AND i.deleted_at IS NULL
GROUP BY i.vehicle;
If I put all the columns that are in the select in the group bt, the results will not be correct.
How do i go about this without putting all the columns in group by and creating sub-queries?
The fuel table looks like:
vehicle dates department quantity totalcost
1 2019-01-01 102 12 1200
1 2019-01-05 102 15 1500
1 2019-01-13 102 18 1800
1 2019-01-22 102 10 1000
2 2019-01-01 102 12 1260
2 2019-01-05 102 19 1995
2 2019-01-13 102 28 2940
Vehicle Table
id model vtype
1 1 2
2 4 6
2 5 7
This is the results i expect from the query
vehicle dates department quantity totalcost model vtype
1 2019-01-01 102 12 1200 1 2
1 2019-01-05 102 15 1500 1 2
1 2019-01-13 102 18 1800 1 2
1 2019-01-22 102 10 1000 1 2
1 2019-01-18 102 10 1000 1 2
1 65 6500
2 2019-01-01 102 12 1260 5 7
2 2019-01-05 102 19 1995 5 7
2 2019-01-13 102 28 2940 5 7
1 45 6195

Your query doesn't really make sense. Apparently there can be multiple departments and costcenters per vehicle in the fuelissuances table - which of those should be returned?
One way to deal with that, is to return all of them, e.g. as an array:
SELECT i.vehicle,
array_agg(i.costcenter) as costcenters,
array_agg(i.department) as departments,
SUM(i.quantity) AS liters,
SUM(i.totalcost) AS Totalcost,
v.model,
v.vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates >= date '2019-03-01'
and i.date < date '2019-04-01'
AND i.deleted_at IS NULL
group by i.vehicle, v.model, v.vtype;
Instead of an array, you could also return a comma separated lists of those values, e.g. string_agg(i.costcenter, ',') as costcenters.
Adding the columns v.model and v.vtype won't (shouldn't) change anything as the group by i.vehicle will only return a single vehicle anyway and thus the model and vtype won't change for that in the group.
Note that I removed the useless aliases and replaced the condition on the date with a proper range condition that can make use of an index on the dates column.
Edit
Based on your new sample data, you want a running total, rather than a "regular" aggregation. This can easily be done using window functions
SELECT i.vehicle,
i.costcenter,
i.department,
SUM(i.quantity) over (w) AS liters,
SUM(i.totalcost) over (w) AS Totalcost,
v.model,
v.vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates >= date '2019-01-01'
and i.dates < date '2019-02-01'
AND i.deleted_at IS NULL
window w as (partition by i.vehicle order by i.dates)
order by i.vehicle, i.dates;
I would not create those "total" lines using SQL, but rather in your front end that display the data.
Online example: https://rextester.com/CRJZ27446

You need to use a nested query to get those SUM you want inside that query.
SELECT i.vehicle AS vehicle,
i.costcenter AS costCenter,
i.department AS department,
(SELECT SUM(i.quantity) FROM TABLES WHERE CONDITIONS GROUP BY vehicle) AS liters,
(SELECT SUM(i.totalcost) FROM TABLES WHERE CONDITIONS GROUP BY vehicle) AS Totalcost,
v.model AS model,
v.vtype AS vtype
FROM fuelissuances AS i
LEFT JOIN vehicles AS v ON i.vehicle = v.id
WHERE i.dates::text LIKE '%2019-03%' AND i.deleted_at IS NULL;

PostgreSQL window function & difference between dates

Suppose I have data formatted in the following way (FYI, total row count is over 30K):
customer_id order_date order_rank
A 2017-02-19 1
A 2017-02-24 2
A 2017-03-31 3
A 2017-07-03 4
A 2017-08-10 5
B 2016-04-24 1
B 2016-04-30 2
C 2016-07-18 1
C 2016-09-01 2
C 2016-09-13 3
I need a 4th column, let's call it days_since_last_order which, in the case where order_rank = 1 then 0 else calculate the number of days since the previous order (with rank n-1).
So, the above would return:
customer_id order_date order_rank days_since_last_order
A 2017-02-19 1 0
A 2017-02-24 2 5
A 2017-03-31 3 35
A 2017-07-03 4 94
A 2017-08-10 5 38
B 2016-04-24 1 0
B 2016-04-30 2 6
C 2016-07-18 1 79
C 2016-09-01 2 45
C 2016-09-13 3 12
Is there an easier way to calculate the above with a window function (or similar) rather than join the entire dataset against itself (eg. on A.order_rank = B.order_rank - 1) and doing the calc?
Thanks!

use the lag window function
SELECT
customer_id
, order_date
, order_rank
, COALESCE(
DATE(order_date)
- DATE(LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date))
, 0)
FROM <table_name>

TSQL Calculating daily sum across records with overlapping date ranges

I have 3 tables, 1 (PortfolioInstrument) holds instruments (Instrument) held in a portoflio with the holding (Holding) across a date range (DateAdded, DateRemoved).
Another (Price) holds daily (TradeDate) closing prices ([Close]) for each instrument ( Instrument).
A 3rd may be useful, (CalcDate) holds the dates (CalcDate) that we re-calculate the holdings and add and delete instruments from the portfolio.
SELECT SUM([Close]*Holding), TradeDate
FROM Price p1 INNER JOIN PortfolioInstrument pio ON pio.Instrument = p1.Instrument
AND pio.Portfolio = 3
WHERE EXISTS (SELECT TradeDate FROM Price p
INNER JOIN PortfolioInstrument pi ON pi.Instrument = p.Instrument AND Portfolio = 3
WHERE TradeDate >= pi.DateAdded AND
(TradeDate < pi.DateRemoved OR pi.DateRemoved IS NULL)
AND p1.ID = p.ID GROUP BY TradeDate) GROUP BY TradeDate
Here is a sample of the PortfolioInstrument data set
ID Portfolio Instrument Holding DateAdded DateRemoved
16256 3 410 714.28571 2007-10-01 00:00:00.0 2007-11-01 00:00:00.0
16257 3 611 564.97174 2007-10-01 00:00:00.0 2007-11-01 00:00:00.0
16258 3 538 1,797.75281 2007-10-01 00:00:00.0 2007-11-01 00:00:00.0
...
16302 3 5352 1,067,319.75 2008-02-01 00:00:00.0 2008-04-01 00:00:00.0
16303 3 5353 1,057,800.875 2008-02-01 00:00:00.0 2008-04-01 00:00:00.0
16304 3 11952 0 2008-02-29 00:00:00.0 2008-04-01 00:00:00.0
16305 3 11952 261,484,400 2008-04-01 00:00:00.0 2008-05-01 00:00:00.0
...
16315 3 8374 14,199.99902 2009-01-30 00:00:00.0 <null>
16316 3 11952 246,102,960 2009-01-30 00:00:00.0 2009-02-27 00:00:00.0
16317 3 11952 246,148,912 2009-02-27 00:00:00.0 2009-04-01 00:00:00.0
The problem with this is that it includes all Holdings that have a DateRemoved < TradeDate so there is a jump each re-calculation date where they should get removed from the set. Had a look at various DateDiff methods on Stackoverflow but cannot work out how to group using them in this case. Also note that the cash instrument (Instrument = 11952) comes into the portfolio at some point and then gets an entry for every month thereafter, as you can see it reduces to 0 for some months, this should not matter I think in the SQL produced.
Thx.
David

It is not very clear why you are using another instance of the same join like that. If you want to exclude particular holdings where DateRemoved <= TradeDate, you could just check that directly in the WHERE clause:
SELECT SUM(p1.[Close]*pio.Holding), TradeDate
FROM Price p1
INNER JOIN PortfolioInstrument pio
ON pio.Instrument = p1.Instrument AND pio.Portfolio = 3
WHERE p1.TradeDate >= pio.DateAdded
AND (p1.TradeDate < pio.DateRemoved OR pio.DateRemoved IS NULL)
GROUP BY p1.TradeDate
;
However, if you want to discard an entire group of same TradeDate rows in which at least one row satisfies the condition DateRemoved <= TradeDate, you could use a HAVING clause, like this:
SELECT SUM(p1.[Close]*pio.Holding), TradeDate
FROM Price p1
INNER JOIN PortfolioInstrument pio
ON pio.Instrument = p1.Instrument AND pio.Portfolio = 3
GROUP BY p1.TradeDate
HAVING COUNT(CASE WHEN p1.TradeDate <= pio.DateRemoved) THEN 1 END) = 0
;
Unlike a WHERE clause, which applies to individual rows, HAVING is evaluated for a group of rows. In this case, the COUNT() function is used to count how many rows in the group have p1.TradeDate <= pio.DateRemoved. If there's at least one, the group will be discarded from the output, because the requirement that I'm assuming here is that there be no such rows.

Could not find a way forward with this, ended up pulling in the raw rows and doing the calculations in the code.

Retrieve information dynamically from multiple CTE

I have multiple CTEs and I want to retrieve some information from a couple of them into next CTE.
So, I have this information from one of the CTEs:
PeriodID StarDate
1 2006-01-01
2 2007-04-25
3 2008-08-16
4 2009-12-08
5 2011-04-017
and this from other:
RecordID Date
100 2007-04-15
101 2008-05-21
102 2008-06-06
103 2008-07-01
104 2009-11-12
And I need to show in next one:
RecordID Date PeriodID
100 2007-04-15 1
101 2008-05-21 2
102 2008-06-06 2
103 2008-07-01 2
104 2009-11-12 3
I can use some case/when statement to define if date of record is in period 1,2,3,4 or 5 but it some situation I can have different numbers of periods return from the first CTE.
Is there a way to do this in the above context?

You can have multiple CTEs defined as follows, and then select from and join them as you would any other table.
with cte1 as (select * ...),
cte2 as (select * ...)
select
cte2.*,
periodid
from cte2
cross apply
(select top 1 * from cte1 where cte2.recorddate> cte1.startdate order by startdate desc) v