TSQL Conditional joins & referencing table dimensions - tsql

I'm trying to find out how long a sales opportunity lasts in each one of 5 different Stages. The problem consists of flattening data in salesforce. The current structure is below:
Table: OppHistory
TaskID | OppID | STAGE | IQ | DISCO | TECH | COMMERCIAL | CLOSE
1 Op1 IQ 5 0 0 0 0
2 Op1 DISCO 0 10 0 0 0
3 Op1 TECH 0 0 15 0 0
4 Op1 COMM 0 0 0 5 0
5 Op2 IQ 2 0 0 0 0
6 Op2 CLOSE 0 0 0 0 3
7 Op3 IQ 3 0 0 0 0
The above table has ~ 417k rows. There have only ever been 49k individual opportunities created, thus 49k is the target row count for my table.
I've tried the below joins. I feel as if I'm on the 1 yd line just need a push
The below gives me a table with 39M rows
SELECT a.OpportunityID, b.IQ, c.Disco, d.Tech, e.Commercial, f.Close
FROM OppHistory a
JOIN OppHistory b ON a.OppID = b.OppID
JOIN OppHistory c ON b.OppID = c.OppID
JOIN OppHistory d ON c.OppID = d.OppID
JOIN OppHistory e ON d.OppID = e.OppID
JOIN OppHistory f ON e.OppID = f.OppID
The below gives me a table with ~50k rows, however, only a select few opportunities are included, and each OppID is repeated around 500 times
SELECT DISTINCT a.OpportunityID, b.IQ, c.Disco, d.Tech, e.Commercial,
f.Close
FROM OppHistory a
JOIN OppHistory b ON a.OppID = b.OppID
JOIN OppHistory c ON b.OppID = c.OppID
JOIN OppHistory d ON c.OppID = d.OppID
JOIN OppHistory e ON d.OppID = e.OppID
JOIN OppHistory f ON e.OppID = f.OppID
The below gives me 0 results in my table
SELECT b.OppID, Duration AS IQDuration, Duration AS DiscoDuration,
Duration AS TEDuration, Duration AS CommercialsDuration, Duration AS
ClosedDuration
FROM OppHistory b,(
SELECT q.IQ AS Duration FROM OppHistory q
UNION
SELECT d.Disco AS Duration FROM OppHistory d
UNION
SELECT t.Tech AS Duration FROM OppHistory t
UNION
SELECT c.Comm AS Duration FROM OppHistory c
UNION
SELECT l.Closed AS Duration FROM OppHistory l
)a
What I need is to have the duration per individual stage in separate columns for distinct opportunity:
OppID |IQ | Disco | TechEval | Commercial | Close
Op1 5 10 15 5 0
Op2 2 0 0 0 3
Op3 3 0 0 0 0

I think you're after this:
SELECT OppID, SUM(IQ) AS IQ, SUM(Disco) AS Disco, SUM(Tech) AS Tech, SUM(Commercial) AS Commercial, SUM(Close) AS Close
FROM OppHistory
GROUP BY OppID

Related

Conditional Counting record in PostgreSQL

I have a table such as the following
SP MA SL NG
jame j001 1 20200715 |
jame j001 -1 20200715 | -> count is 0
pink p002 3 20200730 }
pink p002 -3 20200730 } => count is 0
jack j002 12 20200731 | => count is 1
jack j002 -2 20200731 |
jack j002 12 20200801 } => count is 1
I want to count record and I want a result like:
SP count
jame 0
pink 0
jack 2
I could do with some help, please. Thanks you!
How the result is to be reached:
If SP, MA ,NG is the same then sum to SL.
Sum is 0 then count is 0,SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
As i understood your requirements
If SP, MA, NG is the same then sum to SL.
List item Sum is 0 then count is 0 SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
Try below Query:
with cte as (
select sp,ma,ng,sum(sl) from example group by sp,ma,ng having sum(sl)>0
),
cte1 as (
select distinct sp from example
)
select
t1.sp,
sum(case when sum>0 then 1 else 0 end)
from cte1 t1 left join cte t2 on t1.sp=t2.sp
group by t1.sp
Demo on Fiddle

sum of pivoted columns into new column and insert all records to a temporary table

i want to get sum of pivoted column values to new column and insert output records to a temporary table
select * from (select v.JobNo as JobNo,aj.VehicleNumber as VehicleNo,isnull(g.ImageCount,0) as ImageCount,s.ParamKey as ImageType from dbo.Visits v inner join (select VisitId as visit,paramkey,Value from dbo.VisitParams) s on s.visit = v.visitid left outer join ( select VisitId,FieldId, ( COUNT(*)) as ImageCount from dbo.vw_ImageGallery group by FieldId,VisitId) g on s.visit = g.VisitId and g.FieldId = s.ParamKey inner join Users u on u.UserId = v.CreatedBy inner join AssignedJobs aj on aj.CSRCode = u.Code and aj.JobNumber = v.JobNo where v.VisitType = 1 and v.TimeVisited >= '2019-03-01' AND v.TimeVisited <= '2019-04-01' )as a
PIVOT ( max([ImageCount]) FOR [ImageType] IN ([5],[20],[21]) ) as pvt order by [JobNo]
my actual out put is
job no vehicleno 1 2 5
---------------------------------------------------------
BL1052385 648792 0 8 0
BL1054161 CAT2410 2 8 0
BL1107290 NB 0134 0 5 0
BL1174714 GP 3784 1 7 3
i expect the output like
job no vehicleno 1 2 5 Total Count
----------------------------------------------------------
BL1052385 648792 0 8 0 8
BL1054161 CAT2410 2 8 0 10
BL1107290 NB 0134 0 5 0 5
BL1174714 GP 3784 1 7 3 11
I prefer using conditional aggregation rather than pivot. It is more flexible:
select v.JobNo, aj.VehicleNumber,
sum(case when vp.ParamKey = 1 then g.ImageCount else 0 end) as imagetype_1,
sum(case when vp.ParamKey = 20 then g.ImageCount else 0 end) as imagetype_20,
sum(case when vp.ParamKey = 21 then g.ImageCount else 0 end) as imagetype_21,
count(*) as total
from dbo.Visits v join
dbo.VisitParams vp
on vp.visit = v.visitid join
Users u
on u.UserId = v.CreatedBy join
AssignedJobs aj
on aj.CSRCode = u.Code and
aj.JobNumber = v.JobNo left outer join
(select VisitId, FieldId, count(*) as ImageCount
from dbo.vw_ImageGallery ig
group by FieldId, VisitId
) g
on vp.visit = g.VisitId and
g.FieldId = s.ParamKey
where v.VisitType = 1 and
v.TimeVisited >= '2019-03-01' and
v.TimeVisited <= '2019-04-01' and
vp.ParamKey in (5, 20, 21)
group by v.JobNo, aj.VehicleNumber;
order by v.JobNo

Refer to current row in window function

Is it possible to refer to the current row in a window partition? I want to do something like the following:
SELECT min(ABS(variable - CURRENT.variable)) over (order by criterion RANGE UNBOUNDED PRECEDING)
That is, i want to find in the given partition the variable which is closest to the current value. Is is possible to do something like that?
As an example, from:
criterion | variable
1 2
2 4
3 2
4 7
5 6
We would obtain:
null
2
0
3
1
Thanks
As far as I know, this cannot be done with window functions.
But it can be done with a self join:
SELECT a.id,
a.variable,
min(abs(a.variable - b.variable))
FROM mydata a
LEFT JOIN mydata b
ON (b.criterion < a.criterion)
GROUP BY a.id, a.variable
ORDER BY a.id;
If I understand correctly:
with t (v) as (values (-5),(-2),(0),(1),(3),(10))
select v,
least(
v - lag(v) over (order by v),
lead(v) over (order by v) - v
) as closest
from t
;
v | closest
----+---------
-5 | 3
-2 | 2
0 | 1
1 | 1
3 | 2
10 | 7
Hope this could help you (pay attention for performance problems).
I tried this in MSSQL (at bottom you'll find POSTGRESQL version):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
CROSS APPLY (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
Output:
CRITERION MIN_DELTA
----------- -----------
2 2
3 0
4 3
5 1
POSTGRESQL Version (tested on Rextester http://rextester.com/VMGJ87600):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT * FROM TX;
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
LEFT JOIN LATERAL (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B ON TRUE
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
DROP TABLE TX;
Output:
criterion variabile
1 1 2
2 2 4
3 3 2
4 4 7
5 5 6
criterion min_delta
1 1 NULL
2 2 2
3 3 0
4 4 3
5 5 1

How to insert row data between consecutive dates in HIVE?

Sample Data:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 4-Jan-17 1
A 5-Jan-17 0
B 3-Jan-17 1
B 5-Jan-17 0
Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:
Output should be:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 3-Jan-17 0 (inserted)
A 4-Jan-17 1
A 5-Jan-17 0
B 1-Jan-17 0 (inserted)
B 2-Jan-17 0 (inserted)
B 3-Jan-17 1
B 4-Jan-17 0 (inserted)
B 5-Jan-17 0
select c.customer
,d.txn_date
,coalesce(t.tag,0) as tag
from (select date_add (from_date,i) as txn_date
from (select date '2017-01-01' as from_date
,date '2017-01-05' as to_date
) p
lateral view
posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
) d
cross join (select distinct
customer
from t
) c
left join t
on t.customer = c.customer
and t.txn_date = d.txn_date
;
c.customer d.txn_date tag
A 2017-01-01 1
A 2017-01-02 1
A 2017-01-03 0
A 2017-01-04 1
A 2017-01-05 0
B 2017-01-01 0
B 2017-01-02 0
B 2017-01-03 1
B 2017-01-04 0
B 2017-01-05 0
Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.
Then use the load data command to insert this records into the table.
load data local inpath '/tmp/input.txt' into table tablename;
Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

TSQL A recursive update?

I'm wondering if exists a recursive update in tsql (CTE)
ID parentID value
-- -------- -----
1 NULL 0
2 1 0
3 2 0
4 3 0
5 4 0
6 5 0
I it possible to update the column value recursively using e.g CTE from ID = 6 to the top most row ?
Yes, it should be. MSDN gives an example:
USE AdventureWorks;
GO
WITH DirectReports(EmployeeID, NewVacationHours, EmployeeLevel)
AS
(SELECT e.EmployeeID, e.VacationHours, 1
FROM HumanResources.Employee AS e
WHERE e.ManagerID = 12
UNION ALL
SELECT e.EmployeeID, e.VacationHours, EmployeeLevel + 1
FROM HumanResources.Employee as e
JOIN DirectReports AS d ON e.ManagerID = d.EmployeeID
)
UPDATE HumanResources.Employee
SET VacationHours = VacationHours * 1.25
FROM HumanResources.Employee AS e
JOIN DirectReports AS d ON e.EmployeeID = d.EmployeeID;