TSQL Conditional joins & referencing table dimensions

TSQL Conditional joins & referencing table dimensions - tsql

I'm trying to find out how long a sales opportunity lasts in each one of 5 different Stages. The problem consists of flattening data in salesforce. The current structure is below:
Table: OppHistory
TaskID | OppID | STAGE | IQ | DISCO | TECH | COMMERCIAL | CLOSE
1 Op1 IQ 5 0 0 0 0
2 Op1 DISCO 0 10 0 0 0
3 Op1 TECH 0 0 15 0 0
4 Op1 COMM 0 0 0 5 0
5 Op2 IQ 2 0 0 0 0
6 Op2 CLOSE 0 0 0 0 3
7 Op3 IQ 3 0 0 0 0
The above table has ~ 417k rows. There have only ever been 49k individual opportunities created, thus 49k is the target row count for my table.
I've tried the below joins. I feel as if I'm on the 1 yd line just need a push
The below gives me a table with 39M rows
SELECT a.OpportunityID, b.IQ, c.Disco, d.Tech, e.Commercial, f.Close
FROM OppHistory a
JOIN OppHistory b ON a.OppID = b.OppID
JOIN OppHistory c ON b.OppID = c.OppID
JOIN OppHistory d ON c.OppID = d.OppID
JOIN OppHistory e ON d.OppID = e.OppID
JOIN OppHistory f ON e.OppID = f.OppID
The below gives me a table with ~50k rows, however, only a select few opportunities are included, and each OppID is repeated around 500 times
SELECT DISTINCT a.OpportunityID, b.IQ, c.Disco, d.Tech, e.Commercial,
f.Close
FROM OppHistory a
JOIN OppHistory b ON a.OppID = b.OppID
JOIN OppHistory c ON b.OppID = c.OppID
JOIN OppHistory d ON c.OppID = d.OppID
JOIN OppHistory e ON d.OppID = e.OppID
JOIN OppHistory f ON e.OppID = f.OppID
The below gives me 0 results in my table
SELECT b.OppID, Duration AS IQDuration, Duration AS DiscoDuration,
Duration AS TEDuration, Duration AS CommercialsDuration, Duration AS
ClosedDuration
FROM OppHistory b,(
SELECT q.IQ AS Duration FROM OppHistory q
UNION
SELECT d.Disco AS Duration FROM OppHistory d
UNION
SELECT t.Tech AS Duration FROM OppHistory t
UNION
SELECT c.Comm AS Duration FROM OppHistory c
UNION
SELECT l.Closed AS Duration FROM OppHistory l
)a
What I need is to have the duration per individual stage in separate columns for distinct opportunity:
OppID |IQ | Disco | TechEval | Commercial | Close
Op1 5 10 15 5 0
Op2 2 0 0 0 3
Op3 3 0 0 0 0

I think you're after this:
SELECT OppID, SUM(IQ) AS IQ, SUM(Disco) AS Disco, SUM(Tech) AS Tech, SUM(Commercial) AS Commercial, SUM(Close) AS Close
FROM OppHistory
GROUP BY OppID

Related

Conditional Counting record in PostgreSQL

I have a table such as the following
SP MA SL NG
jame j001 1 20200715 |
jame j001 -1 20200715 | -> count is 0
pink p002 3 20200730 }
pink p002 -3 20200730 } => count is 0
jack j002 12 20200731 | => count is 1
jack j002 -2 20200731 |
jack j002 12 20200801 } => count is 1
I want to count record and I want a result like:
SP count
jame 0
pink 0
jack 2
I could do with some help, please. Thanks you!
How the result is to be reached:
If SP, MA ,NG is the same then sum to SL.
Sum is 0 then count is 0,SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.

As i understood your requirements
If SP, MA, NG is the same then sum to SL.
List item Sum is 0 then count is 0 SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
Try below Query:
with cte as (
select sp,ma,ng,sum(sl) from example group by sp,ma,ng having sum(sl)>0
),
cte1 as (
select distinct sp from example
)
select
t1.sp,
sum(case when sum>0 then 1 else 0 end)
from cte1 t1 left join cte t2 on t1.sp=t2.sp
group by t1.sp
Demo on Fiddle

sum of pivoted columns into new column and insert all records to a temporary table

i want to get sum of pivoted column values to new column and insert output records to a temporary table
select * from (select v.JobNo as JobNo,aj.VehicleNumber as VehicleNo,isnull(g.ImageCount,0) as ImageCount,s.ParamKey as ImageType from dbo.Visits v inner join (select VisitId as visit,paramkey,Value from dbo.VisitParams) s on s.visit = v.visitid left outer join ( select VisitId,FieldId, ( COUNT(*)) as ImageCount from dbo.vw_ImageGallery group by FieldId,VisitId) g on s.visit = g.VisitId and g.FieldId = s.ParamKey inner join Users u on u.UserId = v.CreatedBy inner join AssignedJobs aj on aj.CSRCode = u.Code and aj.JobNumber = v.JobNo where v.VisitType = 1 and v.TimeVisited >= '2019-03-01' AND v.TimeVisited <= '2019-04-01' )as a
PIVOT ( max([ImageCount]) FOR [ImageType] IN ([5],[20],[21]) ) as pvt order by [JobNo]
my actual out put is
job no vehicleno 1 2 5
---------------------------------------------------------
BL1052385 648792 0 8 0
BL1054161 CAT2410 2 8 0
BL1107290 NB 0134 0 5 0
BL1174714 GP 3784 1 7 3
i expect the output like
job no vehicleno 1 2 5 Total Count
----------------------------------------------------------
BL1052385 648792 0 8 0 8
BL1054161 CAT2410 2 8 0 10
BL1107290 NB 0134 0 5 0 5
BL1174714 GP 3784 1 7 3 11

I prefer using conditional aggregation rather than pivot. It is more flexible:
select v.JobNo, aj.VehicleNumber,
sum(case when vp.ParamKey = 1 then g.ImageCount else 0 end) as imagetype_1,
sum(case when vp.ParamKey = 20 then g.ImageCount else 0 end) as imagetype_20,
sum(case when vp.ParamKey = 21 then g.ImageCount else 0 end) as imagetype_21,
count(*) as total
from dbo.Visits v join
dbo.VisitParams vp
on vp.visit = v.visitid join
Users u
on u.UserId = v.CreatedBy join
AssignedJobs aj
on aj.CSRCode = u.Code and
aj.JobNumber = v.JobNo left outer join
(select VisitId, FieldId, count(*) as ImageCount
from dbo.vw_ImageGallery ig
group by FieldId, VisitId
) g
on vp.visit = g.VisitId and
g.FieldId = s.ParamKey
where v.VisitType = 1 and
v.TimeVisited >= '2019-03-01' and
v.TimeVisited <= '2019-04-01' and
vp.ParamKey in (5, 20, 21)
group by v.JobNo, aj.VehicleNumber;
order by v.JobNo

Refer to current row in window function

Is it possible to refer to the current row in a window partition? I want to do something like the following:
SELECT min(ABS(variable - CURRENT.variable)) over (order by criterion RANGE UNBOUNDED PRECEDING)
That is, i want to find in the given partition the variable which is closest to the current value. Is is possible to do something like that?
As an example, from:
criterion | variable
1 2
2 4
3 2
4 7
5 6
We would obtain:
null
2
0
3
1
Thanks

As far as I know, this cannot be done with window functions.
But it can be done with a self join:
SELECT a.id,
a.variable,
min(abs(a.variable - b.variable))
FROM mydata a
LEFT JOIN mydata b
ON (b.criterion < a.criterion)
GROUP BY a.id, a.variable
ORDER BY a.id;

If I understand correctly:
with t (v) as (values (-5),(-2),(0),(1),(3),(10))
select v,
least(
v - lag(v) over (order by v),
lead(v) over (order by v) - v
) as closest
from t
;
v | closest
----+---------
-5 | 3
-2 | 2
0 | 1
1 | 1
3 | 2
10 | 7

Hope this could help you (pay attention for performance problems).
I tried this in MSSQL (at bottom you'll find POSTGRESQL version):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
CROSS APPLY (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
Output:
CRITERION MIN_DELTA
----------- -----------
2 2
3 0
4 3
5 1
POSTGRESQL Version (tested on Rextester http://rextester.com/VMGJ87600):
CREATE TABLE TX (CRITERION INT, VARIABILE INT);
INSERT INTO TX VALUES (1,2), (2,4),(3,2),(4,7), (5,6);
SELECT * FROM TX;
SELECT CRITERION, MIN_DELTA FROM
(
SELECT TX.CRITERION
, MIN(ABS(B.TX2_VAR - TX.VARIABILE)) OVER (PARTITION BY TX.CRITERION) AS MIN_DELTA
, RANK() OVER (PARTITION BY TX.CRITERION ORDER BY ABS(B.TX2_VAR - TX.VARIABILE) ) AS MIN_RANK
FROM TX
LEFT JOIN LATERAL (SELECT TX2.CRITERION AS TX2_CRIT, TX2.VARIABILE AS TX2_VAR FROM TX TX2 WHERE TX2.CRITERION < TX.CRITERION) B ON TRUE
) C
WHERE MIN_RANK=1
ORDER BY CRITERION
;
DROP TABLE TX;
Output:
criterion variabile
1 1 2
2 2 4
3 3 2
4 4 7
5 5 6
criterion min_delta
1 1 NULL
2 2 2
3 3 0
4 4 3
5 5 1

How to insert row data between consecutive dates in HIVE?

Sample Data:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 4-Jan-17 1
A 5-Jan-17 0
B 3-Jan-17 1
B 5-Jan-17 0
Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:
Output should be:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 3-Jan-17 0 (inserted)
A 4-Jan-17 1
A 5-Jan-17 0
B 1-Jan-17 0 (inserted)
B 2-Jan-17 0 (inserted)
B 3-Jan-17 1
B 4-Jan-17 0 (inserted)
B 5-Jan-17 0

select c.customer
,d.txn_date
,coalesce(t.tag,0) as tag
from (select date_add (from_date,i) as txn_date
from (select date '2017-01-01' as from_date
,date '2017-01-05' as to_date
) p
lateral view
posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
) d
cross join (select distinct
customer
from t
) c
left join t
on t.customer = c.customer
and t.txn_date = d.txn_date
;
c.customer d.txn_date tag
A 2017-01-01 1
A 2017-01-02 1
A 2017-01-03 0
A 2017-01-04 1
A 2017-01-05 0
B 2017-01-01 0
B 2017-01-02 0
B 2017-01-03 1
B 2017-01-04 0
B 2017-01-05 0

Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.
Then use the load data command to insert this records into the table.
load data local inpath '/tmp/input.txt' into table tablename;
Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

TSQL A recursive update?

I'm wondering if exists a recursive update in tsql (CTE)
ID parentID value
-- -------- -----
1 NULL 0
2 1 0
3 2 0
4 3 0
5 4 0
6 5 0
I it possible to update the column value recursively using e.g CTE from ID = 6 to the top most row ?

Yes, it should be. MSDN gives an example:
USE AdventureWorks;
GO
WITH DirectReports(EmployeeID, NewVacationHours, EmployeeLevel)
AS
(SELECT e.EmployeeID, e.VacationHours, 1
FROM HumanResources.Employee AS e
WHERE e.ManagerID = 12
UNION ALL
SELECT e.EmployeeID, e.VacationHours, EmployeeLevel + 1
FROM HumanResources.Employee as e
JOIN DirectReports AS d ON e.ManagerID = d.EmployeeID
)
UPDATE HumanResources.Employee
SET VacationHours = VacationHours * 1.25
FROM HumanResources.Employee AS e
JOIN DirectReports AS d ON e.EmployeeID = d.EmployeeID;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

TSQL Conditional joins & referencing table dimensions - tsql

I think you're after this: SELECT OppID, SUM(IQ) AS IQ, SUM(Disco) AS Disco, SUM(Tech) AS Tech, SUM(Commercial) AS Commercial, SUM(Close) AS Close FROM OppHistory GROUP BY OppID

Related

Conditional Counting record in PostgreSQL

sum of pivoted columns into new column and insert all records to a temporary table

Refer to current row in window function

How to insert row data between consecutive dates in HIVE?

TSQL A recursive update?

Categories

Resources