Change row values till first match - tsql

There is a table with the following structure:
(Date, Shop, Exists, Status)
2012-10-09 Shop1 0 Trial
2012-10-23 Shop1 0 New
2012-10-30 Shop1 0 New
2012-11-13 Shop1 0 New
2012-11-27 Shop1 1 New
2012-12-11 Shop1 0 New
2012-12-18 Shop1 0 New
I need to convert it to the following result:
2012-10-09 Shop1 0 Trial
2012-10-23 Shop1 0 Trial
2012-10-30 Shop1 0 Trial
2012-11-13 Shop1 0 Trial
2012-11-27 Shop1 1 New
2012-12-11 Shop1 0 New
2012-12-18 Shop1 0 New
The algorithm is set to trial till Exists column = 1.
There are more Shops, so getting Date value (2012-11-27 in my case) for every single shop doesn't look very sane. Any clues?

You can get a list of all the 'New' rows before the first ShopExists date using the query below. It includes the date the shop first existed as the last column. You should be able to use the results to update your table accordingly.
select
s1.*, s2.FirstExistsDate
from
Shops s1
inner join
( select s.ShopName, MIN(ShopDate) as FirstExistsDate
from Shops s
where ShopExists = 1
group by s.ShopName
) s2 on s1.ShopName = s2.ShopName and s1.ShopDate < s2.FirstExistsDate and s1.ShopStatus = 'New'

If you're not supposed to order the rows except in the way they are exactly in the database then when you go to get the Row Number (you need that to help you find all rows before the row that has 1 for Exists), then you use ORDER BY (SELECT 1) when using ROW_NUMBER. It will take the rows as they are in your table and give them row numbers.
Then find the first row with 1 for Exists and store it in a variable. You can then join the temp table that has the row numbers with your original table and do the update below.
That should do it for you!
SELECT '2012-10-09' AS ShopDate, 'Shop1' AS ShopName, '0' AS ShopExists, 'Trial' AS ShopStatus INTO #Shops UNION
SELECT '2012-10-23', 'Shop1', '0', 'New' UNION
SELECT '2012-10-30', 'Shop1', '0', 'New' UNION
SELECT '2012-11-13', 'Shop1', '0', 'New' UNION
SELECT '2012-11-27', 'Shop1', '1', 'New' UNION
SELECT '2012-12-11', 'Shop1', '0', 'New' UNION
SELECT '2012-12-18', 'Shop1', '0', 'New'
SELECT s.*
, ROW_NUMBER() OVER (ORDER BY (ShopDate)) AS RowNumber
INTO #temp
FROM #Shops s
SELECT *
FROM #temp
DECLARE #RowNumber INT = (SELECT TOP 1 RowNumber FROM #temp WHERE ShopExists = 1 ORDER BY RowNumber ASC)
--SELECT *
UPDATE s SET ShopStatus = 'Trial'
FROM #temp t
JOIN #Shops s
ON s.ShopDate = t.ShopDate
AND s.ShopName = t.ShopName
AND s.ShopExists = t.ShopExists
AND s.ShopStatus = t.ShopStatus
WHERE RowNumber < #RowNumber
SELECT *
FROM #Shops
DROP TABLE #Shops
DROP TABLE #temp
This was the end result:
ShopDate ShopName ShopExists ShopStatus
2012-10-09 Shop1 0 Trial
2012-10-23 Shop1 0 Trial
2012-10-30 Shop1 0 Trial
2012-11-13 Shop1 0 Trial
2012-11-27 Shop1 1 New
2012-12-11 Shop1 0 New
2012-12-18 Shop1 0 New

Related

BigQuery SQL: Group rows with shared ID that occur within 7 days of each other, and return values from most recent occurrence

I have a table of datestamped events that I need to bundle into 7-day groups, starting with the earliest occurrence of each event_id.
The final output should return each bundle's start and end date and 'value' column of the most recent event from each bundle.
There is no predetermined start date, and the '7-day' windows are arbitrary, not 'week of the year'.
I've tried a ton of examples from other posts but none quite fit my needs or use things I'm not sure how to refactor for BigQuery
Sample Data;
Event_Id
Event_Date
Value
1
2022-01-01
010203
1
2022-01-02
040506
1
2022-01-03
070809
1
2022-01-20
101112
1
2022-01-23
131415
2
2022-01-02
161718
2
2022-01-08
192021
3
2022-02-12
212223
Expected output;
Event_Id
Start_Date
End_Date
Value
1
2022-01-01
2022-01-03
070809
1
2022-01-20
2022-01-23
131415
2
2022-01-02
2022-01-08
192021
3
2022-02-12
2022-02-12
212223
You might consider below.
CREATE TEMP FUNCTION cumsumbin(a ARRAY<INT64>) RETURNS INT64
LANGUAGE js AS """
bin = 0;
a.reduce((c, v) => {
if (c + Number(v) > 6) { bin += 1; return 0; }
else return c += Number(v);
}, 0);
return bin;
""";
WITH sample_data AS (
select 1 event_id, DATE '2022-01-01' event_date, '010203' value union all
select 1 event_id, '2022-01-02' event_date, '040506' value union all
select 1 event_id, '2022-01-03' event_date, '070809' value union all
select 1 event_id, '2022-01-20' event_date, '101112' value union all
select 1 event_id, '2022-01-23' event_date, '131415' value union all
select 2 event_id, '2022-01-02' event_date, '161718' value union all
select 2 event_id, '2022-01-08' event_date, '192021' value union all
select 3 event_id, '2022-02-12' event_date, '212223' value
),
binning AS (
SELECT *, cumsumbin(ARRAY_AGG(diff) OVER w1) bin
FROM (
SELECT *, DATE_DIFF(event_date, LAG(event_date) OVER w0, DAY) AS diff
FROM sample_data
WINDOW w0 AS (PARTITION BY event_id ORDER BY event_date)
) WINDOW w1 AS (PARTITION BY event_id ORDER BY event_date)
)
SELECT event_id,
MIN(event_date) start_date,
ARRAY_AGG(
STRUCT(event_date AS end_date, value) ORDER BY event_date DESC LIMIT 1
)[OFFSET(0)].*
FROM binning GROUP BY event_id, bin;

Conditional Counting record in PostgreSQL

I have a table such as the following
SP MA SL NG
jame j001 1 20200715 |
jame j001 -1 20200715 | -> count is 0
pink p002 3 20200730 }
pink p002 -3 20200730 } => count is 0
jack j002 12 20200731 | => count is 1
jack j002 -2 20200731 |
jack j002 12 20200801 } => count is 1
I want to count record and I want a result like:
SP count
jame 0
pink 0
jack 2
I could do with some help, please. Thanks you!
How the result is to be reached:
If SP, MA ,NG is the same then sum to SL.
Sum is 0 then count is 0,SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
As i understood your requirements
If SP, MA, NG is the same then sum to SL.
List item Sum is 0 then count is 0 SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
Try below Query:
with cte as (
select sp,ma,ng,sum(sl) from example group by sp,ma,ng having sum(sl)>0
),
cte1 as (
select distinct sp from example
)
select
t1.sp,
sum(case when sum>0 then 1 else 0 end)
from cte1 t1 left join cte t2 on t1.sp=t2.sp
group by t1.sp
Demo on Fiddle

How to insert row data between consecutive dates in HIVE?

Sample Data:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 4-Jan-17 1
A 5-Jan-17 0
B 3-Jan-17 1
B 5-Jan-17 0
Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:
Output should be:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 3-Jan-17 0 (inserted)
A 4-Jan-17 1
A 5-Jan-17 0
B 1-Jan-17 0 (inserted)
B 2-Jan-17 0 (inserted)
B 3-Jan-17 1
B 4-Jan-17 0 (inserted)
B 5-Jan-17 0
select c.customer
,d.txn_date
,coalesce(t.tag,0) as tag
from (select date_add (from_date,i) as txn_date
from (select date '2017-01-01' as from_date
,date '2017-01-05' as to_date
) p
lateral view
posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
) d
cross join (select distinct
customer
from t
) c
left join t
on t.customer = c.customer
and t.txn_date = d.txn_date
;
c.customer d.txn_date tag
A 2017-01-01 1
A 2017-01-02 1
A 2017-01-03 0
A 2017-01-04 1
A 2017-01-05 0
B 2017-01-01 0
B 2017-01-02 0
B 2017-01-03 1
B 2017-01-04 0
B 2017-01-05 0
Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.
Then use the load data command to insert this records into the table.
load data local inpath '/tmp/input.txt' into table tablename;
Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

Get column of table for results having sum(a_int)=0 and order by date and group by another column

Think of a table like below:
unique_id
a_column
b_column
a_int
b_int
date_created
Let's say data is like:
-unique_id -a_column -b_column -a_int -b_int -date_created
1z23 abc 444 0 1 27.12.2016 18:03:00
2c31 abc 444 0 0 26.12.2016 13:40:00
2e22 qwe 333 0 1 28.12.2016 15:45:00
1b11 qwe 333 1 1 27.12.2016 19:00:00
3a33 rte 333 0 1 15.11.2016 11:00:00
4d44 rte 333 0 1 27.09.2016 18:00:00
6e66 irt 333 0 1 22.12.2016 13:00:00
7q77 aaa 555 1 0 27.12.2016 18:00:00
I want to get the unique_id s where b_int is 1, b_column is 333 and considering a_column, a_int column must always be 0, if there are any records with a_int = 1 even if there are records with a_int = 0 these records must not be shown in the result. Desired result is: " 3a33 , 6e66 " when grouped by a_column and ordered by date_created and got top1 for each unique a_column.
I tried lots of "with ties" and "over(partition by" samples, searched questions, but couldn't manage to do it. This is what I could do:
select unique_id
from the_table
where b_column = '333'
and b_int = 1
and a_column in (select a_column
from the_table
where b_column = '333'
and b_int = 1
group by a_column
having sum(a_int) = 0)
order by date_created desc;
This query returns the result like this " 3a33 ,4d44, 6e66 ". But I don't want "4d44".
You were on the right track with the partitions and window functions. This solution uses ROW_NUMBER to assign a value to the a_column so we can see where there is more than 1. The 1 is the most recent date_created. Then you select from the result set where the row_counter is 1.
;WITH CTE
AS (
SELECT unique_id
, a_column
, ROW_NUMBER() OVER (
PARTITION BY a_column ORDER BY date_created DESC
) AS row_counter --This assigns a 1 to the most recent date_created and partitions by a_column
FROM #test
WHERE a_column IN (
SELECT a_column
FROM #test
WHERE b_column = '333'
AND b_int = 1
GROUP BY a_column
HAVING MAX(a_int) < 1
)
)
SELECT unique_ID
FROM cte
WHERE row_counter = 1

Conditional summarizing columns

I have the following situation
ID Value
1 50
1 60
2 70
2 80
1 0
2 50
I need to run a query that would return summed value, grouped by ID. The catch is if the value is 0, then the entire sum should be 0.
Query results would be
ID Value
1 0
2 200
I tried
select ID, case
when Value> 0 then sum(Value) * 1
when Value= 0 then sum(value) * 0
end
from table
but that did not work.
select ID,
sum(value)*sign(min(abs(value))) as [sum(value)]
from YourTable
group by ID
With a case if you like:
select ID,
case sign(min(abs(value)))
when 0 then 0
else sum(value)
end as [sum(value)]
from YourTable
group by ID