TSQL, remove negative values in a group - tsql

I have a table with following data:
Id Name Value
1 John 100
2 John -500
3 John 500
4 Smith 10
5 Smith 20
6 Smith -20
7 Stuart -10
8 Wills 25
I am looking for an efficient TSQL query which can remove John -500 and Smith -20 (i.e. records with negative value if they have a similar positive value in the same group [group by names]).

I think this is what you need. (SQL DEMO)
delete y
from mytable y join (
select id,name, value
from mytable x
where value > 0) z on y.name = z.name and y.value = -1 * z.value
select * from mytable
1 John 100
3 John 500
4 Smith 10
5 Smith 20
7 Stuart -10
8 Wills 25

delete a
from mytable a
join mytable b
on b.name = a.name
and a.value < 0
and b.value = -1 * z.value
Almost the same as Kaf +1


PostgreSQL cummulative sum max condition

I have the following table:
size id name
100 1 John
200 2 Mary
300 3 Jane
400 4 Anne
100 5 Mike
600 6 Joanne
I want to partition the rows in groups, where the sum of size <= 600.
Expected result:
group size id name
1 100 1 John
1 200 2 Mary
1 300 3 Jane
2 400 4 Anne
2 100 5 Mike
3 600 6 Joanne
I don't know how to do the partition and add the condition.
I cannot think of how to do this without recursion. I left the running_total in the result to make it easier to follow:
with recursive rns as ( -- Assign row numbers as rn in case of gaps in id
select *, row_number() over (order by id) as rn
from account
), sumgrp as ( -- Start with first row
select size, id, name, rn, 1 as grp, size as running_total
from rns
where rn = 1
union all
select n.size, n.id, n.name, n.rn,
case -- Increment the grp when running_total exceeds 600
when p.running_total + n.size > 600 then p.grp + 1
else p.grp
end as grp,
case -- Reset the running_total when it exceeds 600
when p.running_total + n.size > 600 then n.size
else p.running_total + n.size
end as running_total
from sumgrp p
join rns n on n.rn = p.rn + 1
select grp, size, id, name, running_total
from sumgrp
order by id;
db<>fiddle here

Redshift - Get a value from one column A for each ID in the grouping ID column B based on max value in another column C

I have a sql problem (on Redshift) where I need to get the value from column index for each id in column id based on max value in column final_score and put this value in a new column fav_index. score2 equals to the value of score1 where index n = index n + 1, for example, for id = abc1, index = 0 and score1 = 10 the value of score2 will be the value of score1 where index = 1 and the value of final_score is the difference between score1 and score2.
It's easier if you look at below table score. This table score is a result of a sql query which is shown later below.
id index score1 score2 final_score
abc1 0 10 20 10
abc1 1 20 45 25
abc1 2 45 (null) (null)
abc2 0 5 10 5
abc2 1 10 (null) (null)
abc3 0 50 30 -20
abc3 1 30 (null) (null)
So, the resulting table containing column fav_index should look like this:
id index score1 score2 final_score fav_index
abc1 0 10 20 10 0
abc1 1 20 45 25 1
abc1 2 45 (null) (null) 0
abc2 0 5 10 5 0
abc2 1 10 (null) (null) 0
abc3 0 50 30 -20 0
abc3 1 30 (null) (null) 0
Below is the script to generate table score from table story:
max(m.max) as score1,
round(fmt.score2 - max(m.max), 1) as final_score
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
story as sv
group by
order by
) as m
left join
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(score1) as score2
story as sv
group by
) as fmt
m.id = fmt.id
m.index = fmt.index - 1
group by
Table story is as below:
id story_number score1
abc1 1 10
abc1 2 10
abc1 3 20
abc1 4 20
abc1 5 45
abc1 6 45
The only solution I can think of is to do something like,
select id, max(final_score) from score group by id
and then join it back to the long script above (which was used to generate table score). I really want to avoid writing such a long script to get just 1 extra column of information that I need.
Is there a better way to do this?
Thank you!
Update: answer in mysql is also accepted. thanks!
After spending more hours on this and asking people around, I finally figured out a solution by referring to this window function documentation - PostgreSQL https://www.postgresql.org/docs/9.1/static/tutorial-window.html
I basically added 2 x select statements at the top and 1 x where statement at the very bottom. The where statement is to take care of the rows where final_score = null because otherwise the rank() function will rank them as 1.
My code then becomes:
id, index, final_score, rank, case when rank = 1 then index else null end as fav_index
id, index, final_score, rank() over (partition by id order by final_score desc)
max(m.max) as score1,
round(fmt.score2 - max(m.max), 1) as final_score
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
story as sv
group by
order by
) as m
left join
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(score1) as score2
story as sv
group by
) as fmt
m.id = fmt.id
m.index = fmt.index - 1
group by
final_score is not null)
And the result is as follows:
id index final_score rank fav_index
abc1 0 10 2 (null)
abc1 1 25 1 1
abc2 0 5 1 0
abc3 0 -20 1 0
Result is slightly different than what I stated in the question, however, the fav_index for each id is identified and this is what I needed really. Hope this might help someone. Cheers

Return all records regardless if there is a match

In my Table 1, It may have AND have a null entry in the address column to corresponding record OR not have a matching entry in Table 2.
I want to present all the records in Table 1 but also present corresponding entries from Table 2. My RESULT is what I am trying to achieve.
Table 1
ID First Last
1 John Smith
2 Bob Long
3 Bill Davis
4 Sam Bird
5 Tom Fenton
6 Mary Willis
Table 2
RefID ID Address
1 1 123 Main
2 2 555 Center
3 3 626 Smith
4 4 412 Walnut
5 1
6 2 555 Center
7 3
8 4 412 Walnut
Id First Last Address
1 John Smith 123 Main
2 Bob Long 555 Center
3 Bill Davis 626 Smith
4 Sam Bird 412 Walnut
5 Tom Fenton
6 Mary Willis
You need an outer join for this:
SELECT * FROM Table1 t1 LEFT OUTER JOIN Table2 t2 ON t1.ID = t2.RefID
How do you join those two tables? If table 2 have more than 1 matched address, how do you want display them? Please clarify in your question.
Here is a query based on my assumptions.
ID, First, Last,
Address = (SELECT MAX(Address) FROM Table2 t2 WHERE t1.ID = t2.ID)
FROM Table1 t1

Build a query that pulls records based on a value in a column

My table has a parent/child relationship, along the lines of parent.id,id. There is also a column that contains a quantity, and another ID representing a grand-parent, like so:
id parent.id qty Org
1 1 1 100
2 1 0 100
3 1 4 100
4 4 1 101
5 4 2 101
6 6 1 102
7 6 0 102
8 6 1 102
What this is supposed to show is ID 1 is the parent, and ID 2 and 3 are children which belongs to ID 1, and ID 1, 2, and 3 all belong to the grandparent 100.
I would like to know if any child or parent has QTY = 0, what are all the other id's associated to that parent, and what are all the other parents associated with that grandparent?
For example, I would want to see a report that shows me this:
Org id parent.id qty
100 1 1 1
100 2 1 0
100 3 1 4
102 6 6 1
102 7 6 0
102 8 6 1
Much appreciate any help you can offer to build a MS SQL 2000 (yeah, I know) query to handle this.
Try this
select * from tablename a
where exists (select 1 from tablename x
where x.parent_id = a.parent_id and qty = 0)
;with cte as
( select 1 id,1 parent_id, 1 qty, 100 org
union all select 2,1,0,100
union all select 3,1,4,100
union all select 4,4,1,101
union all select 5,4,2,101
union all select 6,6,1,102
union all select 7,6,0,102
union all select 8,6,1,102
select * from cte a
where exists (select 1 from cte x
where x.parent_id = a.parent_id and qty = 0)

Extract Unique Time Slices in Oracle

I use Oracle 10g and I have a table that stores a snapshot of data on a person for a given day. Every night an outside process adds new rows to the table for any person whose had any changes to their core data (stored elsewhere). This allows a query to be written using a date to find out what a person 'looked' like on some past day. A new row is added to the table even if only a single aspect of the person has changed--the implication being that many columns have duplicate values from slice to slice since not every detail changed in each snapshot.
Below is a data sample:
SliceID PersonID StartDt Detail1 Detail2 Detail3 Detail4 ...
1 101 08/20/09 Red Vanilla N 23
2 101 08/31/09 Orange Chocolate N 23
3 101 09/15/09 Yellow Chocolate Y 24
4 101 09/16/09 Green Chocolate N 24
5 102 01/10/09 Blue Lemon N 36
6 102 01/11/09 Indigo Lemon N 36
7 102 02/02/09 Violet Lemon Y 36
8 103 07/07/09 Red Orange N 12
9 104 01/31/09 Orange Orange N 12
10 104 10/20/09 Yellow Orange N 13
I need to write a query that pulls out time slices records where some pertinent bits, not the whole record, have changed. So, referring to the above, if I only want to know the slices in which Detail3 has changed from its previous value, then I would expect to only get rows having SliceID 1, 3 and 4 for PersonID 101 and SliceID 5 and 7 for PersonID 102 and SliceID 8 for PersonID 103 and SliceID 9 for PersonID 104.
I'm thinking I should be able to use some sort of Oracle Hierarchical Query (using CONNECT BY [PRIOR]) to get what I want, but I have not figured out how to write it yet. Perhaps YOU can help.
Thanks you for your time and consideration.
Here is my take on the LAG() solution, which is basically the same as that of egorius, but I show my workings ;)
SQL> select * from
2 (
3 select sliceid
4 , personid
5 , startdt
6 , detail3 as new_detail3
7 , lag(detail3) over (partition by personid
8 order by startdt) prev_detail3
9 from some_table
10 )
11 where prev_detail3 is null
12 or ( prev_detail3 != new_detail3 )
13 /
---------- ---------- --------- - -
1 101 20-AUG-09 N
3 101 15-SEP-09 Y N
4 101 16-SEP-09 N Y
5 102 10-JAN-09 N
7 102 02-FEB-09 Y N
8 103 07-JUL-09 N
9 104 31-JAN-09 N
7 rows selected.
The point about this solution is that it hauls in results for 103 and 104, who don't have slice records where detail3 has changed. If that is a problem we can apply an additional filtration, to return only rows with changes:
SQL> with subq as (
2 select t.*
3 , row_number () over (partition by personid
4 order by sliceid ) rn
5 from
6 (
7 select sliceid
8 , personid
9 , startdt
10 , detail3 as new_detail3
11 , lag(detail3) over (partition by personid
12 order by startdt) prev_detail3
13 from some_table
14 ) t
15 where t.prev_detail3 is null
16 or ( t.prev_detail3 != t.new_detail3 )
17 )
18 select sliceid
19 , personid
20 , startdt
21 , new_detail3
22 , prev_detail3
23 from subq sq
24 where exists ( select null from subq x
25 where x.personid = sq.personid
26 and x.rn > 1 )
27 order by sliceid
28 /
---------- ---------- --------- - -
1 101 20-AUG-09 N
3 101 15-SEP-09 Y N
4 101 16-SEP-09 N Y
5 102 10-JAN-09 N
7 102 02-FEB-09 Y N
As egorius points out in the comments, the OP does want hits for all users, even if they haven't changed, so the first version of the query is the correct solution.
In addition to OMG Ponies' answer: if you need to query slices for all persons, you'll need partition by:
SELECT s.sliceid
, s.personid
FROM (SELECT t.sliceid,
LAG(t.detail3) OVER (
PARTITION BY t.personid ORDER BY t.startdt
) prev_val
FROM t) s
WHERE (s.prev_val IS NULL OR s.prev_val != s.detail3)
I think you'll have better luck with the LAG function:
SELECT s.sliceid
FROM (SELECT t.sliceid,
LAG(t.detail3) OVER (PARTITION BY t.personid ORDER BY t.startdt) 'prev_val'
WHERE s.personid = 101
AND (s.prev_val IS NULL OR s.prev_val != s.detail3)
Subquery Factoring alternative:
WITH slices AS (
SELECT t.sliceid,
LAG(t.detail3) OVER (PARTITION BY t.personid ORDER BY t.startdt) 'prev_val'
SELECT s.sliceid
FROM slices s
WHERE s.personid = 101
AND (s.prev_val IS NULL OR s.prev_val != s.detail3)