Trying to partition to remove rows where two columns don't match sql - group-by

How can I filter out rows within a group that do not have matching values in two columns?
I have a table A like:
CODE
US_ID
US_PRICE
NON_US_ID
NON_US_PRICE
5109
57
10
75
10
0206
85
11
58
11
0206
85
15
33
14
0206
85
41
22
70
T100
20
10
49
NULL
T100
20
38
64
38
Within each CODE group, I want to check whether US_PRICE = NON_US_PRICE and remove that row from the resulting table.
I tried:
SELECT *,
CASE WHEN US_PRICE != NON_US_PRICE OVER (PARTITION BY CODE) END
FROM A;
but I think I am missing something when I try to partition by CODE.
I want the resulting table to look like
CODE
US_ID
US_PRICE
NON_US_ID
NON_US_PRICE
0206
85
15
33
14
0206
85
41
22
70
T100
20
10
49
NULL

For provided sample, simple WHERE clause could produce such result:
SELECT *
FROM A
WHERE US_PRICE IS DISTINCT FROM NON_US_PRICE;
IS DISTINCT FROM handles NULLs comparing to != operator.

Related

Usage of DISTINCT in reversed int pairs duplicates elimination

I have a following question:
create table memorization_word_translation
(
id serial not null
from_word_id integer not null
to_word_id integer not null
);
This table stores pairs of integers, that are often in reverse order, for example:
35 36
35 37
36 35
37 35
37 39
39 37
Question is - if I make a query, for example:
select * from memorization_word_translation
where from_word_id = 35 or to_word_id = 35
I would get
35 36
35 37
36 35 - duplicate of 35 36
37 35 - duplicate of 35 37
How is to use DISTINCT in this example to filter out all duplicates even if they are reversed?
I want to keep it only like this:
35 36
35 37
You can do it with ROW_NUMBER() window function:
select from_word_id, to_word_id
from (
select *,
row_number() over (
partition by least(from_word_id, to_word_id),
greatest(from_word_id, to_word_id)
order by (from_word_id > to_word_id)::int
) rn
from memorization_word_translation
where 35 in (from_word_id, to_word_id)
) t
where rn = 1
See the demo.
demo:db<>fiddle
You could try a it with a small sorting algorithm (here a comparison) in combination with DISTINCT ON.
The DISTINCT ON clause works an arbitrary columns or terms, e.g. on a tuple. This CASE clause sorts the two columns into tuples and removes tied (ordered) ones. The source columns can be returned in your SELECT statement:
select distinct on (
CASE
WHEN (from_word_id >= to_word_id) THEN (from_word_id, to_word_id)
ELSE (to_word_id, from_word_id)
END
)
*
from memorization_word_translation
where from_word_id = 35 or to_word_id = 35

TSQL Select TOP and Distinct from one table into a TEMP table

I have the following table:
Data nr1 nr2 nr3 nr4 nr5 nr6
2020-09-12 6 15 36 42 67 78
2020-09-10 46 48 67 78 80 87
2020-09-08 23 27 28 31 69 89
2020-09-05 7 14 27 56 72 83
2020-09-03 16 17 38 39 68 84
2020-09-01 10 22 28 45 48 71
2020-08-29 1 3 35 42 55 61
2020-08-27 37 49 52 53 75 87
2020-08-25 15 24 31 70 83 84
2020-08-22 7 12 45 47 73 87
2020-08-20 7 17 30 39 41 67
2020-08-18 13 22 28 58 65 77
2020-08-17 5 9 26 62 77 79
2020-08-13 4 5 49 57 66 75
2020-08-11 7 9 38 68 78 80
2020-08-08 6 16 22 55 58 83
2020-08-06 21 37 40 46 69 80
2020-08-04 5 19 21 25 45 82
2020-08-01 4 14 17 18 26 45
2020-07-30 4 15 19 26 28 55
2020-07-28 23 45 49 71 80 82
2020-07-25 18 30 42 70 78 80
2020-07-23 10 29 37 49 56 57
2020-07-21 4 34 46 54 55 62
2020-07-18 18 33 49 76 80 84
I have to do the following task:
Select into a #TEMP table with only one column DistinctNumbers all distinct numbers of the above table because some numbers in the above table might be repeated across rows and columns.
Select into another #TEMP table all numbers in the range from 1 to 99 which are not in the original table.
What is the best way of accomplishing these two tasks?
You should unpivot original table first
1.Unpivot original table into #temp table
2.Now you have all numbers in one column
3.Use while between 1 and 99 and insert counter into #RESULT table where not in #temp(unpivoted table)
SELECT DISTINCT(num) num INTO #TEMP_DISTINCT_NUMBERS FROM ORIGINAL_TABLE UNPIVOT (
num
FOR PivotColumn IN (nr1,nr2,nr3,nr4,nr5,nr6)
) AS UNPIVOTE_TABLE
CREATE TABLE #RESULT(NUM INT)
DECLARE #COUNTER INT =1;
WHILE(#COUNTER<=99)
BEGIN
INSERT INTO #RESULT SELECT #COUNTER WHERE #COUNTER NOT IN (SELECT num FROM
#TEMP_DISTINCT_NUMBERS)
SET #COUNTER=#COUNTER+1
END
SELECT * FROM #RESULT
you can try this:
;WITH tally
AS (SELECT 1 AS num
UNION ALL
SELECT num + 1
FROM tally
WHERE num < 99)
SELECT DISTINCT tally.num
FROM tally
LEFT JOIN
( SELECT num FROM #dataset --your dataset
CROSS APPLY (VALUES (nr1),(nr2),(nr3),(nr4),(nr5),(nr6)) AS B (num)
) AS dataset
ON tally.num = dataset.num
WHERE dataset.num IS NULL
Code above:
Create [tally] recursive common table expression with sequence from 1 to 99
Left join tally with your unpivoted dataset ...
test here: https://rextester.com/YEB57637

Recursive CTE with multiple valid same parent child relationships

I have an equipment inventory application I am working on. The piece of equipment is my top level and it contains assemblies, sub-assemblies and parts. I am trying to use recursive CTE to display the parent/child relationships. The issue I am having is that some assemblies can have multiple sub-assemblies that are the same, meaning there is not difference in the part numbers. This is causing my query to not show the correct relationship based on my order by statement. This is the first time I have used CTE so I have be using a lot learned on the web.
PartNumberID 174 is used twice in this assembly.
Sample Table
equipmentID parentPartNumberID partNumberID
17 1 281
17 281 156
17 156 161
17 161 224
17 281 174
17 174 192
17 192 56
17 174 193
17 281 174
17 174 192
17 192 56
17 174 193
17 281 283
17 ` 283 183
17 283 277
17 283 173
Results of Query
PARENT CHILD PARTLEVEL HIERARCHY
1 281 0 281
281 156 1 281.156
156 161 2 281.156.161
161 224 3 281.156.161.224
281 174 1 281.174
281 174 1 281.174
174 192 2 281.174.192
174 192 2 281.174.192
192 56 3 281.174.192.56
192 56 3 281.174.192.56
174 193 2 281.174.193
174 193 2 281.174.193
281 283 1 281.283
283 173 2 281.283.173
283 183 2 281.283.183
283 277 2 281.283.277
As you can see the hierarchy is created correctly but I it is not being returned correctly because there is nothing unique for these 2 assemblies for the order by statement.
The Code:
with parts(PARENT,CHILD,PARTLEVEL,HIERARCHY) as (select parentPartNumberID,
--- Used to get rid of duplicates
CASE WHEN ROW_NUMBER() OVER (PARTITION BY partNumberID ORDER BY partNumberID) > 1
THEN NULL
ELSE partNumberID END AS partNumberID,
0,
CAST( partNumberID as nvarchar) as PARTLEVEL
FROM db.tbl_ELEMENTS
WHERE parentPartNumberID=1 and equiptmentID=17
UNION ALL
SELECT part1.parentPartNumberId,
--- Used to get rid of duplicates
CASE WHEN ROW_NUMBER() OVER (PARTITION BY parts1.partNumberID ORDER BY parts1.partNumberID) > 1
THEN 10000 + parts1.partNumberID
ELSE parts1.partNumberID END,
PARTLEVEL+1,
cast(parts.hierarchy + '.' + CAST(parts1.partNumberID as nvarchar) as nvarchar)
from dbo.tbl_BOM_Elements as parts1 inner
join parts onparts1.parentPartNumberID=parts.CHILD
where id =17)
select CASE WHEN PARENT > 10000
THEN PARENT - 10000
ELSE PARENT END AS PARENT,
CASE WHEN CHILD > 10000
THEN CHILD - 10000
ELSE CHILD END AS CHILD,
PARTLEVEL,HIERARCHY
from parts
order by hierarchy
I tried to create a unique ID to order but was not successful. Any suggestions would be greatly appreciated.
I'll start by just answering the part about getting a sequential id.
If you have control you could just a unique Id to your source table. Having a surrogate primary key would be pretty typical here.
You could instead use a second CTE before the recursive one and add the row numbers there using ROW_NUMBER() OVER BY (ORDER BY equipmentID, parentPartNumberID, partNumberID). Then build your recursive CTE off of that rather than the source table directly.
Better might be to use the first CTE to instead GROUP BY equipmentID, parentPartNumberID, partNumberID and add a COUNT(1) field. This would let you instead use the count in you hierarchy rather than getting the duplicates. Something like 281.283.277x2 or whatever.

Ordering row having same values in two columns on top

I have data similar to one given below:
ID UserID PlayerID Name
1 56 21 A
2 57 34 B
3 77 77 C
4 65 23 D
5 77 77 E
I want the rows with same value in UserID and PlayerID column to be at the top.
I have currently done this:
select * from tblTest
order by abs(UserID - PlayerID ) asc
Any better way to achieve this result?
Try this
SELECT * From tblTest
Order By Case When UserID = PlayerID Then 0 Else 1 End

Subselect and Max

Alright, I've been trying to conceptualize this for a better part of the afternoon and still cannot figure out how to structure this subselect.
The data that I need to report are ages for a given student major grouped by the past 3 fiscal years. Each fiscal year has 3 semesters (summer, fall, spring). I need to have my query grouped on the fiscalyear and agerange fields and then count the distinct student id's.
I currently have this for my SQL statement:
Select COUNT(distinct StuID), AgeRange, FiscalYear
from tblStatic
where Campus like 'World%' and (enrl_act like 'REG%' or enrl_act like 'SCH%')
and StuMaj = 'LAWSC' and FiscalYear IN ('09/10', '10/11', '11/12')
group by FiscalYear, AgeRange
order by FiscalYear, AgeRange
So this is all fine and dandy except it doesn't match my headcount of students for the fiscalyear. The reason being, that people may cross over in the age ranges during the fiscal year and is adding them to my count twice.
How can I use a subselect to resolve this duplicate entry? The field I have been trying to get working is my semester field and using a max to find the max semester during a fiscalyear for a given student.
Data Sample:
Count AgeRange FiscalYear
3 1 to 19 09/10
20 20 to 23 09/10
60 24 to 29 09/10
96 30 to 39 09/10
34 40 to 49 09/10
14 50 to 59 09/10
3 60+ 09/10
2 1 to 19 10/11
24 20 to 23 10/11
73 24 to 29 10/11
109 30 to 39 10/11
43 40 to 49 10/11
11 50 to 59 10/11
2 60+ 10/11
1 1 to 19 11/12
17 20 to 23 11/12
75 24 to 29 11/12
123 30 to 39 11/12
44 40 to 49 11/12
14 50 to 59 11/12
2 60+ 11/12
Solution: (Just got this working and produced my headcounts that match what they are suppose to be)
Select COUNT(distinct S.StuID), AR.AgeRange, S.FiscalYear
from tblStatic S
INNER JOIN
( Select S.StuID, MIN(AgeRange) as AgeRange
From tblStatic S
Group By S.StuID) AR on S.StuID=AR.StuID
where Campus like 'World%' and (enrl_act like 'REG%' or
enrl_act like 'SCH%')
and StuMaj = 'LAWSC' and FiscalYear IN ('09/10', '10/11', '11/12')
group by S.FiscalYear, AR.AgeRange
order by S.FiscalYear, AR.AgeRange
Replace each student's age range with its maximum (or minimum, if you like) age range that fiscal year, then count them:
;
WITH sourceData AS (
SELECT
StudID,
MaxAgeRangeThisFiscalYear = MAX(AgeRange) OVER
(PARTITION BY StudID, FiscalYear),
FiscalYear
FROM tblStatic
WHERE Campus LIKE 'World%'
AND (enrl_act LIKE 'REG%' OR enrl_act LIKE 'SCH%')
AND StuMaj = 'LAWSC'
AND FiscalYear IN ('09/10', '10/11', '11/12')
)
SELECT
FiscalYear,
AgeRange = MaxAgeRangeThisFiscalYear,
Count = COUNT(DISTINCT StudID)
FROM sourceData
GROUP BY
FiscalYear,
MaxAgeRangeThisFiscalYear
ORDER BY
FiscalYear,
MaxAgeRangeThisFiscalYear