Self join to lowest occurrence of group

Self join to lowest occurrence of group - tsql

I have a problem in T-SQL that I find difficult to solve.
I have a table with groups of records, grouped by key1 and key2. I order each group chronologically by date. For each record, I want to see if there existed a record before (within the group and with lower date) for which the field "datafield" forms an allowed combination with the current record's "datafield". For the allowed combinations, I have a table called AllowedCombinationsTable.
I wrote following code to achieve it:
WITH Source AS (
SELECT key1, key2, datafield, date1,
ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
ON D.datafield1 = L.datafield
LEFT JOIN Source AS R
ON R.Key1 = L.Key1
AND R.Key2 = L.Key2
AND R.dateorder < L.dateorder
AND DC.datafield2 = L.datafield
-- AND "pick the one record with lowest dateorder"
Now for each of these possible combination records, I want to pick the first one (see placeholder in code). How can I do it most efficiently?
EDIT: OK let's say for the source, only showing group (1, 1):
**Key1 Key2 Datafield Date DateOrder**
1 1 "Horse" 1-Jan-2010 1
1 1 "Horse" 2-Jan-2010 2
1 1 "Sheep" 3-Jan-2010 3
1 1 "Dog" 4-Jan-2010 4
1 1 "Cat" 5-Jan-2010 5
AllowedCombinationsTable:
**Datafield1 Datafield**
Cat Sheep (and Sheep Cat)
Cat Horse (and Horse Cat)
Dog Horse (and Horse Dog)
After my join I have now:
**Key1 Key2 Datafield Date DateOrder JoinedCombination JoinedCombinationDateOrder**
1 1 "Horse" 1-Jan-2010 1 NULL NULL
1 1 "Horse" 2-Jan-2010 2 NULL NULL
1 1 "Sheep" 3-Jan-2010 3 NULL NULL
1 1 "Dog" 4-Jan-2010 4 "Horse" 1
1 1 "Dog" 4-Jan-2010 4 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Horse" 1
1 1 "Cat" 5-Jan-2010 5 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Sheep" 3
I want to display only the first "Horse" for record 4 "Dog", and also only the first "Horse" for record 5 "Cat".
Get it? ;)

I think this may do it--don't have data set up to test the query with. Check the comments for rationale.
WITH Source AS (
SELECT key1, key2, datafield, date1,
ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
ON DC.datafield1 = L.datafield -- DC Alias
LEFT JOIN Source AS R
ON R.Key1 = L.Key1
AND R.Key2 = L.Key2
AND DC.datafield2 = R.datafield -- Changed alias from L to R
AND R.dateorder = 1 -- Pick out lowest one
AND R.dateorder < L.dateorder -- Make sure it's not the same one

Well, I don't use WITH or OVER, so this is a different approach.. I might be over-simplifying something, but without having the data in front of me this is what I came up with:
SELECT distinct a.Key1, a.Key2, a.Datafield,
ISNULL(b.Datafield,'') as Datafield1,
ISNULL(b.Date,a.Date) as `Date`,
MIN(a.DateOrder) as DateOrder
FROM Source a
LEFT JOIN Source b
ON a.Key1 = b.Key1
AND a.Key2 = b.Key2
AND a.Dateorder <> b.Dateorder
LEFT JOIN AllowedDataCombinationsTable c
ON a.Datafield = c.Datafield
AND b.Datafield = c.Datafield1
GROUP BY a.Key1, a.Key2, a.Datafield, ISNULL(b.Datafield,''), ISNULL(b.Date,a.Date)

Related

How to convert timestamp to numbers

Suppose I have a table like this:
Id Types Timestamp
1 A 2014-02-04 00:00:00
2 A 2014-02-05 00:00:00
1 A 2014-02-05 03:59:00
3 C 2014-05-06 03:59:00
1 B 2014-02-04 03:00:00
2 D 2014-02-05 00:40:00
I would like the output to be like this:
Id 1 2 3 4 5 etc
1 A B A C D ...
2 A D NULL NULL NULL
3 C NULL NULL NULL NULL
Is it possible to make time expresses the type's order.
Thanks for any hints.

Preliminary comments:
SQL can only return a predefined number of columns returned. IMHO, the best you can get is values concatenated in an array.
I have name your input table MyTable and renamed the column Timestamp to MyTimestamp to avoid conflict with the corresponding type's keyword.
You have put C and D in the 1 row of your output. I will treat it as a typo (they are not on ID = 1)
-
WITH RECURSIVE ConcatAndOrder(ID, MyResult, RowNumForOrder, RowCountForOrder) AS (
SELECT ID, ARRAY[Type], RowNumForOrder, RowCountForOrder
FROM IndexedTable
WHERE RowNumForOrder = 1
UNION ALL
SELECT I.ID, MyResult || I.Type, I.RowNumForOrder, I.RowCountForOrder
FROM IndexedTable I
JOIN ConcatAndOrder C on I.ID = C.ID and I.RowNumForOrder = C.RowNumForOrder + 1
), IndexedTable(ID, Type, RowNumForOrder, RowCountForOrder) AS (
SELECT ID, Type,
row_number() OVER (PARTITION BY ID ORDER BY MyTimestamp),
count(*) OVER (PARTITION BY ID)
FROM MyTable
)
SELECT ID, MyResult
FROM ConcatAndOrder
WHERE RowNumForOrder = RowCountForOrder
ORDER BY ID

PostgreSQL get all possible origin/destination node sequence

I have this PostgreSQL table with node of a directed graph:
node_id | node_sequence
-----------------------
1 1
2 2
3 3
I'd return a table with all the possible origin destination sequence (only in one direction) between the node:
(1,2); (1,2,3); (2,3). So the output table should be:
node_id
----
1
2
1
2
3
2
3
Maybe WITH RECURSIVE is the right thing to do but I cannot understand how.

Edit from initial answer:
You seem to have 2 constraints you do not mention in your question:
You want sequences of at least 2 elements
Elements in a sequence must be in ascending order and consecutive
Here is a simple query that does it (CTE GraphNode should be replaced with your table):
WITH RECURSIVE GraphPath AS (
SELECT G2.Node, ARRAY[G1.Node, G2.Node] AS GraphPath /* Start with 2 elements */
FROM GraphNode G1
JOIN GraphNode G2 ON G1.Node + 1 = G2.Node
UNION ALL
SELECT N.Node, P.GraphPath || N.Node
FROM GraphNode N
JOIN GraphPath P ON N.Node = 1 + P.Node
), GraphNode AS (
SELECT UNNEST(ARRAY[1,2,3]) AS Node
)
SELECT GraphPath
FROM GraphPath
ORDER BY GraphPath

aggregate function to keep specific value, depending of other columns

I have data of the following format
id_A id_B val
--------------------------------
1 1 1
1 2 2
2 1 3
2 3 4
Is there a nice way to group by id_A while keeping the value of the line where id_A = Id_B ?
The reason I need to aggregate is that if there is no such line, I want the average.
The result should look like this:
id_A val
-----------------
1 1
2 3.5
I've come up with the following, but that case looks ugly and hacky to me.
Select id_A,
Coalesce(
avg(case when id_A = id_B then val else null end),
avg(val)
) as value
From myTable
Group by id_A;

With postgres 9.4+ you can use FILTER clause for aggregates and window functions:
functions. Something like this:
Select id_A,
Coalesce(
avg(val) filter(where id_A = id_B),
avg(val)
) as value
From myTable
Group by id_A;
Details here:http://www.postgresql.org/docs/current/static/sql-expressions.html

TSQL Update value to 1 if Max value between two table else 0

I have two table
TABLE 1 : Stage_product
PRODUCT_ID SYS_ROWDATETIMEUTC
1 2015-03-13 06:09:30.040
2 ....
3
TABLE 2 : DIM_Product
PRODUCT_ID SYS_ROWSTARTDATETIMEUTC SYS_ROWISCURRENT
1 2014-03-13 06:09:30.040 0
2 2015-03-13 06:09:30.040 1
I want to do an update statement that if the value SYS_ROWDATETIMEUTC in the first table is more recent than the value SYS_ROWSTARTDATETIMEUTC in the table, then the value SYS_ROWISCURRENT in the second table is set to 0, else 1.

You can use the following query:
UPDATE t2
SET t2.SYS_ROWISCURRENT = CASE
WHEN t1.SYS_ROWDATETIMEUTC > t2.SYS_ROWSTARTDATETIMEUTC THEN 0
ELSE 1
END
FROM Table2 t2
INNER JOIN Table1 t1 ON t2.PRODUCT_ID = t1.PRODUCT_ID
I assume you want to compare dates between the two tables for the same product.

How to sum items from subtable in SQL

Let's say I have table orders
id name
1 order1
2 order2
3 order3
and subtable items
id parent amount price
1 1 1 10
2 1 3 20
3 2 2 5
4 2 5 1
I would like to create query with order with added column value. it should calculate order with all relevant items
id name value
1 order1 70
2 order2 15
3 order3 0
Is this possible with TSQL

GROUP BY and SUM would do it, need to use left join and isnull as you don't have items for all orders.
SELECT o.id, o.name, isnull(sum(i.amount*i.price),0) as value
FROM orders o
left join items i
on o.id = i.parent
group by o.id, o.name

I think you're looking for something like this
SELECT o.name, i.Value FROM orders o WITH (NOLOCK)
LEFT JOIN (SELECT parent, SUM(price) AS Value FROM items WITH (NOLOCK) GROUP BY parent) i
ON o.id = i.parent
...seems like RADAR beat me to the answer.
EDIT: missing the ON line.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Self join to lowest occurrence of group - tsql

Related

How to convert timestamp to numbers

PostgreSQL get all possible origin/destination node sequence

aggregate function to keep specific value, depending of other columns

TSQL Update value to 1 if Max value between two table else 0

How to sum items from subtable in SQL

Categories

Resources