Greatest N per group in Open SQL - greatest-n-per-group

Selecting the rows from a table by (partial) key with the maximum value in a particular column is a common task in SQL. This question has some excellent answers that cover a variety of approaches to it. Unfortunately I'm struggling to replicate this in my ABAP program.
None of the commonly used approaches seem to be supported:
Joining on a subquery is not supported in syntax: SELECT * FROM X as x INNER JOIN ( SELECT ... ) AS y
Using IN for a composite key is not supported in syntax as far as I know: SELECT * FROM X WHERE (key1, key2) IN ( SELECT key1 key2 FROM ... )
Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons: SELECT * FROM X AS x LEFT JOIN X as xmax ON x-key1 = xmax-key1 AND x-key2 < xmax-key2 WHERE xmax-key IS INITIAL
After trying each of these solutions in turn only to discover that ABAP doesn't seem to support them and being unable to find any equivalents I'm starting to think that I'll have no choice but to dump the data of the subquery to an itab.
What is the best practice for this common programming requirement in ABAP development?

First of all, specific requirement, would give you a better answer. As it happens I bumped into this question when working on a program, that uses 3 distinct methods of pseudo-grouping, (while looking for alternatives) and ALL 3 can be used to answer your question, depending on what exactly you need to do. I'm sure there are more ways to do it.
For instance, you can pull maximum values within a group by simply selecting max( your_field ) and grouping by some fields, if that's all you need.
select bname, nation, max( date_from ) from adrp group by bname, nation. "selects highest "from" date for each bname
If you need to use that max value as a filter condition within a query, you can do it by performing pseudo-grouping using sub-query and max within sub-query like this (notice how I move out the BNAME check into sub query, which means I don't have to check both fields using in (subquery) addition):
select ... from adrp as b_adrp "Pulls the latest person info for a user (some conditions are missing, but this is a part of an actual query)
where b_adrp~date_from in (
select max( date_from ) "Highest date_from where both dates are valid
from adrp where persnumber = b_adrp~persnumber and nation = b_adrp~nation and date_from <= #sy-datum )
The query above allows you to select selects all user info from base query and (where the first one only allows to take aggregated and grouped data).
Finally, If you need to check based on composite key and compare it to multiple agregate function results, the implementation will heavily depend on specifics of your requirement (and since your question has none, I'll provide a generic one). Easiest option is to use exists / not exists instead of in (subquery), in exact same way and form the subquery to check for existance of specific key or condition rather than pull a list ( you can nest subqueries if you have to ):
select * from bkpf where exists ( select 1 from bkpf as b where belnr = bkpf~belnr and gjahr = bkpf~gjahr group by belnr, gjahr having max( budat ) = bkpf~budat ) "Took an available example, that I had in testing program.
All 3 queries will get you max value of a column within a group and in fact, all 3 can use joins to achieve identical results.

please find my answers below your questions.
Joining on a subquery is not supported in syntax: SELECT * FROM X as x INNER JOIN ( SELECT ... ) AS y
Putting the subquery in your where condition should do the work SELECT * FROM X AS x INNER JOIN Y AS y ON x~a = y~b WHERE ( SELECT * FROM y WHERE ... )
Using IN for a composite key is not supported in syntax as far as I know: SELECT * FROM X WHERE (key1, key2) IN ( SELECT key1 key2 FROM ... )
You have to split your WHERE clause: SELECT * FROM X WHERE key1 IN ( SELECT key1 FROM y ) AND key2 IN ( SELECT key2 FROM y )
Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons.
Yes, thats right at the moment.

Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons:
SELECT * FROM X AS x LEFT JOIN X as xmax ON x-key1 = xmax-key1 AND x-key2 < xmax-key2 WHERE xmax-key IS INITIAL
This is not true. This SELECT is perfectly valid:
SELECT b1~budat
INTO TABLE lt_bkpf
FROM bkpf AS b1
LEFT JOIN bkpf AS b2
ON b2~belnr < b1~belnr
WHERE b1~bukrs <> ''.
And was valid at least since 7.40 SP08, since July 2013, so at the time you asked this question it was valid as well.

Related

SQL Server: How to customize one column in Order By clause out of many other columns in order by

I have store procedure which return data fine and it was developed by some one else who now not in touch.
Output now looks like
Here i am attaching a part of the query which return data.
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From
(
---- Broker Detail
Select AA.Section,AA.LineItem,Csm.DisplayInCSM ,AA.BrokerCode Broker,AA.BrokerName,'''' BM_Element,'''' BM_Code,AA.Ord,AA.[Revise Date],AA.LineItemId,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From tblCSM_ModelDetails Csm LEFT OUTER JOIN (
Select b.*,L.ID LineItemId
From #TmpAll_Broker_LI b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
) AA ON Csm.LineItemId=AA.LineItemId
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+' AND Csm.BMID=0 AND Type !=''SHEET''
UNION
----- Consensus
Select Section, b.LineItem,DisplayInCSM, '''' Broker,'''' BrokerName,'''' BM_Element,'''' BM_Code, Ord,'''' [Revise Date],L.ID LineItemID,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From #TmpZacksCons b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
INNER JOIN tblCSM_ModelDetails Csm ON Csm.LineItemID=L.ID
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+' AND Csm.BMID=0
---- Blue Metrics
UNION
Select Section, b.LineItem,DisplayInCSM,'''' Broker,'''' BrokerName,BM_Element,Code BM_Code, Ord,'''' [Revise Date],L.ID LineItemID,
Csm.ID,[FontName],[FontStyle],[FontSize],[UnderLine],[BGColor],[FGColor],[Indent],[Box],[HeadingSubHeading],
'+#PeriodCols+','+#PeriodColsComment +',LineItem_Comment,BrokerName_Comment,Date_Comment
From #TmpBM b
INNER JOIN TblLineItemTemplate L ON TickerID='''+#TickerID+''' AND b.LineItem= L.LineItem
INNER JOIN tblCSM_ModelDetails Csm ON Csm.BMID=b.code AND Csm.LineItemID=L.ID
WHERE Csm.CSM_ID='+TRIM(CONVERT(CHAR(10),#CSM_Id))+'
AND Ord IS NOT NULL
) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,Ord,BM_Code,LineItem,BrokerName'
Now broker Name is not coming as alphabetical order and it is the issue.
see this line at the bottom Order by ID,Ord,BM_Code,LineItem,BrokerName
When i try to change this order by like Order by ID,Ord,BM_Code,LineItem,BrokerName IN (SELECT BrokerName FROM #Brokers ORDER BY BrokerName ASC)' then getting error like clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
in my order by there are many columns and data is getting order by that way but i need to show broker name in alphabetical order but i am not being able. so please some one guide me how can i customize this sql.
Here i have not attached my full store procedure code because it is very large. looking for suggestion & help. Thanks
Short version
The ORDER BY is doing what is expected - ordering first by ID, then Ord, then BM_Code, then LineItem, then BrokerName.
Within ID 76187, the next field to order by is Ord - which it sorts from 30911, to 31097.
If it previously ordered by BrokerName, it was only by chance - or that Ord was ordered the same way as BrokerName.
My initial suggestion is to re-order your sort e.g., ORDER BY ID, BM_Code, LineItem, BrokerName, Ord
Longer explanation of issue
In SQL, underlying data is treated as a set and ordering doesn't matter.
For example, if you have a variable #x and you were testing IF #x IN (1,2,3,4,5) will produce the same result as IF #x in (5,4,3,2,1).
In your example, you're putting an ORDER BY into the sub-query you're checking with the IN e.g., ORDER BY ... BrokerName IN (SELECT BrokerName FROM #Brokers ORDER BY BrokerName ASC). The order of that sub-query isn't allowed, and wouldn't do anything anyway.
The only sort that matters (other than for a few things like TOP) is the final sort - when displaying the data.
That being said, even if you removed the ORDER BY in the sub-query, it wouldn't help you with your issue
The SQL is not likely to work anyway - ORDER BY needs a value - you may have needed to make it CASE WHEN BrokerName IN (...) THEN 0 ELSE 1 END
Which also won't help, as the issue is that Ord is sorted before BrokerName anyway.
UPDATE following comment
Fundamentally, the statement that provides the actual report is
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From (<a whole lot of calculations/cpode>) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,Ord,BM_Code,LineItem,BrokerName'
The last line on there provides the ordering of the data coming from this procedure.
To get a different order, you need to change the order of the fields shown - moving BrokerName more towards the start of the list, and Ord towards the end.
e.g.,
SET #sql = '
Select XX.*,'''' scale,Isnull(AllowComma,''FALSE'') AllowComma,Isnull(AllowedDecimalPlace,''0'') AllowedDecimalPlace,
Isnull(AllowPercentageSign,''FALSE'') AllowPercentageSign,Isnull(CurrencySign,'''') CurrencySign,Isnull(BM_Denominator,'''') BM_Denominator
From (<a whole lot of calculations/cpode>) XX
Left Outer Join tblLiConfig ZZ
On XX.Section=ZZ.Section And XX.LineItem=ZZ.LI And ZZ.Ticker='''+#Ticker+'''
Order by ID,BrokerName,BM_Code,LineItem,Ord'
The above probably sorts by BrokerName too early - but it's up to you to determine what you need.

PostgreSQL array_agg order for window functions

The answer to my question was almost here: PostgreSQL array_agg order
Except that I wanted to array_agg over a window function:
select distinct c.concept_name,
array_agg(c2.vocabulary_id||':'||c2.concept_name
order by c2.vocabulary_id, c2.concept_name)
over (partition by ca.min_levels_of_separation),
ca.min_levels_of_separation
from concept c
join concept_ancestor ca on c.concept_id = ca.descendant_concept_id
and max_levels_of_separation > 0
join concept c2 on ca.ancestor_concept_id = c2.concept_id
where
c.concept_code = '44054006'
order by min_levels_of_separation;
So, maybe this will work in some future version, but I get this error
ERROR: aggregate ORDER BY is not implemented for window functions
LINE 2: select distinct c.concept_name, array_agg(c2.vocabulary_id||...
^
I should probably be selecting from a subquery like the first answer to the quoted question above suggests. I was hoping for something as simple as the order by (in that question's second answer). Or maybe I'm just being lazy about the query and should be doing a group by instead of select distinct.
I did try putting the order by in the windowing function (over (partition by ca.min_levels_of_separation order by c2.vocabulary_id, c2.concept_name)), but I get these sort of repeated rows that way:
"Type 2 diabetes mellitus";"{"MedDRA:Diabetes mellitus"}";1
"Type 2 diabetes mellitus";"{"MedDRA:Diabetes mellitus","MedDRA:Diabetes mellitus (incl subtypes)"}";1
"Type 2 diabetes mellitus";"{"MedDRA:Diabetes mellitus","MedDRA:Diabetes mellitus (incl subtypes)","SNOMED:Diabetes mellitus"}";1
(btw: http://www.ohdsi.org/ if you happen to be curious about where I got the medical vocabulary tables)
Yes, it does look like I was being muddle-headed and didn't need the window function. This seems to work:
select c.concept_name,
array_agg(c2.vocabulary_id||':'||c2.concept_name
order by c2.vocabulary_id, c2.concept_name),
ca.min_levels_of_separation
from concept c
join concept_ancestor ca on c.concept_id = ca.descendant_concept_id
and max_levels_of_separation > 0
join concept c2 on ca.ancestor_concept_id = c2.concept_id
where c.concept_code = '44054006'
group by c.concept_name, ca.min_levels_of_separation
order by min_levels_of_separation
I won't accept my answer for a while since it just avoids the question instead of actually answering it, and someone might have something more useful to say on the matter.
like this :
select distinct c.concept_name,
array_agg(c2.vocabulary_id||':'||c2.concept_name ) over (partition by ca.min_levels_of_separation order by c2.vocabulary_id, c2.concept_name),
ca.min_levels_of_separation
from concept c
join concept_ancestor ca on c.concept_id = ca.descendant_concept_id
and max_levels_of_separation > 0
join concept c2 on ca.ancestor_concept_id = c2.concept_id
where
c.concept_code = '44054006'
order by min_levels_of_separation;

OrientDB SQL Check if multiple pairs of vertices are connected

I haven't been able to find an answer for the SQL for this.
Given pairs of vertices (record ids) and edge types between them, I want to check if all pairs exists.
V1 --E1--> V2
V3 --E2--> V4
... and so on. The answer I want is true / false or something equivalent. ALL connections must be present in order to evaluate to true, so at least one edge (of correct type) must exist for each pair.
Pseudo, the question would be:
Does V1 have edge <E1EdgeType> to V2?
AND
Does V3 have edge <E2EdgeType> to V4?
AND
... and so on
Does anyone know what the orientDB SQL would be to achieve this?
UPDATE
I did already have one way of checking if one single edge exists between known vertices. It's perhaps not very pretty either, but it works:
SELECT FROM (
SELECT EXPAND(out('TestEdge')) FROM #12:0
) WHERE #rid=#12:1
This will return the destination record (#12:0) if an edge of type 'TestEdge' exists from #12:0 to #12:1. However, if I have two of those, how can I query for one single result for both queries. Something like:
SELECT <something with $c> LET
$a = (SELECT FROM (SELECT EXPAND(out('TestEdge')) FROM #12:0) WHERE #rid=#12:1)
$b = (SELECT FROM (SELECT EXPAND(out('AnotherTestEdge')) FROM #12:2) WHERE #rid=#12:3)
$c = <something that checks that both a and b yield results>
That's what I aim towards doing. Please tell me if I'm solving this the wrong way. I'm not even sure what the gain is to merge queries like this compared to just repeat queries.
Given a pair of vertices, say #11:0 and #12:0, the following query will effectively check whether there is an edge of type E from #11:0
to #12:0
select from (select #this, out(E) from #11:0 unwind out) where out = #12:0
----+------+-----+-----
# |#CLASS|this |out
----+------+-----+-----
0 |null |#11:0|#12:0
----+------+-----+-----
This is highly inelegant and I would encourage you to think about formulating an enhancement request accordingly at https://github.com/orientechnologies/orientdb/issues
One way to incorporate the boolean tests you have in mind is illustrated by the following:
select from
(select $a.size() as a, $b.size() as b
let a=(select count(*) as e from (select out(E) from #11:0 unwind out)
where out = #12:0),
b=(select count(*) as e from (select out(E) from #11:1 unwind out)
where out = #12:2))
where a > 0 and b > 0
Yes, inelegance again :-(
It might be useful to you the following query
SELECT eval('sum($a.size(),$b.size())==2') as existing_edges
let $a = ( SELECT from TestEdge where out = #12:0 and in = #12:1 limit 1),
$b = ( SELECT from AnotherTestEdge where out = #12:2 and in = #12:3 limit 1)
Hope it helps.

How to use GROUP BY with Firebird?

I'm trying create a SELECT with GROUP BY in Firebird but I can't have any success. How could I do this ?
Exception
Can't format message 13:896 -- message file C:\firebird.msg not found.
Dynamic SQL Error.
SQL error code = -104.
Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause).
(49,765 sec)
trying
SELECT FA_DATA, FA_CODALUNO, FA_MATERIA, FA_TURMA, FA_QTDFALTA,
ALU_CODIGO, ALU_NOME,
M_CODIGO, M_DESCRICAO,
FT_CODIGO, FT_ANOLETIVO, FT_TURMA
FROM FALTAS Falta
INNER JOIN ALUNOS Aluno ON (Falta.FA_CODALUNO = Aluno.ALU_CODIGO)
INNER JOIN MATERIAS Materia ON (Falta.FA_MATERIA = Materia.M_CODIGO)
INNER JOIN FORMACAOTURMAS Turma ON (Falta.FA_TURMA = Turma.FT_CODIGO)
WHERE (Falta.FA_CODALUNO = 238) AND (Turma.FT_ANOLETIVO = 2015)
GROUP BY Materia.M_CODIGO
Simple use of group by in firebird,group by all columns
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)
Using GROUP BY doesn't make sense in your example code. It is only useful when using aggregate functions (+ some other minor uses). In any case, Firebird requires you to specify all columns from the SELECT column list except those with aggregate functions in the GROUP BY clause.
Note that this is more restrictive than the SQL standard, which allows you to leave out functionally dependent columns (ie if you specify a primary key or unique key, you don't need to specify the other columns of that table).
You don't specify why you want to group (because it doesn't make much sense to do it with this query). Maybe instead you want to ORDER BY, or you want the first row for each M_CODIGO.

PostgreSQL - select the results of two subqueries

I have 2 complex queries that are both subqueries in postgres, the results of which are:
q1_results = id , delta , metric_1
q2_results = id , delta , metric_2
i'd like to combine the results of the queries, so the outer query can access either:
results_a = id , delta , metric_1 , metric_2
results_b = id , delta , combined_metric
i can't figure out how to do this. online searches keep leading me to UNION , but that keeps the metrics in the same column. i need to keep them split.
It's not entirely clear what you're asking in the question and the comments, but it sounds like you might be looking for a full join with a bunch of coalesce statements, e.g.:
-- create view at your option, e.g.:
-- create view combined_query as
select coalesce(a.id, b.id) as id,
coalesce(a.delta, b.delta) as delta,
a.metric1 as metric1,
b.metric2 as metric2,
coalesce(a.metric1,0) + coalesce(b.metric2,0) as combined
from (...) as results_a a
full join (...) as results_b b on a.id = b.id -- and a.delta = b.delta maybe?