Query table by a value in the second dimension of a two dimensional array column - postgresql

WHAT I HAVE
I have a table with the following definition:
CREATE TABLE "Highlights"
(
id uuid,
chunks numeric[][]
)
WHAT I NEED TO DO
I need to query the data in the table using the following predicate:
... WHERE id = 'some uuid' and chunks[????????][1] > 10 chunks[????????][3] < 20
What should I put instead of [????????] in order to scan all items in the first dimension of the array?
Notes
I'm not entirely sure that chunks[][1] even close to something I need.
All I need is to test a row, whether its chunks column contains a two dimensional array, that has in any of its tuples some specific values.

May be there's better alternative, but this might do - you just go over first dimension of each array and testing your condition:
select *
from highlights as h
where
exists (
select
from generate_series(1, array_length(h.chunks, 1)) as tt(i)
where
-- your condition goes here
h.chunks[tt.i][1] > 10 and h.chunks[tt.i][3] < 20
)
db<>fiddle demo
update as #arie-r pointed out, it'd be better to use generate_subscripts function:
select *
from highlights as h
where
exists (
select *
from generate_subscripts(h.chunks, 1) as tt(i)
where
h.chunks[tt.i][3] = 6
)
db<>fiddle demo

Related

Redshift how to split a stringified array into separate parts

Say I have a varchar column let's say religions that looks like this: ["Christianity", "Buddhism", "Judaism"] (yes it has a bracket in the string) and I want the string (not array) split into multiple rows like "Christianity", "Buddhism", "Judaism" so it can be used in a WHERE clause.
Eventually I want to use the results of the query in a where clause like this:
SELECT ...
FROM religions
WHERE name in
(
<this subquery>
)
How can one do this?
You can use the function JSON_PARSE to convert the varchar string into an array. Then you can use the strategy described in Convert varchar array to rows in redshift - Stack Overflow to convert the array to separate rows.
You can do the following.
Create a temporary table with sequence of numbers
Using the sequence and split_part function available in redshift, you can split the values based on the numbers generated in the temporary table by doing a cross join.
To replace the double quote and square brackets, you can use the regexp_replace function in Redshift.
create temp table seq as
with recursive numbers(NUMBER) as
(
select 1 UNION ALL
select NUMBER + 1 from numbers where NUMBER < 28
)
select * from numbers;
select regexp_replace(split_part(val,',',seq.number),'[]["]','') as value
from
(select '["christianity","Buddhism","Judaism"]' as val) -- You can select the actual column from the table here.
cross join
seq
where seq.number <= regexp_count(val,'[,]')+1;

PostgreSQL calculate prefix combinations after split

I do have a string as entry, of the form foo:bar:something:221. I'm looking for a way to generate a table with all prefixes for this string, like:
foo
foo:bar
foo:bar:something
foo:bar:something:221
I wrote the following query to split the string, but can't figure out where to go from there:
select unnest(string_to_array('foo:bar:something:221', ':'));
An option is to simulate a loop over all elements, then take the sub-array from the input for each element index:
with data(input) as (
values (string_to_array('foo:bar:something:221', ':'))
)
select array_to_string(input[1:g.idx], ':')
from data
cross join generate_series(1, cardinality(input)) as g(idx);
generate_series(1, cardinality(input)) generates as many rows as the array has elements. And the expression input[1:g.idx] takes the "sub-array" starting with the first up to the "idx" one. As the output is an array, I use array_to_string to re-create the representation with the :
You can use string_agg as a window function. The default frame is from the beginning of the partition to the current row:
SELECT string_agg(s, ':') OVER (ORDER BY n)
FROM unnest(string_to_array('foo:bar:something:221', ':')) WITH ORDINALITY AS u(s, n);
string_agg
-----------------------
foo
foo:bar
foo:bar:something
foo:bar:something:221
(4 rows)

Greatest N per group in Open SQL

Selecting the rows from a table by (partial) key with the maximum value in a particular column is a common task in SQL. This question has some excellent answers that cover a variety of approaches to it. Unfortunately I'm struggling to replicate this in my ABAP program.
None of the commonly used approaches seem to be supported:
Joining on a subquery is not supported in syntax: SELECT * FROM X as x INNER JOIN ( SELECT ... ) AS y
Using IN for a composite key is not supported in syntax as far as I know: SELECT * FROM X WHERE (key1, key2) IN ( SELECT key1 key2 FROM ... )
Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons: SELECT * FROM X AS x LEFT JOIN X as xmax ON x-key1 = xmax-key1 AND x-key2 < xmax-key2 WHERE xmax-key IS INITIAL
After trying each of these solutions in turn only to discover that ABAP doesn't seem to support them and being unable to find any equivalents I'm starting to think that I'll have no choice but to dump the data of the subquery to an itab.
What is the best practice for this common programming requirement in ABAP development?
First of all, specific requirement, would give you a better answer. As it happens I bumped into this question when working on a program, that uses 3 distinct methods of pseudo-grouping, (while looking for alternatives) and ALL 3 can be used to answer your question, depending on what exactly you need to do. I'm sure there are more ways to do it.
For instance, you can pull maximum values within a group by simply selecting max( your_field ) and grouping by some fields, if that's all you need.
select bname, nation, max( date_from ) from adrp group by bname, nation. "selects highest "from" date for each bname
If you need to use that max value as a filter condition within a query, you can do it by performing pseudo-grouping using sub-query and max within sub-query like this (notice how I move out the BNAME check into sub query, which means I don't have to check both fields using in (subquery) addition):
select ... from adrp as b_adrp "Pulls the latest person info for a user (some conditions are missing, but this is a part of an actual query)
where b_adrp~date_from in (
select max( date_from ) "Highest date_from where both dates are valid
from adrp where persnumber = b_adrp~persnumber and nation = b_adrp~nation and date_from <= #sy-datum )
The query above allows you to select selects all user info from base query and (where the first one only allows to take aggregated and grouped data).
Finally, If you need to check based on composite key and compare it to multiple agregate function results, the implementation will heavily depend on specifics of your requirement (and since your question has none, I'll provide a generic one). Easiest option is to use exists / not exists instead of in (subquery), in exact same way and form the subquery to check for existance of specific key or condition rather than pull a list ( you can nest subqueries if you have to ):
select * from bkpf where exists ( select 1 from bkpf as b where belnr = bkpf~belnr and gjahr = bkpf~gjahr group by belnr, gjahr having max( budat ) = bkpf~budat ) "Took an available example, that I had in testing program.
All 3 queries will get you max value of a column within a group and in fact, all 3 can use joins to achieve identical results.
please find my answers below your questions.
Joining on a subquery is not supported in syntax: SELECT * FROM X as x INNER JOIN ( SELECT ... ) AS y
Putting the subquery in your where condition should do the work SELECT * FROM X AS x INNER JOIN Y AS y ON x~a = y~b WHERE ( SELECT * FROM y WHERE ... )
Using IN for a composite key is not supported in syntax as far as I know: SELECT * FROM X WHERE (key1, key2) IN ( SELECT key1 key2 FROM ... )
You have to split your WHERE clause: SELECT * FROM X WHERE key1 IN ( SELECT key1 FROM y ) AND key2 IN ( SELECT key2 FROM y )
Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons.
Yes, thats right at the moment.
Left join to itself with smaller-than comparison is not supported, outer joins only support EQ comparisons:
SELECT * FROM X AS x LEFT JOIN X as xmax ON x-key1 = xmax-key1 AND x-key2 < xmax-key2 WHERE xmax-key IS INITIAL
This is not true. This SELECT is perfectly valid:
SELECT b1~budat
INTO TABLE lt_bkpf
FROM bkpf AS b1
LEFT JOIN bkpf AS b2
ON b2~belnr < b1~belnr
WHERE b1~bukrs <> ''.
And was valid at least since 7.40 SP08, since July 2013, so at the time you asked this question it was valid as well.

need to copy all rows with C_PROV_TYPE ='014' and C_SPECILTY = '300' and insert back 3 rows with same data + max sequence number + 1 i.e =4,5,6

need 3 rows for each one of the two valid rows displayed below output like below:
Primary key is C_PROCEDURE + C_PROV_TYPE + SPEC_SEQ_NO!
output shall be like bELOW
You could try something like this:
INSERT INTO YourTable (
C_PROCEDURE,
C_PROV_TYPE,
I_PT_SPEC_SEQ_NO,
C_SPECIALTY
)
SELECT
s.C_PROCEDURE,
s.C_PROV_TYPE,
s.MaxSeq + ROW_NUMBER() OVER (
PARTITION BY s.C_PROCEDURE, s.C_PROV_TYPE
ORDER BY v.rn, s.I_PT_SPEC_SEQ_NO),
s.C_SPECIALTY + v.rn
FROM (
SELECT
*,
MAX(I_PT_SPEC_SEQ_NO) OVER (
PARTITION BY C_PROCEDURE, C_PROV_TYPE
) AS MaxSeq
FROM YourTable
) s
CROSS JOIN (
VALUES (1), (2), (3)
) v (rn)
WHERE s.C_PROV_TYPE = '014'
AND s.C_SPECIALTY = '300'
;
Basically, the subquery returns all the YourTable rows supplied with the maximum values of I_PT_SPEC_SEQ_NO for every partition of (C_PROCEDURE, C_PROV_TYPE) using the windowing MAX() function (MAX(...) OVER (...)).
The resulting set of that subquery is then cross-joined to an inline 3-row table (which produces three copies of every row returned) and filtered by the specified values of C_PROV_TYPE and C_SPECIALTY.
New data rows pull C_PROCEDURE and C_PROV_TYPE directly from the subquery. The new C_SPECIALTY values are produced using those from the subquery and the rn values of the inline table. The new sequence numbers are generated with the help of the ROW_NUMBER() function and the maximum sequence numbers returned by the subquery.
As I didn't have access to a working installation of DB2, I was testing my script in SQL Server 2008, trying to stick to features that I understood DB2 supported as well as SQL Server. This SQL Fiddle demo also uses a SQL Server 2008 instance to demonstrate how the query works.

Using two different rows from the same table in an expression

I'm using PostgreSQL + PostGIS.
In table I have a point and line geometry in the same column of the same table, in different rows. To get the line I run:
SELECT the_geom
FROM filedata
WHERE id=3
If i want to take point I run:
SELECT the_geom
FROM filedata
WHERE id=4
I want take point and line together, like they're shown in this WITH expression, but using a real query against the table instead:
WITH data AS (
SELECT 'LINESTRING (50 40, 40 60, 50 90, 30 140)'::geometry AS road,
'POINT (60 110)'::geometry AS poi)
SELECT ST_AsText(
ST_Line_Interpolate_Point(road, ST_Line_Locate_Point(road, poi))) AS projected_poi
FROM data;
You see in this example data comes from a hand-created WITH expression. I want take it from my filedata table. My problem is i dont know how to work with data from two different rows of one table at the same time.
One possible way:
A subquery to retrieve another value from a different row.
SELECT ST_AsText(
ST_Line_Interpolate_Point(
the_geom
,ST_Line_Locate_Point(
the_geom
,(SELECT the_geom FROM filedata WHERE id = 4)
)
)
) AS projected_poi
FROM filedata
WHERE id = 3;
Use a self-join:
SELECT ST_AsText(
ST_Line_Interpolate_Point(fd_road.the_geom, ST_Line_Locate_Point(
fd_road.the_geom,
fd_poi.the_geom
)) AS projected_poi
FROM filedata fd_road, filedata fd_poi
WHERE fd_road.id = 3 AND fd_poi.id = 4;
Alternately use a subquery to fetch the other row, as Erwin pointed out.
The main options for using multiple rows from one table in a single expression are:
Self-join the table with two different aliases as shown above, then filter the rows;
Use a subquery expression to get a value for all but one of the rows, as Erwin's answer shows;
Use a window function like lag() and lead() to get a row relative to the current row within the query result; or
JOIN on a subquery that returns a table
The latter two are more advanced options that solve problems that're difficult or inefficient to solve with the simpler self-join or subquery expression.