The values to pair are defined in two arrays, array[1,2,3] and array['A','B','C'].
What I need to do is merge these two arrays together, where each element is paired with one at the same index in the other, so it results in array[[1,'A'],[2,'B'],[3,'C']].
How can I do that?
You could use UNNEST WITH ORDINALITY & join the 2 result this way:
Schema (PostgreSQL v14)
CREATE TABLE array_zip (
id INT,
a1 INT[],
a2 TEXT[]
);
INSERT INTO array_zip
VALUES (1, ARRAY [1, 2, 3], ARRAY ['A', 'B', 'C'])
, (2, ARRAY [4, 5, 6], ARRAY ['D', 'E', 'F', 'G']) -- different number of elements
;
Query #1
SELECT id, zip
FROM array_zip
CROSS JOIN LATERAL (
SELECT array_agg((aa1.v, aa2.v)) AS zip
FROM UNNEST(a1) WITH ORDINALITY AS aa1(v, i)
-- use an INNER JOIN instead to shortcut the zip of the first missing element
FULL JOIN UNNEST(a2) WITH ORDINALITY AS aa2(v, i)
USING (i)
) AS f
;
id
zip
1
{"(1,A)","(2,B)","(3,C)"}
2
{"(4,D)","(5,E)","(6,F)","(,G)"}
Query #2
You could also avoid the JOIN between the two arrays with an index-based direct access, which should be faster (but probably not by a lot, unless the arrays here are pretty big):
SELECT id, zip
FROM array_zip
CROSS JOIN LATERAL (
SELECT array_agg((a1[i], a2[i])) AS zip
-- use LEAST instead to stop the zip on the first missing element
FROM generate_series(1, GREATEST(cardinality(a1), cardinality(a2))) AS i
) AS f;
id
zip
1
{"(1,A)","(2,B)","(3,C)"}
2
{"(4,D)","(5,E)","(6,F)","(,G)"}
View on DB Fiddle
The last option I can think of is writing a function (either PL/pgSQL, or PL/v8, or...). The code would probably be easier to understand (especially if you need this feature in multiple queries) and you could handle the len(arr1) != len(arr2 -> raise case if you want/need to.
Related
How to use the for loop in for loop (in postgreSQL) if i need parallel values for both the loops like (i,j) = (1,1), (2,2), (3,3)
I tried with this piece of code -
FOR i IN COALESCE(ARRAY_LOWER(ids, 1), 1) .. COALESCE(ARRAY_UPPER(ids, 1), 0)
LOOP
FOR j IN COALESCE(ARRAY_LOWER(cids, 1), 1) .. COALESCE(ARRAY_UPPER(cids, 1), 0)
LOOP
--add the link between id and cids
INSERT INTO intervals (
columns)
VALUES (
--cid[j],
--id[i]
);
It is not quite clear what you are trying to do. But if you simply wanted to insert the cartesian product of two one-dimensional array into your table, you just need a CROSS JOIN
demo:db<>fiddle
INSERT INTO intervals (column_a, columnb)
SELECT
ids,
cids
FROM unnest(ids) as ids -- 1
CROSS JOIN unnest(cids) as cids
unnest() extracts the array elements into single rows.
I am trying to run a query against a table in AWS Redshift (i.e., postgresql). Below is a simplified definition of the table:
CREATE TABLE some_schema.some_table (
row_id int
,productid_level1 char(1)
,productid_level2 char(1)
,productid_level3 char(1)
)
;
INSERT INTO some_schema.some_table
VALUES
(1, a, b, c)
,(2, d, c, e)
,(3, c, f, g)
,(4, e, h, i)
,(5, f, j, k)
,(6, g, l, m)
;
I need to return a de-duped, single column table of a given productid and all of its children. "Children" means any productid that has "level" higher than the given product (for a given row) and also its grandchildren.
For example, for productid 'c', I expect to return...
'c' (because it's found in rows 1, 2, and 3)
'e' (because it's a child of 'c' in row 2)
'f' and 'g' (because they're children of 'c' in row 3)
'h' and 'i' (because they're children of 'e' in row 4)
'j' and 'k' (because they're children of 'f' in row 5)
and 'l' and 'm' (because they're children of 'g' in row 6)
Visually, I expect to return the following:
productid
---------
c
e
f
g
h
i
j
k
l
m
The actual table has about 3M rows and has about 20 "levels".
I think there are 2 parts to this query -- (1) a recursive CTE to build out the hierarchy and (2) an unpivot operation.
I have not attempted (1) yet. For (2), I have tried a query like the following, but it hasn't returned even after 3 minutes. As this will be used for an operational report, I need it to return in < 15 seconds.
select
b.productid
,b.product_level
from
some_schema.some_table as a
cross join lateral (
values
(a.productid_level1, 1)
,(a.productid_level2, 2)
...
,(a.productid_level20, 20)
) as b(productid, product_level)
How can I write the query to achieve (1) and (2) and be very performant?
I would avoid using the term Hierarchy, as that "usually" implies any node having a single parent at most.
I admit I'm lost as to the nature of the graph/network this table represents. But you might benefit from a little brute force and code repetition.
Whatever eventually works for you, I think you'll need to persist/materialise/cache the results, as repeating this at report time is unlikely to ever be a good idea.
I'm a data engineer by trade, and I'm sure they have good reasons for what they've done (or, like me, they maybe screwed up). Either way, there are many good reasons to ask them to materialise the graph in more than just one form, each suited to different use cases. So, asking them for a traditional adjacency list, as well as the table you already have, is a reasonable request. Or, at the very least, a good starting point for a conversation.
So, a brute force approach?
WITH
adjacency AS
(
SELECT level01, level02 FROM some_table WHERE level02 IS NOT NULL
UNION
SELECT level02, level03 FROM some_table WHERE level03 IS NOT NULL
UNION
...
UNION
SELECT level19, level20 FROM some_table WHERE level20 IS NOT NULL
)
The WHERE clause elimates any sparse data before it enters the map.
The UNION (without ALL) ensures duplicate links are eliminated. You should also test UNION ALL and then wrapping a SELECT DISTINCT around it (or similar).
Then you can use that adjacency list in the usual recursive walk, to find all children of a given node. (Taking care that there aren't any cyclic paths.)
In postgresql, I have a simple one JSONB column data store:
data
----------------------------
{"foo": [1,2,3,4]}
{"foo": [10,20,30,40,50,60]}
...
I need to convert consequent pairs of values into data points, essentially calling the array variant of ST_MakeLine like this: ST_MakeLine(ARRAY(ST_MakePoint(10,20), ST_MakePoint(30,40), ST_MakePoint(50,60))) for each row of the source data.
Needed result (note that the x,y order of each point might need to be reversed):
data geometry (after decoding)
---------------------------- --------------------------
{"foo": [1,2,3,4]} LINE (1 2, 3 4)
{"foo": [10,20,30,40,50,60]} LINE (10 20, 30 40, 50 60)
...
Partial solution
I can already iterate over individual array values, but it is the pairing that is giving me trouble. Also, I am not certain if I need to introduce any ordering into the query to preserve the original ordering of the array elements.
SELECT ARRAY(
SELECT elem::int
FROM jsonb_array_elements(data -> 'foo') elem
) arr FROM mytable;
You can achieve this by using window functions lead or lag, then picking only every second row:
SELECT (
SELECT array_agg((a, b) ORDER BY o)
FROM (
SELECT elem::int AS a, lead(elem::int) OVER (ORDER BY o) AS b, o
FROM jsonb_array_elements(data -> 'foo') WITH ORDINALITY els(elem, o)
) AS pairs
WHERE o % 2 = 1
) AS arr
FROM example;
(online demo)
And yes, I would recommend to specify the ordering explicitly, making use of WITH ORDINALITY.
How can I cast an array of string into an array of integers?
Below is my array
["6", "5"]
I want convert into int array
[6, 5]
demo:db<>fiddle
SELECT
array_agg(elems::int)
FROM unnest(ARRAY['5', '6']) as elems
Expand the array into one record per element
Reaggregate cast integer values
To ensure the original order, you need to add WITH ORDINALITY, which adds an index to the original array:
SELECT
array_agg(elems.value::int ORDER BY elems.index)
FROM unnest(ARRAY['5', '6']) WITH ORDINALITY as elems(value, index)
If you have a JSON array instead, the algorithm is the same, only the used functions have different names:
SELECT
json_agg(elems.value::int ORDER BY elems.index)
FROM json_array_elements_text('["5", "6"]'::json) WITH ORDINALITY as elems(value, index)
EDIT: According to comment:
this is my query. SELECT data->>'pid' FROM user_data where
data->>'pid' is not null How can I update pid to array of integers ?
demo:db<>fiddle
You have to expand and reaggregate nonetheless:
SELECT
json_agg(elems::int) -- 2
FROM user_data,
json_array_elements_text(data -> 'pid') as elems -- 1
WHERE data->>'pid' IS NOT NULL
GROUP BY id
You can’t “cast”, but you can get something very close to a cast:
array(select unnest(myArray)::int)
As a testable query:
select array(select unnest(array['5', '6'])::int)
See live demo.
When applying to a column of a selected table:
select
array(select unnest(myArrayCol)::int)
from myTable
See live demo.
This syntax preserves order.
I do have a string as entry, of the form foo:bar:something:221. I'm looking for a way to generate a table with all prefixes for this string, like:
foo
foo:bar
foo:bar:something
foo:bar:something:221
I wrote the following query to split the string, but can't figure out where to go from there:
select unnest(string_to_array('foo:bar:something:221', ':'));
An option is to simulate a loop over all elements, then take the sub-array from the input for each element index:
with data(input) as (
values (string_to_array('foo:bar:something:221', ':'))
)
select array_to_string(input[1:g.idx], ':')
from data
cross join generate_series(1, cardinality(input)) as g(idx);
generate_series(1, cardinality(input)) generates as many rows as the array has elements. And the expression input[1:g.idx] takes the "sub-array" starting with the first up to the "idx" one. As the output is an array, I use array_to_string to re-create the representation with the :
You can use string_agg as a window function. The default frame is from the beginning of the partition to the current row:
SELECT string_agg(s, ':') OVER (ORDER BY n)
FROM unnest(string_to_array('foo:bar:something:221', ':')) WITH ORDINALITY AS u(s, n);
string_agg
-----------------------
foo
foo:bar
foo:bar:something
foo:bar:something:221
(4 rows)