array_agg DISTINCT and ORDER - postgresql

I'm trying to make a query in PostgreSQL for include results from 2 (or more) tables using left join lateral, and I need to have one record for each record for table entidad_a_ (main table) and all the records from table entidad_b_ must be included in one field generated by array_agg. And in this array, I have to delete duplicate elements and I have to preserve order array in main table.
I need to execute this SQL query:
SELECT entidad_a_._id_ AS "_id", CASE WHEN count(entidadB) > 0 THEN array_agg(DISTINCT entidadB._id,ordinality order by ordinality)
ELSE NULL END AS "entidadB"
FROM entidad_a_ as entidad_a_, unnest(entidad_a_.entidad_b_) WITH ORDINALITY AS u(entidadb_id, ordinality)
LEFT JOIN LATERAL (
SELECT entidad_b_3._id_ AS "_id", entidad_b_3.label_ AS "label"
FROM entidad_b_ as entidad_b_3
WHERE entidad_b_3._id_ = entidadb_id
GROUP BY entidad_b_3._id_
LIMIT 1000 OFFSET 0
) entidadB ON TRUE
GROUP BY entidad_a_._id_
LIMIT 1000 OFFSET 0
But I have errors....
How can I have these results?
Edited:
My error is:
ERROR: function array_agg (integer, bigint) does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 69
If the query is:
......array_agg (DISTINCT entidadB._id order by ordinality).....
The eror is:
ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
SQL state: 42P10
Character: 110
My problem is the combination of array_agg, DISTINCT, and ORDER by

Solved!! I've created a postgres extension with a custom aggregation.
CREATE AGGREGATE array_agg_dist (anyelement)
(
sfunc = array_agg_transfn_dist,
stype = internal,
finalfunc = array_agg_finalfn_dist,
finalfunc_extra
);
Creating functions and c code for this custom functions.

Related

Subquery with `WHERE` on function calls with outer query grouped by the function calls gives "subquery uses ungrouped column from outer query"

Consider this situation where age_group(.) is a function that returns an age bracket for an age (0-17: 'minor', 18-64: 'adult' etc.)
SELECT
date_of_data,
age_group(age),
count(1),
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'junior') AS average_salary_as_junior,
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'senior') AS average_salary_as_senior,
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'principal') AS average_salary_as_principal,
-- 15 more types to go
FROM tbl1
GROUP BY
date_of_data, age_group(age);
This will not work unless the outer query is grouped by age in contrast to age_group(age), because the subquery uses age as an argument to a function, despite being the same function:
subquery uses ungrouped column "tbl1.age" from outer query..
If I group by age instead of age_group(age), there will be redundant identical records in the output.
Conditional aggregation might be a solution, and so is using DISTINCT on the whole output, albeit inefficient. Not sure if there are more techniques to achieve the same, but I am wondering whether there's a way to make Postgres realise that the same function call exists in the GROUP BY clause, and permit such a query to execute.

Literal SQL works: Array value must start with "{" or dimension information

I am trying to add an ARRAY to an existing jsonb ARRAY. This array will be added to the ARRAY[0] of the existing array. When I hardcode the details it's working but when I try to do it dynamically it fails with the above error. what am I doing wrong?
Postgresql 13 db server version
with whatposition as (select position pos from users cross join lateral
jsonb_array_elements(user_details->'Profile') with ordinality arr(elem,position)
where display_ok=false)
update users set user_details=jsonb_set(
user_details,concat('ARRAY[''userProfile'',''',(select pos-1 from whatposition)::text,'''',',''DocumentDetails'']')::text[],
'[{"y":"supernewValue"}]')
where display_ok=false;
SQL Error [22P02]: ERROR: malformed array literal:
"ARRAY['userProfile','0','DocumentDetails']" Detail: Array value
must start with "{" or dimension information.
This is the with subquery output.
with whatposition as (select position pos from users cross join lateral
jsonb_array_elements(user_details->'userProfile') with ordinality arr(elem,position)
where display_ok=false)
select concat('ARRAY[''userProfile'',''',(select pos-1 from whatposition)::text,'''',',''DocumentDetails'']');
OUTPUT OF THE ABOVE SQL
ARRAY['userProfile','0','DocumentDetails']
But when I pass the value as a literal to the above SQL it works just fine.
with whatposition as (select position pos from users cross join lateral
jsonb_array_elements(user_details->'userProfile') with ordinality arr(elem,position)
where display_ok=false)
update users set user_details=jsonb_set(
user_details,ARRAY['userProfile','0','DocumentDetails'],'[{"y":"cccValue"}]')
where display_ok=false;
You shouldn't put the ARRAY[…] syntax in a literal value.
with whatposition as (
select position pos
from users
cross join lateral jsonb_array_elements(user_details->'Profile') with ordinality arr(elem,position)
where display_ok=false
)
update users
set user_details=jsonb_set(
user_details,
ARRAY['userProfile', (select pos-1 from whatposition)::text, 'DocumentDetails'],
'[{"y":"supernewValue"}]'
)
where display_ok=false;
The query you are trying is broken beyond the superficial syntax error (which is addressed by Bergi).
If the CTE returns multiple rows (as expected), the ARRAY constructor will fail because the nested subselect is only allowed to return a single value in this place.
To "upsert" (insert or update) the property "DocumentDetails": [{"y": "cccValue"}]} to the first element (the one with subscript 0) of the nested JSON array user_details->'userProfile':
Postgres 14 or later
Make use of JSONB subscripting:
UPDATE users
SET user_details['userProfile'][0]['DocumentDetails'] = '[{"y":"cccValue"}]'
WHERE display_ok = FALSE;
Postgres 13
Use jsonb_set() - exactly like you already have in your last code example, only without the unneeded CTE:
UPDATE users
SET user_details = jsonb_set(user_details, '{userProfile, 0, DocumentDetails}', '[{"y":"cccValue"}]')
WHERE display_ok = FALSE;
db<>fiddle here

How to check an ascending ordered column value in where clause in postgresql?

I am new to postgresql. I want to join two tables if one geometry of first table is contained by the geometry of second table. So, I have written and executed this part of the query as following and it is running fine.
select edge.start_id, cls.gid
from edge_table edge
inner join cluster_info cls on st_contains(cls.geom,st_setsrid(edge.start_geom,3067));
But it is giving the start_id and its containing geom id (as mentioned cls.gid in the query) in a random order such as following:
start_id gid
26040 2493
43323 2490
26208 2400
42754 2433
43537 2434
1379 2434
43570 2904
42887 2475
43689 2495
43211 2904
But I need to insert the result in another column named start_cls in my edge table. I need to identify the row where the cls.gid should be inserted. So, I need to check the value of start_id for each row and the cls.gid corresponding to that start_id should be put in that row. Assume, four rows of my edge table are following:
gid start_id end_id start_geom end_geom start_cls end_cls
1 81608 81608 01010000007368912D8B622341E5D022EBEAF65A41 01010000007368912D8B622341E5D022EBEAF65A41
2 81557 81520 010100000085EB51F89C0723418B6CE7DB9F8E5A41 0101000000986E1203DE0723416DE7FB51A38E5A41
3 189898 80812 01010000006F1283C0A093214179E926F1A1005B41 0101000000BE9F1A6FF3942141022B871EEC005B41
4 80952 80476 0101000000666666E67F832341F2D24DBA38B45A41 0101000000736891EDB48423413BDF4F755AB45A41
I need to fill the start_cls column first. So, the cls.gid value of 81608 (first start_id) should be there at first row under start_cls column. So, I have given one where clause as following:
select edge.start_id, cls.gid
from edge_table edge
inner join cluster_info cls on st_contains(cls.geom,st_setsrid(edge.start_geom,3067))
where (select start_id from edge_table) = edge.start_id;
But, it is giving following error:
ERROR: more than one row returned by a subquery used as an expression
********** Error **********
ERROR: more than one row returned by a subquery used as an expression
SQL state: 21000
I tried with the following query too but no luck.
select edge.start_id, cls.gid
from edge_table edge
inner join cluster_info cls on st_contains(cls.geom,st_setsrid(edge.start_geom,3067))
where (select start
from (select start_id as start
from edge_table) as s) = edge.start_id;
Please help with this query. It has some geometry part but the main problem is in postgresql query organisation. So, I have raised this question in stackoverflow instead of gis.stackexchange.

comprare aggregate sum function to number in postgres

I have the next query which does not work:
UPDATE item
SET popularity= (CASE
WHEN (select SUM(io.quantity) from item i NATURAL JOIN itemorder io GROUP BY io.item_id) > 3 THEN TRUE
ELSE FALSE
END);
Here I want to compare each line of inner SELECT SUM value with 3 and update popularity. But SQL gives error:
ERROR: more than one row returned by a subquery used as an expression
I understand that inner SELECT returns many values, but can smb help me in how to compare each line. In other words make loop.
When using a subquery you need to get a single row back, so you're effectively doing a query for each record in the item table.
UPDATE item i
SET popularity = (SELECT SUM(io.quantity) FROM itemorder io
WHERE io.item_id = i.item_id) > 3;
An alternative (which is a postgresql extension) is to use a derived table in a FROM clause.
UPDATE item i2
SET popularity = x.orders > 3
FROM (select i.item_id, SUM(io.quantity) as orders
from item i NATURAL JOIN itemorder io GROUP BY io.item_id)
as x(item_id,orders)
WHERE i2.item_id = x.item_id
Here you're doing a single group clause as you had, and we're joining the table to be updated with the results of the group.

DB2 subquery not working using IN statement SQLCODE 115

I'm trying to execute a query in DB2. But it throws following error:
Error: DB2 SQL Error: SQLCODE=-115, SQLSTATE=42601, SQLERRMC=IN, DRIVER=4.8.86
SQLState: 42601
ErrorCode: -115
Error: DB2 SQL Error: SQLCODE=-514, SQLSTATE=26501, SQLERRMC=SQL_CURSH200C1; STMT0001, DRIVER=4.8.86
SQLState: 26501
ErrorCode: -514
Which does'nt make sense as my query looks correct:
SELECT ROW_NUMBER() OVER() AS ID,
CONCAT(TRIM(TB1.ROW1),CONCAT('_',TRIM(TB1.ROW2))) AS CODE_DESCRIPTION,
CASE
WHEN TRIM(TB1.ROW1) IN (SELECT T1.ROW1 FROM DB1.TABLE1 T1 WHERE T1.ROW3 = 'TEST')
THEN 'Valid'
ELSE 'Invalid'
END,
TB1.* FROM DB1.TABLE1 TB1
WHERE TB1.ROW3 = 'CLASS1';
SQLCode 115 means Comparison is invalid. Which is not ?
Update:
What I'm trying to accomplish here is. I have a Table Table1(Name changed for simplicity). Following is the part of the content.
**Row3** **Row1** **Row2**
KSASPREM SRQ 0 0 Auto Carry SRQ
KSASPREM SCG 0 0 BRT Buses SCG
KSASPREM SCE 0 0 Buses SCE
KSASPREM SRR 0 0 Buses SRR
KSASPREM SDC 0 0 Domestic All Risks SDC
KSASPREM SDA 0 0 Domestic Buildings SDA
Task to accomplish:
Retrieve all the values from Table1 where Row3 is KSASPREM
The result should contain one extra column 'Valid' value Yes/No if value of Row1 is not in the Values retrieved from Table1 where Row3 is 'TEST'
Hope I made myself clear and not more confusing ?
Any Help ?
Thanks
Ps. Updated the Query
As with so many things, a JOIN (here, LEFT JOIN) is the answer. Specifically, we need to put the (slightly modified) subquery as the table reference:
LEFT JOIN (SELECT DISTINCT row1, 'Valid' as valid
FROM Table1
WHERE row3 = 'TEST') AS Test
ON Test.row1 = TB1.row1
LEFT JOIN tells the query engine that "rows in this other table aren't required".
DISTINCT says, "for all value combinations in these columns, give me just one row"
Using a constant value - 'Valid' - returns that constant value.
... so this gets us a (virtual, temp) table containing unique row1 entries where row3 = 'test'.
Here's the full query:
SELECT ROW_NUMBER() OVER(ORDER BY TB1.row1) AS ID,
TRIM(TB1.ROW1) || '_' || TRIM(TB1.ROW2) AS CODE_DESCRIPTION,
COALESCE(Test.valid, 'Invalid') AS valid,
TB1.row3, TB1.row1, TB1.row2
FROM Table1 TB1
LEFT JOIN (SELECT DISTINCT row1, 'Valid' as valid
FROM Table1
WHERE row3 = 'TEST') Test
ON Test.row1 = TB1 .row1
WHERE TB1.ROW3 = 'KSASPREM'
SQL Fiddle Example
COALESCE(...) returns the first non-null value encountered in the value list. Since, if there is no Test row, Test.valid will be null, this outputs 'Invalid' for TB1 rows without a corresponding Test row. (Internally it's calling CASE, I believe, this just makes it prettier)
Note that:
I've put an ORDER BY into the OVER clause, to return (mostly) consistent results. If you only ever plan on running this once it doesn't matter, but if you need to run it multiple times and get consistent IDs, you'll need to use something that won't be shuffled.
DB2 (and apparently PostgreSQL) support || as a concat operator. It makes reading statements so much easier to understand.
Never use SELECT *, it isn't safe for several reasons. Always specify which columns you want.