Subquery with `WHERE` on function calls with outer query grouped by the function calls gives "subquery uses ungrouped column from outer query" - postgresql

Consider this situation where age_group(.) is a function that returns an age bracket for an age (0-17: 'minor', 18-64: 'adult' etc.)
SELECT
date_of_data,
age_group(age),
count(1),
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'junior') AS average_salary_as_junior,
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'senior') AS average_salary_as_senior,
(SELECT avg(salary)
FROM tbl2
WHERE age_group(tbl2.age) = age_group(tbl1.age)
AND tbl2.date_of_file = tbl1.date_of_file
AND type = 'principal') AS average_salary_as_principal,
-- 15 more types to go
FROM tbl1
GROUP BY
date_of_data, age_group(age);
This will not work unless the outer query is grouped by age in contrast to age_group(age), because the subquery uses age as an argument to a function, despite being the same function:
subquery uses ungrouped column "tbl1.age" from outer query..
If I group by age instead of age_group(age), there will be redundant identical records in the output.
Conditional aggregation might be a solution, and so is using DISTINCT on the whole output, albeit inefficient. Not sure if there are more techniques to achieve the same, but I am wondering whether there's a way to make Postgres realise that the same function call exists in the GROUP BY clause, and permit such a query to execute.

Related

How to reference a column in the select clause in the order clause in SQLAlchemy like you do in Postgres instead of repeating the expression twice

In Postgres if one of your columns is a big complicated expression you can just say ORDER BY 3 DESC where 3 is the order of the column where the complicated expression is. Is there anywhere to do this in SQLAlchemy?
As Gord Thompson observes in this comment, you can pass the column index as a text object to group_by or order_by:
q = sa.select(sa.func.count(), tbl.c.user_id).group_by(sa.text('2')).order_by(sa.text('2'))
serialises to
SELECT count(*) AS count_1, posts.user_id
FROM posts GROUP BY 2 ORDER BY 2
There are other techniques that don't require re-typing the expression.
You could use the selected_columns property:
q = sa.select(tbl.c.col1, tbl.c.col2, tbl.c.col3)
q = q.order_by(q.selected_columns[2]) # order by col3
You could also order by a label (but this will affect the names of result columns):
q = sa.select(tbl.c.col1, tbl.c.col2, tbl.c.col3.label('c').order_by('c')

array_agg DISTINCT and ORDER

I'm trying to make a query in PostgreSQL for include results from 2 (or more) tables using left join lateral, and I need to have one record for each record for table entidad_a_ (main table) and all the records from table entidad_b_ must be included in one field generated by array_agg. And in this array, I have to delete duplicate elements and I have to preserve order array in main table.
I need to execute this SQL query:
SELECT entidad_a_._id_ AS "_id", CASE WHEN count(entidadB) > 0 THEN array_agg(DISTINCT entidadB._id,ordinality order by ordinality)
ELSE NULL END AS "entidadB"
FROM entidad_a_ as entidad_a_, unnest(entidad_a_.entidad_b_) WITH ORDINALITY AS u(entidadb_id, ordinality)
LEFT JOIN LATERAL (
SELECT entidad_b_3._id_ AS "_id", entidad_b_3.label_ AS "label"
FROM entidad_b_ as entidad_b_3
WHERE entidad_b_3._id_ = entidadb_id
GROUP BY entidad_b_3._id_
LIMIT 1000 OFFSET 0
) entidadB ON TRUE
GROUP BY entidad_a_._id_
LIMIT 1000 OFFSET 0
But I have errors....
How can I have these results?
Edited:
My error is:
ERROR: function array_agg (integer, bigint) does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 69
If the query is:
......array_agg (DISTINCT entidadB._id order by ordinality).....
The eror is:
ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
SQL state: 42P10
Character: 110
My problem is the combination of array_agg, DISTINCT, and ORDER by
Solved!! I've created a postgres extension with a custom aggregation.
CREATE AGGREGATE array_agg_dist (anyelement)
(
sfunc = array_agg_transfn_dist,
stype = internal,
finalfunc = array_agg_finalfn_dist,
finalfunc_extra
);
Creating functions and c code for this custom functions.

What does a column assignment using an aggregate in the columns area of a select do?

I'm trying to decipher another programmer's code who is long-gone, and I came across a select statement in a stored procedure that looks like this (simplified) example:
SELECT #Table2.Col1, Table2.Col2, Table2.Col3, MysteryColumn = CASE WHEN y.Col3 IS NOT NULL THEN #Table2.MysteryColumn - y.Col3 ELSE #Table2.MysteryColumn END
INTO #Table1
FROM #Table2
LEFT OUTER JOIN (
SELECT Table3.Col1, Table3.Col2, Col3 = SUM(#Table3.Col3)
FROM Table3
INNER JOIN #Table4 ON Table4.Col1 = Table3.Col1 AND Table4.Col2 = Table3.Col2
GROUP BY Table3.Col1, Table3.Col2
) AS y ON #Table2.Col1 = y.Col1 AND #Table2.Col2 = y.Col2
WHERE #Table2.Col2 < #EnteredValue
My question, what does the fourth column of the primary selection do? does it produce a boolean value checking to see if the values are equal? or does it set the #Table2.MysteryColumn equal to some value and then inserts it into #Table1? Or does it just update the #Table2.MysteryColumn and not output a value into #Table1?
This same thing seems to happen inside of the sub-query on the third column, and I am equally at a loss as to what that does as well.
MysteryColumn = gives the expression a name also called a column alias. The fact that a column in the table#2 also has the same name is besides the point.
Since it uses INTO syntax it also gives the column its name in the resulting temporary table. See the SELECT CLAUSE and note | column_alias = expression and the INTO CLAUSE

How to use GROUP BY with Firebird?

I'm trying create a SELECT with GROUP BY in Firebird but I can't have any success. How could I do this ?
Exception
Can't format message 13:896 -- message file C:\firebird.msg not found.
Dynamic SQL Error.
SQL error code = -104.
Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause).
(49,765 sec)
trying
SELECT FA_DATA, FA_CODALUNO, FA_MATERIA, FA_TURMA, FA_QTDFALTA,
ALU_CODIGO, ALU_NOME,
M_CODIGO, M_DESCRICAO,
FT_CODIGO, FT_ANOLETIVO, FT_TURMA
FROM FALTAS Falta
INNER JOIN ALUNOS Aluno ON (Falta.FA_CODALUNO = Aluno.ALU_CODIGO)
INNER JOIN MATERIAS Materia ON (Falta.FA_MATERIA = Materia.M_CODIGO)
INNER JOIN FORMACAOTURMAS Turma ON (Falta.FA_TURMA = Turma.FT_CODIGO)
WHERE (Falta.FA_CODALUNO = 238) AND (Turma.FT_ANOLETIVO = 2015)
GROUP BY Materia.M_CODIGO
Simple use of group by in firebird,group by all columns
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)
Using GROUP BY doesn't make sense in your example code. It is only useful when using aggregate functions (+ some other minor uses). In any case, Firebird requires you to specify all columns from the SELECT column list except those with aggregate functions in the GROUP BY clause.
Note that this is more restrictive than the SQL standard, which allows you to leave out functionally dependent columns (ie if you specify a primary key or unique key, you don't need to specify the other columns of that table).
You don't specify why you want to group (because it doesn't make much sense to do it with this query). Maybe instead you want to ORDER BY, or you want the first row for each M_CODIGO.

comprare aggregate sum function to number in postgres

I have the next query which does not work:
UPDATE item
SET popularity= (CASE
WHEN (select SUM(io.quantity) from item i NATURAL JOIN itemorder io GROUP BY io.item_id) > 3 THEN TRUE
ELSE FALSE
END);
Here I want to compare each line of inner SELECT SUM value with 3 and update popularity. But SQL gives error:
ERROR: more than one row returned by a subquery used as an expression
I understand that inner SELECT returns many values, but can smb help me in how to compare each line. In other words make loop.
When using a subquery you need to get a single row back, so you're effectively doing a query for each record in the item table.
UPDATE item i
SET popularity = (SELECT SUM(io.quantity) FROM itemorder io
WHERE io.item_id = i.item_id) > 3;
An alternative (which is a postgresql extension) is to use a derived table in a FROM clause.
UPDATE item i2
SET popularity = x.orders > 3
FROM (select i.item_id, SUM(io.quantity) as orders
from item i NATURAL JOIN itemorder io GROUP BY io.item_id)
as x(item_id,orders)
WHERE i2.item_id = x.item_id
Here you're doing a single group clause as you had, and we're joining the table to be updated with the results of the group.