Having(count) in a Cakephp 3 query fails on PostgreSQL - postgresql

I have the following table structure with matching relations:
,---------. ,--------------. ,---------.
| Threads | | ThreadsUsers | | Users |
|---------| |--------------| |---------|
| id | | id | | id |
'---------' | thread_id | '---------'
| user_id |
'--------------'
This custom query in ThreadsTable is meant to find threads with a given number of participants. It works fine on mysql
public function findWithUserCount(Query $query, array $options)
{
return $query
->matching('Users')
->select([
'Threads.id',
'count' => 'COUNT(Users.id)'
])
->group('Threads.id HAVING count = ' . $options['count']);
}
However it fails on postgresql with the following error
PDOException: SQLSTATE[42703]: Undefined column: 7
ERROR: column "count" does not exist
LINE 1: ...ThreadsUsers.user_id)) GROUP BY Threads.id HAVING count = 2

The HAVING clause cannot reference column aliases defined in the SELECT clause. The documentation says:
Each column referenced in condition must unambiguously reference a grouping column, unless the reference appears within an aggregate function or the ungrouped column is functionally dependent on the grouping columns.
Since count is neither a "grouping column" (i.e. the subject of the GROUP BY clause) nor an aggregate function, it can't be used there.
So the correct form would presumably be (I don't know CakePHP, and the fact that you can inject SQL into the group call at all seems like a massively broken design for a query builder):
->group('Threads.id HAVING COUNT(Users.id) = ' . $options['count']);

Related

Convert comma-separated fields to concat() function

I have a table_product that contains comma-separated strings;
id | products
-----------
1 | tv,phone,tablet
2 | computer,tv
3 | printer,tablet,radio
To avoid manual concatenation, like concat(tv,',',phone,',',tablet)
I want to select the data from table_product.products as concat() statement.
Tried this, but getting an error:
select concat(select products from table_product where id=1) from table_sales
Is there any short and basic way to perform this query?

Aggregate function to extract all fields based on maximum date

In one table I have duplicate values ​​that I would like to group and export only those fields where the value in the "published_at" field is the most up-to-date (the latest date possible). Do I understand it correctly as I use the MAX aggregate function the corresponding fields I would like to extract will refer to the max found or will it take the first found in the table?
Let me demonstrate you this on simple example (in real world example I am also joining two different tables). I would like to group it by id and extract all fields but only relating to the max published_at field. My query would be:
SELECT "t1"."id", "t1"."field", MAX("t1"."published_at") as "published_at"
FROM "t1"
GROUP By "t1"."id"
| id | field | published_at |
---------------------------------
| 1 | document1 | 2022-01-10 |
| 1 | document2 | 2022-01-11 |
| 1 | document3 | 2022-01-12 |
The result I want is:
1 - document3 - 2022-01-12
Also one question - why am I getting this error "ERROR: column "t1"."field" must appear in the GROUP BY clause or be used in an aggregate function". Can I use MAX function on string type column?
If you want the latest row for each id, you can use DISTINCT ON. For example:
select distinct on (id) *
from t
order by id, published_at desc
If you just want the latest row in the whole result set you can use LIMIT. For example:
select *
from t
order by published_at desc
limit 1

group by in postgres sql with error must appear in the GROUP BY clause or be used in an aggregate function [duplicate]

I've been migrating some of my MySQL queries to PostgreSQL to use Heroku. Most of my queries work fine, but I keep having a similar recurring error when I use group by:
ERROR: column "XYZ" must appear in the GROUP BY clause or be used in
an aggregate function
Could someone tell me what I'm doing wrong?
MySQL which works 100%:
SELECT `availables`.*
FROM `availables`
INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
GROUP BY availables.bookdate
ORDER BY availables.updated_at
PostgreSQL error:
ActiveRecord::StatementInvalid: PGError: ERROR: column
"availables.id" must appear in the GROUP BY clause or be used in an
aggregate function:
SELECT "availables".* FROM "availables" INNER
JOIN "rooms" ON "rooms".id = "availables".room_id WHERE
(rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21'
AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY
availables.updated_at
Ruby code generating the SQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => 'availables.bookdate',
:order => 'availables.updated_at')
Expected Output (from working MySQL query):
+-----+-------+-------+------------+---------+---------------+---------------+
| id | price | spots | bookdate | room_id | created_at | updated_at |
+-----+-------+-------+------------+---------+---------------+---------------+
| 414 | 38.0 | 1 | 2009-11-22 | 1762 | 2009-11-20... | 2009-11-20... |
| 415 | 38.0 | 1 | 2009-11-23 | 1762 | 2009-11-20... | 2009-11-20... |
| 416 | 38.0 | 2 | 2009-11-24 | 1762 | 2009-11-20... | 2009-11-20... |
+-----+-------+-------+------------+---------+---------------+---------------+
3 rows in set
MySQL's totally non standards compliant GROUP BY can be emulated by Postgres' DISTINCT ON. Consider this:
MySQL:
SELECT a,b,c,d,e FROM table GROUP BY a
This delivers 1 row per value of a (which one, you don't really know). Well actually you can guess, because MySQL doesn't know about hash aggregates, so it will probably use a sort... but it will only sort on a, so the order of the rows could be random. Unless it uses a multicolumn index instead of sorting. Well, anyway, it's not specified by the query.
Postgres:
SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c
This delivers 1 row per value of a, this row will be the first one in the sort according to the ORDER BY specified by the query. Simple.
Note that here, it's not an aggregate I'm computing. So GROUP BY actually makes no sense. DISTINCT ON makes a lot more sense.
Rails is married to MySQL, so I'm not surprised that it generates SQL that doesn't work in Postgres.
PostgreSQL is more SQL compliant than MySQL. All fields - except computed field with aggregation function - in the output must be present in the GROUP BY clause.
MySQL's GROUP BY can be used without an aggregate function (which is contrary to the SQL standard), and returns the first row in the group (I don't know based on what criteria), while PostgreSQL must have an aggregate function (MAX, SUM, etc) on the column, on which the GROUP BY clause is issued.
Correct, the solution to fixing this is to use :select and to select each field that you wish to decorate the resulting object with and group by them.
Nasty - but it is how group by should work as opposed to how MySQL works with it by guessing what you mean if you don't stick fields in your group by.
If I remember correctly, in PostgreSQL you have to add every column you fetch from the table where the GROUP BY clause applies to the GROUP BY clause.
Not the prettiest solution, but changing the group parameter to output every column in model works in PostgreSQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => Available.column_names.collect{|col| "availables.#{col}"},
:order => 'availables.updated_at')
According to MySQL's "Debuking GROUP BY Myths" http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html. SQL (2003 version of the standard) doesn't requires columns referenced in the SELECT list of a query to also appear in the GROUP BY clause.
For others looking for a way to order by any field, including joined field, in postgresql, use a subquery:
SELECT * FROM(
SELECT DISTINCT ON(availables.bookdate) `availables`.*
FROM `availables` INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056
AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
) AS distinct_selected
ORDER BY availables.updated_at
or arel:
subquery = SomeRecord.select("distinct on(xx.id) xx.*, jointable.order_field")
.where("").joins(")
result = SomeRecord.select("*").from("(#{subquery.to_sql}) AS distinct_selected").order(" xx.order_field ASC, jointable.order_field ASC")
I think that .uniq [1] will solve your problem.
[1] Available.select('...').uniq
Take a look at http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields

Postgresql query results to depend on few rows of same table

I'm working on some application, and we're using postgres as our DB. I don't a lot of experience with SQL at all, and now i encountered a problem, that i can't find answer to.
So here's a problem:
We have privacy settings stored in separate table, and accessibility of each row of data depends on few rows of this privacy table.
Basically structure of privacy table is:
entityId | entityType | privacyId | privacyType | allow | deletedAt
-------------------------------------------------------------------
5 | user | 6 | user | f | //example entry
5 | user | 1 | user_all | t |
In two words, this settings mean, that user id5 allows to have access to his data to everybody except user id6.
So i get available data by query like:
SELECT <some_relevant_fields> FROM <table>
JOIN <join>
WHERE
(privacy."privacyId"=6 AND privacy."privacyType"='user' AND privacy.allow=true)
OR (
(privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL)
AND
(privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false)
);
I know that this query is incorrect in this form, but i want you to get idea of what i try to achieve.
So it must check for field with its type/id and allow=true, OR check that user_all is not deleted(deletedAt field is null) and there is no field restricting access with allow=false to this user.
But it seems like postgres is chaining all expressions, so it overrides privacy."privacyType"='user_all' with 'user' at the end of expression, and returns no results, or returns data even if user "blocked", because 'user_all' exist.
Is there a way to write WHERE clause to return result if AND expression is true for 2 different rows, for example in code above: (privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL) is true for one row AND (privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false) is true for other, or maybe check for absence of row with this values.
Is this what you want?
select <some_fields> from <table> where
privacyType='user_all' AND deletedAt IS NOT NULL
union
select <some_fields> from <table> where
privacyType='user' AND privacyId=6 AND allow<>'f';
You left join the table with itself and found what element doesnt have a match using the where.
SELECT p1.*
FROM privacy p1
LEFT JOIN privacy p2
ON p1."entityId" = p2."entityId"
AND p1."privacyType" = 'user_all'
AND p1."deletedAt" IS NULL
AND p2."privacyType"='user' AND
AND p2."privacyId"= 6
AND p2.allow!=false
WHERE
p2.privacyId IS NOT NULL

Is it possible in PL/pgSQL to evaluate a string as an expression, not a statement?

I have two database tables:
# \d table_1
Table "public.table_1"
Column | Type | Modifiers
------------+---------+-----------
id | integer |
value | integer |
date_one | date |
date_two | date |
date_three | date |
# \d table_2
Table "public.table_2"
Column | Type | Modifiers
------------+---------+-----------
id | integer |
table_1_id | integer |
selector | text |
The values in table_2.selector can be one of one, two, or three, and are used to select one of the date columns in table_1.
My first implementation used a CASE:
SELECT value
FROM table_1
INNER JOIN table_2 ON table_2.table_1_id = table_1.id
WHERE CASE table_2.selector
WHEN 'one' THEN
table_1.date_one
WHEN 'two' THEN
table_1.date_two
WHEN 'three' THEN
table_1.date_three
ELSE
table_1.date_one
END BETWEEN ? AND ?
The values for selector are such that I could identify the column of interest as eval(date_#{table_2.selector}), if PL/pgSQL allows evaluation of strings as expressions.
The closest I've been able to find is EXECUTE string, which evaluates entire statements. Is there a way to evaluate expressions?
In the plpgsql function you can dynamically create any expression. This does not apply, however, in the case you described. The query must be explicitly defined before it is executed, while the choice of the field occurs while the query is executed.
Your query is the best approach. You may try to use a function, but it will not bring any benefits as the essence of the issue will remain unchanged.