Update is ignoring the Limit clause - postgresql

Gorm doesn't seem to support the use of Limit with Update, and doesn't throw an error either.
resUpdate := tx.Model(&daos.Voucher{}).
Where("status = ?", models.VoucherStatusAvailable).
Limit(quantity).
Scan(&vouchers).
Update("status", models.VoucherStatusBooked)
This query will update every field in my DB from VoucherStatusAvailable to VoucherStatusBooked, without regards for the use of Limit.
Someone quickly addressed the subject here : https://penkovski.com/post/gorm-update-returning/
But their solution of putting the Limit in the Where clause :
resUpdate := tx.Model(&daos.Voucher{}).
Where("status = ? LIMIT ?", models.VoucherStatusAvailable, quantity).
Update("status", models.VoucherStatusBooked)
doesn't work when using gorm soft delete, since it put parentheses around the where clause, and the AND deleted_at = NULL filter after it.
Initially I was doing it with two separated query (a select then an update) but in this case there is a chance of conflict.
Does anyone have an idea of how I should do the selection of X members of the table, then update their status, without having a risk of conflict with the selected items ?
Edit : The DBMS I'm using is PostgreSQL

Different point were answered :
Postgre doesn't support LIMIT and ORDER BY, so I needed an alternative.
The right alternative seemed to be CTEs, as pointed out in a comment.
gorm support custom plugin, and there is a plugin that implement CTE : https://github.com/WinterYukky/gorm-extra-clause-plugin. Unfortunately, it isn't compatible with the Update clause.
So ultimately, the only solution I found is to do a CTE in raw SQL :
var updated []Voucher
resUpdate := tx.Raw(`WITH v AS (
SELECT * from vouchers WHERE status = ?
LIMIT ?
)
UPDATE vouchers SET status = ?
WHERE EXISTS (SELECT * FROM v WHERE vouchers.id = v.id)
RETURNING *`,
models.ArticleStatusAvailable, quantity, models.VoucherStatusBooked).
Scan(&updated)

Related

Finding a non existing value in a column PostgreSQL

I'm working on a DSpace 5.10 repository with PostgrSQL 9.x. The problem is that when harvested, there are a lot of items that lack metadata required by the regulating entity of my country. Is there a way to bring up which itemID's don't have a specific field?
For example:
I need a query that gives me as result all the resource_id that don't have a metadatafield_id = X. A same resource_id has many metadata_field_id entries.
Thanks a lot.
If I'm understanding you properly:
You're looking to return all resource_id that don't have X in the field metadatafield_id.
There are multiple rows per resource_id, but only some of those rows don't contain X in their respective metadata_field_id columns.
If so, try this:
SELECT distinct resource_id
FROM your_table_name
WHERE metadata_field_id != 'X'
By using distinct, you remove all duplicate rows. In this way, you'll only return unique resource_id. Without using distinct, you will return duplicate entries for resource_id in your result.
Here is the PostgreSQL documentation for distinct.
EDIT: distinct is only supported on PostgreSQL versions 9.5+
You need to get the list of all items that doesn't have a specific metadata, so the easier way is to exclude from the complete list the ones that actually have such metadata
select item_id from item where item_id not in
(
select resource_id from resourcepolicy where
resource_type_id = 2 and metadata_field_id = ?
);

MemSql > workaround for SELECT ... FOR UPDATE

I am using MemSql as my DB and I need to have SELECT ... FOR UPDATE functionality. However it is not supported in 6.5 version, which I am using. Is there any workaround for this problem?
My problem is as follows: multiple processes pick a single record (that has not been process yet) from the same table, do some job out of SQL code then do UPDATE for marking the record as processed. If I had a possibility to do SELECT ... FOR UPDATE then I could lock the record for assuring that only one process can pick it.
As a workaround that I can think of is using some LockToken column and do something like
UPDATE Tbl SET LockToken = 'a_unique_token' WHERE LockToken IS NULL LIMIT 1;
SELECT * FROM Tbl WHERE LockToken = 'a_unique_token';
but in this case I get
Error Code: 1749. Feature 'UPDATE...LIMIT must be constrained to a single partition' is not supported by MemSQL Distributed.
I could also do the job with LOCK TABLES, but according to this they are not supported as well.
Is there any workaround to this type of problem?
Yes, your workaround is a good idea. One way you could workaround that error is to pick a specific row to lock instead of using LIMIT 1, like UPDATE Tbl SET LockToken = 'a_unique_token' WHERE LockToken IS NULL and id = (select id from Tbl WHERE LockToken IS NULL limit 1). (Or you could use (select min(id) from Tbl WHERE LockToken IS NULL) or something similar to pick an id depending on what you want.) This should work well if you have an index on id.
Also, you could check out version 6.7 where select for update is now supported: https://docs.memsql.com/sql-reference/v6.7/select/.

LIMIT clause is not enforced in UPDATE subquery

I'm seeing a strange issue since upgrading to postgres 10 from 9.4. This parameterised update query used to work reliably, but now it regularly (but not always) fails to enforce the LIMIT clause
UPDATE tablename SET someValue = ? WHERE myKey IN (
SELECT myKey
FROM tablename
WHERE status = 'good'
ORDER BY timestamp
ASC LIMIT ? FOR UPDATE
)
try this :
SELECT ... FOR UPDATE SKIP LOCKED
limit ? for update skip locked is a feature added from 9.5.
with this concept, every thread just sees the records that are not locked for update by other threads and so no race occurs.

Postgresql Increment if exist or Create a new row

Hello I have a simple table like that:
+------------+------------+----------------------+----------------+
|id (serial) | date(date) | customer_fk(integer) | value(integer) |
+------------+------------+----------------------+----------------+
I want to use every row like a daily accumulator, if a customer value arrives
and if doesn't exist a record for that customer and date, then create a new row for that customer and date, but if exist only increment the value.
I don't know how implement something like that, I only know how increment a value using SET, but more logic is required here. Thanks in advance.
I'm using version 9.4
It sounds like what you are wanting to do is an UPSERT.
http://www.postgresql.org/docs/devel/static/sql-insert.html
In this type of query, you update the record if it exists or you create a new one if it does not. The key in your table would consist of customer_fk and date.
This would be a normal insert, but with ON CONFLICT DO UPDATE SET value = value + 1.
NOTE: This only works as of Postgres 9.5. It is not possible in previous versions. For versions prior to 9.1, the only solution is two steps. For 9.1 or later, a CTE may be used as well.
For earlier versions of Postgres, you will need to perform an UPDATE first with customer_fk and date in the WHERE clause. From there, check to see if the number of affected rows is 0. If it is, then do the INSERT. The only problem with this is there is a chance of a race condition if this operation happens twice at nearly the same time (common in a web environment) since the INSERT has a chance of failing for one of them and your count will always have a chance of being slightly off.
If you are using Postgres 9.1 or above, you can use an updatable CTE as cleverly pointed out here: Insert, on duplicate update in PostgreSQL?
This solution is less likely to result in a race condition since it's executed in one step.
WITH new_values (date::date, customer_fk::integer, value::integer) AS (
VALUES
(today, 24, 1)
),
upsert AS (
UPDATE mytable m
SET value = value + 1
FROM new_values nv
WHERE m.date = nv.date AND m.customer_fk = nv.customer_fk
RETURNING m.*
)
INSERT INTO mytable (date, customer_fk, value)
SELECT date, customer_fk, value
FROM new_values
WHERE NOT EXISTS (SELECT 1
FROM upsert up
WHERE up.date = new_values.date
AND up.customer_fk = new_values.customer_fk)
This contains two CTE tables. One contains the data you are inserting (new_values) and the other contains the results of an UPDATE query using those values (upsert). The last part uses these two tables to check if the records in new_values are not present in upsert, which would mean the UPDATE failed, and performs an INSERT to create the record instead.
As a side note, if you were doing this in another SQL engine that conforms to the standard, you would use a MERGE query instead. [ https://en.wikipedia.org/wiki/Merge_(SQL) ]

PostgreSQL -must appear in the GROUP BY clause or be used in an aggregate function

I am getting this error in the pg production mode, but its working fine in sqlite3 development mode.
ActiveRecord::StatementInvalid in ManagementController#index
PG::Error: ERROR: column "estates.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = ...
^
: SELECT "estates".* FROM "estates" WHERE "estates"."Mgmt" = 'Mazzey' GROUP BY user_id
#myestate = Estate.where(:Mgmt => current_user.Company).group(:user_id).all
If user_id is the PRIMARY KEY then you need to upgrade PostgreSQL; newer versions will correctly handle grouping by the primary key.
If user_id is neither unique nor the primary key for the 'estates' relation in question, then this query doesn't make much sense, since PostgreSQL has no way to know which value to return for each column of estates where multiple rows share the same user_id. You must use an aggregate function that expresses what you want, like min, max, avg, string_agg, array_agg, etc or add the column(s) of interest to the GROUP BY.
Alternately you can rephrase the query to use DISTINCT ON and an ORDER BY if you really do want to pick a somewhat arbitrary row, though I really doubt it's possible to express that via ActiveRecord.
Some databases - including SQLite and MySQL - will just pick an arbitrary row. This is considered incorrect and unsafe by the PostgreSQL team, so PostgreSQL follows the SQL standard and considers such queries to be errors.
If you have:
col1 col2
fred 42
bob 9
fred 44
fred 99
and you do:
SELECT col1, col2 FROM mytable GROUP BY col1;
then it's obvious that you should get the row:
bob 9
but what about the result for fred? There is no single correct answer to pick, so the database will refuse to execute such unsafe queries. If you wanted the greatest col2 for any col1 you'd use the max aggregate:
SELECT col1, max(col2) AS max_col2 FROM mytable GROUP BY col1;
I recently moved from MySQL to PostgreSQL and encountered the same issue. Just for reference, the best approach I've found is to use DISTINCT ON as suggested in this SO answer:
Elegant PostgreSQL Group by for Ruby on Rails / ActiveRecord
This will let you get one record for each unique value in your chosen column that matches the other query conditions:
MyModel.where(:some_col => value).select("DISTINCT ON (unique_col) *")
I prefer DISTINCT ON because I can still get all the other column values in the row. DISTINCT alone will only return the value of that specific column.
After often receiving the error myself I realised that Rails (I am using rails 4) automatically adds an 'order by id' at the end of your grouping query. This often results in the error above. So make sure you append your own .order(:group_by_column) at the end of your Rails query. Hence you will have something like this:
#problems = Problem.select('problems.username, sum(problems.weight) as weight_sum').group('problems.username').order('problems.username')
#myestate1 = Estate.where(:Mgmt => current_user.Company)
#myestate = #myestate1.select("DISTINCT(user_id)")
this is what I did.