Postgres Update table with limit - postgresql

Postgres fails when I use limit with update statement
I want to update just the first record of a table
UPDATE test1 SET name="user101" LIMIT 1;
this fails in postgres as I am using limit;
I can do this
UPDATE test1 SET name="user101"
WHERE ID =(SELECT ID FROM test1 LIMIT 1)
But to use the above sql I should know the column name (ID) and I don't have access to column name in my case.
I need a sql statement that does not require column name but updates only the first record.
Update:
I am using ruby #db.exec(query) to execute query
We pass different sql statements like below to the method which calls #db.exec(query)
update table1 set t1_name = "123" limit 1;
update table2 set something = "xyz" limit 1;
update table3 set som = "abc" limit 1;
now before calling #db.exec(query)
I want to modify the query such that the query won't use limit
In this case I have access to table name, column name that I want to update. But I don't have access to any other column names (ID).

Although I think this approach has a very bad smell (what kind of application sends updates for random rows to the database), you can use the internal ctid column for this which is always available.
UPDATE test1
SET name = 'user101'
WHERE ctid = (SELECT ctid FROM test1 LIMIT 1)

Related

Selecting an entry from PostgreSQL table based on time and id using psycopg2

I have the following table in PostgreSQL DB:
DB exempt
I need a PostgreSQL command to get a specific value from tbl column, based on time_launched and id columns. More precisely, I need to get a value from tbl column which corresponds to a specific id and latest (time-wise) value from time_launched column. Consequently, the request should return "x" as an output.
I've tried those requests (using psycopg2 module) but they did not work:
db_object.execute("SELECT * FROM check_ids WHERE id = %s AND MIN(time_launched)", (id_variable,))
db_object.execute(SELECT DISTINCT on(id, check_id) id, check_id, time_launched, tbl, tbl_1 FROM check_ids order by id, check_id time_launched desc)
Looks like a simple ORDER BY with a LIMIT 1 should do the trick:
SELECT tbl
FROM check_ids
WHERE id = %s
ORDER BY time_launched DESC
LIMIT 1
The WHERE clause filters results by the provided id, the ORDER BY clause ensures results are sorted in reverse chronological order, and LIMIT 1 only returns the first (most recent) row

How can I use a UNION statement or an OR statement inside psql's UPDATE command?

I have an app that vends a 'code' to users through an api. A code belongs to a pool of codes that when a user hits an endpoint, he/she will get a code from this 'pool'. At the moment there is only 1 'pool' of codes where a code can be vended. That idea is best expressed below in the following sql.
<<-SQL
UPDATE codes SET vended_at = NOW()
WHERE id = (
SELECT "codes"."id"
FROM "codes"
INNER JOIN "code_batches" ON "code_batches"."id" = "codes"."code_batch_id"
WHERE "codes"."vended_at" IS NULL
AND "code_batches"."active" = true
ORDER BY "code_batches"."end_at" ASC
FOR UPDATE OF "codes" SKIP LOCKED
LIMIT 1
)
RETURNING *;
SQL
So basically, when the end point is pinged, I am returning a code that is active and its vended_at field is NULL.
Now what I need to do is to build off of this sql so that a user can get a code from this pool or from a second pool. So for example, lets say that if the user couldn't get a code from this pool (we will call it A represented by the above sql), I need to vend a code from another pool (we will call it B).
I looked up the documentation of postgresql and I think what I want to do is to either 1). Use a UNION somehow to combine pools A and B into one megapool to vend a code or if I can't vend a code through pool A, use postgresql's OR clause to select from pool B.
The problem is that I can't seem to be able to use either of these syntaxes. I've tried something along the lines like this, tweaking it with different variations.
<<-SQL
UPDATE codes SET vended_at = NOW()
WHERE id = (
SELECT "codes"."id"
FROM "codes"
INNER JOIN "code_batches" ON "code_batches"."id" = "codes"."code_batch_id"
WHERE "codes"."vended_at" IS NULL
AND "code_batches"."active" = true
ORDER BY "code_batches"."end_at" ASC
FOR UPDATE OF "codes" SKIP LOCKED
LIMIT 1
) UNION (
######## SELECT SOME OTHER STUFF #########
)
RETURNING *;
SQL
or
<<-SQL
UPDATE codes SET vended_at = NOW()
WHERE id = (
SELECT "codes"."id"
FROM "codes"
INNER JOIN "code_batches" ON "code_batches"."id" = "codes"."code_batch_id"
WHERE "codes"."vended_at" IS NULL
AND "code_batches"."active" = true
ORDER BY "code_batches"."end_at" ASC
FOR UPDATE OF "codes" SKIP LOCKED
LIMIT 1
) OR (
######## SELECT SOME OTHER STUFF USING OR #########
)
RETURNING *;
SQL
So far the syntax is off and I'm starting to wonder if I can even use this approach for what I'm trying to do. I can't determine if my approach is wrong or if maybe I am using UNION, OR, and SUB-SELECTS wrong. Does anyone have any advice I can try to accomplish my goal? Thank you.
####### EDIT ########
To illustrate and make the concept even easier, I essentially want to do this.
<<-SQL
UPDATE codes SET vended_at = NOW()
WHERE id = (
CRITERIA 1
)
OR/UNION
(
CRITERIA 2
)
RETURNING *;
SQL
Use one table to store both pools.
Add a pool_number column to the codes table to indicate which pool the code is in, then just add
ORDER BY pool_number
to your existing query.

Update a very large table in PostgreSQL without locking

I have a very large table with 100M rows in which I want to update a column with a value on the basis of another column. The example query to show what I want to do is given below:
UPDATE mytable SET col2 = 'ABCD'
WHERE col1 is not null
This is a master DB in a live environment with multiple slaves and I want to update it without locking the table or effecting the performance of the live environment. What will be the most effective way to do it? I'm thinking of making a procedure that update rows in batches of 1000 or 10000 rows using something like limit but not quite sure how to do it as I'm not that familiar with Postgres and its pitfalls. Oh and both columns don't have any indexes but table has other columns that has.
I would appreciate a sample procedure code.
Thanks.
There is no update without locking, but you can strive to keep the row locks few and short.
You could simply run batches of this:
UPDATE mytable
SET col2 = 'ABCD'
FROM (SELECT id
FROM mytable
WHERE col1 IS NOT NULL
AND col2 IS DISTINCT FROM 'ABCD'
LIMIT 10000) AS part
WHERE mytable.id = part.id;
Just keep repeating that statement until it modifies less than 10000 rows, then you are done.
Note that mass updates don't lock the table, but of course they lock the updated rows, and the more of them you update, the longer the transaction, and the greater the risk of a deadlock.
To make that performant, an index like this would help:
CREATE INDEX ON mytable (col2) WHERE col1 IS NOT NULL;
Just an off-the-wall, out-of-the-box idea. Both col1 and col2 must be null to qualify precludes using an index, perhaps building a psudo index might be an option. This index would of course be a regular table but would only exist for a short period. Additionally, this relieves the lock time worry.
create table indexer (mytable_id integer primary key);
insert into indexer(mytable_id)
select mytable_id
from mytable
where col1 is null
and col2 is null;
The above creates our 'index' that contains only the qualifying rows. Now wrap an update/delete statement into an SQL function. This function updates the main table and deleted the updated rows from the 'index' and returns the number of rows remaining.
create or replace function set_mytable_col2(rows_to_process_in integer)
returns bigint
language sql
as $$
with idx as
( update mytable
set col2 = 'ABCD'
where col2 is null
and mytable_id in (select mytable_if
from indexer
limit rows_to_process_in
)
returning mytable_id
)
delete from indexer
where mytable_id in (select mytable_id from idx);
select count(*) from indexer;
$$;
When the functions returns 0 all rows initially selected have been processed. At this point repeat the entire process to pickup any rows added or updated which the initial selection didn't identify. Should be small number, and process is still available needed later.
Like I said just an off-the-wall idea.
Edited
Must have read into it something that wasn't there concerning col1. However the idea remains the same, just change the INSERT statement for 'indexer' to meet your requirements. As far as setting it in the 'index' no the 'index' contains a single column - the primary key of the big table (and of itself).
Yes you would need to run multiple times unless you give it the total number rows to process as the parameter. The below is a DO block that would satisfy your condition. It processes 200,000 on each pass. Change that to fit your need.
Do $$
declare
rows_remaining bigint;
begin
loop
rows_remaining = set_mytable_col2(200000);
commit;
exit when rows_remaining = 0;
end loop;
end; $$;

Postgresql Increment if exist or Create a new row

Hello I have a simple table like that:
+------------+------------+----------------------+----------------+
|id (serial) | date(date) | customer_fk(integer) | value(integer) |
+------------+------------+----------------------+----------------+
I want to use every row like a daily accumulator, if a customer value arrives
and if doesn't exist a record for that customer and date, then create a new row for that customer and date, but if exist only increment the value.
I don't know how implement something like that, I only know how increment a value using SET, but more logic is required here. Thanks in advance.
I'm using version 9.4
It sounds like what you are wanting to do is an UPSERT.
http://www.postgresql.org/docs/devel/static/sql-insert.html
In this type of query, you update the record if it exists or you create a new one if it does not. The key in your table would consist of customer_fk and date.
This would be a normal insert, but with ON CONFLICT DO UPDATE SET value = value + 1.
NOTE: This only works as of Postgres 9.5. It is not possible in previous versions. For versions prior to 9.1, the only solution is two steps. For 9.1 or later, a CTE may be used as well.
For earlier versions of Postgres, you will need to perform an UPDATE first with customer_fk and date in the WHERE clause. From there, check to see if the number of affected rows is 0. If it is, then do the INSERT. The only problem with this is there is a chance of a race condition if this operation happens twice at nearly the same time (common in a web environment) since the INSERT has a chance of failing for one of them and your count will always have a chance of being slightly off.
If you are using Postgres 9.1 or above, you can use an updatable CTE as cleverly pointed out here: Insert, on duplicate update in PostgreSQL?
This solution is less likely to result in a race condition since it's executed in one step.
WITH new_values (date::date, customer_fk::integer, value::integer) AS (
VALUES
(today, 24, 1)
),
upsert AS (
UPDATE mytable m
SET value = value + 1
FROM new_values nv
WHERE m.date = nv.date AND m.customer_fk = nv.customer_fk
RETURNING m.*
)
INSERT INTO mytable (date, customer_fk, value)
SELECT date, customer_fk, value
FROM new_values
WHERE NOT EXISTS (SELECT 1
FROM upsert up
WHERE up.date = new_values.date
AND up.customer_fk = new_values.customer_fk)
This contains two CTE tables. One contains the data you are inserting (new_values) and the other contains the results of an UPDATE query using those values (upsert). The last part uses these two tables to check if the records in new_values are not present in upsert, which would mean the UPDATE failed, and performs an INSERT to create the record instead.
As a side note, if you were doing this in another SQL engine that conforms to the standard, you would use a MERGE query instead. [ https://en.wikipedia.org/wiki/Merge_(SQL) ]

how to use results from first query in second query

Ive been reading about mysqli multi_query and couldnt find a way to do this (if its possible)
$db->multi_query("SELECT id FROM table WHERE session='1';
UPDATE table SET last_login=NOW() WHERE id=table.id");
It doesnt seem to work. I am trying to use the id of the first query to update the second. is this possible
UPDATE table
SET last_login = NOW()
WHERE id IN (SELECT id
FROM table2
WHERE session = '1')
That will update all your records with session = '1'. Assuming of course that the subquery returns more than one result set, which from what I can see, it will.
This also allows you to drop the multi_query() method, as it's just a single query.
In response to the comment:
According to http://lists.mysql.com/mysql/219882 this doesn't appear to be possible with MySQL. Although I suppose you could go for something like:
$db->multiquery(
"UPDATE table
SET last_login = NOW()
WHERE id IN (SELECT id
FROM table2
WHERE session = '1');
SELECT id
FROM table2
WHERE session = '1';"
);
Which is ugly, performing the same query twice, but should do what you want.