Two concurrent statements writing to database - postgresql

+---+---------+-----------+
|id | title |description|
+---+---------------------+
| 1 | The King| Jonh X |
+---+---------------------+
Two concurrent statements:
update book set title = 'aaa', description = 'aaa' where id = 1
update book set title = 'bbb', description = 'bbb' where id = 1
Is it theoretically possible the following result?
+---+---------+-----------+
|id | title |description|
+---+---------------------+
| 1 | aaa | bbbb |
+---+---------------------+
update book set title = 'aaa', description = 'aaa' where id = 1
select title, description from book -> (The King, aaa)?
Those statements are not wrapped in transaction
What about popular database systems like SQL Server, Postgres?

Generally impossible in any ACID-compliant database.
ACID stands for atomicity, consistency, isolation, durability.
In particular, Postgres takes a write-lock on affected rows before the UPDATE and does not release it until the end of the transaction. (And every UPDATE runs inside a transaction, implicitly or explicitly.) Concurrent transactions trying to write to the same row must wait and re-evaluate filters once the lock is released. They may then change the row once more - or come up empty if the filters do not apply any more.

Related

UPDATE from temp table picking the "last" row per group

Suppose there is a table with data:
+----+-------+
| id | value |
+----+-------+
| 1 | 0 |
| 2 | 0 |
+----+-------+
I need to do a bulk update. And use COPY FROM STDIN for fast insert to temp table without constraints and so it can contains duplicate values in id column
Temp table to update from:
+----+-------+
| id | value |
+----+-------+
| 1 | 1 |
| 2 | 1 |
| 1 | 2 |
| 2 | 2 |
+----+-------+
If I simply run a query like with:
UPDATE test target SET value = source.value FROM tmp_test source WHERE target.id = source.id;
I got wrong results:
+----+-------+
| id | value |
+----+-------+
| 1 | 1 |
| 2 | 1 |
+----+-------+
I need the target table to contain the values that appeared last in the temporary table.
What is the most effective way to do this, given that the target table may contain millions of records, and the temporary table may contain tens of thousands?**
Assuming you want to take the value from the row that was inserted last into the temp table, physically, you can (ab-)use the system column ctid, signifying the physical location:
UPDATE test AS target
SET value = source.value
FROM (
SELECT DISTINCT ON (id)
id, value
FROM tmp_test
ORDER BY id, ctid DESC
) source
WHERE target.id = source.id
AND target.value <> source.value; -- skip empty updates
About DISTINCT ON:
Select first row in each GROUP BY group?
This builds on a implementation detail, and is not backed up by the SQL standard. If some insert method should not write rows in sequence (like future "parallel" INSERT), it breaks. Currently, it should work. About ctid:
How do I decompose ctid into page and row numbers?
If you want a safe way, you need to add some user column to signify the order of rows, like a serial column. But do your really care? Your tiebreaker seems rather arbitrary. See:
Temporary sequence within a SELECT
AND target.value <> source.value
skips empty updates - assuming both columns are NOT NULL. Else, use:
AND target.value IS DISTINCT FROM source.value
See:
How do I (or can I) SELECT DISTINCT on multiple columns?

Querying data with additional column that creates a number for ordering purposes

I am trying to create a "queue" system by adding an arbitrary column that creates a number based on a condition and date, to sort the importance of a row.
For example, below is the query result I pulled in Postgres:
Table: task
Result:
description | status/condition| task_created |
bla | A | 2019-12-01 07:00:00|
pikachu | A | 2019-12-01 16:32:10|
abcdef | B | 2019-12-02 18:34:22|
doremi | B | 2019-12-02 15:09:43|
lalala | A | 2019-12-03 22:10:59|
In the above, each task has a date/timestamp and status/condition applied to them. I would like to create another column that gives a number to a row where it prioritises the older tasks first, BUT if the condition is B, then we take the older task of those in B as first priority.
The expected end result (based on the example) should be:
Table1: task
description | status/condition| task_created | priority index
bla | A | 2019-12-01 07:00:00| 3
pikachu | A | 2019-12-01 16:32:10| 4
abcdef | B | 2019-12-02 18:34:22| 2
doremi | B | 2019-12-02 15:09:43| 1
lalala | A | 2019-12-03 22:10:59| 5
For priority number, 1 being most urgent to do/resolve, while 5 being the least.
How would I go about adding this additional column into the existing query? especially since there's another condition apart from just the task_created date/time.
Any help is appreciated. Many thanks!
You maybe want the Rank or Dense Rank function (depends on your needs) window functions.
If you don't need a conditional order on the status you can use this one.
SELECT *,
rank() OVER (
ORDER BY status desc, task_created
) as priority_index
FROM task
If you need a custom order based on the value of the status:
SELECT *,
rank() OVER (
ORDER BY
CASE status
WHEN 'B' THEN 1
WHEN 'A' THEN 2
WHEN 'C' THEN 3
ELSE 4
END, task_created
) as priority_index
FROM task
If you have few values this is good enough, because we can simply specify your custom order. But if you have a lot of values and the ordering information is fixed, then it should have its own table.

joining with a DISTINCT ON on an ordered subquery in sqlalchemy

Here is (an extremely simplified version of) my problem.
I'm using Postgresql as the backend and trying to build a sqlalchemy query
from another query.
Table setup
Here are the tables with some random data for the example.
You can assume that each table was declared in sqlalchemy declaratively, with
the name of the mappers being respectively Item and ItemVersion.
At the end of the question you can find a link where I put the code for
everything in this question, including the table definitions.
Some items.
item
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
A table containing versions of each item. Each has at least one.
item_version
+----+---------+---------+-----------+
| id | item_id | version | text |
+----+---------+---------+-----------+
| 1 | 1 | 0 | item_1_v0 |
| 2 | 1 | 1 | item_1_v1 |
| 3 | 2 | 0 | item_2_v0 |
| 4 | 3 | 0 | item_3_v0 |
+----+---------+---------+-----------+
The query
Now, for a given sqlalchemy query over Item, I want a function that returns
another query, but this time over (Item, ItemVersion), where the Items are
the same as in the original query (and in the same order!), and where the
ItemVersion are the corresponding latest versions for each Item.
Here is an example in SQL, which is pretty straightforward:
First a random query over the item table
SELECT item.id as item_id
FROM item
WHERE item.id != 2
ORDER BY item.id DESC
which corresponds to
+---------+
| item_id |
+---------+
| 3 |
| 1 |
+---------+
Then from that query, if I want to join the right versions, I can do
SELECT sq2.item_id AS item_id,
sq2.item_version_id AS item_version_id,
sq2.item_version_text AS item_version_text
FROM (
SELECT DISTINCT ON (sq.item_id)
sq.item_id AS item_id,
iv.id AS item_version_id,
iv.text AS item_version_text
FROM (
SELECT item.id AS item_id
FROM item
WHERE id != 2
ORDER BY id DESC) AS sq
JOIN item_version AS iv
ON iv.item_id = sq.item_id
ORDER BY sq.item_id, iv.version DESC) AS sq2
ORDER BY sq2.item_id DESC
Note that it has to be wrapped in a subquery a second time because the
DISTINCT ON discards the ordering.
Now the challenge is to write a function that does that in sqlalchemy.
Here is what I have so far.
First the initial sqlalchemy query over the items:
session.query(Item).filter(Item.id != 2).order_by(desc(Item.id))
Then I'm able to build my second query but without the original ordering. In
other words I don't know how to do the second subquery wrapping that I did in
SQL to get back the ordering that was discarded by the DISTINCT ON.
def join_version(session, query):
sq = aliased(Item, query.subquery('sq'))
sq2 = session.query(sq, ItemVersion) \
.distinct(sq.id) \
.join(ItemVersion) \
.order_by(sq.id, desc(ItemVersion.version))
return sq2
I think this SO question could be part of the answer but I'm not quite
sure how.
The code to run everything in this question (database creation, population and
a failing unit test with what I have so far) can be found here. Normally
if you can fix the join_version function, it should make the test pass!
Ok so I found a way. It's a bit of a hack but still only queries the database twice so I guess I will survive! Basically I'm querying the database for the Items first, and then I do another query for the ItemVersions, filtering on item_id, and then reordering with a trick I found here (this is also relevant).
Here is the code:
def join_version(session, query):
items = query.all()
item_ids = [i.id for i in items]
items_v_sq = session.query(ItemVersion) \
.distinct(ItemVersion.item_id) \
.filter(ItemVersion.item_id.in_(item_ids)) \
.order_by(ItemVersion.item_id, desc(ItemVersion.version)) \
.subquery('sq')
sq = aliased(ItemVersion, items_v_sq)
items_v = session.query(sq) \
.order_by('idx(array{}, sq.item_id)'.format(item_ids))
return zip(items, items_v)

Update a single value in a database table through form submission

Here is my table in the database :
id | account_name | account_number | account_type | address | email | ifsc_code | is_default_account | phone_num | User
-----+--------------+----------------+--------------+---------+------------------------------+-----------+--------------------+-------------+----------
201 | helloi32irn | 55265766432454 | Savings | | mypal.appa99721989#gmail.com | 5545 | f | 98654567876 | abc
195 | hello | 55265766435523 | Savings | | mypal.1989#gmail.com | 5545 | t | 98654567876 | axyz
203 | what | 01010101010101 | Current | | guillaume#sample.com | 6123 | f | 09099990 | abc
On form submission in the view, which only posts a single parameter which in my case is name= "activate" which corresponds to the column "is_default_account" in the table.
I want to change the value of "is_default_account" from "t" to "f". For example here in the table, for account_name "hello" it is "t". And i want to deactivate it, i.e make it "f" and activate any of the other that has been sent trough the form
This will update your table and make account 'what' default (assuming that is_default_account is BOOLEAN field):
UPDATE table
SET is_default_account = (account_name = 'what')
You may want limit updates if table is more than just few rows you listed, like this:
UPDATE table
SET is_default_account = (account_name = 'what')
WHERE is_default_account != (account_name = 'what')
AND <limit updates by some other criteria like user name>
I think to accomplish what you want to do you should send at least two values from the form. One for the id of the account you want to update and the other for the action (activate here). You can also just send the id and have it toggle. There are many ways to do this but I can't figure out exactly what you are trying to do and whether you want SQL or Playframework code. Without limiting your update in somewhere (like id) you can't precisely control what specific rows get updated. Please clarify your question and add some more code if you want help on the playframework side, which I would think you do.

Zend DB inserting relational data

I'm using the Zend Framework database relationships for a couple of weeks now. My first impression is pretty good, but I do have a question related to inserting related data into multiple tables. For a little test application I've related two tables with each other by using a fuse table.
+---------------+ +---------------+ +---------------+
| Pages | | Fuse | | Prints |
+---------------+ +---------------+ +---------------+
| pageid | | fuseid | | printid |
| page_active | | fuse_page | | print_title |
| page_author | | fuse_print | | print_content |
| page_created | | fuse_locale | | ... |
| ... | | ... | +---------------+
+---------------+ +---------------+
Above is an example of my DB architecture
Now, my problem is how to insert related data to two separate tables and insert the two newly created ID's into the fuse table at the same time. If someone could could maybe explain or give me a topic related tutorial. I would appreciate it!
I assume you got separate models for each table. Then simply insert stuff in Prints table, store returned ID in variable. Then insert stuff in Pages table and store returned ID in another varialble. Eventually insert data in your Fuse table. You do not need any "at the same time" (atomic) operation here. ID of newly inserted rows are returned by save() (I assume you use autoincrement fields for this).
$printsModel = new Application_Model_Prints();
$pagesModel = new Application_Model_Pages();
$fuseModel = new Application_Model_Fuse();
$printData = array('print_title'=>'foo',
...);
$printId = $printsModel->insert( $printData );
$pagesData = array('page_author'=>'bar',
...);
$pageId = $pagesModel->insert($pagesData);
$fuseData = array('fuse_page' => $pageId,
'fuse_print' => $printId,
...);
$fuseId = $fuseModel->insert($fuseData);
thus is pseudo code, so you may want to move inserts into your models and do somoe i.e. normalisation etc.
I also suggest paying more attention to fields naming convention. It usually helps and now you got fuseid but also fuse_page. So it either should be fuse_id or fusepage (not to mention I suspect this field stores id so it would be fuse_page_id or fusepageid).
Prints and Pages are two entities . Create row clases for each
class Model_Page extends Zend_Db_Table_Row_Abastract
{
public function addPrint($print)
{
$fuseTb = new Table_Fuse();
$fuse = $fuseTb->createRow();
$fuse->fuse_page = $this->pageid;
$fuse->fuse_print = $print->printid;
$fuse->save();
return $fuse;
}
}
Now when you create page
$page = $pageTb->createRow() ; //instance of Model_Page is returned
$page->addPrint($printTb->find(1)->current());