How to profile azure SQL deadlocks? - tsql

I know this question was asked here but 1) it's relatively old and 2) It didn't help me much.
I am running into a relatively large number of deadlocks with a few operations on my database. The setup is as follows:
Tables:
Table A with foreign key into Table B.
Operations:
Insert into table A
Insert into table B
Update row in table B
Delete row in table B
Delete row in table A
Problem:
These operations can happen essentially in any order because I have multiple worker roles so these operations must be idempotent, however, each worker role will be working with a different primary key from table A. I am still trying to wrap my head around the concept of locks on tables and from what i understand, any delete on A will first lock table B, delete relevant rows there, and then delete the row from A. I currently assume that is an atomic operation and there is no time to execute additional locks between locking table B and locking table A because I can't imagine a way to get around that.
I am currently able to catch an exception in microsoft visual studio of the following format:
Transaction (Process ID xxx) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
This exception seems like it can happen on any of the above operations.
My question is: How do i know which locks/transactions are the ones causing the deadlock? Does anyone know any queries that would be useful AFTER we get the exception?

sys.event_log is the answer here.
It lives in your server's masterdb and should contain an entry with all of the deadlock graphs your database has hit in the last month.
Armed with the deadlock graph there are many tutorials on sql server deadlock graph debugging.

Currently profiling tools for Sql Azure are practically non existent.
The locking problem shouldn't differ much between standard Sql Server and Sql Azure world thus I would suggest trying to repro the problem in the 'old' world using standard techniques such as good old Profiler: quite useful article & this.
If that approach doesn't prove to be fruitful a dirty solution could be to work on catch/retry logic.

I ran into similar issues recently.
Try using your updates with "with (UPDLOCK)".

To try and find the root cause:
Start by just running a single worker role.
Then check:
Are you locking at the right level table lock, page lock or row lock?
Are you releasing the locks?
is your system designed in such a way, that all edits to the same row will be done by the same machine?
There is a blog post on finding blocking queries here: http://blogs.msdn.com/b/sqlazure/archive/2010/08/13/10049896.aspx

Related

Is there a way to give write permissions only in a transaction in Postgres?

I work with a software that is used by a lot of different clients in several countries, with different needs, rules and constraints on their data.
When I make a change to the database's structure, I have a tool to test it on every client's database, obviously with read-only rights. This means that the best way to test a query like UPDATE table SET x = y WHERE condition
is to call the "read-only part" SELECT x FROM table WHERE condition.
It works but it's not ideal, as sometimes it is writing data that causes problems (mostly deadlocks or timeouts), meaning I can't see the problem until a client suffers from it.
I'm wondering if there is a way to grant write permissions in Postgres, but only when inside a transaction, and force a rollback on every transaction. This way, changes could be tested accurately on real data and still prevent any dev from editing it.
Any ideas?
Edit: the volumes are too large to consider cloning data for every dev who needs to run a query
This sounds similar to creating an audit table to record information about transactions. I would consider using a trigger to write a copy of the data to a "rollback" table/row and then copy the "rollback" table/row back on completion of the update.

How does postgresql lock tables when inserting and selecting?

I'm migrating data from one table to another in an environment where any long locks or downtime is not acceptable, in total about 80000 rows. Essentially the query boils down to this simple case:
INSERT INTO table_2
SELECT * FROM table_1
JOIN table_3 on table_1.id = table_3.id
All 3 tables are being read from and could have an insert at any time. I want to just run the query above, but I'm not sure how the locking works and whether the tables will be totally inaccessible during the operation. My understanding tells me that only the affected rows (newly inserted) will be locked. Table 1 is just being selected, so no harm, and concurrent inserts are safe so table 2 should be freely accessible.
Is this understanding correct, and can I run this query in a production environment without fear? If it's not safe, what is the standard way to accomplish this?
You're fine.
If you're interested in the details, you can read up on multiversion concurrency control, or on the details of the Postgres MVCC implementation, or how its various locking modes interact, but the implications for your case are nicely summarised in the docs:
reading never blocks writing and writing never blocks reading
In short, every record stored in the database has some version number attached to it, and every query knows which versions to consider and which to ignore.
This means that an INSERT can safely write to a table without locking it, as any concurrent queries will simply ignore the new rows until the inserting transaction decides to commit.

CREATE SCHEMA IF NOT EXISTS raises duplicate key error

To give some context, the command is issued inside a task, and many task might issue the same command from multiple workers at the same time.
Each tasks tries to create a postgres schema. I often get the following error:
IntegrityError: (IntegrityError) duplicate key value violates unique constraint "pg_namespace_nspname_index"
DETAIL: Key (nspname)=(9621584361) already exists.
'CREATE SCHEMA IF NOT EXISTS "9621584361"'
Postgres version is PostgreSQL 9.4rc1.
Is it a bug in Postgres?
This is a bit of a wart in the implementation of IF NOT EXISTS for tables and schemas. Basically, they're an upsert attempt, and PostgreSQL doesn't handle the race conditions cleanly. It's safe, but ugly.
If the schema is being concurrently created in another session but isn't yet committed, then it both exists and does not exist, depending on who you are and how you look. It's not possible for other transactions to "see" the new schema in the system catalogs because it's uncommitted, so it's entry in pg_namespace is not visible to other transactions. So CREATE SCHEMA / CREATE TABLE tries to create it because, as far as it's concerned, the object doesn't exist.
However, that inserts a row into a table with a unique constraint. Unique constraints must be able to see uncommitted rows in order to function. So the insert blocks (stops) until the first transaction that did the CREATE either commits or rolls back. If it commits, the second transaction aborts, because it tried to insert a row that violates a unique constraint. CREATE SCHEMA isn't smart enough to catch this case and re-try.
To properly fix this PostgreSQL would probably need predicate locking, where it could lock the potential for a row. This might get added as part of the current work going on for implementing UPSERT.
For these particular commands, PostgreSQL could probably do a dirty read of the system catalogs, where it can see uncommitted changes. Then it could wait for the uncommitted transaction to commit or roll back, re-do the dirty read to see if someone else is waiting, and retry. But this would have a race condition where someone else might create the schema between when you do the read to check for it and when you try to create it.
So the IF NOT EXISTS variants would have to:
Check to see if the schema exists; if it does, finish without doing anything.
Attempt to create the table
If creation fails due to a unique constraint error, retry at the start
If table creation succeeds, finish
As far as I know nobody's implemented that, or they tried and it wasn't accepted. There would be possible issues with transaction ID burn rate, etc, with this approach.
I think this is a bug of sorts, but it's a "yeah, we know" kind of bug, not a "we'll get right on fixing that" kind of bug. Feel free to post to pgsql-bugs about it; at the very least the documentation should mention this caveat about IF NOT EXISTS.
I don't recommend doing DDL concurrently like that.
I needed to work around this limitation in an application where schemas are created concurrently. What worked for me was adding
LOCK TABLE pg_catalog.pg_namespace
in the transaction including CREATE SCHEMA. Looks like a dirty and unsafe thing to do, but helped me to solve the problem which occurred only in tests anyway.

ALTER query very slow on tiny table in PostgreSQL

I've got PostgreSQL 9.2 and a tiny database with just a bit of seed data for a website that I'm working on.
The following query seems to run forever:
ALTER TABLE diagnose_bodypart ADD description text NOT NULL;
diagnose_bodypart is a table with less than 10 rows. I've let the query run for over a minute with no results. What could be the problem? Any recommendations for debugging this?
Adding a column does not require rewriting a table (unless you specify a DEFAULT). It is a quick operation absent any locks. pg_locks is the place to check, as Craig pointed out.
In general the most likely cause are long-running transactions. I would be looking at what work-flows are hitting these tables and how long the transactions are staying open for. Locks of this sort are typically transactional and so committing transactions will usually fix the problem.

How to prevent Write Ahead Logging on just one table in PostgreSQL?

I am considering log-shipping of Write Ahead Logs (WAL) in PostgreSQL to create a warm-standby database. However I have one table in the database that receives a huge amount of INSERT/DELETEs each day, but which I don't care about protecting the data in it. To reduce the amount of WALs produced I was wondering, is there a way to prevent any activity on one table from being recorded in the WALs?
Ran across this old question, which now has a better answer. Postgres 9.1 introduced "Unlogged Tables", which are tables that don't log their DML changes to WAL. See the docs for more info, but at least now there is a solution for this problem.
See Waiting for 9.1 - UNLOGGED tables by depesz, and the 9.1 docs.
Unfortunately, I don't believe there is. The WAL logging operates on the page level, which is much lower than the table level and doesn't even know which page holds data from which table. In fact, the WAL files don't even know which pages belong to which database.
You might consider moving your high activity table to a completely different instance of PostgreSQL. This seems drastic, but I can't think of another way off the top of my head to avoid having that activity show up in your WAL files.
To offer one option to my own question. There are temp tables - "temporary tables are automatically dropped at the end of a session, or optionally at the end of the current transaction (see ON COMMIT below)" - which I think don't generate WALs. Even so, this might not be ideal as the table creation & design will be have to be in the code.
I'd consider memcached for use-cases like this. You can even spread the load over a bunch of cheap machines too.