ALTER query very slow on tiny table in PostgreSQL - postgresql

I've got PostgreSQL 9.2 and a tiny database with just a bit of seed data for a website that I'm working on.
The following query seems to run forever:
ALTER TABLE diagnose_bodypart ADD description text NOT NULL;
diagnose_bodypart is a table with less than 10 rows. I've let the query run for over a minute with no results. What could be the problem? Any recommendations for debugging this?

Adding a column does not require rewriting a table (unless you specify a DEFAULT). It is a quick operation absent any locks. pg_locks is the place to check, as Craig pointed out.
In general the most likely cause are long-running transactions. I would be looking at what work-flows are hitting these tables and how long the transactions are staying open for. Locks of this sort are typically transactional and so committing transactions will usually fix the problem.

Related

How does postgresql lock tables when inserting and selecting?

I'm migrating data from one table to another in an environment where any long locks or downtime is not acceptable, in total about 80000 rows. Essentially the query boils down to this simple case:
INSERT INTO table_2
SELECT * FROM table_1
JOIN table_3 on table_1.id = table_3.id
All 3 tables are being read from and could have an insert at any time. I want to just run the query above, but I'm not sure how the locking works and whether the tables will be totally inaccessible during the operation. My understanding tells me that only the affected rows (newly inserted) will be locked. Table 1 is just being selected, so no harm, and concurrent inserts are safe so table 2 should be freely accessible.
Is this understanding correct, and can I run this query in a production environment without fear? If it's not safe, what is the standard way to accomplish this?
You're fine.
If you're interested in the details, you can read up on multiversion concurrency control, or on the details of the Postgres MVCC implementation, or how its various locking modes interact, but the implications for your case are nicely summarised in the docs:
reading never blocks writing and writing never blocks reading
In short, every record stored in the database has some version number attached to it, and every query knows which versions to consider and which to ignore.
This means that an INSERT can safely write to a table without locking it, as any concurrent queries will simply ignore the new rows until the inserting transaction decides to commit.

is it safe to enable autovacuum for a table in PostgreSQL

Am newbie in PostgreSQL(Version 9.2) Database development. While looking one of my table a saw an option called autovaccum.
many of my table contains 20000+ rows.For testing purpose I've altered one of that table like below
ALTER TABLE theTable SET (
autovacuum_enabled = true
);
So,I wish to know the benefits/advantages/disadvantages(if any) autovacuuming a table ?
Autovacuum is enabled by default in current versions of Postgres (and has been for a while). It's generally a good thing to have enabled for performance and other reasons.
Prior to autovacuuming, you would need to explicitly vacuum tables yourself (via cronjobs which executed psql commands to vacuum them, or similar) in order to get rid of dead tuples, etc. Postgres has for a while now managed this for you via autovacuum.
I have in some cases, with tables that have immense churn (i.e. very high rates of insertions and deletions) found it necessary to still explicitly vacuum via a cron in order to keep the dead tuple count low and performance high, because the autovacuum doesn't kick in fast enough, but this is something of a niche case.
More info: http://www.postgresql.org/docs/current/static/runtime-config-autovacuum.html

postgresql 9.2 never vacuumed and analyze

I have given a postgres 9.2 DB around 20GB of size.
I looked through the database and saw that it has been never run vacuum and/or analyze on any tables.
Autovacuum is on and the transaction wraparound limit is very far (only 1% of it).
I know nothing about the data activity (number of deletes,inserts, updates), but I see, it uses a lot of index and sequence.
My question is:
does the lack of vacuum and/or analyze affect data integrity (for example a select doesn't show all the rows matches the select from a table or from an index)? The speed of querys and writes doesn't matter.
is it possible that after the vacuum and/or analyze the same query gives a different answer than it would executed before the vacuum/analyze command?
I'm fairly new to PG, thank you for your help!!
Regards,
Figaro88
Running vacuum and/or analyze will not change the result set produced by any select operation (unless there was a bug in PostgreSQL). They may effect the order of results if you do not supply an ORDER BY clause.

How to profile azure SQL deadlocks?

I know this question was asked here but 1) it's relatively old and 2) It didn't help me much.
I am running into a relatively large number of deadlocks with a few operations on my database. The setup is as follows:
Tables:
Table A with foreign key into Table B.
Operations:
Insert into table A
Insert into table B
Update row in table B
Delete row in table B
Delete row in table A
Problem:
These operations can happen essentially in any order because I have multiple worker roles so these operations must be idempotent, however, each worker role will be working with a different primary key from table A. I am still trying to wrap my head around the concept of locks on tables and from what i understand, any delete on A will first lock table B, delete relevant rows there, and then delete the row from A. I currently assume that is an atomic operation and there is no time to execute additional locks between locking table B and locking table A because I can't imagine a way to get around that.
I am currently able to catch an exception in microsoft visual studio of the following format:
Transaction (Process ID xxx) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
This exception seems like it can happen on any of the above operations.
My question is: How do i know which locks/transactions are the ones causing the deadlock? Does anyone know any queries that would be useful AFTER we get the exception?
sys.event_log is the answer here.
It lives in your server's masterdb and should contain an entry with all of the deadlock graphs your database has hit in the last month.
Armed with the deadlock graph there are many tutorials on sql server deadlock graph debugging.
Currently profiling tools for Sql Azure are practically non existent.
The locking problem shouldn't differ much between standard Sql Server and Sql Azure world thus I would suggest trying to repro the problem in the 'old' world using standard techniques such as good old Profiler: quite useful article & this.
If that approach doesn't prove to be fruitful a dirty solution could be to work on catch/retry logic.
I ran into similar issues recently.
Try using your updates with "with (UPDLOCK)".
To try and find the root cause:
Start by just running a single worker role.
Then check:
Are you locking at the right level table lock, page lock or row lock?
Are you releasing the locks?
is your system designed in such a way, that all edits to the same row will be done by the same machine?
There is a blog post on finding blocking queries here: http://blogs.msdn.com/b/sqlazure/archive/2010/08/13/10049896.aspx

UPDATE vs INSERT statement performance in PostgreSQL

I am working with a database of a million rows approx.. using python to parse documents and populate the table with terms.. The insert statements work fine but the update statements get extremely time consuming as the table size grows..
It would be great if some can explain this phenomenon and also tell me if there is a faster way to do updates.
Thanks,
Arnav
Sounds like you have an indexing problem. Whenever I hear about problems getting worse as table size grows, it makes me wonder if you're doing a table scan whenever you interact with a table.
Check to see if you have a primary key and meaningful indexes on that table. Look at the WHERE clause you have on that UPDATE and make sure there's an index on those columns to make finding that record as fast as possible.
UPDATE: Write a SELECT query using the WHERE clause you use to UPDATE and ask the database engine to EXPLAIN PLAN. If you see a TABLE SCAN, you'll know what to do.