I'm reading this from the Postgres docs:
Building Indexes Concurrently
...
PostgreSQL supports building indexes without locking out writes. This
method is invoked by specifying the CONCURRENTLY option of CREATE
INDEX. When this option is used, PostgreSQL must perform two scans of
the table, and in addition it must wait for all existing transactions
that could potentially modify or use the index to terminate. Thus this
method requires more total work than a standard index build and takes
significantly longer to complete. However, since it allows normal
operations to continue while the index is built, this method is useful
for adding new indexes in a production environment....
In a concurrent index build, the index is actually entered into the
system catalogs in one transaction, then two table scans occur in two
more transactions. Before each table scan, the index build must wait
for existing transactions that have modified the table to terminate.
After the second scan, the index build must wait for any transactions
that have a snapshot (see Chapter 13) predating the second scan to
terminate. Then finally the index can be marked ready for use, and the
CREATE INDEX command terminates. Even then, however, the index may not
be immediately usable for queries: in the worst case, it cannot be
used as long as transactions exist that predate the start of the index
build.
What is a system catalog? What is a table scan? So it sounds like the index is build first, then it must wait for existing transactions (ones that occurred during the index build?) to terminate, and then wait for any transactions that have a snapshot predating the second scan to terminate (what does this mean? How does it differ from the first statement). What are these scans?
What is a system catalog? - https://www.postgresql.org/docs/current/static/catalogs.html
What is a table scan? - It reads table to get values of a column you build index on.
ones that occurred during the index build? - no, ones that could change data after first table scan
what does this mean? It means it waits transactions to end.
What are these scans? First scan reads table before starting to build index concurrently. Allowing changes to table to avoid lock. When build is done it scans roughly saying a difference, apply a short lock and mark index as usable. It is different from create index in a way the last locks table, not permitting any changes to data, while concurrently scans twice, but alows changing data while building index
Related
I have a queue-like table t1 holding timestamp-ordered data. One of its columns is foreign key ext_id. I have a number of workers that process rows from t1 and remove them after their job is done. Outcome of the process is upserting rows to another table t2. t2 also references ext_id, but in this case relation is unique: if row pointing to the particular ext_id already exist I want to update it instead of inserting.
As long as single worker is processing the data task is fairly simple. When multiple workers are brought into play SKIP LOCKED clause comes to the rescue: each thread locks the row that it is processing and makes it invisible to other threads. SKIP LOCKED clause guarantees that threads are not interfering with each other in terms of source table t1. The problem is, that they can still try simultaneously insert rows into table t2. Since there is uniqueness constraint on t2 this can yield error if multiple workers select t1 rows sharing ext_id. Since constraint raises error I can simply retry processing of a particular row, but then I lose the guarantee of processing order (not to mention that exception-based flow control feels like serious anti-pattern).
I considered adding auxiliary "synchronization" table (lets call it sync), that would hold entry for each ext_id being currently processed. The processing becomes more complicated then: I need to commit sync row insertion before actually start processing, so that other threads can use this information to select t1 rows that are safe to process. t1 row selection can join the auxiliary table and match first row that has ext_id not present in sync table. It is still possible that concurrent threads will select consecutive rows and try inserting synchronization rows pointing to the same ext_id. If it happens I need to retry t1 row selection.
Second approach solves concurrent processing of conflicting t1 rows and guarantees that the row order is maintained (within partitions defined by ext_id values). What it fails to solve is dirty flow control structure based on failed insertions.
PostgreSQL provides advisory locks mechanism, which allows building of application-specific custom synchronization logic.
Quote from explicit locking documentation:
For example, a common use of advisory locks is to emulate pessimistic locking strategies typical of so-called “flat file” data management systems. While a flag stored in a table could be used for the same purpose, advisory locks are faster, avoid table bloat, and are automatically cleaned up by the server at the end of the session.
Usage of advisory locks solves exactly this problem and according to the documentation should yield better performance. When a process selects a row, it also obtains advisory lock parametrized with ext_id. If another process tries to select conflicting row, it will have to wait until the other lock is released.
This is much better, but in some cases it will prohibit subset of workers from performing their tasks simultaneously and make them wait to perform tasks in a sequence. What those workers could do instead of waiting is to try fetching another row: something that sync-based solution solved by excluding t1 rows using outer join.
After this lengthy introduction, finally, the question:
Existing advisory locks can be inspected by querying pg_locks view. This view can be joined as a regular relation within queries. It is tempting to join it while fetching next t1 row, to exclude rows that are currently unprocessable due to existing lock. Since pg_locks is not regular table I have some doubts if this approach is safe.
Is it?
I have a big table(bo_sip_cti_event) which is too largest to even run queries on this so I made the same table (bo_sip_cti_event_day), added trigger after insert on bo_sip_cti_event to add all the same values to bo_sip_cti_event_day and now I am thinking if I significantly slowed down inserts into bo_sip_cti_event.
So generally, does trigger after insert slow down operations on this table?
Yes, the trigger must slow down inserts.
The reason is that relational databases are ACID compliant: All actions, including side-effects like triggers, must be completed before the update transaction completes. So triggers must be executed synchronously, and that consumes CPU, and in your case I/O too, which ultimately takes more time. There's no getting around it.
The answer is yes: it is additional overhead, so obviously it takes time to finish the transaction with the additional trigger execution.
Your design makes me wonder if:
You explored all options to speed up your large table. Even billions of rows can be handled quite fine, if you have proper index ect. But it all depends on the table, the design, the data and the queries.
What exactly your trigger is doing. The table name "_day" raises questions when and where and how exactly this table is cleaned out at midnight. Hopefully not inside the trigger function, and hopefully not with a "DELETE FROM".
I'm fairly inexperienced with SQL (or here PostgreSQL) and I'm trying to understand and use indices correctly.
PostgreSQL has a CONCURRENTLY option for CREATE INDEX and the documentation says:
"When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially use the index to terminate. Thus this method requires more total work than a standard index build and takes significantly longer to complete. However, since it allows normal operations to continue while the index is built, this method is useful for adding new indexes in a production environment."
Does this mean that an INDEX is only created at startup or during a migration process?
I know that one can re-index tables if they get fragmented over time (not sure how this actually happens and why an index is just not kept "up-to-date") and that re-indexing helps the database to get more efficient again.
Can I benefit from CONCURRENTLY during such a re-index process?
and besides that I'm asking myself
Are there situation where I should avoid CONCURRENTLY or would it hurt to use CONCURRENTLY just on every INDEX I create?
If it was sensible to always create index ... concurrently it'd be the default.
What it does is builds the index with weaker locks held on the table being indexed, so you can continue to insert, update, delete, etc.
This comes at a price:
You can't use create index ... concurrently in a transaction, unlike almost all other DDL
The index build can take longer
The index built may be less efficiently laid out (slower, bigger)
Rarely, the create index can fail, so you have to drop and recreate the index
You can't easily use this to re-create an existing index. PostgreSQL doesn't yet support reindex ... concurently. There are workarounds where you create a new index, then swap old and new indexes, but it's very difficult if you're trying to do it for a unique index or primary key that's the target of a foreign key constraint.
Unless you know you need it, just use create index without concurrently.
I am looking up query optimization in Postgres.
I don't understand this statement:
Index scans involve random disk access and still have to read the underlying data blocks for visibility checks.
what does "visibility check" mean here?
PostgreSQL uses a technique called Multi-Version Concurrency Control for managing concurrent access to data. Data is not visible until the transaction that inserted it commits. Under other cases, the data is silently ignored for other transactions so they don't see it (except in rare cases, explicit locks, or higher isolations levels).
What this means is that PostgreSQL must check the transaction id's of the actual rows to make sure they are visible for all transactions. Now, 9.2 (iirc) and higher allow PostgreSQL to skip this check if all tuples in a page are visible. However otherwise it has to check per row.
I have two types of queries I run often on two large datasets. They run much slower than I would expect them to.
The first type is a sequential scan updating all records:
Update rcra_sites Set street = regexp_replace(street,'/','','i')
rcra_sites has 700,000 records. It takes 22 minutes from pgAdmin! I wrote a vb.net function that loops through each record and sends an update query for each record (yes, 700,000 update queries!) and it runs in less than half the time. Hmmm....
The second type is a simple update with a relation and then a sequential scan:
Update rcra_sites as sites
Set violations='No'
From narcra_monitoring as v
Where sites.agencyid=v.agencyid and v.found_violation_flag='N'
narcra_monitoring has 1,700,000 records. This takes 8 minutes. The query planner refuses to use my indexes. The query runs much faster if I start with a set enable_seqscan = false;. I would prefer if the query planner would do its job.
I have appropriate indexes, I have vacuumed and analyzed. I optimized my shared_buffers and effective_cache_size best I know to use more memory since I have 4GB. My hardware is pretty darn good. I am running v8.4 on Windows 7.
Is PostgreSQL just this slow? Or am I still missing something?
Possibly try reducing your random_page_cost (default: 4) compared to seq_page_cost: this will reduce the planner's preference for seq scans by making random-accesses driven by indices more attractive.
Another thing to bear in mind is that MVCC means that updating a row is fairly expensive. In particular, updating every row in a table requires doubling the amount of storage for the table, until it can be vacuumed. So in your first query, you may want to qualify your update:
UPDATE rcra_sites Set street = regexp_replace(street,'/','','i')
where street ~ '/'
(afaik postgresql doesn't automatically suppress the update if it looks like you're not actually updating anything. Istr there was a standard trigger function added in 8.4 (?) to allow you to do that, but it's perhaps better to address it in the client side)
When a row is updated, a new row version is written.
If the new row does not fit in the same disk block, then every index entry pointing to the old row needs to be updated to point to the new row.
It is not just indexes on the updated data that need updating.
If you have a lot of indexes on rcra_sites, and only one or two frequently updated fields, then you might gain by separating the frequently updated fields into a table of their own.
You can also reduce the fillfactor percentage below its default of 100, so that some of the updates can result in new rows being written to the same block, resulting in the indexes pointing to that block not needing to be updated.