Postgres include REINDEX in UPDATE statement - postgresql

I have a database with a table which is incremently patched and has many indexes. But sometimes the patching does not happen and the new patch becomes very large. Which makes in practice makes it smarter to delete the indexes and patch the table and reset the indexes. But this seems horrible and in practice with users using the table this is not an option. So I though that there was a posibility to RESET the index during the update statement or even better have postgres it self check if it is optimal. (I'm using postgres 10 this might be a problem that is solved by upgrading).
I hope you can help me.

No, there is no good solution, nor any on the horizon for future versions.
Either you keep the indexes and must maintain them during the "patch"; or you drop them in the same transaction as the "patch" and rebuild them later in which case the table is locked against all other uses; or you drop them in a separate transaction and rebuild them later in which case other sessions can see the table in an unindexed state.
There are in principle ways this could be improved (for example, ending the "patch" the same way create-index-concurrently ends, with a merge join between the index and the table. But since CIC must be in its own transaction, it is not clear how these could be shoehorned together), but I am not aware of any promising work going on currently.

Related

PostgreSQL Query Planner/Optimizer: Is there a way to get candidate plans?

In PostgreSQL, we can use "EXPLAIN ANALYZE" on a query to get the query plan of a given SQL Query. While this is useful, is there anyway that we are able to get information on other candidate plans that the optimizer generated (and subsequently discarded)?
This is so that we can do an analysis ourselves for some of the candidates (for e.g. top 3) generated by the DBMS.
No. The planner discards incipient plans as early as it can, before they are even completely formed. Once it decides a plan can't be the best, it never finishes constructing it, so it can't display it.
You can usually use the various enable_* settings or the *_cost settings to force it to make a different choice and show the plan for that, but it can be hard to control exactly what that different choice is.
You can also temporarily drop an index to see what it would do without that index. If you DROP an index inside a transaction, then do the EXPLAIN, then ROLLBACK the transaction, it will rollback the DROP INDEX so that the index doesn't need to be rebuilt, it will just be revived. But be warned that DROP INDEX will take a strong lock on the table and hold it until the ROLLBACK, so this method is not completely free of consequences.
If you just want to see what the other plan is, you just need EXPLAIN, not EXPLAIN ANALYZE. This is faster and, if the statement has side effects, also safer.

Postgres: is it possible to lock some rows for changing?

I have pretty old tables, which hold records for clients' payments and commissions for several years. In regular business cycle sometimes it's needed to recalc commissions and update table. But usually period of recalulation 1 or 2 months backwards, not more.
Recently, in result of bug in php script, our developer recalculated commissions since the very beggining 0_0. And the process of recalculation is really complicated so it cant be restored just grabbing yeasterday's backup - data changes in noumerous databases, so restoring data is really complicated and awfully expensive procedure. And complains from clients and change in accounting...you know..Horor.
We can't split tables by periods. (Well we can, but it will take year to remake all data selects).
What I'm trying to think about is to set up some update trigger that would check date of the changing record and allowed date that should be less then the updating date. So in case of mistake or bug, when someone would try to update such 'restricted' row it would get an exception and keep the data unchaged.
Is that a good approach? How's that can be done - I mean trigger?
For postgres you can use a check constraint to ensure the allowed_date is always less than the update_date:
ALTER TABLE mytable ADD CONSTRAINT datecheck CHECK (allowed_date < update_date);

PostgreSQL: Alternative to UPDATE (As COPY to INSERT)

I realize update operation speed in PostgreSQL doesn't meet my expectation especially when I update so many row at the same time, said 10K rows data. Is there any fast alternative to UPDATE? as using fast COPY to INSERT operation.
Thanks before.
Unlike INSERT, UPDATE can optimize for large writes. I have certainly had cases where I was updating tens of thousands of records and had it reasonably fast. The normal caveats for bulk operations apply, of course:
Indexing doesn't always help and in fact will not help if updating your entire table. You may find it faster to drop indexes, update, and recreate them.
NOT EXISTS is painfully slow in updates over large sets. Find a way of making things work with a left join instead.
If Normal performance rules apply (please look at query plans, etc).

Synchronizing two tables, best practice

I need to synchronize two tables across databases, whenever either one changes, on either update, delete or insert. The tables are NOT identical.
So far the easiest and best solution i have been able to find is adding SQL triggers.
Which i slowly started adding, and seems to be working fine. But before i continue finishing it, I want to be sure that this is a good idea? And in general good practice.
If not, what a better option for this scenario?
Thank you in advance
Regards
Daniel.
Triggers will work, but there are quite a few different options available to consider.
Are all data modifications to these tables done through stored procedures? If so, consider putting the logic in the stored procedures instead of in a trigger.
Do the updates have to be real-time? If not, consider a job that regularly synchronizes the tables instead of a trigger. This probably gets tricky with deletes, though. Not impossible, just tricky.
We had one situation where the tables were very similar, but had slightly different column names or orders. In that case, we created a view to the original table that let the application use the view instead of the second copy of the table. We were also able to use a Synonym one time to point to the original table, but that requires that the table structures be the same.
Generally speaking, a lot of people try to avoid unnecessary triggers as they're just too easy to miss when doing other work in the database. That doesn't make them bad, but can lead to interesting times when trying to troubleshoot problems.
In your scenario, I'd probably briefly explore other options before continuing with the triggers. Just watch out for cascading trigger effects where your one update results in the second table updating, passing the update back to the first table, then the second, etc. You can guard for this a little with nesting levels. Otherwise you run the risk of hitting that maximum recursion level and throwing errors.

Propagated delete in code or database?

I'm working on an iPhone application with a few data relationships (Author -> Books for example). When a user deletes an Author object from the application, I have a few SQLite triggers that run on the delete to remove any books from the database that have a foreign key matching the Author's primary key.
I'm also using a trigger to insert some data when a new item is created.
I can't help but shake the feeling that this might be bad design or lead to some problems down the road I am not thinking of. That said, should I rely on code in my app to handle propagating the deletes like this when the database has the capability built in to handle it?
What say you?
True. Use the inbuilt capabilities of the database as much as possible. Atleast try and start off like that and only compromise when things really demand so.
I would make use of the database's features to ensure relational integrity, especially with respect to updates/deletes. There are cases where I might use a trigger to insert some additional data (auditing comes to mind), though I would tend to avoid this and insert all of the data from my application. If you are doing multiple inserts, though, make sure to wrap it all in a single transaction so that you don't end up with a partial insert which could lead to loss of relational integrity.
I like the idea of using the database's built in functionality (I am not familiar with how it works).. but I would worry if I went back to the code a year from now, would I remember how it worked? (Given the code isn't right in front of me).
I imagine if you add a lot of comments to remind yourself about how it works now, if anything goes wrong in the future, at least you won't need to relearn the database features when you need to go do some debugging.
You're a few steps ahead of me: I recently learned about how to do that stuff with triggers and I am tempted to use them myself.
Based on the other answers here, it seems like a philosophical choice. It would probably be fine to use either triggers or code, but best to be consistent. So don't use triggers for cascading deletes on one table but then C code for another table.
Since you tagged the question iphone, I think the most important difference would be relative performance of C code versus a trigger. You'd probably have to code both and experiment to determine the difference, if any.
Another thing that comes to mind is that, of all the horror stories that I read on thedailywtf.com, about half of them seem to involve database triggers.
Unfortunately SQLite does NOT support on delete cascade etc. From the SQLite documentation:
http://www.sqlite.org/omitted.html
FOREIGN KEY constraints are parsed but are not enforced. However, the equivalent constraint enforcement can be achieved using triggers. The SQLite source tree contains source code and documentation for a C program that will read an SQLite database, analyze the foreign key constraints, and generate appropriate triggers automatically.
There is some support for triggers but it is not complete. Missing subfeatures include FOR EACH STATEMENT triggers (currently all triggers must be FOR EACH ROW), INSTEAD OF triggers on tables (currently INSTEAD OF triggers are only allowed on views), and recursive triggers - triggers that trigger themselves.
Therefore, the only way to code on delete cascade etc using SQLite requires triggers.
Kind regards,
Code goes in your app.
Triggers are code. The functionality goes in your app. Not in the database.
I think that databases should be used for data, not processing. I think apps should be used for processing, not data.
Database processing features merely muddy the water.