My team is thinking of switching from 9.1 to 9.4 and as a part evaluation we would like to measure how much of an improvement is INSERT INTO TABLE ... where there are 3-4 columns of fixed length types like INT, DOUBLE PRECISION. We are using an unbatched INSERT and the tables are logged and not temporary. fsync is set to on.
Q0: Are there any grounds to think that 9.4 would be faster than 9.1 on this particular statement?
For example based on improved WAL performance:
https://momjian.us/main/writings/pgsql/features.pdf
Clearly the best answer would be to go and check our data and put an experiment, but lets allow some speculation.
Q1: Are there performance evaluations that you are aware of?
Q2: How much of INSERT is taken by WAL?
Settings on the server (copied verbatim from 9.1 config file)
#fsync = off
#synchronous_commit = on
#wal_sync_method = fsync
#full_page_writes = on
#wal_buffers = -1
#wal_writer_delay = 200ms
shared_buffers = 15GB
temp_buffers = 1024MB
work_mem = 1024MB
The link you provided is based on information found in Section E.2.3.1.2. General Performance - a great read by the way. Based on your suggested test, I would not expect any real performance difference because you will not take advantage of parallel or partial writes (regarding wal files). That said, you might run across this in the future. Also, 9.4 (well, post 9.1 really) provides many useful tools and performance enhancements that, in my opinion, justify a switch from 9.1 to 9.4. For instance, since 9.2, JSON is now a datatype, Index Only scans are possible, and in-memory sorting has been improved by as much as 25%. 9.3 saw the introduction of materialized views (with concurrent refresh in 9.4) and updatable "simple" (definition expanded slightly in 9.4) views. In 9.4, aggregates are enhanced and ALTER SYSTEM defined (the ability to change config settings (goes into postgresql.auto.config which is read last, ensuring it overrides postgresql.config values) using a SQL command).
Of note, default logging has also changed. For instance, when creating a table, you won't receive messages about implicit index and sequence creation (set log level to DEBUG1 to fix - drove me crazy though when I switched from 9.1 to 9.3, especially during lectures).
In regards to question 1, I'd run a TPC benchmark test (C and VMS are the only ones that aren't free). For question 2, that really depends on your wal settings, but with what I see from your config file, it shouldn't matter in regards to version performance. I'd also run pgtune on your system (link below) to ensure your config file is as optimal as possible before testing.
As with the other commenters, just build it out and see what happens. You might not get much, if any, difference with straight inserts, so I'd try large, multi-table joins, huge sorts, and lots of transaction simulation (e.g., lots of inserts, updates, and deletes - just use plpgsql for simplicity) - the TCP queries will also do a pretty good job of performance testing.
Links:
To find the "What's new" PostgreSQL wiki pages, add the version number to the end of the following URL.
You can find a GUI version of pgtune at pgtune.leopard.in.ua; the standalone download is hit or miss from pgfoundry because it always seems to be down.
Related
Background: so far I've always used Django with its ORM to build small websites, so which database (MySQL vs PostgreSQL) was doing all the work behind the curtains wasn't really an issue.
Recently I decided to learn more about the differences between those two. I've just finished reading this (long) article which explores how indexes work in PostgresSQL and I am really shocked about the following fact:
"For instance, if we have a table with a dozen indexes defined on it, an update to a field that is only covered by a single index must be propagated into all 12 indexes to reflect the ctid for the new row."
I'm not an expert at all, but sounds insane to me that such an overload should happen by design when updating fields not involved in an index.
Moreover, the article goes on explaining how PostgreSQL replication strategy does not work at logical level, but at on-disk level, i.e. the master sends to the slaves a list (byte by byte) of all changes to apply on the disk rather than more abstract instructions such as UPDATE <fields> ON <table> WHERE ....
Although many short articles on the web comparing MySQL and PostgreSQL generally tend to claim that PostgreSQL is technically more advanced (ACID, JSON support, etc..), these two problems seem to be serious drawback to me. Can you confirm those statements and possibly point out further resources about these issues?
Thank you.
About indexes and performance
It is certainly true that PostgreSQL has to do more work on indexes when a row is updated. This is due to the fact that an UPDATE actually creates a new row version in the table, and the indexes have to point to that new row version.
There is, however, a way to mitigate the impact: if you set fillfactor to less than 100, so that there is free space in the data pages, and no indexed column is updated, PostgreSQL can create a “heap only tuple”, and such a HOT update will not need to touch any index.
MySQL's InnoDB with its secondary indexes (that reference the primary key index) has to do less work updating indexes. You pay the price for that with every index scan: first, you have to scan the secondary index to find the primary key, then you have to scan the primary key index to find the table row.
So there is a trade-off, but I think it is one-sided to unconditionally say that one solution is better.
About replication
MySQL has had a replication solution much earlier than PostgreSQL. It uses the binary log for replication, which is a slightly deceptive name as it actually contains SQL statements.
Version 9.0 of PostgreSQL introduced streaming replication, which ships the transaction log to the standby. This information is on the physical level, so primary and standby are kept physically identical. This is often more wasteful than shipping SQL statements (index updates!), but it is a very stable solution that leaves no room for replication conflicts.
PostgreSQL v10 has introduced logical replication, which generates an abstract description of the change, similar to an SQL statement. This allows for more flexible replication scenarios.
So the article you are referencing has become somewhat outdated in this respect.
We're using Postgresql 9.1.4 as our db server. I've been trying to speed up my test suite so I've stared profiling the db a bit to see exactly what's going on. We are using database_cleaner to truncate tables at the end of tests. YES I know transactions are faster, I can't use them in certain circumstances so I'm not concerned with that.
What I AM concerned with, is why TRUNCATION takes so long (longer than using DELETE) and why it takes EVEN LONGER on my CI server.
Right now, locally (on a Macbook Air) a full test suite takes 28 minutes. Tailing the logs, each time we truncate tables... ie:
TRUNCATE TABLE table1, table2 -- ... etc
it takes over 1 second to perform the truncation. Tailing the logs on our CI server (Ubuntu 10.04 LTS), take takes a full 8 seconds to truncate the tables and a build takes 84 minutes.
When I switched over to the :deletion strategy, my local build took 20 minutes and the CI server went down to 44 minutes. This is a significant difference and I'm really blown away as to why this might be. I've tuned the DB on the CI server, it has 16gb system ram, 4gb shared_buffers... and an SSD. All the good stuff. How is it possible:
a. that it's SO much slower than my Macbook Air with 2gb of ram
b. that TRUNCATION is so much slower than DELETE when the postgresql docs state explicitly that it should be much faster.
Any thoughts?
This has come up a few times recently, both on SO and on the PostgreSQL mailing lists.
The TL;DR for your last two points:
(a) The bigger shared_buffers may be why TRUNCATE is slower on the CI server. Different fsync configuration or the use of rotational media instead of SSDs could also be at fault.
(b) TRUNCATE has a fixed cost, but not necessarily slower than DELETE, plus it does more work. See the detailed explanation that follows.
UPDATE: A significant discussion on pgsql-performance arose from this post. See this thread.
UPDATE 2: Improvements have been added to 9.2beta3 that should help with this, see this post.
Detailed explanation of TRUNCATE vs DELETE FROM:
While not an expert on the topic, my understanding is that TRUNCATE has a nearly fixed cost per table, while DELETE is at least O(n) for n rows; worse if there are any foreign keys referencing the table being deleted.
I always assumed that the fixed cost of a TRUNCATE was lower than the cost of a DELETE on a near-empty table, but this isn't true at all.
TRUNCATE table; does more than DELETE FROM table;
The state of the database after a TRUNCATE table is much the same as if you'd instead run:
DELETE FROM table;
VACCUUM (FULL, ANALYZE) table; (9.0+ only, see footnote)
... though of course TRUNCATE doesn't actually achieve its effects with a DELETE and a VACUUM.
The point is that DELETE and TRUNCATE do different things, so you're not just comparing two commands with identical outcomes.
A DELETE FROM table; allows dead rows and bloat to remain, allows the indexes to carry dead entries, doesn't update the table statistics used by the query planner, etc.
A TRUNCATE gives you a completely new table and indexes as if they were just CREATEed. It's like you deleted all the records, reindexed the table and did a VACUUM FULL.
If you don't care if there's crud left in the table because you're about to go and fill it up again, you may be better off using DELETE FROM table;.
Because you aren't running VACUUM you will find that dead rows and index entries accumulate as bloat that must be scanned then ignored; this slows all your queries down. If your tests don't actually create and delete all that much data you may not notice or care, and you can always do a VACUUM or two part-way through your test run if you do. Better, let aggressive autovacuum settings ensure that autovacuum does it for you in the background.
You can still TRUNCATE all your tables after the whole test suite runs to make sure no effects build up across many runs. On 9.0 and newer, VACUUM (FULL, ANALYZE); globally on the table is at least as good if not better, and it's a whole lot easier.
IIRC Pg has a few optimisations that mean it might notice when your transaction is the only one that can see the table and immediately mark the blocks as free anyway. In testing, when I've wanted to create bloat I've had to have more than one concurrent connection to do it. I wouldn't rely on this, though.
DELETE FROM table; is very cheap for small tables with no f/k refs
To DELETE all records from a table with no foreign key references to it, all Pg has to do a sequential table scan and set the xmax of the tuples encountered. This is a very cheap operation - basically a linear read and a semi-linear write. AFAIK it doesn't have to touch the indexes; they continue to point to the dead tuples until they're cleaned up by a later VACUUM that also marks blocks in the table containing only dead tuples as free.
DELETE only gets expensive if there are lots of records, if there are lots of foreign key references that must be checked, or if you count the subsequent VACUUM (FULL, ANALYZE) table; needed to match TRUNCATE's effects within the cost of your DELETE .
In my tests here, a DELETE FROM table; was typically 4x faster than TRUNCATE at 0.5ms vs 2ms. That's a test DB on an SSD, running with fsync=off because I don't care if I lose all this data. Of course, DELETE FROM table; isn't doing all the same work, and if I follow up with a VACUUM (FULL, ANALYZE) table; it's a much more expensive 21ms, so the DELETE is only a win if I don't actually need the table pristine.
TRUNCATE table; does a lot more fixed-cost work and housekeeping than DELETE
By contrast, a TRUNCATE has to do a lot of work. It must allocate new files for the table, its TOAST table if any, and every index the table has. Headers must be written into those files and the system catalogs may need updating too (not sure on that point, haven't checked). It then has to replace the old files with the new ones or remove the old ones, and has to ensure the file system has caught up with the changes with a synchronization operation - fsync() or similar - that usually flushes all buffers to the disk. I'm not sure whether the the sync is skipped if you're running with the (data-eating) option fsync=off .
I learned recently that TRUNCATE must also flush all PostgreSQL's buffers related to the old table. This can take a non-trivial amount of time with huge shared_buffers. I suspect this is why it's slower on your CI server.
The balance
Anyway, you can see that a TRUNCATE of a table that has an associated TOAST table (most do) and several indexes could take a few moments. Not long, but longer than a DELETE from a near-empty table.
Consequently, you might be better off doing a DELETE FROM table;.
--
Note: on DBs before 9.0, CLUSTER table_id_seq ON table; ANALYZE table; or VACUUM FULL ANALYZE table; REINDEX table; would be a closer equivalent to TRUNCATE. The VACUUM FULL impl changed to a much better one in 9.0.
Brad, just to let you know. I've looked fairly deeply into a very similar question.
Related question: 30 tables with few rows - TRUNCATE the fastest way to empty them and reset attached sequences?
Please also look at this issue and this pull request:
https://github.com/bmabey/database_cleaner/issues/126
https://github.com/bmabey/database_cleaner/pull/127
Also this thread: http://archives.postgresql.org/pgsql-performance/2012-07/msg00047.php
I am sorry for writing this as an answer, but I didn't find any comment links, maybe because there are too much comments already there.
I've encountered similar issue lately, i.e.:
The time to run test suite which used DatabaseCleaner varied widely between different systems with comparable hardware,
Changing DatabaseCleaner strategy to :deletion provided ~10x improvement.
The root cause of the slowness was a filesystem with journaling (ext4) used for database storage. During TRUNCATE operation the journaling daemon (jbd2) was using ~90% of disk IO capacity. I am not sure if this is a bug, an edge case or actually normal behaviour in these circumstances. This explains however why TRUNCATE was a lot slower than DELETE - it generated a lot more disk writes. As I did not want to actually use DELETE I resorted to setting fsync=off and it was enough to mitigate this issue (data safety was not important in this case).
A couple of alternate approaches to consider:
Create a empty database with static "fixture" data in it, and run the tests in that. When you are done, just just drop the database, which should be fast.
Create a new table called "test_ids_to_delete" that contains columns for table names and primary key ids. Update your deletion logic to insert the ids/table names in this table instead, which will be much faster than running deletes. Then, write a script to run "offline" to actually delete the data, either after a entire test run has finished, or overnight.
The former is a "clean room" approach, while latter means there will be some test data will persist in database for longer. The "dirty" approach with offline deletes is what I'm using for a test suite with about 20,000 tests. Yes, there are sometimes problems due to having "extra" test data in the dev database but at times. But sometimes this "dirtiness" has helped us find and fixed bug because the "messiness" better simulated a real-world situation, in a way that clean-room approach never will.
I am attempting to restart a postgresql db which has stopped/is down and requires a VACUUM.
http://suwala.eu/blog/2010/10/09/how-to-vacuum-postgresql/
Following the above sequence of commands, I can't seem to get the last line to execute right.
$ postgres -D /var/lib/pgsql/data YOUR_DATABASE_NAME < /tmp/fix.sql
This gives me an error that says
postgres: invalid argument: "YOUR_DATABASE_NAME"
Try "postgres --help" for more information.
Any idea why?
CLARIFICATION
The 'YOUR_DATABASE_NAME' and the data directory I used on my server are the correct ones.
The referenced "how-to-vacuum-postgresql" page referenced in the question gives some very bad advice when it recommends VACUUM FULL. All that is needed is a full-database vacuum, which is simply a VACUUM run as the database superuser against the entire database (i.e., you don't specify any table name).
A VACUUM FULL works differently based on the version, but it eliminates all space within the heap files which is held by the database for quick re-use, and releases it to the OS. This can be much slower than the minimum needed to get back to a usable database, by orders of magnitude. And since any inserts or updates after the VACUUM FULL require OS calls to re-allocate space to the database, it can cause slower execution afterward, unless your database had a lot of bloat. (Although, if you turned off autovacuum, it might be in horrible shape, but you probably want to get back on your feet first, and sort that out later.)
Another issue with VACUUM FULL before version 9.0 is that while it eliminates bloat in a table's heap files, it tends to increase bloat in its index files, sometimes dramatically. If you issue a VACUUM FULL, you should normally follow it with a REINDEX to get the indexes back into good shape.
The page referenced in the question also fails to heed the advice given in the PostgreSQL docs at http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND to use single-user mode:
since the system will not execute commands once it has gone into the
safety shutdown mode, the only way to do this is to stop the server
and use a single-user backend to execute VACUUM. The shutdown mode is
not enforced by a single-user backend. See the postgres reference page
for details about using a single-user backend.
As others have mentioned -- there is almost no use case where turning off autovacuum is beneficial. It may be useful to supplement the autovacuum activity with explicit vacuums on large tables, or you may want to adjust autovacuum configuration, but really -- don't turn it off or you will see bloat which saps performance and you'll run into transaction ID wraparound problems periodically. People who notice a performance hit when autovacuum is performing maintenance sometimes have an instinct to make it less aggressive in triggering, but that is usually counter-productive. It is generally better to adjust the autovacuum cost limitation parameters to pace the work, rather than have it neglect tables which need maintenance.
This appears to be an issue in PostgreSQL, as according to documentation for 9.0 and 8.3 it should work with those versions but doesn't.
However, using --single switch makes it work:
postgres --single -D [path-to-data-dir] [db-name] < /tmp/fix.sql
Background:
I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP.
I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse.
How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to 15-20k. The most frequently updated rows are ~64 bytes each but there are various tables etc so the data is quite diverse and can range up to 4000 bytes per row. The OLTP is active 24x5.5.
Best Solution?
From what I can piece together the most practical solution is as follows:
Create a TRIGGER to write all DML activity to a rotating CSV log file
Perform whatever transformations are required
Use the native DW data pump tool to efficiently pump the transformed CSV into the DW
Why this approach?
TRIGGERS allow selective tables to be targeted rather than being system wide + output is configurable (i.e. into a CSV) and are relatively easy to write and deploy. SLONY uses similar approach and overhead is acceptable
CSV easy and fast to transform
Easy to pump CSV into the DW
Alternatives considered ....
Using native logging (http://www.postgresql.org/docs/8.3/static/runtime-config-logging.html). Problem with this is it looked very verbose relative to what I needed and was a little trickier to parse and transform. However it could be faster as I presume there is less overhead compared to a TRIGGER. Certainly it would make the admin easier as it is system wide but again, I don't need some of the tables (some are used for persistent storage of JMS messages which I do not want to log)
Querying the data directly via an ETL tool such as Talend and pumping it into the DW ... problem is the OLTP schema would need tweaked to support this and that has many negative side-effects
Using a tweaked/hacked SLONY - SLONY does a good job of logging and migrating changes to a slave so the conceptual framework is there but the proposed solution just seems easier and cleaner
Using the WAL
Has anyone done this before? Want to share your thoughts?
Assuming that your tables of interest have (or can be augmented with) a unique, indexed, sequential key, then you will get much much better value out of simply issuing SELECT ... FROM table ... WHERE key > :last_max_key with output to a file, where last_max_key is the last key value from the last extraction (0 if first extraction.) This incremental, decoupled approach avoids introducing trigger latency in the insertion datapath (be it custom triggers or modified Slony), and depending on your setup could scale better with number of CPUs etc. (However, if you also have to track UPDATEs, and the sequential key was added by you, then your UPDATE statements should SET the key column to NULL so it gets a new value and gets picked by the next extraction. You would not be able to track DELETEs without a trigger.) Is this what you had in mind when you mentioned Talend?
I would not use the logging facility unless you cannot implement the solution above; logging most likely involves locking overhead to ensure log lines are written sequentially and do not overlap/overwrite each other when multiple backends write to the log (check the Postgres source.) The locking overhead may not be catastrophic, but you can do without it if you can use the incremental SELECT alternative. Moreover, statement logging would drown out any useful WARNING or ERROR messages, and the parsing itself will not be instantaneous.
Unless you are willing to parse WALs (including transaction state tracking, and being ready to rewrite the code everytime you upgrade Postgres) I would not necessarily use the WALs either -- that is, unless you have the extra hardware available, in which case you could ship WALs to another machine for extraction (on the second machine you can use triggers shamelessly -- or even statement logging -- since whatever happens there does not affect INSERT/UPDATE/DELETE performance on the primary machine.) Note that performance-wise (on the primary machine), unless you can write the logs to a SAN, you'd get a comparable performance hit (in terms of thrashing filesystem cache, mostly) from shipping WALs to a different machine as from running the incremental SELECT.
if you can think of a 'checksum table' that contains only the id's and the 'checksum' you can not only do a quick select of the new records but also the changed and deleted records.
the checksum could be a crc32 checksum function you like.
The new ON CONFLICT clause in PostgreSQL has changed the way I do many updates. I pull the new data (based on a row_update_timestamp) into a temp table then in one SQL statement INSERT into the target table with ON CONFLICT UPDATE. If your target table is partitioned then you need to jump through a couple of hoops (i.e. hit the partition table directly). The ETL can happen as you load the the Temp table (most likely) or in the ON CONFLICT SQL (if trivial). Compared to to other "UPSERT" systems (Update, insert if zero rows etc.) this shows a huge speed improvement. In our particular DW environment we don't need/want to accommodate DELETEs. Check out the ON CONFLICT docs - it gives Oracle's MERGE a run for it's money!
I have a Windows Server 2003 machine which I will be using as a Postgres database server, the machine is a Dual Core 3.0Ghz Xeon with 4 GB ECC Memory and 4 x 120GB 10K RPM SAS Drives, all stripped.
I have read that the default Postgres install is configured to run nicely on a 486 with 32MB RAM, and I have read several web pages about configuration optimizations - but was hoping for something more concrete from my Stackoverflow peeps.
Generally, its only going to serve 1 database (potentially one or two more) but the catch is that the database has 1 table in particular which is massive (hundreds of millions of records with only a few coloumn). Presently, with the default configuration, it's not slow, but I think it could potentially be even faster.
Can people please give me some guidance and recomendations for configuration settings which you would use for a server such as this.
4*stripped drive was a bad idea — if any of this drives will fail then you'll loose all data, and even SAS drives sometimes fail — with 4 drivers it is 4 times more likely than with 1 drive; you should go for RAID 1+0.
Use the latest version of Postgres, 8.3.7 now; there are many performance improvements in every major version.
Set shared_buffers parameter in postgresql.conf to about 1/4 of your memory.
Set effective_cache_size to about 1/2 of your memory.
Set checkpoint_segments to about 32 (checkpoint every 512MB) and checkpoint_completion_target to about 0.8.
Set default_statistics_target to about 100.
Migrate to Enterprise Linux or FreeBSD: Postgres works much better on Unix type systems — Windows support is a recent addition, not very mature.
You can read more on this page: Tuning Your PostgreSQL Server — PostgreSQL Wiki
My experience suggests that (within limits) the hardware is typically the least important factor in database performance.
Assuming that you have enough memory to keep commonly used data in cache, then your CPU speed may vary 10-50% between a top-of the line machine and a common or garden box.
However, a missing index in an important search, or a poorly written recursive trigger could easily make a difference of 1,000% or 10,000% or more in your response times.
Without knowing exactly your table structure and row counts, I think anybody would suggest that your your hardware looks amply sufficient. It is only your database structure which will kill you. :)
UPDATE:
Without knowing the specific queries and your index details, there's not much more we can do. And in general, even knowing the queries, it's often very difficult to optimize without actually installing and running the queries with realistic data sets.
Given the cost of the server, and the cost of your time, I think you need to invest thirty bucks in a book. Then install your database with test data, run the queries, and see what runs well and what runs badly. Fix, rinse, and repeat.
Both of these books are specific to SQL Server and both have high ratings:
http://www.amazon.com/Inside-Microsoft%C2%AE-SQL-Server-2005/dp/0735621969/ref=sr_1_1
http://www.amazon.com/Server-Performance-Tuning-Distilled-Second/dp/B001GAQ53E/ref=sr_1_5