Why does Perl makes the system very slow when I made more than 4,000 database connections? - perl

I was writing a code to find the speed of my database using a Perl script.
My intention was to make a 4,000 database connection after each fork (which would act as a 4,000 different clients) and sleep, and I issue the update command when I get the signal
but the system itself becomes very slow and almost hangs for making the connections itself and even I couldn't send the signal using my terminal.
I am using DBI module, I have 4GB RAM in my system where Postgres 8.3 is running in a different machine.

I'm not entirely clear on whether you're saying you wanted to a) Open 4,000 connections, fork, open 4,000 more connections, etc. or b) Fork 4,000 times and open one connection from each process, but 4,000 database connections or 4,000 processes is some pretty serious resource consumption either way. I'm not at all surprised that it's slowing your system to a crawl - I would expect that to be the end result regardless of the language used.
What are you actually attempting to achieve by creating all of these processes and/or connections? There's probably a better way to do it that won't be quite so resource-intensive.

I've seen pgpool in use on production systems where the number of postgres connections could not be limited to something reasonable. You may wish to look into using this yourself to mitigate against poor application design by your developers.
Essentially, pgpool acts as a proxy to postgres. It multiplexes queries on lots of connections to a much smaller (and manageable) number to the back-end database.

That is relativity speaking a lot of connections to have at once, but not unheard of by any means. How much memory do you have on the database server? Each connection takes resources, if you don't have a database server setup to handle that volume of connections it will be slow no matter what language you use to connect.
A simple analogy would be if you had a Toyota Prius (old days I would had said Ford Pinto) pulling a semi trailer with 80,000 lbs (typical legal weight in a lot of the states) of weight in it. It would burn that little Prius up in a heartbeat like you are seeing. To do it right you need to buy your self a big rig and hook it to that trailer to move that amount of weight.

Ignoring the wisdom of doing 4000 connection forks, you should work through your performance issues with something akin to Devel::NYTProf.
I would alternatively setup persistent workers in gearman and do my gearman client requests. Persistence and your scheduled forks on demand.

Related

How many parallel processes?

I am running some code in parallel by using a forking module in perl called Parallel::ForkManager. I have currently setting the maximum number of processes to 30:
my $pm = Parallel::ForkManager->new(30);
What would be an advisable maximum number of processes to create? I am doing this on a commercial grade Solaris server, but I still don't want to overload the system.
In downloading files, this really depends on
how many different hosts you're downloading from, and
how fast they will give you the requested files compared to your maximum bandwidth.
If you're downloading files from a single machine to a single machine on a local network, 2-3 is about max. If you're downloading files from 30 different servers on the internet, all of which are slow, but you have a fat pipe, then 30 might be reasonable.
There is no one universal right answer here. Unless you count "it depends."
The purpose of "downloading files" was mentioned, but in comments a while ago and I take the question as stated, to also be more general.
The only relevant measure is when you start reaching saturation in performance gains, with particular software on that system. The formal limits are huge and meaningless while rules of thumb are very general.
Let's imagine to run 10 processes and the time to complete the job drops 10 times. Increase to 20 processes and the time drops 20 times -- but for 30 processes the gain is the factor of 10. At this point we have loaded the system. Push further and the performance will degrade rapidly, and for everyone. At that point the server is overloaded, even though it allows, say, 1024 processes per user (and really ten or more times that for a server).
With a few processes per core the machine is engaged and I'd say that that is a good rule of thumb. However, it is too general. I doubt that you'd gain much in performance by going to that many processes, given the many other factors that affect it.
Accessing one web server The server's capability is the gospel. They may have posted how many requests per seconds they are happy with. Or they may have a limit on number of processes per user, say 10 or 20. If that means that many simultaneous downloads then that's your limit. But I'd be careful -- if the site is close and fast a request may complete in as little as 0.1 or 0.2 seconds. Then, with 10 processes you may be hitting the server 100 times a second. I do not recommend that. If there is no information I'd say keep it to a few requests per second. The performance and server load also depend on the content -- big downloads are different from pulling many skinny web pages. The I/O on your side may matter but I'd expect the server to set the limit. If you are going to use their service a lot why not send an email and ask what they are OK with.
I/O, network (many servers) or disk With network the performance depends on every piece of hardware in the path as well as on software. Nobody can tell without trying it out. The disk I/O is very complex. To add to trouble it is unclear whether it'd be your disks or network that is the bottleneck. I'd expect clear performance gains up to a few tens of processes, and probably fewer.
CPU or memory bound This may be easiest -- processing that can be broken up in parallel on 30 cores can enjoy close to a factor of 30 speedup (given no other bottlenecks). Going beyond the number of cores clearly leads to reduced performance gain. Concurrent (but not parallel) processing is far more complicated. If your code is memory intensive that is yet completely different.
Useful basic tools for assessing above components are iostat -xzn, netstat -I, and vmstat. But there is a bit of a curve to learning how to interpret their output and hopefully it doesn't come to that.
The conclusion is that you have to time it. Take your real application and time it running in one process. Do this 3 to 5 times and see the average (throw away obvious outliers). Then repeat with 5 processes, then with 10, etc. I'd expect that the trend will start slowing down far sooner than the 30 processors you mention. Once it gets to that the system is loaded and whoever works on it will notice. Very soon after that the performance will likely degrade rapidly. Proper benchmarking tools, like Benchmark, are far more sophisticated but this may well settle the issue. If you see strange or inconsistent behavior you may have to dig into details, starting with tools mentioned above.
What "overloaded" means is a bit unclear. I like to cap my use of resources well before other people are affected. But it may be possible to push it, in particular if you can run when it's quiet. I doubt that you'll keep having a worthy gain all the way to the number of available processors.
So there is no concern about "overloading" the server if you first time things. The performance limit will tell you when to stop. I'd say that your limit of 30 is very reasonable. Unless this is really about downloading files, in which case the web server is likely all that matters.
You should set the maximum number of processes to 60.

Does it make sense to use pg_pconnect (php-fpm)

I have about 11000 hits a second on 10 servers with php-fpm. I'm migrating to postgres from mysql, so my question is Does it make sense to use pg_*p*connect?
It's better to use a dedicated connection pooler like PgBouncer.
Performance should be comparable to pg_pconnect, but PgBouncer will allow to perform a cleanup after an error in PHP code. pg_pconnect will not automatically clean open transactions, locks, prepared statements etc.
Establishing a connection to a PostgreSQL server is expected to be significantly more expensive than to a MySQL server. This is due to different design choices of these databases in how they handle resource allocation and privilege separation between independent connections.
Therefore, for a website, it totally makes sense to reuse connections to PostgreSQL whenever possible.
The way generally recommended is not to use pg_pconnect but rather an external connection pooler like pgBouncer or pgPoolII which are better suited for this task. When using PHP-FPM however, you already have a middleware that lets you control somehow the number of open connections through the fpm process manager options, so it may be good enough. You may consider setting pm.max_requests to a non-zero value to make sure that connections get cleaned up at a reasonable frequency and avoid keeping a pile of unused connections during off-peak hours.
Well, pg_pconnect will mean you have one connection per PHP backend, so it depends how many backends you have. With a traditional Apache mod-php setup it'd be a non-starter but you might get away with it.
The database server can handle hundreds of idle connections, but almost certainly grind to a halt if they all have queries being issued concurrently. I've seen a rule-of-thumb of no more than two connections per core - that's assuming I/O doesn't limit you first.
The common approach is to run a connection pooler like pgbouncer and have php connect per-request. That reduces your connection overhead while keeping concurrency plausible.

Postgres causing swapping on CentOS

All,
I am running CentOS 6.0 with Postgresql 8.4 and can't seem to figure out how to prevent so much disc swap from occurring. I have 12 gigs of RAM and 4 processors and I am doing some simple updates (1 table at a time). I thought for a minute that the inserts happening in parallel from a script I wrong was causing the large memory usage but when I saw the simple update causing it too I basically threw in the towel and decided to ask for help.
I pasted the conf file here. http://pastebin.com/e0jdBu0J
You can see that I set the buffers relatively low and the connection amounts high. The DB service will not start if I set the shared buffers any higher than 64 megs. Anyone have an idea what may be causing this for me?
Thanks,
Adam
If you're going into swap, increasing shared_buffers will make the problem worse; you'll be taking RAM away from the part that's running out and swapping, instead dedicating memory to the database caching. It's worth fixing SHMMAX etc. just on general principle and for later tuning work, but that's not going to help with this problem.
Guessing at the identify of your memory gobbling source is a crapshoot. Far better to look at data from "top -c" and ps to find which processes are using a lot of it. It's possible for a really bad query to consume way more memory than it should. If you see memory use spike up for a PostgreSQL process running something, check the process ID against the information in pg_stat_tables to see what it's doing.
There are a couple of things that can cause this sort of issue that often surprise people. If you are doing a large number of row updates in a single transaction, and there are foreign key checks or triggers involved, that can run out of memory. The queue of things to check in each of those cases is kept in RAM, and can be surprisingly big.
There are two problems with your PostgreSQL settings that might be related. Databases don't actually work very well if you have a lot more active connections than cores in the server; best performance is normally 2 to 3 active clients per core. And all sorts of things go wrong once you've got more than a few hundred connection. There is some connections^2 behavior that gets ugly there performance wise, and there are some memory issues too. If you really need 1250 connections, you should be using a connection pooler such as pgBouncer or pgpool-II.
And effective_io_concurrency = 1000 is way too high for any hardware on the planet. Useful values for that in a small multiple of how many disks you have in the server. I have no idea what happens as far as memory usage goes when you set it that high, but it's not been tested very well at that range. Normal settings more like 1 to 25. The parameters outlined at Tuning Your PostgreSQL Server are much more important than it is; the concurrency value only impacts one particular type of table scan.
Centos 6 seems to have a very conservative shmmax as a default
Set your shared buffers to that recommended by postgres tuning resources
see for explanation and how to set.
To experiment you can (as root) use sysctl -w kernel.shmmax = n
where n is the value that the startup error message that postgres is trying to allocate on startup. When you identify the value you wish to use permanently then set that in /etc/sysctl.conf

Slony-I replication CPU usage

I have recently had to install slony (version 2.0.2) at work. Everything works fine, however, my boss would like to lower the cpu usage on slave nodes during replication. Searching on the net does not reveal any blatantly obvious answers to this. Any suggestions that would help reduce CPU usage (or spread the update out over a longer period) would be very much appreciated!
Have you looked into general PostgreSQL tuning here? The server can waste a lot of CPU cycles doing redundant work if it's not given enough resources to work with, and the default config is extremely small. Tuning Your PostgreSQL Server is a useful guide here, shared_buffers and checkpoint_segments are the two parameters you might get some significant improvement from on a slave (many of the rest only really help for improving query time).
Magnus might be right, this could very well just be a symptom of the fact that your database has very high traffic. Slony effectively multiplies the resource usage of any given DML operation: not only is data CRUD'ed to the replication master, but every time that happens, a Slony trigger (think of it as a change listener) generates an identical transaction and forwards it to the Slon process, which runs it on other members of the cluster.
However, there are two other possible explanations/solutions to this issue:
A possible solution might be to run the slon processes on a separate machine from your database hosts. Even if you have a single-master/single-slave replication scheme, it is advantageous in terms of stability, role-segregation, and performance (that’s you) to run the slon replication daemons on a physically different set of hardware (on the same LAN segment, ideally). There is nothing about Slony that says it has to run on the same machine as a given database host, so putting it in a different location (think “traffic controller”) might relieve some of the resource load on your database hosts. This is also a good idea in terms of both machine stability and scalability.
There's also a chance that this is only a temporary problem caused by the fact that you recently started using Slony. When you first subscribe a new node to a replication set, that node (and, to some extent, its parent) experiences VERY heavy CPU load (and possibly disk load as well) during the subscription process. I'm not sure how it works under the covers, but, depending on how much data was already on the node subscribed, Slony will either check the master’s data against every single piece of data present on the slave in tables that are replicated, and copy data down to the slave if it is missing or different. These are potentially CPU-intensive operations. Especially in large databases, the process of subscription can take a very long time (it took over a day for me, but our database is over 20GB), during which CPU load will be very high. A simple way to see what Slony is up to is to use pgAdmin’s Server Status viewer, which, while limited, will give you some useful info here. If there are a lot of “prepare table for replication” or “cleanup table after replication” operations in progress on the node that has a high CPU load, it’s probably because a subscription isn’t complete. pgAdmin’s status viewer isn’t too informative, however; there are more reliable ways of checking subscription progress using Slony directly. Section 4.7.6.4 in the Slony log-analysis documentation might help with that, as would reading the doc for SUBSCRIBE SET (pay special attention to the boxed warning message, and the "Dangerous/Unintuitive Behavior" section. A simple yet definitive hack to tell whether a set is still in the process of subscriptions is to run a MERGE SET and try to merge it with an empty (or not) other set. MERGE SET will fail with a "subscriptions in progress" error if subscription is still running. However, that hack won't work on Slony 2.1; MERGE SET will just wait until subscriptions are finished.
The best way to reduce the CPU usage would be to put less data into the database :-)
Other than that, you can experiment with sync_interval. It may be what you're looking for.

PostgreSQL consuming large amount of memory for persistent connection

I have a C++ application which is making use of PostgreSQL 8.3 on Windows. We use the libpq interface.
We have a multi-threaded app where each thread opens a connection and keeps using without PQFinish it.
We notice that for each query (especially the SELECT statements) postgres.exe memory consumption would go up. It goes up as high as 1.3 GB. Eventually, postgres.exe crashes and forces our program to create a new connection.
Has anyone experienced this problem before?
EDIT: shared_buffer is currently set to be 128MB in our conf. file.
EDIT2: a workaround that we have in place right now is to call PQfinish for every transaction. But then, this slows down our processing a bit since establishing a connection every time is quite slow.
In PostgreSQL, each connection has a dedicated backend. This backend not only holds connection and session state, but is also an execution engine. Backends aren't particularly cheap to leave lying around, and they cost both memory and synchronization overhead even when idle.
There's an optimum number of actively working backends for any given Pg server on any given workload, where adding more working backends slows things down rather than speeding it up. You want to find that point, and limit the number of backends to around that level. Unfortunately there's no magic recipe for this, it mostly involves benchmarking - on your hardware and with your workload.
If you need more connections than that, you should use a proxy or pooling system that allows you to separate "connection state" from "execution engine". Two popular choices are PgBouncer and PgPool-II . You can maintain light-weight connections from your app to the proxy/pooler, and let it schedule the workload to keep the database server working at its optimum load. If too many queries come in, some wait before being executed instead of competing for resources and slowing down all queries on the server.
See the postgresql wiki.
Note that if your workload is read-mostly, and especially if it has items that don't change often for which you can determine a reliable cache invalidation scheme, you can also potentially use memcached or Redis to reduce your database workload. This requires application changes. PostgreSQL's LISTEN and NOTIFY will help you do sane cache invalidation.
Many database engines have some separation of execution engine and connection state built in to the core database engine's design. Sybase ASE certainly does, and I think Oracle does too, but I'm not too sure about the latter. Unfortunately, because of PostgreSQL's one-process-per-connection model it's not easy for it to pass work around between backends, making it harder for PostgreSQL to do this natively, so most people use a proxy or pool.
I strongly recommend that you read PostgreSQL High Performance. I don't have any relationship/affiliation with Greg Smith or the publisher*, I just think it's great and will be very useful if you're concerned about your DB's performance.
* ... well, I didn't when I wrote this. I work for the same company now.
The memory usage is not necessarily a problem. PostgreSQL uses shared memory for some caching, and this memory does not count towards the size of the process memory usage until it's actually used. The more you use the process, the larger parts of the shared buffers will be active in it's address space.
If you have a large value for shared_buffers, this will happen. If you have it too large, the process can run out of address space and crash, yes.
The problem is probably that you don't close the transaction,
In PostgreSQL even if you do only selects without DML it runs in transaction which need to be rollback.
By adding rollback at the end of the transaction will reduce your memory problem