I want to use pgpool or pgbouncer as an external connection pooler with my flask app. The flask-sqlalchemy extension does not seem to expose a way to change the connection pooler to NullPool. Is there some way to do this?
While it should be possible with the apply_driver_hacks method, I would strongly recommend against it.
The TCP overhead is negligible on a local machine, but the authentication and negotiation (encoding for example) certainly isn't. Keeping a pool is always useful within Flask and if needed can be configured with the SQLALCHEMY_POOL_SIZE, SQLALCHEMY_POOL_TIMEOUT, SQLALCHEMY_POOL_RECYCLE and SQLALCHEMY_MAX_OVERFLOW settings.
If you simply want to cut down on overhead (albeit completely negligible) and your one instance Flask app is the only thing connecting to Postgres, than removing PgBouncer/PgPool from the mix would be better.
Related
I’m using Postgres provisioned by Google Cloud SQL,
Recently we see the number of connections to increase by a lot.
Had to raise the limit from 200 to 500, then to 1000. In Google Cloud console Postgres reports 800 currenct connections.
However I have no idea where these connections come from. We have one app engine service, with not a lot of traffic at the moment accessing it, another application hosted on kubernetes. And a dozen or so batch jobs that connect to it. Clearly there must be some connection leakage somewhere.
Is there any way I can see from where these connections originate ?
All applications connecting to it are Java based at the moment.
They use the HikariCP connection pool. I’m considering changing the “test query”upon connection to insert a record in a log table. Hence I could perhaps find out from where the connections originate.
But are there better ways available?
Thanks,
Consider monitoring connection activity with pg_stat_activity, i.e: SELECT * from pg_stat_activity;.
As per the documentation:
Connections that show an IP address, such as 1.2.3.4, are connecting using IP. Connections with cloudsqlproxy~1.2.3.4 are using the Cloud SQL Proxy, or else they originated from App Engine. Connections from localhost are usually to a First Generation instance from App Engine, although that path is also used by some internal Cloud SQL processes.
Also, take a look at the best practices for managing database connections that contain information on opening and closing connections, connection count, or on how to set a connection duration in the Java programming language.
in Java, application servers like JBoss EAP have the option to periodically verify the connections in a database pool (https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform/6.4/html/administration_and_configuration_guide/sect-database_connection_validation). This has been very useful for removing stale connections.
I'm now looking at a ADO.NET application, and I was wondering if there was any similar functionality that could be used with a Microsoft SQL Server?
I ended up find this post by redgate that describes some of the validation that goes on when connections are taken from the pool:
If the connection has died because a router has decided that it no
longer wants to forward your packets and no other routers like you
either then there is no way to know this unless you try to send some
data and don’t get a response.
If you create a connection and a connection pool is created and
connections are put into the pool and not used, the longer they are in
there, the bigger the chance of something bad happening to it.
When you go to use a connection there is nothing to warn you that a
router has stopped forwarding your packets until you go to use it; so
until you use it, you do not know that there is a problem.
This was an issue with connection pooling that was fixed in the first
.Net 4 reliability update (see issue 14 which vaguely describes this)
with a feature called “Connection Pool Resiliency”. The update meant
that when a connection is about to be taken from the pool, it is
checked for TCP validity and only returned if it is in a good state.
I am setting up two computers to run a web application. web-host hosts a MongoDB database and NodeJS web server, while worker runs some more demanding processes and populates the database. Using an SSH tunnel from worker, web-host:27017 is accessible using localhost:9999 from worker. web-host:80 has been set up to be accessible on http://our.corporate.site/my_site/.
At the moment MongoDB has no authentication on it - anything that can contact web-host:27017 can read or write anything to the database.
With this setup, how paranoid should I be about authenticating requests to MongoDB? The answers to this question seemed to suggest not very. Considering access is only possible from localhost it seems about as secure as the local file system. In MySQL I usually have a special 'web' user with limited privileges to limit the damage of an injection attack in case I make a mistake sanitizing input, however MongoDB seems less vulnerable to injection (or at least easier to sanitize) compared with MySQL.
Here's the issue: If you do set-up Mongo authentication, you are going to need to store the keys on the machine that accesses it.
So assuming that web-host:80 is compromised, the keys are also vulnerable.
There are some mitigation processes you can use to secure your environment, but there is no silver bullet if an attacker gains root access to your environment.
First I would consider putting mongodb on a separate machine on a private internal network that can only be accessed by machines in a DMZ (the part of the network where machines can communicate with your internal network and the outside world).
Next, assuming you are running a Linux-based system, you should be able to use AppArmor or SELinux to limit which processes are allowed to make outbound network requests. In this case only your webapp process should be able to initiate network requests such as connecting to your Mongo database.
If an attacker was able to get non-root access on your machine, the SELinux/AppArmor system policy would prevent them from initiating a connection to your database from their own script.
Using this architecture, you should be more secure than simply augmenting your current architecture with authentication. In a choice between the SELinux/AppArmor, I would use SELinux, since it is was much more mature and had much more granular control the last time I checked.
I was using MySQL Workbench and I am not able to figure out the difference between the following:
1. Server instance
2. Connection to server
In general I want to know if we can use Open Connection to start querying without creating
a server instance of the connection we are trying to connect. Are these two things independent?
You need one or two connections depending on what you wanna do with your server. For MySQL work (i.e. running queries) you need a MySQL connection. For server work (e.g. shutting the MySQL server down or manage other aspects that require shell access) you need a second connection (which is called a Server Instance).
Beginning with MySQL Workbench 6.0 we merged both connection settings into one interface.
I've set up PGBouncer and configured it to connect to my postgres DB and it all connects just fine, however I'm not sure if its actually working.
I have a php script that runs as a daemon and picks up beanstalk jobs. The problem is for every distinct user/action on the system it opens a new connection to postgres and then leaves that connection idling because the daemon doesn't actually stop running so the connection is never terminated (a quick fix for this was to reset the connection at the end of the script loop, but that is going to be inefficient with many connects).
Anyway, this caused postgres to eventually run out of connections and lock up...
So PGBouncer seems like the answer.
But now when I run it, I see the same db connection multiple times when i do ps ax | grep postgres.
Isn't PGBouncer supposed to only ever have 1 connection open to the DB and route all traffic through that connection? Then open a new connection if that one is full?
At present I have 3 for one db connection (my access control system) and 2 for the other database (my client specific data).
To me, it feels like if I roll out these changes then I will just be faced with the same problem that the connections will just get eaten up again because they're not being released.
I hope that explains enough for someone to offer any advice.
It sounds like an important step here is to fix the scripts so they release the connections when they're done. Once you do that, PgBouncer will help reduce the connection setup/teardown overhead, but in connection pooling mode it won't give you the ability to maintain more connections to Pg than you otherwise could.
However, you can also use PgBouncer in transaction-pooling mode. When used for transaction pooling, PgBouncer keeps a pool of idle backend transactions and only assigns them when a client does a BEGIN. Connections are returned to the pool after the client does a COMMIT or ROLLBACK. That means that in transaction pooling mode you can have large numbers of open connections to PgBouncer; they don't each need a corresponding connection to a PostgreSQL backend, so long as most of them are idle at any point in time and don't have any transaction open.
The only real downside of transaction pooling mode is that it breaks applications that expect to SET session-level variables, keep prepared statements across transactions, etc. Applications may need modification, e.g. to use SET LOCAL.