I am trying to build a messaging app such that each message will be inserted to the database. Also, my backend will hit the database every second to retrieve the latest messages. I am worried that if I have many users who are using the messages feature, I will hit the database maximum number of connection quickly and the app will not work for other users. So, how can I make sure this problem will not happen?
I am just wondering if I created two (db-g1-small) instances, will I have twice the number of connection (1000 per instance)?. Google documentation says (db-g1-small) 1,000 Maximum Connections.
How can I keep track of the number of connections?. What will happen if the number of connection to the database reaches the maximum?
https://cloud.google.com/sql/pricing#2nd-gen-instance-pricing
You shouldn't have a unique connection per user to your database. Instead, your backend should use connection pooling to maintain a consistent number of connections to your instance. You can view some example of how best to do this on the Managing Database Connections page.
It's incredibly unlikely that you'll need 1000 open connections. Most applications use far, far less for optimal performance. You can check out this article about benchmarking different connection pool sizes.
Related
We have a setup wherein a Database instance is shared between multiple users.
We are trying to implement some form or throttling or Rate limiting for a shared PostgreSQL so that one user may not starve other users from consuming all the resources.
One approach that we can think of is adding connections pools and fixing the number of connections that we give each tenant.
But one user can still starve all the resource over a few connections. Is there a way to throttle resource usage per connection or per user in PostgreSQL?
No, the postgres documentation makes it clear that's not possible using Postgres alone.
It's usually a (very) bad sign if your application allows one user to starve resources from others - it suggests you've got a bottleneck in your application, and that bottleneck will appear when you least want it to.
I apologise if the question is naive. I wanted to understand what could be a few possible use cases of the live query feature.
Let's say - My database state changes but it doesn't change every minute (or hour). If I execute a live query against my database/class/cluster, I'm not really expecting the callback to be called anytime soon. But, hey, I would still want to be notified when there's a state change.
My need with Orientdb is more on lines of ElasticSearch's percolator bundled with a publish-subscribe system.
Is live query meant to cater to such use cases too? Or is my understanding of live query very limited? What could be a few possible use cases for the live query feature?
Thanks!
Whether or not Live Queries will be appropriate for your use case depends on a few things. There are several reason why live queries make sense. A few questions to ask are:
How frequently does the data change?
How soon after the data changes do you need to know about it?
How many different groups of data (e.g. classes, clusters) do you need to deal with?
How many clients are connected to the server?
If the data does not change very often, or if you can wait a set period of time before an update, or you don't have many clients (hitting the DB directly), or if you only have one thing feeding the database, then you might want to just do polling. There is a balance between holding a connection open that you send a message on very infrequently (live queries) and polling too often.
For example. It's possible that you have an application server (tomcat, node, etc) and that your clients connect via web sockets. Now lets say your app server makes one (or a few pooled) live query to the database. Now lets say your database has an update. It might just go from the database to the app server (e.g. node). Node may now be responsible for fanning out that message across 100 web sockets (1 for each connected client). In this case, the fact that node is connected to the database in a persistent way with a live query open, is not that big of a deal.
The question is. If you have thousands of clients connected, do they all need an immediate update. If so are you planning on having them polling at a short interval? If so, you probably could benefit from a live query. Lots of clients polling at a short interval will generate a lot of unnecessary traffic and queries.
Unfortunately at the end of the day, the answer is it depends. You probably need to prototype and then instrument under load to see what your tradeoffs are. But in principal, it is less about how frequently updates come, and more about how often you would have clients poll, and how many clients you have. If the answer is "short intervals and a lot of clients" Give live queries a try.
As far as I know database connectivity technologies like the entity framework open and close connections automatically to enhance scalability. (Managing Connections and Transactions)
For example a form using asp.net mvc and the entity framework will connect to retrieve a record and them will immediately disconnect and remains disconnected until I modify the data in the controls and save it.
I wonder if the same behavior applies for an access 2013 form linked via odbc to SQL Server. Once a record is retrieved, is the connection closed until my next operation or the connection remains open until I close the form? Is the behavior configurable?
The fact or existence of a connection does not change nor increase scalability for typical applications. So if you have 10, or 1000 connections, and those connections are NOT doing anything, then SQL server not doing any work, and hence no increase in scalability will occur in these typical cases.
And OFTEN there is additional chatter over the network to open the connection pull the data, close the connection.
Then when you write the data back you AGAIN have 3 steps. So you again open the connection, open the table, write the data, and then close the table!
In fact keeping the connection open means you don’t waste network bandwidth opening and closing the connection!
The MAIN reason for disconnected datasets is that such connections work far more reliable in the case when you have a poor or less than ideal connection (such as over the internet or via Wi-Fi at a coffee shop). I these cases, if the open connection command fails, then the connection does NOT occur, and you don’t pull any data. And if a bit of time delay or re-try occurs as the connection is re-attempted, then no big deal. So you grab that data and close the connection.
However, this opening, and then closing often as noted causes additional overhead. However, given how the internet works (as opposed to a typical office network), then this disconnected approach is much the norm for pulling data over the internet, or when using something like Wi-Fi. So the approach is one of expecting that a minor disconnect will and can occur.
The second “common” reason for this 3 step process is other development platforms “promote” the use of disconnected data because the forms are NOT bound to the actual data tables (or bound to a query). The downside of this disconnected approach is you thus in general have to write code to pull the data down to the client, and THEN render the data from the recordset object to the form. The result is a TON of additional work to edit data in a form. So expect the typical asp .net application to cost 5 or even 10 times as much as writing that application in Access.
In the case of Access bound form model, it eliminates the developer having to code the data pull and coding and eliminates the need of the developer to pull that data into some object, and then close that connection. Once Access establishes a connection to the SQL server, then that connection remains open until you shut down the Access application.
The keep the connection open and active has the advantage of rapid application development due to the bound forms model. So you DO NOT need to write code to pull data from the server and THEN transfer that data from some type of object into the form.
So the downside to the Access approach has little (if anything) to do with scalability. The downside is a simple break in the connection is NOT at all well handled by Access.
So if you build a form in Access that is bound to a SQL server table of say 1 million records, and you launch that form with a where clause of InvoiceNumer = 12356, then Access is smart, and ONLY pulls down the ONE record from SQL server. So in terms of scalability and performance, the use of disconnected system as opposed to the bound connected model in Access will not result in a performance difference.
However, because Access keeps that connection open, then any breakage in that connection will result in an ODBC error. Worse is Access is NOT able to recover from such errors when using bound forms – your only recourse is to restart Access.
It is certainly possible to build un-bound forms in Access, but this is NOT how Access was designed. In fact if one is going to adopt a disconnected data model in Access then Access is the wrong tool due to no wizards or “developer aids” for such an approach. So say .net has wizards built around the disconnected system, and Access has tools built around the connected system. And a LOT of functionally that makes Access such a great rapid application development tool is lost if you build un-bound forms in Access. So bound Access forms have MANY additional events that you not find in say .net forms.
So it not scalability or better performance that is lost by the bound forms approach in Access, but simple ease of development is the main feature and gain that the Access development approach results in.
A developer will STILL need and should LIMIT the number of records pulled into a form (by use of the forms “where” clause.
(so .net forms don’t have such indulgences as a where clause).
So the major shortcoming is that Access does not recover from ODBC disconnections since it was designed to keep such connections open.
We are evaluating different alternatives for multi-tenancy in our platform. We think that one database per customer is the way to go as data structure and requirements are completely different from one customer to another, and we want to keep them as isolated as possible.
However we are facing the question of how to manage the connection to multiple databases. We don't want to have one app instance per customer. Instead we want to have a pool of app instances handling requests for all our customers and use the correct database depending on the customer.
Our concern is if keeping connections open to many (maybe thousands) of database will cause a performance issue. We are actually worried about memory usage, so we are wondering what's the overhead on client side when performing a connection to the MongoDB server.
Also we are thinking about moving the database access to a different service, which is going to be responsible of handling the database connection for all customers. In this case, is there an existing tool that allows to do that kind of "multiplexing" of MongoDB databases?
Some additional notes:
We discarded sharding. It won't fit our needs. We need different databases.
Databases will be in different servers with reserved resources. This means all databases run its own mondod process and we need different connections.
We use Java driver.
i am going to have a website with 20k+ concurrent users.
i am going to use mongodb using one management node and 3 or more nodes for data sharding.
now my problem is maximum connections. if i have that many users accessing the database, how can i make sure they don't reach the maximum limit? also do i have to change anything maybe on the kernel to increase the connections?
basically the database will be used to keep hold of connected users to the site, so there are going to be heavy read/write operations.
thank you in advance.
You don't want to open a new database connection each time a new user connects. I don't know if you'll be able to scale to 20k+ concurrent users easily, since MongoDB uses a new thread for each new connection. You want your web app backend to have just one to a few database connections open and just use those in a pool, particularly since web usage is very asynchronous and event driven.
see: http://www.mongodb.org/display/DOCS/Connections
The server will use one thread per TCP
connection, therefore it is highly recomended that your application
use some sort of connection pooling. Luckily, most drivers handle this
for you behind the scenes. One notable exception is setups where your
app spawns a new process for each request, such as CGI and some
configurations of PHP.
Whatever driver you're using, you'll have to find out how they handle connections and if they pool or not. For instance, Node's Mongoose is non-blocking and so you use one connection per app usually. This is the kind of thing you probably want.