I wasn't able to find the answer in the configuration manual, here, or on Google. Are HAProxy stick-table preallocated at the time they are declared, or does the table grow over time (up to the max size specified)?
Related
We are using Amazon RDS to host our PostgreSQL databases. Our production instance (db.t3.xlarge, Single-AZ) was running smoothly until suddenly Read IOPS, Read Latency, Read Throughput and Disk Queue Depth metrics in the AWS console increased rapidly and stayed high afterward (with a lower variability) whereas Write IOPS and Write Throughput were normal.
Read IOPS
Read Throughput
Disk Queue Depth
Write IOPS
There were no code changes or deployments on the date of the increase. There were no significant increases in user activity either.
About our DB structure, we have a single table that holds all of our data and in that table, we have these fields: id as UUID (primary key), type as VARCHAR, data as JSONB (holds the actual data), createdAt and updatedAt as timestamp with the time zone. Most of our data columns have sizes bigger than 2 KB so most of the rows are stored in TOAST table. We have 20 (BTREE) indexes that are created for frequently used fields in JSONB.
So far we have tried VACUUM ANALYZE and also completely rebuilding our table: creating a new table, copying all data from the old table, creating all indexes. They didn't change the behavior.
We also tried increasing storage thus increasing IOPS performance. It helped a bit but it is still not the same as before.
What could be the root cause of this problem? How can we fix it permanently (without increasing storage or instance type)? For now, we are looking for easy changes and we will improve our data model in the future.
T3 instances are not suitable for production. Try moving to another family like a C or M type. You may have hit some burst limits that are now causing odd behaviour
Why shall I decrease max_connections in PostgreSQL when I use PgBouncer? Will there be a difference if I set max_connections in PostgreSQL's config equal 100 or 1000 when I use PgBouncer to limit connections below either?
Each possible connection reserves some resources in shared memory, and some backend private memory is also scaled to it. Reserving this memory when it will never be used is a waste of resources. This was more of an issue in the past, when shared memory resources were much more fiddly than they are on modern OS.
Also, there is some code which needs to iterate over all of those resources, possibly while holding locks, so it takes more time to do that if there is more data to iterate over. The exact nature of the iteration and locks have changed from version to version, as code was optimized to make it more scalable to large number of CPUs.
Neither of these effects is likely to be huge when the most of the possible connections are not actually used. Maybe the most important reason to lower max_connections is to get instant diagnosis in case pgbouncer has been misconfigured and is not doing its job correctly.
I have a large number of collections getting created at high bursts of traffic. I generally delete this collections once I m done processing the data in them. But at sudden bursts I sometimes run into namesspace issues..
Can I increase nssize for handling this and what values of nssize are OK? By default, it is 16 MB.. I increased it to 100 MB and still hit the issue.. Can I still increase it without worrying?
Also, I have a lot of databases where the data is around 1 Mb but mongo pre allocates 64 Mb space. How do I fix this? If I run compact, does it hit mongo performance?
You can increase the namespace size, up to 2047MB. Each namespace file is per database and the default size should be fine for about 24000 collections.
What are the issues you're seeing, exactly? Do you have log lines or error messages? The numbers don't look like they should be a problem.
For more about nsSize, see the docs.
As for your second question, please see the link in the first comment as it has a good explanation and links to more info.
in fact the following is what i really want to ask:
the set of chunk size is 1M. when a config server down, the whole config servers are readonly. current cluster have only a chunk, if I want to insert a lot of data to this cluster, the capacity of these data is more than 1M, Can I successfully insert these data?
if yes, do it describe that the real chunk size can more than the set of chunk size?
Thanks!
Short answer: yes you can, but you should fix your config server(s) asap to avoid unbalancing your shards for long.
Chunks are split automatically when they reach their size threshold - stated here.
However, during a config server failure, chunks cannot be split. Even if just one server fails. See here.
Edits
As stated by Sergio Tulentsev, you should fix your config server(s) before performing your insert. The system's metadata will continue to be readonly until then.
As Adam C's link points out, your shard will become unbalanced if you were to perform an insert like you describe before fixing your config server(s).
In the MongoDB documentation for auto-sharding it says: "Sharding is performed on a per-collection basis. Small collections need not be sharded."
Our business has many databases (~100), with many small collections (~30), each with a document count of 1 - 3000. Our DB system is looking at approximately 100,000,000 page views per month.
In that scenario will sharding ever activate since the collections are never big enough even though the DB usage and site traffic is certainly high enough to require load balancing. From the docs I can't seem to find a clear answer.
Whether it makes sense to shard depends a little bit on whether you have mostly writes or reads to the database. Sharding is primarily used for write-scaling, but if you are not doing a lot of writes, then simply using replicasets with "slaveOkay" for the reads might work just as well.
From the numbers that you provided you seem to get about 9 million documents, but are they large documents? If they easily fit in memory, then there is most likely not even going to be a need for replicasets besides for failover capabilities.
This is hard to answer without knowing more about your use case, but I'll give it a shot.
Are you sure sharding is what you need? What does your insert rate look like?
If you are going to have a static set of data, or even a relatively static set, then you probably don't need to shard, you could simply use more secondaries and enable slaveOK reads. The reads will be distributed to the various secondaries and scale up your read capacity.
If that is not the case, and you do need to shard, then there are options. But first, to explain briefly and at a high level how automatic sharding works:
The mongos process is responsible for splitting and migrating chunks in general. These are two separate operations - splitting and balancing.
Splits occur when the mongos sees that a certain portion of the
maximum chunk size has been written, it initiates a split if there is
in fact enough data to warrant it. Over time, with enough data
written, the number of chunks grows.
Balancing occurs when there is an imbalance of chunks (currently 8 in
2.0, though moving to a more dynamic heuristic in 2.2). The balancer migrates the chunks around the shards until a balance is achieved.
So, you need to be writing enough data relative to the max chunk size (default is 64MB in 2.0) to generate the chunks needed for the balancer to move them around appropriately. If that is not going to happen with your data, then you can look at:
Decreasing the chunk size (has drawbacks too - http://www.mongodb.org/display/DOCS/Sharding+Administration#ShardingAdministration-ChunkSizeConsiderations)
Manually split/move the chunks
For the manual instructions see:
http://www.mongodb.org/display/DOCS/Splitting+Shard+Chunks
http://www.mongodb.org/display/DOCS/Moving+Chunks