When to increase VARNISH_STORAGE_SIZE?

When to increase VARNISH_STORAGE_SIZE? - centos

I use Varnish 4 on CentOS6.6. Storage is set to malloc. Storage size is set to 1G.
Now my question is, how to check if it's enough or should I increase the value of storage size? I know that there is varnishstat -1 command, but i'm not entirely sure what to look for? I would like to avoid situation when 1GB storage will be not enough.

You are looking for n_lru_nuked. If that is larger than zero then - to make space for new objects - Varnish had to remove old objects before their TTL expired. This is either bad cache design (when caching everything for 10 years) or simply there is not enough space.

Related

What is the downside to increase shared buffer in PostgreSQL

I've noticed a significant performance drop when data is not loaded into shared_buffer when querying PostgreSQL, the difference can be almost 100 times. So in the process of optimizing the query, I was wondering if there is anyway to increase performance by increasing the shared_buffer.
Then I started to investigate the shared_buffer in PostgreSQL. and I found that the recommend value is 25% of the OS memory and PostgreSQL will take advantage of OS cache to accelerate the query. But from what I've seen with my own db, reading from disk vs shared_buffer has huge difference, so I would like to query from shared_buffer for the most time.
So I wondered, what's the downside if I increase the shared_buffer in PostgreSQL? What if I only increase the shared_buffer in my readonly instance?

A downside of increasing the buffer cache is double buffering. When you need to read a page into shared_buffers, it might first need to evict an existing page to make room for it. But then the OS cache might need to evict a page from itself as well so to make room for it to read the page from the actual disk. Then you end up with the same page being located in both places, which wastes cache space. So then instead of reading a page from the OS cache you are more likely to need to read it from actual disk, which is far slower. From a double-buffering perspective, you probably want shared_buffers to be much less than half of the system RAM (using OS cache as the main cache) or much larger than half (using shared_buffers as the main cache)
Another downside is that if it is too large, you might start to get out-of-memory errors or invoke the OOM killer or otherwise destabilize the system.
Another problem is that after some operations, like DROP TABLE, TRUNCATE, or the ending of a COPY in some circumstances, PostgreSQL needs to invalidate a lot of buffers and chooses to do so by scouring the entire buffer cache. If you do a lot of those operations, that time can really add up with large buffer cache settings.

Some workloads (I know about DROP TABLE, but there may be others) perform better with a smaller shared_buffers. But essentially, it is a matter of trial and error (or better yet: reproducible performance tests).
If you can make shared_buffers big enough that it can hold everything you need from the database, that is probably a good choice.

dealing with full mongodb-atlas database

I have set up a free tier MongoDB-atlas database and have a script that is storing tweets on it. Using db.collection.stats() it says storage size is 32768 which will fill up quite fast. Firstly, what happens when you exceed this limit? are new entries rejected or something else? Secondly, is there a way to deal with this without upgrading? For example, is it possible to clear entries before exceeding capacity?

When you exceed the limit the atlas cluster node will have exceeded the limit will be unavailable. It may be possible that the entire cluster will go down and then you will need to contact the MongoDB support to make the cluster up.
Although the best option is this that you need to upgrade to next tier for having more storage capacity. But in case you don't want that in that case you may write a script to delete old data from your cluster and after deleting the data make sure to run the compact command to reclaim the data storage.

GridFS disk management

In my environments I can have DB of 5-10 GB or DB of 10 TB (video recordings).
Focusing on the 5-10 GB: if I keep default settings for prealloc an small-files I can actually loose 20-40% of the disk space because of allocations.
In my production environments, the disk size can be 512G, but user can limit DB allocation to only 10G.
To implement this, I have a scheduled task that deletes the old documents from the DB when DB dataSize reached a certain threshold.
I can't use capped-collection (GridFS, sharding limitation, cannot delete random documents..), I can't use --no-prealloc/small-files flags, cause i need the files insert to be efficient.
So what happens, is this: if dataSize gets to 10G, the fileSize would be at least 12G, so I need to take that in consideration and lower the threshold in 2GB (and lose a lot of disk space).
What I do want, is to tell mongo to pre-allocate all the 10 GB the user requested, and disable further pre-alloc.
For example, running mongod with --no-prealloc and --small-files, but pre-allocate in advance all the 10 GB.
Another protection I gain here, is protecting the user against sudden disk-full errors. If he regularly downloads Game of Thrones episodes to the same drive, he can't take space from the DB 10G, since it's already pre-allocated.
(using C# driver)

I think I found a solution: You might want to look at the --quota and --quotafiles command line opts. In your case, you also might want to add the --smalfiles option. So
mongod --smallfiles --quota --quotafiles 11
should give you a size of exactly 10224 MB for your data, which, adding the default namespace file size of 16MB equals your target size of 10GB, excluding indices.

The following applies to regular collections as per documentation. But since metadata can be attached to files, it might very well apply to GridFS as well.
MongoDB uses what is called a record to store data. A record consists of two parts: the actual data and something which is called "padding". The padding is basically unused data which is used if the document grows in size. The reason for that is that a document or file chunk in GridFS respectively never gets fragmented to enhance query performance. So what would happen when the document or a file chunk grows in size is that it had to be moved to a different location in the datafile(s) every time the file is modified, which can be a very costly operation in terms of IO and time. So with the default settings, if the document or file chunk grows in size is that the padding is used instead of moving the file, thus reducing the need of moving around data in the data file and thereby improving performance. Only if the growth of the data exceeds the preallocated padding the document or file chunk is moved within the datafile(s).
The default strategy for preallocating padding space is "usePowerOf2Sizes", which determines the padding size by taking the document size and uses the next power of two size as the size preallocated for the document. Say we have a 47 byte document, the usePowerOf2Sizes strategy would preallocate 64 bytes for that document, resulting in a padding of 17 bytes.
There is another preallocation strategy, however. It is called "exactFit". It determines the padding space by multiplying the document size with a dynamically computed "paddingFactor". As far as I understood, the padding factor is determined by the average document growth in the respective collection. Since we are talking of static files in your case, the padding factor should always be 0, and because of this, there should not be any "lost" space any more.
So I think a possible solution would be to change the allocation strategy for both the files and the chunks collection to exactFit. Could you try that and share your findings with us?

Does MongoDB reuse deleted space?

First off, I know about this question:
Auto compact the deleted space in mongodb?
My question is not about shrinking DB file sizes though, but more about the reuse of deleted space. Say I have 100K documents in a collection, I then delete 50K of those. Will Mongo reuse the space within its data file that the deleted documents have freed? Or are they simply "marked" as deleted?
I don't care so much about the actual size of the file on disk, its more about "does it just grow and grow".

Update (Mar 2015): As of the 3.0 release, there are multiple storage engines available in MongoDB. This answer applies to the MMAP storage engine (still the default in MongoDB 3.0), the answer for other engines (WiredTiger for example) is quite different and may well be tunable and adjustable. Hence if you are using another engine, please read the relevant docs for that storage engine to determine what your space re-use defaults and options are.
With the MMAP storage engine, when documents are deleted the space left behind is put into a free list. However, to use the space there will need to be similarly sized documents inserted later, and MongoDB will need to find an appropriate space for that document within a certain time frame (once it times out looking at the list, it will just append) otherwise the space re-use is not going to happen very often. This deletion is done within the data files, so there is no disk space reclamation happening here - all of this is done internally within the existing data files.
If you subsequently do a repair, or resync a secondary from scratch, the data files are rewritten and the space on disk will be reclaimed (any padding on docs is also removed). This is where you will see actual space reclamation on-disk. For any other actions (compact included) the on disk usage will not change and may even increase.
With 2.2+ you can now use the collMod command and the usePowersOf2Sizes option to make the re-use of deleted space more likely (note that this is the default in 2.6+). This means that the initial space allocation for a document is a bit less efficient (512 bytes for a 400 byte doc for example) but means that when a new doc is inserted it is more likely to be able to re-use that space. If you are deleting (or growing and hence moving) documents a lot, then this will be more efficient in the long term.
For anyone that is interested, one of the people that wrote a lot of the storage code (Mathias Stearn) has a great presentation about the storage internals, which can be found here

How much can SQLite store on the iPhone?

I have an idea for a webapp for the iPhone but its unknown to me how much data can be stored in mobile Safari's SQLite db. I tried searching through the Apple docs but found nothing:
Safari Client-Side Storage and Offline Applications Programming Guide: Using the JavaScript Database

Most of these answers are totally wrong. Safari will not allow you to create SQLite databases over 50MB (or expand existing databases beyond that size).
This is a limit imposed by Safari - as other people have noted, SQLite itself supports much larger databases that you can use from native apps. But webapps are limited to 50MB.
It might be useful to note that this is per database - if you really need the extra space, you can create multiple databases, although this would obviously cause a lot of hassle.

It's as the other posters say. You're only limited by the drive space on the device.
You also need to consider your in memory footprint though. There is a finite amount of memory on the iphone, and in general it's quiet small, so the amount of data/hydrated objects you'll be able to have in memory is another potential limitation for your app.

There are a LOT of people answering that have clearly never tested it. I am on the latest version of iOS (4.3.3) and have set up a system to create multiple databases and keep them under 45 MB but found that the 50 MB cap is for the site as a whole. So, no matter how much you split the data up, it still restricts it to an aggregated cap of 50 MB.

The database size limit on safari mobile, is 50 mb per site not per database. i have tested this. even if you have an extra empty database you cannot add to it if the total size of all databases on a single site is 50 mb
whats worth noting as well is that characters are saved as double bytes on websql, that is 2 million characters will be 4 megabytes not 2 megabytes on disk.

You are only limited by the amount of free space on the device.

I'm not sure. If you were doing your own application you'd be limited by free space on the device and to some extent in memory footprint (as Bryan McLemore points out).
However since you're looking at using JavaScript inside of Safari there's no easy way to tell. According to the document you found it looks like it may be limited by site, but there's nothing telling you how much. I'd suggest writing a quick script to fill up the database and figure out how much it actually is. After that, I'd probably halve that value and assume I'd be always be able to use that much.
Be sure to report back so we'll all know!

It's most likely 32 terabytes... which is well over the available disk space.
I reached this number by multiplying the maximum page size by the maximum page count listed at the bottom of the SQLite limits page.

Limits In SQLite
"Limits" in the context of this article means sizes or quantities that can not be exceeded. We are concerned with things like the maximum number of bytes in a BLOB or the maximum number of columns in a table.
SQLite was originally designed with a policy of avoiding arbitrary limits. Of course, every program that runs on a machine with finite memory and disk space has limits of some kind. But in SQLite, those limits were not well defined. The policy was that if it would fit in memory and you could count it with a 32-bit integer, then it should work.
Unfortunately, the no-limits policy has been shown to create problems. Because the upper bounds were not well defined, they were not tested, and bugs (including possible security exploits) were often found when pushing SQLite to extremes. For this reason, newer versions of SQLite have well-defined limits and those limits are tested as part of the test suite.
As of version 3.6.19 (all statistics in the report are against that release of SQLite), the SQLite library consists of approximately 65.7 KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in other words, lines of code excluding blank lines and comments.) By comparison, the project has 690 times as much test code and test scripts - 45409.7 KSLOC.

The default storage limit on iPhone seems to be 5mb

davibe has done some work to raise the limit up to 1GB with his PhoneGap plugin.
https://github.com/davibe/Phonegap-SQLitePlugin
The plugin calls the native sqlite3 API, with a wrapper on the Javascript side.
The relevant code extracted from sqlite.js are:
update origins set quota = '999999999999' where origin = 'file__0';
"update databases set estimatedSize = '999999999999' where name = '" + dbName + "';'";

Caution: my iphone is jailbroken! But I don't suspect that this changes anything.
The limit of 50MB is no longer correct.
On my iPhone 4S with iOS 6.1 I have a database of 58.66 MB (448496 records) for my webclip (website pinned to the springboard).
No special tricks, just standard HTML5 usage.

Maximum Database Size
Please refer Official Sqlite site
Every database consists of one or more "pages". Within a single database, every page is the same size, but different database can have page sizes that are powers of two between 512 and 65536, inclusive. The maximum size of a database file is 2147483646 pages. At the maximum page size of 65536 bytes, this translates into a maximum database size of approximately 1.4e+14 bytes (140 terabytes, or 128 tebibytes, or 140,000 gigabytes or 128,000 gibibytes).
This particular upper bound is untested since the developers do not have access to hardware capable of reaching this limit. However, tests do verify that SQLite behaves correctly and sanely when a database reaches the maximum file size of the underlying filesystem (which is usually much less than the maximum theoretical database size) and when a database is unable to grow due to disk space exhaustion.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse