Why does memcached impose a 30-day limit on the lifetime of cache entries?
In my system, I am always setting the lifetime to be 30 days, since that's the max allowed value. Setting it to a value much greater than 30 days would be ideal for my app.
Is there a way to change the "30-day" value to something else?
I am considering downloading the memcached source and recompiling it for my own use. I would either change the "30" to "300" or perhaps get rid of that check entirely. If I were to do this, would I be changing something that would cause memcached to malfunction or perform poorly? My expectation would be that items would be allowed to remain in the cache for longer, and they items would be removed from the cache when the cache gets full.
30 days is the limit at which we consider the time you specified to be a TTL from now.
If you want longer than 30 days, it's fine, just use an absolute time (time() + whatever).
If you want no time-based expiration, as ConroyP says, just use 0.
30 days is the maximum length of time for which you can specify an expiry, but if you're thinking of eliminating the expiry check altogether, would it not be simpler to set the expiry time to 0? This should mean that the data is stored until the cache is full and it's removed to allow for insertion of newer items.
From the PHP Memcache docs:
Parameter expire is expiration time in seconds. If it's 0, the item never expires (but memcached server doesn't guarantee this item to be stored all the time, it could be deleted from the cache to make place for other items).
Related
Hello I am implementing simple garbage collector over memcached in Perl.
And i want to delete all rows (key, value is serialized(payload, date)) before or after date.
What is the best effective implementation? Get all data and then check the date in for cycle(But the data could get very big and i think it could be slow and not very effective)?
Any other ideas or opinions?
Thanks, Cospel
You cannot iterate memcached keys in a effective way(i mean there is no "good" way to do that). Best solution is to setup proper expires field, so entries will be expired/deleted automatically. Also its good to remove the key right after the moment it is no longer needed.
Internally memcached uses LRU, so when no memory available, most unused items will be discarded. This can be entries with big TTL (expire time), so probably its a parameter to tune for your needs.
I have a web application which would perform well if some database operations are cached. Those are static data's and new data are added everyday. To reduce the database read operation i'll be using memcached.
What will happen if i don't give an expiry time for the data i put in memcached. Will it affect the performance by consuming more RAM..? Is it good to ditch the expiry time while adding data to cache.
PS: We use AWS to deploy the webapp with ngnix, php, mysql.
Presumably when your app is still running in the year 2050, some things that you put in cache way back in 2012 will no longer be relevant. If you don't provide some sort of expiration, only a cache reset (e.g. server restart) will end up flushing the cache.
Unless you have infinite memory (and I'm pretty sure AWS doesn't provide that ;-) it is wise to add some expiration time to cached items.
Even though Memcached will expire items based on a least recently used mechanism (thanks #mikewied for pointing that out), the cache will still fill entirely before memcache begins evicting items based on LRU. Unfortunately, Memcache's LRU algorithm is per slab, not global. This means that LRU-based evictions can be less than optimal. See Memcached Memory Allocation and Optimization.
When in memcache the available memory is full, memcache uses the LRU (last recently used) algorithm to free memory.
My question is will the LRU Algorithm rather delete entries that have not been used for some amount of time (last recently used) than expired items?
Entries that are expiring are not deleted on that exact moment but when the next time someone tries to access it (AFAIR). So will the LRU Algorithm (also) account for the expiry of keys?
To understand how memcached does LRU you must go deeper and understand how memcached stores items. Items are stored depending on their size, simply put all your items that are lets say 100k get stored in the same slab, while other items that are 200k are stored in a different slab.
When memory gets full and you try to store a 100k item, memcached will apply LRU on that slab. If there are keys expired or not used in the 200k slab, they remain there, while if the 100k slab has only hot keys, one of those based on the algorithm will be evicted.
Back to your question, when memory is full and you try to store an item, memcached will look first for expired items in the slab you are trying to write to, then look for the least used items. So yes, it does take into account the expiry of the keys, or better yet, expired keys go first before LRU.
Also, when you try to get an item which is past its expiration date, that item is evicted and the memory reclaimed.
More details on (lots on google for memcached memory allocation which explains LRU as well, so plenty to read on this):
http://returnfoo.com/2012/02/memcached-memory-allocation-and-optimization-2/
http://www.adayinthelifeof.nl/2011/02/06/memcache-internals/
And a really nice tool which I recommend on every memcached topic :
http://code.google.com/p/phpmemcacheadmin/
Hope it helps!
From what I know this statement is not true.
"Back to your question, when memory is full and you try to store an item, memcached will look first for expired items in the slab you are trying to write to, then look for the least used items. So yes, it does take into account the expiry of the keys, or better yet, expired keys go first before LRU."
Memcache will evict item according to LRU (it doesn't matter if it has any expired items as long as they are more recently used than another key (even valid)).
Tested a while ago on Memcache 1.4.4.
Ive been using memcache before, decided to try out APC. Im having problems with it actually reading values, and respecting the expiry dates. I can set a 10 min expired date on a piece of data. Refresh the page, which will run a mysql query and cache the result into a key. On next load, it checks to see if the key is set, and if it is, it grabs data from it, instead of DB. Except it doesnt always do that... it still runs the query, about 1/2 the time, regardless of the key being set or not. They keys that are set, dont always expire when they are set to expire either either. And the command that deletes a key from the cache, doesn't always do that either.
I havent had these problems with memcache, which performed like clockwork.
Make sure APC isn't full -- it's possible that your keys are being pushed out of memory. The default configuration on many systems only allocates 32 megabytes, which is actually extremely easy to fill with PHP bytecode alone.
The best way to gain visibility into your APC cache utilization is via the apc.php script that ships with APC.
We are trying to update memcached objects when we write to the database to avoid having to read them from database after inserts/updates.
For our forum post object we have a ViewCount field containing the number of times a post is viewed.
We are afraid that we are introducing a race condition by updating the memcached object, as the same post could be viewed at the same time on another server in the farm.
Any idea how to deal with these kind of issues - it would seem that some sort of locking is needed but how to do it reliably across servers in a farm?
If you're dealing with data that doesn't necessarily need to be updated realtime, and to me the view count is one of them, then you could add an expires field to the objects that are stored in memcache.
Once that expiration happens, it'll go back to the database and read the new value, but until then it will leave it alone.
Of course for new posts you may want this updated more often, but you can code for this.
Memcache only stores one copy of your object in one of its instances, not in many of them, so I wouldn't worry about object locking or anything. That is for the database to handle, not your cache.
Edit:
Memcache offers no guarantee that when you're getting and setting from varied servers that your data won't get clobbered.
From memcache docs:
A series of commands is not atomic. If you issue a 'get' against an item, operate on the data, then wish to 'set' it back into memcached, you are not guaranteed to be the only process working on that value. In parallel, you could end up overwriting a value set by something else.
Race conditions and stale data
One thing to keep in mind as you design your application to cache data, is how to deal with race conditions and occasional stale data.
Say you cache the latest five comments for display on a sidebar in your application. You decide that the data only needs to be refreshed once per minute. However, you neglect to remember that this sidebar display is renderred 50 times per second! Thus, once 60 seconds rolls around and the cache expires, suddenly 10+ processes are running the same SQL query to repopulate that cache. Every time the cache expires, a sudden burst of SQL traffic will result.
Worse yet, you have multiple processes updating the same data, and the wrong one ends up dating the cache. Then you have stale, outdated data floating about.
One should be mindful about possible issues in populating or repopulating our cache. Remember that the process of checking memcached, fetching SQL, and storing into memcached, is not atomic at all!
I'm thinking - could a solution be to store viewcount seperately from the Post object, and then do an INCR on it. Of course this would require reading 2 seperate values from memcached when displaying the information.
memcached operations are atomic. the server process will queue the requests and serve each one completely before going to the next, so there's no need for locking.
edit: memcached has an increment command, which is atomic. You just have to store the counter as a separate value in the cache.
We encountered this in our system. We modified get so
If the value is unset, it sets it with a flag ('g') and [8] second TTL, and returns false so the calling function generates it.
If the value is not flagged (!== 'g') then unserialize and return it.
If the value is flagged (==='g') then wait 1 second and try again until it's not flagged. It will eventually be set by the other process, or expired by the TTL.
Our database load dropped by a factor of 100 when we implemented this.
function get($key) {
$value=$m->get($key);
if ($value===false) $m->set($key, 'g', $ttl=8);
else while ($value==='g') {
sleep(1);
$value=$m->get($key);
}
return $value;
}