In what condition can memcached run into allocation failure - memcached

each slab class has at least one page, and when there's no memory available to allocate, it will evict items in the "tails" list. So why can it run into MEMCACHED_MEMORY_ALLOCATION_FAILURE state?

I think this is a libmemcached state, not a memcached state. In any case it would occur when the application requests memory from the underlying allocator (malloc) and malloc returns no heap memory. Since the request for memory cannot be completed you would likely get an error like this since the application would not be able to proceed with your request.

This error occur when client call libmemcached.libmemcached throw exception include several cases:
1 realloc, malloc, calloc
2 see source code segment as follow:
...
new_size= sizeof(char) * (size_t)((adjust * MEMCACHED_BLOCK_SIZE) + string->current_size);
if (new_size < need)
return MEMCACHED_MEMORY_ALLOCATION_FAILURE;
...
fully it could not happen

Related

What happens if a thread is in the critical section or entering the critical section?

I am trying to better understand a chapter and have been confused about what happens if a thread is in the critical section or is entering the critical section. May someone explain or give me an idea on the process of what the thread undergoes in such circumstances? Thank you.
For an example, let's assume that you have an array, and multiple threads that read and write to the array; and if different threads are reading and writing to the array at the same time they'd see inconsistent data and it'd cause problems. To prevent those problems you protect the array with some kind of lock - before doing anything with the array a thread acquires the array's lock, and when it's finished using the array the thread releases the array's lock.
For example:
acquire_array_lock();
/** Critical section (code that does something with the array) **/
release_array_lock();
There's nothing special about the code in the critical section. It does whatever it was designed to do (maybe sorting the array, maybe adding up all the numbers in the array, maybe displaying the array, etc) using code that's no different to code that you might use to do the same thing in a single-threaded system without locks.
The only special parts are the code to acquire and release the lock.
There are many types of locks (spinlocks, mutexes, semaphores), but they all have the same fundamental principle - when acquiring it you have something (e.g. a variable) to determine if a thread can/can't continue, then either (if the thread can't continue) some kind of waiting or (if the thread can continue) some kind of change to let others know they need to wait; and when releasing you have something to let others know they can stop waiting.
The main difference between different kinds of locks is the implementation details - what kind of data is used to determine if a thread can/can't continue, and how a thread waits.
For the simplest kind of lock (a spinlock) you might just have a single "yes/no" flag, a little bit like this (but not literally like this):
acquire_lock(void) {
while(myLock == 0) {
// do nothing then retry
}
myLock = 1;
}
release_lock(void) {
myLock = 0;
}
However this won't work because two or more threads can see that myLock == 0 at the same time and think they can both continue (and then do the myLock = 1 after it's too late). To fix this you need assembly language or special language support for atomic operations (e.g. a special function for "test and set" or "compare and exchange").
The reason this is called a "spinlock" is that (if a thread needs to wait) it wastes CPU time continually checking ("spinning") to see if it can continue. Instead of doing this (to avoid wasting CPU time), a thread could tell a scheduler not to give it any CPU time until the lock is released; and this is how a mutex works.

Memory leak issue in (Scala + Akka HTTP) micro service

I have deployed my micro services into docker container on AWS server which is written using Akka-HTTP(https://github.com/theiterators/akka-http-microservice) and Scala. I am facing memory leak issues & performance problems once I deployed the service to AWS server.
I have noticed that the memory usage increased when server getting more number of requests(like 340mb, 410mb, 422mb...) & depends on load, then it automatically came down to the normal state (230mb). But Memory usage keeps on increasing when server getting more number of requests and it failed to release unused memory even after CPU usage came to the normal stage and finally it reached its maximum (512mb) and crashed.
We can avoid this if it cleared the unused memory/resources properly. JVM should handle the memory usage by itself using Garbage collectors. But it failed to clear unwanted memory(objects) once the request is processed. I am using the below code to clear the Akka.HTTP actor objects.
try {
<-- code block -->
} catch {
case e: Exception =>
sys.addShutdownHook(system.shutdown())
} finally {
sys.addShutdownHook(system.shutdown())
}
How can I remove unused memory/resources immediately after the execution process has been completed?
Please provide us a solution/way to resolve this problem as soon as possible as it is a very critical problem for us?
Firstly you should remove the code that is supposed to solve your problem.
BTW if you do it on each request you can spoil all the memory with these shutdown hooks.

potential memory leak using TriMap in Scala and Tomcat

I am using a scala.collection.concurrent.TriMap wrapped in an object to store configuration values that are fetched remotely.
object persistentMemoryMap {
val storage: TrieMap[String, CacheEntry] = TrieMap[String, CacheEntry]()
}
It works just fine but I have noticed that when Tomcat is shut down it logs some alarming messages about potential memory leaks
2013-jun-27 08:58:22 org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
ALLVARLIG: The web application [] created a ThreadLocal with key of type [scala.concurrent.forkjoin.ThreadLocalRandom$1] (value [scala.concurrent.forkjoin.ThreadLocalRandom$1#5d529976]) and a value of type [scala.concurrent.forkjoin.ThreadLocalRandom] (value [scala.concurrent.forkjoin.ThreadLocalRandom#59d941d7]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak
I am guessing this thread will terminate on it's own eventually but I am wondering if there is some way to kill it or should I just leave it alone?
The scala.concurrent.forkjoin.ThreadLocalRandom's value is created only once per thread. It does not hold any references to objects other than the random value generator used by that thread -- the memory it consumes has a fixed size. Once the thread is garbage collected, its thread local random value will be collected as well -- you should just let the GC do its work.
You could still remove it manually by using Java reflection to remove the private modifier on the static field localRandom in the ThreadLocalRandom class:
https://github.com/scala/scala/blob/master/src/forkjoin/scala/concurrent/forkjoin/ThreadLocalRandom.java#L62
You could then call localRandom.set(null) to null out the reference to the random number generator. You should also then ensure that TrieMap is no longer used from that thread, otherwise ThreadLocalRandom will break by assuming that the random number generator is different than null.
Seems hacky to me, and I think you should just stick to letting the GC collect the thread local value.

The reason why Task deletion of uCOS should not occur during ISR

I'm modifying some functionalities (mainly scheduling) of uCos-ii.
And I found out that OSTaskDel function does nothing when it is called by ISR.
Though I learned some basic features of OS, I really don't understand why that should be prohibited.
All it does is withrawl from readylist and release of acquired resources like TCB or semaphores...
Is there any reason for them to be banned while handling interrupt?
It is not clear from the documentation why it is prohibited in this case, but OSTaskDel() explicitly calls OS_Sched(), and in an ISR this should only happen when the outer-most nested interrupt handler exists (handled by OSIntExit()).
I don't think the following is advisable, because there may be other reasons why this is prohibited, but you could remove the:
if (OSIntNesting > 0) {
return (OS_TASK_DEL_ISR);
}
then make the OS_Sched() call conditional as follows:
if (OSIntNesting == 0) {
OS_Sched();
}
If this dies horribly, remember I said it was ill-advised!
This operation will extend your interrupt processing time in any case so is probably a bad idea if only for that reason.
It is a bad idea in general (not just from an ISR) to asynchronously delete another task regardless of that tasks state or resource usage. uC/OS-II provides the OSTaskDelReq() function to manage task deletion in a way that allows a task to delete itself on request and therefore be able to correctly release all its resources. Even without that, sending a request via the task's normal IPC mechanisms is usually better (and more portable).
If a task is not designed for self-deletion on demand, then you might simply use OSSuspend().
Generally, you cannot do a few things in ISRs:
block on a semaphore and the like
block while acquiring a spin lock, if it's a single-CPU system
cause a page fault, that has to be resolved by the virtual memory subsystem (with virtual on-disk memory, that is)
If you do any of the above in an ISR, you'll have a deadlock.
OSTaskDel() is probably doing some of those things.

Volatile vars and multi-core thread synchronization!

I have several threads executing concurrently and checking a value of a field in their own object. The field is set by the launch thread like this:
for (i = 0; i < ThreadCount; i++)
{
ThreadContext[i].MyField = 1;
}
Within each thread then I check the value of this value:
if (MyField == 1)
{
...//do something
}
However, I noticed that on a 4 core CPU, some of the (4) running threads need several miliseconds or even longer in order to see the changed MyField. MyField is a single char field. What appears to be happening is that when the memory bus is maxed out by the first thread which detects the change, all other threads may stall almost for the entire duration of the run of the first. (assuming there is enough memory pressure). Only when the first thread eases on memory (and does more with registers), is when other threads also get to see the new value.
I checked the asm and there is no compiler optimization in the way here. Calls go directly to memory. How can this be fixed?
Thanks!
Jam
I got feedback from Intel: Yes, that's how it works (no easy fix).