Linux, warning : __get_request: dev 8:0: request aux data allocation failed, iosched may be disturbed - linux-device-driver

I playing with the test code to submit BIO from my own kernel module:
if i use submit_bio(&bio) - all works fine
if i use bdev->bd_queue->make_request_fn(bdev->bd_queue, &bio) then
getting in the dmesg:
__get_request: dev 8:0: request aux data allocation failed, iosched may be disturbed
My primary target is submiting BIOs to stackable device driver w/o calling the submit_bio() routine. Any ideas, pointers ?

Our hero Tom Caputi of ZFS encryption fame figured this out.
Basically the scheduler expects an io context in the task struct for the thread that's running your request.
You'll see here, the io context is created in generic_make_request_checks()
https://elixir.bootlin.com/linux/latest/source/block/blk-core.c#L2323
If it is never created for the task struct that's running your request, you'll see that message "io sched may be disturbed." A lousy message if ever there was one. "Scheduler context was not allocated for this task" would have made the problem a bit more obvious.
I'm not the kernel guy that tom is, but basically by doing this:
bdev->bd_queue->make_request_fn
your request is being handled by another thread, that doesn't have that context allocated.
Now create_io_context is not exported so you can't call it directly.
But if you call this:
https://elixir.bootlin.com/linux/latest/source/block/blk-ioc.c#L319
which is exported the io context will be allocated an no more warning message.
And I imagine there will be some io improvement because the scheduler has context to work with.

Related

The task scheduling problem of Flink, in Flink, how to place subtasks in a slot of the specified task manager?

Recently, I am studying the problem of task scheduling in Flink. My purpose is to schedule subtasks to a slot of the specified node according to my own needs by modifying some source codes of the scheduling part. Through remote debugging and checking the source code, I found the following method call stack, most of which I can't understand (the comments are a little less), especially in this method: org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#allocateMultiTaskSlot. I guess the code that allocates slots to tasks is around here. Because it is too difficult to read the source code, I have to ask you for help. Of course, if there is a better way to achieve my needs, please specify one or two. Sincerely look forward to your reply! Thank you very much!!!
The method call stack is as follows(The version of Flink I use is 1.11.1):
org.apache.flink.runtime.jobmaster.JobMaster#startJobExecution
org.apache.flink.runtime.jobmaster.JobMaster#resetAndStartScheduler
org.apache.flink.runtime.jobmaster.JobMaster#startScheduling
org.apache.flink.runtime.scheduler.SchedulerBase#startScheduling
org.apache.flink.runtime.scheduler.DefaultScheduler#startSchedulingInternal
org.apache.flink.runtime.scheduler.strategy.EagerSchedulingStrategy#startScheduling
(This is like the method call chain of PipelinedRegionSchedulingStrategy class. In order to simply write it as the method call chain of EagerSchedulingStrategy class, it should have no effect)
org.apache.flink.runtime.scheduler.strategy.EagerSchedulingStrategy#allocateSlotsAndDeploy
org.apache.flink.runtime.scheduler.DefaultScheduler#allocateSlotsAndDeploy
org.apache.flink.runtime.scheduler.DefaultScheduler#allocateSlots
org.apache.flink.runtime.scheduler.DefaultExecutionSlotAllocator#allocateSlotsFor
org.apache.flink.runtime.executiongraph.SlotProviderStrategy.NormalSlotProviderStrategy#allocateSlot
org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#allocateSlot
org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#allocateSlotInternal
org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#internalAllocateSlot
org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#allocateSharedSlot
org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl#allocateMultiTaskSlot
(I feel that this is the key to allocate slot for subtask, that is, execution vertex, but there is no comment, and I don't understand the process idea, so I can't understand it.)

Context switch by random system call

I know that an interrupt causes the OS to change a CPU from its current task and to run a kernel routine. I this case, the system has to save the current context of the process running on the CPU.
However, I would like to know whether or not a context switch occurs when any random process makes a system call.
I would like to know whether or not a context switch occurs when any random process makes a system call.
Not precisely. Recall that a process can only make a system call if it's currently running -- there's no need to make a context switch to a process that's already running.
If a process makes a blocking system call (e.g, sleep()), there will be a context switch to the next runnable process, since the current process is now sleeping. But that's another matter.
There are generally 2 ways to cause a content switch. (1) a timer interrupt invokes the scheduler that forcibly makes a context switch or (2) the process yields. Most operating systems have a number of system services that will cause the process to yield the CPU.
well I got your point. so, first I clear a very basic idea about system call.
when a process/program makes a syscall and interrupt the kernel to invoke syscall handler. TSS loads up Kernel stack and jump to syscall function table.
See It's actually same as running a different part of that program itself, the only major change is Kernel play a role here and that piece of code will be executed in ring 0.
now your question "what will happen if a context switch happen when a random process is making a syscall?"
well, nothing will happen. Things will work in same way as they were working earlier. Just instead of having normal address in TSS you will have address pointing to Kernel stack and SysCall function table address in that random process's TSS.

Can the instructions in a thread change during execution? (OS)

I'm currently researching threads in the context of the operating system and I'm unsure if a thread is a set sequence of instructions that can be repeatedly executed or if it is filled and replaced with new instructions by the user or the operating system.
Thanks a bundle!
-Tom
I'm not quite sure what you mean - the compiled instructions for a program are stored in memory and are not changed at runtime (at least for languages which are not JIT-compiled).
A thread is an entirely separate concept from the code itself. A thread gives you the ability to be running at "two places at once" in the code. At a conceptual level, a thread is simply a container for the context that you need at any point in the execution of some code. This means that each thread has a call stack and a set of registers (which are either actually stored in the registers of a processor if the thread is running, or elsewhere if the thread is paused).
Almost all thread libraries work such that a new thread will execute some user-defined function and will then exit. This function can be long-running, just like main() (which is the function executed by the first thread in your process).
If the threads are supported by the OS (ie they are not "green threads"/"fibers") they will exit by calling an OS API which tells the OS it can deallocate any data it has which is associated with that thread.
Sometimes, abstractions are built on top of this mechanism such that a thread or pool of threads will execute a function which simply loops over a queue of tasks to run, but the fundamental mechanism is the same. However, these abstractions are provided by user libraries built on top of the OS threading mechanisms, not by the OS itself.

Memcached memory leak

i'm building some application, where i have to use memcached.
I found quite nice client:
net.spy.memcached.MemcachedClient
Under this cliend everything works greate except one - i have problem with close connection, and after a while i'm startign to fight with memory leak.
I was looking for possibility for close connection, and i foud "shutdown" method. But if i use this method like this:
MemcachedClient c = new MemcachedClient(new InetSocketAddress(
memcachedIp, memcachedPort));
c.set(something, sessionLifeTime, memcache.toJSONString());
c.shutdown();
I have problem with adding anything do memcached - in logs i see that this method is opening connection, and before it will add anything to memcached, it's closeing the connection.
Do you have any idea, what to do?
Additionally - i found method: c.shutdown(2, TimeUnit.SECONDS); - which should close connection after 2 seconds, but i have connected jmx monitor to my tomcat and i see, that Memcached thread isn't finished after 2 seconds - this thread isn't finished at all...
The reason you are having an issue adding things to memcached like this is that the set(...) function is asynchronous and all it does is put that operation into a queue to be sent to memcached. Since you call shutdown right after this the operation doesn't actually have time to make it out onto the wire. You need to call set(...).get() in order to make your application thread actually wait for the operation to complete before calling shutdown.
Also, I haven't experience IO threads not dying after calling shutdown with a timeout. One way you can confirm that this is an actual bug is by running a standalone program with Spymemached. If the process doesn't terminate when it's completed then you've found an issue.

How to do a "kill_proc()" in Linux Kernel 2.6.31.5

Trying this free forum for developers. I am migrating a serial driver to kernel 2.6.31.5. I have used various books and articles to solve problems going from 2.4
Now I have a couple of kill_proc that is not supported anymore in kernel 2.6.31.5
What would be the fastest way to migrate this into the kernel 2.6.31.5 way of killing a thread. In the books they say use kill() but it does not seem to be so in 2.6.31.5. Using send_signal would be a good way, but how do I do this? There must be a task_struct or something, I wich I could just provide my PID and SIGTERM and go ahaed and kill my thread, but it seems more complicated, having to set a struct with parameters I do not know of.
If anyone have a real example, or a link to a place with up to date info on kernel 2.6.31 I would be very thankful. Siply put, I need to kill my thread, and this is not suppose to be hard. ;)
This is my code now:
kill_proc(ex_pid, SIGTERM, 1);
/Jörgen
For use with kthreads, there is now kthread_stop that the caller (e.g. the module's exit function) can invoke. The kthread itself has to check using kthread_should_stop. Examples of that are readily available in the kernel source tree.