Is there any way to monitor the task queue of scala.concurrent.ExecutionContext.Implicits.global? ie., see the number of tasks waiting for a thread to be released?
JDK comes along with jconsole and jmc. You can use them to see thread usage. You can see,
Thread state,
blocked count
thread allocated bytes etc
scala implicit threads name start with scala-execution-context-global-n.
jmc screenshot:
Related: what is the best way to get the number of futures running in background in an execution context?
Related
I have started using Akka. Please clarify the following queries;
I see around 8 default-dispatcher threads are created. Where is that number defined?
I see two types of dispatchers namely default-dispatcher and internal-dispatcher are created in my setup. How is this decided?
We could see the dispatcher threads in the Debug mode of any IDE. Is there any way to visualize the Akka objects such as Actors, Routers, Receptionist, etc?
I have observed that the dispatcher threads die automatically when I leave the program running for some time. Please explain the reason for this. Does actor system auto-create and auto-delete the dispatchers?
I see a number after default-dispatcher in the logs. What does this number indicate? Does it indicate the thread number being allocated by the dispatcher for an actor?
Example: 2022-09-02 10:39:25.482 [MyApp-akka.actor.default-dispatcher-5] DEBUG
The default dispatcher configuration for Akka can be found in the reference.conf file for the akka-actor package. By default, the default dispatcher (on which user actors run if not otherwise configured), is a fork-join pool with a minimum of 8 threads and a maximum of 64 threads. The internal dispatcher (introduced in Akka 2.6 to prevent user actors from starving system actors) is also, IIRC, a fork-join pool. Other dispatcher types and configurations are possible: the comments in reference.conf go through them, though for most purposes the defaults are reasonable.
As far as I know, there are no existing tools for visualizing the actors etc. in an Akka application.
The threads are managed by the thread-pool underlying the dispatcher. The fork-join pool will terminate threads which have been idle for a certain length of time and create new threads as needed.
The threads are (by default... at the very least a custom dispatcher could override this) named dispatcher-name-threadNumber. If you log inside an actor and have your log message format include the thread name, you can see which thread the actor was scheduled onto. Note that, typically, an actor being scheduled onto one thread does not imply it will never be scheduled on another thread (in particular, things like thread-local storage are unlikely to work); it's also worth noting that as Akka dispatchers are also Scala ExecutionContexts and Java Executors, the threads can be consumed by things which aren't actors (e.g. Future callbacks in Scala).
I was wondering if ProcessorContext.schedule is thread-safe so that I can spawn new thread to execute the Punctuator callback?
Also, if a consumer consumes just 1 partition but we set num.stream.threads=2. Does this automatically spawn a new thread for the scheduler?
After trying it a bit I found the answer may be "no".
Then what's the recommended the way to make spawning new thread for scheduler thread-safe?
Registering a punctuation will not spawn a new thread. The number of used threads in determined by num.stream.threads configuration only. Hence, if you register a punctuation, it's executed on the same thread as the topology and thus it is thread safe.
If you configure more threads than available input topic partitions, some threads would not get any work assigned, and thus, they would not execute any punctuations.
Should we use a thread pool for long running threads or start our own threads? Is there some design pattern?
Unfortunately, it depends. There is no hard and fast rule saying that you should always employ thread pools.
Thread pools offer two main things:
Delegated creation/reuse of threads.
Back-pressure
IMO, it's the back-pressure property that's interesting, but often the most poorly understood. Your machine runs on a limited set of resources. If you have (say) 8 CPU cores and they are all busy working, you would like to signal that in some way that adding more work (submitting more tasks) isn't going to help, at least not in terms of latency.
This is the reason java.util.concurrent.ExecutorService implementations allow you to specify a java.util.concurrent.BlockingQueue of your choice. When this queue grows full, invoking threads will block until the thread pool has managed to complete tasks in progress.
Whether or not to have long-running threads inside the thread pool depends on what it's doing. If the thread is constantly busy (meaning it will never complete) then it will always occupy a slot in the thread pool, which is kind of pointless.
Regarding delegated creation/reuse of threads; maybe you could have two pools, one for long-running tasks and one for other tasks. Or perhaps a long-running thread pool with one single slot, this will prevent two long-running tasks from running at the same time, provided that is what you want.
As you can see, there is no single good answer. It really boils down to what you are trying to achieve and how you want to use the resources at hand.
In my netty server, I create threadpools as follows.
ChannelFactory factory =
new NioServerSocketChannelFactory(
Executors.newCachedThreadPool(threadFactory),
Executors.newCachedThreadPool(threadFactory);
Sometimes, I noticed that after a certain number of connections are being worked on by the server, the subsequent connections wait for one of the priors threads to finish.
From the documentation of newCachedThreadPool, I was under the assumption that the thread pool creates new threads as needed. Could someone help me understand why some of my connections being blocked till the prior connections finish? Would netty not create a new thread for new connection, as all the existing ones are busy?
How do I fix this?
Any help is appreciated!
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
At any point, at most nThreads threads will be active processing tasks.
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.
The threads in the pool will exist until it is explicitly shutdown.
from Oracle Java Doc for newCachedThreadPool!
so the thread number is fixed by Executors.newCachedThreadPool
in netty default number is processer_number *2
:)
If a function in a thread is going to return, how can we describe this behavior.
The thread returns.
The thread is dying.
What's the meaning of "thread is dead"?
In my understanding, threads are basically kernel data structures. You can create and destroy threads through the system APIs. If you just create a thread, start it executing, and it runs out of code, the kernel will probably put it into a non-executing state. In unmanaged code you still have to release that resource.
Then there's the thread pool. In that case, you queue up work to be done by the thread pool, and the platform takes care of picking a thread and executing your work. When the work is complete, the thread is returned to the thread pool. The platform takes care of creating and destroying threads to balance the available threads against the workload and the system resources.
As of Java 1.3 the six-state thread model was introduced. This includes the following states:
Ready-to-run: The thread is created and waiting for being picked for running by the thread scheduler
Running: The thread is executing.
Waiting: The thread is in blocked state while waiting for some external processing to finish (like I/O).
Sleeping: The thread is forced to sleep via .sleep()
Blocked: On I/O: Will move into state 1 after finished (e.g. reading a byte of data). On sync: Will move into state 1 after a lock is acquired.
Dead (Terminated): The thread has finished working and cannot be resumed.
The term "Dead" is rarely used today, almost totally changed to "Terminated". These two are equivalent.
Most thread APIs work by asking the operating system to run a particular function, supplied by you, on your behalf. When this function eventually returns (via for example a return statement or reaching the end of its code) the operationg system ends the thread.
As for "dead" threads - that's not a term I've seen used in thread APIs.