how is HawtDispatch different from Java's Executors? (and netty) - threadpool

Frustratingly, HawtDispatch's website describes it as "thread pooling and NIO event notification framework API."
Let's take the 'thread pooling' part first. Most of the Executors provided by Java are also basically thread pools. How is HawtDispatch different?
It is also apparently an "NIO event notification framework API." I'm assuming it is a thin layer on top NIO which takes incoming data and passes to its notion of 'thread pool,' and passes it to a consumer when the thread pool scheduler finds the time. Correct? (Any improvement over NIO is welcomed). Has anyone done any performance analysis of netty vs HD?

HawtDispatch is designed to be a single system wide fixed size thread pool. It provides implements 2 flavors of Java Executors:
Global Dispatch Queue: Submitted Runnable objects are executed concurrently (You get the same effect using a Executors.newFixedThreadPool(n) executor)
Serial Dispatch Queue: Submitted Runnable objects are executed serially (You get the same effect using a Executors.newSingleThreadExecutor() executor)
Unlike the java executor model all global and serial dispatch queues share a single fixed size thread pool. You can use thousands of serial dispatch queues without increasing your thread count. Serial dispatch queues can be used like Erlang mailboxes to drive reactive actor style applications.
Since HawtDispatch is using a fixed size thread pool for processing all global and serial queue executions, all Runnable tasks it executes must be non-blocking. In a way this is similar to the NodeJS architecture except it using multiple threads instead of just one.
In comparison to Netty, HawtDispatch is not a framework for actually processing socket data. It does not provide a framework for how encode/decode, buffer and process the socket data. All it does is execute a user configured Runnable when data can be read or written on the non-blocking socket. It's up to you application then to actually read/write the socket data.

Related

Number of dispatcher threads created in Akka & Viewing Akka Actors in an IDE

I have started using Akka. Please clarify the following queries;
I see around 8 default-dispatcher threads are created. Where is that number defined?
I see two types of dispatchers namely default-dispatcher and internal-dispatcher are created in my setup. How is this decided?
We could see the dispatcher threads in the Debug mode of any IDE. Is there any way to visualize the Akka objects such as Actors, Routers, Receptionist, etc?
I have observed that the dispatcher threads die automatically when I leave the program running for some time. Please explain the reason for this. Does actor system auto-create and auto-delete the dispatchers?
I see a number after default-dispatcher in the logs. What does this number indicate? Does it indicate the thread number being allocated by the dispatcher for an actor?
Example: 2022-09-02 10:39:25.482 [MyApp-akka.actor.default-dispatcher-5] DEBUG
The default dispatcher configuration for Akka can be found in the reference.conf file for the akka-actor package. By default, the default dispatcher (on which user actors run if not otherwise configured), is a fork-join pool with a minimum of 8 threads and a maximum of 64 threads. The internal dispatcher (introduced in Akka 2.6 to prevent user actors from starving system actors) is also, IIRC, a fork-join pool. Other dispatcher types and configurations are possible: the comments in reference.conf go through them, though for most purposes the defaults are reasonable.
As far as I know, there are no existing tools for visualizing the actors etc. in an Akka application.
The threads are managed by the thread-pool underlying the dispatcher. The fork-join pool will terminate threads which have been idle for a certain length of time and create new threads as needed.
The threads are (by default... at the very least a custom dispatcher could override this) named dispatcher-name-threadNumber. If you log inside an actor and have your log message format include the thread name, you can see which thread the actor was scheduled onto. Note that, typically, an actor being scheduled onto one thread does not imply it will never be scheduled on another thread (in particular, things like thread-local storage are unlikely to work); it's also worth noting that as Akka dispatchers are also Scala ExecutionContexts and Java Executors, the threads can be consumed by things which aren't actors (e.g. Future callbacks in Scala).

CoreData: would it be safe using NSConfinementConcurrencyType and dispatch all operations using this context on a serial dispatch queue?

Literally, this concurrency type requires an specific thread, but using a serial queue would be more easy, but is it safe to use the context with a NSConfinementConcurrencyType concurrency type on a serial dispatch queue?
As long as you're sure you only use that queue with the context, yes, that's completely fine.
Core Data doesn't care about the thread so much as it cares about concurrent access. If you serialize access, you're safe, however you choose to do it. You could use NSRecursiveLock or semaphores or whatever works for you.
Note that the newer concurrency models are queue based. NSPrivateQueueConcurrencyType does not guarantee that operations are always performed on the same thread, even when you use performBlock:. They happen on a private queue and might run on different threads at different times. If you can manage your queue and your access well enough to do this yourself, it's reasonable to do so.
No, having a serial queue does not guarantee the operations will execute on the same thread:
The Concurrency Programming Guide specifies
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
Why don't you just use the NSPrivateQueueConcurrencyType? It will make your code cleaner and thread safe. You just need to call -performBlock: or -performBlockAndWait: when accessing the context from somewhere other than the block that initialized the context.

Using Scala Akka framework for blocking CLI calls

I'm relatively new to Akka & Scala, but I would like to use Akka as a generic framework to pull together information from various web tools, and cli commands.
I understand the general principal that in an Actor model, it is highly desirable not to have the actors block. And in the case of the http requests, there are async http clients (such as Spray) that means that I can handle the requests asynchronously within the Actor framework.
However, I'm unsure what is the best approach when combining actors with existing blocking API calls such as the scala ProcessBuilder/ProcessIO libraries. In terms of issuing these CLI commands I expect a relatively small amount of concurrency, e.g. perhaps executing a max of 10 concurrent CLI invocations on a 12 core machine.
Is it better to have a single actor managing these CLI commands, farming the actual work off to Futures that are created as needed? Or would it be cleaner just to maintain a set of separate actors backed by a PinnedDispatcher? Or something else?
From the Akka documentation ( http://doc.akka.io/docs/akka/snapshot/general/actor-systems.html#Blocking_Needs_Careful_Management ):
"
Blocking Needs Careful Management
In some cases it is unavoidable to do blocking operations, i.e. to put a thread to sleep for an indeterminate time, waiting for an external event to occur. Examples are legacy RDBMS drivers or messaging APIs, and the underlying reason in typically that (network) I/O occurs under the covers. When facing this, you may be tempted to just wrap the blocking call inside a Future and work with that instead, but this strategy is too simple: you are quite likely to find bottle-necks or run out of memory or threads when the application runs under increased load.
The non-exhaustive list of adequate solutions to the “blocking problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router [Java, Scala]), making sure to configure a thread pool which is either dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded number of tasks of this nature will exhaust your memory or thread limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they occur as actor messages.
The first possibility is especially well-suited for resources which are single-threaded in nature, like database handles which traditionally can only execute one outstanding query at a time and use internal synchronization to ensure this. A common pattern is to create a router for N actors, each of which wraps a single DB connection and handles queries as sent to the router. The number N must then be tuned for maximum throughput, which will vary depending on which DBMS is deployed on what hardware."

Jboss Messaging WorkerThread# what are these threads?

I am load testing a jboss messaging install with 5 producers producing 100,000 100k messages. I am seeing significant bottlenecking. When I monitor the profiler, I see there are 15 threads named WorkerThread#. These threads are allocated 100% with no waits. I think they may be related. Does anyone know what function these threads service and if there is a threadpool setting. I am using a supp
JBoss Enterprise Application Server 4.3 CP08
JBoss Enterprise Service Bus 4.4 CP04
JBoss Transactions 4.2.3._CP07
JBoss Messaging 1.4.0.SP3-CP09
JBoss Rules 4.0.7
JBoss jBPM 3.2.9
JBoss Web Services 2.0.1.SP2_CP07
I've figured it out. Its not a pool of threads. In the jboss-messaging.sar/remoting-bisocket.xml file that defines the remoting connector for Jboss Messaging, you see a couple of values mainly clientMaxPool, maxPoolSize, numAcceptThreads.
In remoting, when a socket is established threads are created to monitor that socket up to the value of "numAcceptThreads". All this thread does is read data from the socket and hand it off to a thread in the client pool(governed by maxPoolSize).
The threads called workerThread#[] refer to the accept threads. The reason that I see more when I create more producers is because for the bisocket transport for Jboss Messaging there apparently are three sockets created. Initially there are 3, but when I create 5 producers that number is increased to 15(or 5*3 for those not mathematically inclined :)). The reason they are 100% allocated is because when I am sending all those messages the threads read from the socket, hand off to Server Thread, go back to reading from the socket(where this is always data)
So the short answer is there is no pool to govern these threads. You can have more than 1 accept thread, but It would almost never make sense. This because its job is so minimal read the data, hand it off, read the data... So have more threads would just add synchronization overhead.
This is from http://download.oracle.com/javase/tutorial/uiswing/concurrency/worker.html; hope it helps.
When a Swing program needs to execute a long-running task, it usually uses one of the worker threads, also known as the background threads. Each task running on a worker thread is represented by an instance of javax.swing.SwingWorker. SwingWorker itself is an abstract class; you must define a subclass in order to create a SwingWorker object; anonymous inner classes are often useful for creating very simple SwingWorker objects.

Scala: Why are Actors lightweight?

What makes actors so lightweight?
I'm not even sure how they work. Aren't they separate threads?
When they say lightweight they mean that each actor is not mapped to a single thread.
JVM offers shared memory threads with
locks as the primary form of
concurrency abstractions. But shared
memory threads are quite heavyweight
and incur severe performance penalties
from context switching overheads. For
an actor implementation based on a
one-to-one mapping with JVM threads,
the process payload per Scala actor
will not be as lightweight that we can
spawn a million instances of an actor
for a specific computation. Hence
Scala actors have been designed as
lightweight event objects, which get
scheduled and executed on an
underlying worker thread pool that
gets automatically resized when all
threads block on long running
operations. In fact, Scala implements
a unified model of actors - thread
based and event based. Scala actors
offer two form of suspension
mechanisms - a full stack frame
suspension(implemented as receive) and
a suspension based on a continuation
closure (implemented as react). In
case of event based actors, a wait on
react is represented by a continuation
closure, i.e. a closure that captures
the rest of the actor's computation.
When the suspended actor receives a
message that matches one of the
patterns specified in the actor, the
continuation is executed by scheduling
the task to one of the worker threads
from the underlying thread pool. The
paper "Actors that Unify Threads and
Events" by Haller and Odersky
discusses the details of the
implementation.
Source
Important Reference Actors that Unify Threads and Events
I don't think we should strengthen that actor is that lightweight.
firstly thread based actors are actor per thread so not lightweight at all.
event based actors are the point where we start to feel actors are light weight. it is light weight because it does not have working thread wait and switched to another , working thread just switch from a piece of data work to another piece of data work, thus keep spinning on effective calculations.