Vert.x standard verticle thread safety - vert.x

I'm just going through vert.x documentation and got confused by the part about standard verticles:
No more worrying about synchronized and volatile any more, and you also avoid many other cases of race conditions and deadlock so prevalent when doing hand-rolled 'traditional' multi-threaded application development.
This is the link to it: https://vertx.io/docs/vertx-core/java/#_standard_verticles
Is this statement true only if I deploy only one instance of standard verticle, and if my vert.x application isn't clustered?

only if I deploy only one instance of standard verticle, and if my vert.x application isn't clustered?
Each verticle deployed is single threaded. So if you have 3 instances - each of them individually are single threaded.
vert.x application isn't clustered?
Not related. Clustered is across processes/machines - here we are talking about threads

Related

why/how deploying multiple instances of a verticle

While reading a document about vert-x mongo client I came across following line:
In most cases you will want to share a pool between different client instances.
E.g. you scale your application by deploying multiple instances of your verticle and you want (...)
It is the last line that caught my attention. I didn't know I should scale my application by deploying multiple instances of the verticle. I plan to make a MongoDbVerticle class that will listen for queries on the event bus.
Questions are:
Am I really supposed to deploy this verticle several times?
How many times? Based on what criterias? Or have I misunderstood some basic concept? I'm new to vert-x, so that might well be.
What happens is that vertx will route your request to one of the verticles that you have defined. Since vertx can be deployed over several machines you can i practice load balance you verticles that have long running operations(such as talking to a database or writing to file, etc.).
If I remeber correctly vertx uses Round Robin to route the requests. That means that if you have two mongo-verticles; a and b, it will first select a then b then a again and so on.
To deploy a verticle you just use the command vertx run <verticle>.
Note: This is not as simple if you run your vertx instance as a fat-jar.

Service with background jobs, how to ensure jobs only run periodically ONCE per cluster

I have a play framework based service that is stateless and intended to be deployed across many machines for horizontal scaling.
This service is handling HTTP JSON requests and responses, and is using CouchDB as its data store again for maximum scalability.
We have a small number of background jobs that need to be run every X seconds across the whole cluster. It is vital that the jobs do not execute concurrently on each machine.
To execute the jobs we're using Actors and the Akka Scheduler (since we're using Scala):
Akka.system().scheduler.schedule(
Duration.create(0, TimeUnit.MILLISECONDS),
Duration.create(10, TimeUnit.SECONDS),
Akka.system().actorOf(LoggingJob.props),
"tick")
(etc)
object LoggingJob {
def props = Props[LoggingJob]
}
class LoggingJob extends UntypedActor {
override def onReceive(message: Any) {
Logger.info("Job executed! " + message.toString())
}
}
Is there:
any built in trickery in Akka/Actors/Play that I've missed that will do this for me?
OR a recognised algorithm that I can put on top of Couchbase (distributed mutex? not quite?) to do this?
I do not want to make any of the instances 'special' as it needs to be very simple to deploy and manage.
Check out Akka's Cluster Singleton Pattern.
For some use cases it is convenient and sometimes also mandatory to
ensure that you have exactly one actor of a certain type running
somewhere in the cluster.
Some examples:
single point of responsibility for certain cluster-wide consistent decisions, or coordination of actions across the cluster system
single entry point to an external system
single master, many workers
centralized naming service, or routing logic
Using a singleton should not be the first design choice. It has
several drawbacks, such as single-point of bottleneck. Single-point of
failure is also a relevant concern, but for some cases this feature
takes care of that by making sure that another singleton instance will
eventually be started.

How can I implement a job queue with a greedy-worker-pool in Java EE 6 in a correct way?

I'm looking for a correct way, to do the following in Java EE 6, if possible with vanilla Java EE 6 only.
I want to put a job in a job queue and have a fixed pool of worker objects, which should pull a job from the queue, if they are idle.
The worker objects are in a fixed relation to a legacy system, so it is not possible to use one worker object in multiple threads for all jobs and it is also not possible to instantiate a new worker object for every job.
The greedy worker pattern looks perfect, but that's only true for Java SE. In EE, I'm not sure, what the correct way is, to implement this.
Any suggestions?
Thanks in Advance.
M.
The first thing to notice is, that by definition in the spec you must not create and start your own threads in JavaEE.
Concerning your setup, I'm not completely sure how it works in your system - do you have a fixed relation to clients all the time or are there only jobs from time to time to execute which then do work for one client?
In both cases you can just use stateful EJBs, so that one EJB serves a specific client system. Then for the first case this EJB serves the client for the whole lifecycle or for the second case you can start asynchronous EJBs to do the work.

Using Scala Akka framework for blocking CLI calls

I'm relatively new to Akka & Scala, but I would like to use Akka as a generic framework to pull together information from various web tools, and cli commands.
I understand the general principal that in an Actor model, it is highly desirable not to have the actors block. And in the case of the http requests, there are async http clients (such as Spray) that means that I can handle the requests asynchronously within the Actor framework.
However, I'm unsure what is the best approach when combining actors with existing blocking API calls such as the scala ProcessBuilder/ProcessIO libraries. In terms of issuing these CLI commands I expect a relatively small amount of concurrency, e.g. perhaps executing a max of 10 concurrent CLI invocations on a 12 core machine.
Is it better to have a single actor managing these CLI commands, farming the actual work off to Futures that are created as needed? Or would it be cleaner just to maintain a set of separate actors backed by a PinnedDispatcher? Or something else?
From the Akka documentation ( http://doc.akka.io/docs/akka/snapshot/general/actor-systems.html#Blocking_Needs_Careful_Management ):
"
Blocking Needs Careful Management
In some cases it is unavoidable to do blocking operations, i.e. to put a thread to sleep for an indeterminate time, waiting for an external event to occur. Examples are legacy RDBMS drivers or messaging APIs, and the underlying reason in typically that (network) I/O occurs under the covers. When facing this, you may be tempted to just wrap the blocking call inside a Future and work with that instead, but this strategy is too simple: you are quite likely to find bottle-necks or run out of memory or threads when the application runs under increased load.
The non-exhaustive list of adequate solutions to the “blocking problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router [Java, Scala]), making sure to configure a thread pool which is either dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded number of tasks of this nature will exhaust your memory or thread limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they occur as actor messages.
The first possibility is especially well-suited for resources which are single-threaded in nature, like database handles which traditionally can only execute one outstanding query at a time and use internal synchronization to ensure this. A common pattern is to create a router for N actors, each of which wraps a single DB connection and handles queries as sent to the router. The number N must then be tuned for maximum throughput, which will vary depending on which DBMS is deployed on what hardware."

Jboss Messaging WorkerThread# what are these threads?

I am load testing a jboss messaging install with 5 producers producing 100,000 100k messages. I am seeing significant bottlenecking. When I monitor the profiler, I see there are 15 threads named WorkerThread#. These threads are allocated 100% with no waits. I think they may be related. Does anyone know what function these threads service and if there is a threadpool setting. I am using a supp
JBoss Enterprise Application Server 4.3 CP08
JBoss Enterprise Service Bus 4.4 CP04
JBoss Transactions 4.2.3._CP07
JBoss Messaging 1.4.0.SP3-CP09
JBoss Rules 4.0.7
JBoss jBPM 3.2.9
JBoss Web Services 2.0.1.SP2_CP07
I've figured it out. Its not a pool of threads. In the jboss-messaging.sar/remoting-bisocket.xml file that defines the remoting connector for Jboss Messaging, you see a couple of values mainly clientMaxPool, maxPoolSize, numAcceptThreads.
In remoting, when a socket is established threads are created to monitor that socket up to the value of "numAcceptThreads". All this thread does is read data from the socket and hand it off to a thread in the client pool(governed by maxPoolSize).
The threads called workerThread#[] refer to the accept threads. The reason that I see more when I create more producers is because for the bisocket transport for Jboss Messaging there apparently are three sockets created. Initially there are 3, but when I create 5 producers that number is increased to 15(or 5*3 for those not mathematically inclined :)). The reason they are 100% allocated is because when I am sending all those messages the threads read from the socket, hand off to Server Thread, go back to reading from the socket(where this is always data)
So the short answer is there is no pool to govern these threads. You can have more than 1 accept thread, but It would almost never make sense. This because its job is so minimal read the data, hand it off, read the data... So have more threads would just add synchronization overhead.
This is from http://download.oracle.com/javase/tutorial/uiswing/concurrency/worker.html; hope it helps.
When a Swing program needs to execute a long-running task, it usually uses one of the worker threads, also known as the background threads. Each task running on a worker thread is represented by an instance of javax.swing.SwingWorker. SwingWorker itself is an abstract class; you must define a subclass in order to create a SwingWorker object; anonymous inner classes are often useful for creating very simple SwingWorker objects.