In vert.x, does it make sense to create multiple HttpServer instances in a runtime? - vert.x

I created a verticle named HttpServerVerticle and inside it let it create a HttpServer instance by vertx.createHttpServer(), then in my main verticle I deployed this HTTP verticle with > 1 instances by vertx.deployVerticle("xxx.xxx.HttpServerVerticle", deploymentOptionsOf(instances = 2)).
Does it make sense to have multiple HttpServer instances in a runtime? If it does, why I did not see similar error messages like "8080 port is already in use"?

Vert.x will actually round-robin between your HttpServer instances listening on the same port:
When several HTTP servers listen on the same port, vert.x orchestrates the request handling using a round-robin strategy...
So, when [a] verticle is instantiated multiple times as with: vertx run io.vertx.examples.http.sharing.HttpServerVerticle -instances 2, what’s happening? If both verticles would bind to the same port, you would receive a socket exception. Fortunately, vert.x is handling this case for you. When you deploy another server on the same host and port as an existing server it doesn’t actually try and create a new server listening on the same host/port. It binds only once to the socket. When receiving a request it calls the server handlers following a round robin strategy...
Consequently the servers can scale over available cores while each Vert.x verticle instance remains strictly single threaded, and you don’t have to do any special tricks like writing load-balancers in order to scale your server on your multi-core machine.
So it is both safe and encouraged to creating multiple instances of HttpServers, if required to scale across cores.

Related

Does gRPC server spin up a new thread for each request?

I tried profiling a gRPC java server. And i see the below set of thread pools majorly.
grpc-default-executor Threads : Created 1 for each incoming request.
grpc-default-worker-ELG Threads: May be to listen on the incoming gRPC requests & assign to the above "grpc-default-executor" thread.
Overall, is gRPC java server, Netty style or Jetty/Tomcat style? Or it can configured to run as both ways?
gRPC Java server is exposed closer to Jetty/Tomcat style, except that it is asynchronous. That is, in normal Servlets each request consumes a thread until it is complete. While newer Servlet versions let you detach from the dedicated thread and continue work asynchronously (freeing the thread for other use) that is more uncommon. In gRPC you are free to work in either style. Note that gRPC uses a cachedThreadPool by default to reuse threads; on server-side it's a good idea to replace the default executor with your own, generally fixed-size, pool via ServerBuilder.executor().
Internally gRPC Java uses the Netty-style. That means fully non-blocking. You may use ServerBuilder.directExecutor() to run on the Netty threads. Although in that case you may want to specify the NettyServerBuilder.bossEventLoopGroup(), workerEventLoopGroup(), and for compatibility channelType().
As far as I know you can specify using the directExecutor() when building the GRPC server / client that will ensure all work is done in the IO thread and so threads will be shared. The default is to not do this for safety reasons as you will need to be very careful about what you do if you are in the IO Thread (like you should never block there).

Vertx | Global state of Verticles in a cluster

Newbie alert.
I'm trying to write a simple module in Vertx that polls the database (PostGres) every 10 seconds and pushes the results to the clients. I'm thinking of confining the blocking code (queries the database via JDBC) in a worker verticle and rest of the above layers are completely non-blocking and async.
This module will be packaged as a jar and distributed to a different apps (typically webapps) which can subscribe to the event bus via the javascript bridge.
My question here is in a clustered environment where I have 5 processes of the webapp running with the vertx modules, how can I ensure that there's only one vertx verticle querying the database. I don't want all the verticles querying the database and add more load. Or is there a different way to think to solve this problem. I'm using Vertx version 3.4.1
So there are 2 ways how your verticle can be multiplied:
If you instantiate multiple instances when you deploy your verticle
If you start to cluster your vert.x instances in different jvm's or different hosts
You could try to control the number of instances of your verticle which executes the query. Means you ensure, that the verticle only exists in one of your vert.x instances and your verticle is deployed with only one instance.
But this has several drawbacks:
your deployment is not transparent, means your cluster nodes differ in the deployment structure.
if your cluster node dies, where the query verticle is running, then you have no fallback.
So the best thing is, to deploy the verticle on all instances and synchronize it.
I see 3 possibilites:
Use hazelcast (the clustermanager of vert.x) to synchronize
http://vertx.io/docs/apidocs/io/vertx/spi/cluster/hazelcast/HazelcastClusterManager.html#getLockWithTimeout-java.lang.String-long-io.vertx.core.Handler-
There are also datastructures available, which are synchronized over
the cluster
http://vertx.io/docs/apidocs/io/vertx/spi/cluster/hazelcast/HazelcastClusterManager.html#getSyncMap-java.lang.String-
Use your database as synchronization point. you could add a simple
table which stores the last execution time in millis. The polling
modules, will first check if it is time to execute the next poll. If
the polling module executes the poll it also updates the time. This
has to be done in one transaction with a explicit lock on the time
table.
You use redis with the https://redis.io/commands/getset
functionality. You can store the time in millis in a key and ensure
with the getset method, that the upgrade of the time is atomic. So only the polling module which could set the key in redis, will execute the poll.
I'm giving out my naive solution here, I don't know if it would completely solve your problem or not but here is my thought process.
1) Polling bit, yes indeed you can have a worker verticle for blocking call's [ or else you could use Async bit here too IMHO because you already have Async Postgress JDBC client ] for the every 10secs part. code snippet like this can help you
vertx.setPeriodic(10000, id -> {
// This handler will get called every 10 seconds
JsonObject jdbcObject = fetchFromJdbc();
eventBus.publish("INTRESTED_PARTIES", jdbcObject);
});
2) For the listening part all the other verticles can subscribe to event bus and listen for the that address and would be getting the message whenever things would happen
3) This is for ensuring part that not all running instances of your jar start polling the database, for this I think the best possible way to handle would be not deploying the verticle in any jar and running the verticle in an standalone way using runtime vertx command like
vertx run DatabasePoller.java -cluster
And if you really want to be very fancy you could throw in Service Discovery for ensuring part that if the service of the verticle is already register then no other deployments would trigger registrations.
But I want to give you thumbs up on considering the events for getting that information much better way for handling inter-system communication.

Akka Cluster: calling actors by path

My use case is that I want to set up a cluster of nodes which run Akka Actors. Each actor would be an instance of the same actor to handle a WebSocket connection to a certain user.
Each actor would register itself with a unique path. On a non-clustered setup I can simply call an actor by its path like system.actorSelection(s"user/$client") where $client is a unique name to an actor instance. I have to pass messages to these actors so they can then send it back to their respective WebSocket client.
Apparently Akka Cluster offers a variety of setup: http://doc.akka.io/docs/akka/current/scala/cluster-usage.html
I want to run my nodes on Kubernetes where I can't reliable configure instance names/domains as instances will be coming and going.
What is the simplest set up for Akka Cluster in this scenario?
Did not see any impact with Kubernetes. For your case, I think akka cluster sharding is just for you, use shardRegion to get the proper sharded actor & send message to it directly. For every docker instance just need to make itself as part of cluster node, then not necessary to use a fixed address to find an actor, then, instance dynamic join & leave are all ok.

How to make Vertx server to serve request in parallel?

How to make Vertx server to serve request in parallel? If lets say there are 50 user who are submitting HTTP request to vertx server then I want all user request should be served in parallel ?
Asking in context of vertx 2 manual
As far as I know, it is the same than vertx 3: http servers handle requests in parallel, unless you block the event loop.
For all 50 user requests to be served in parallel, run your verticle with increased number of instances which will in short scale your application.
Run 50 instances of Java verticle,
vertx run MyVerticle.java -instances 50
From vert.x manual,
-instances : The number of instances of the verticle to instantiate in the Vert.x server. Each verticle instance is strictly single threaded so to scale your application across available cores you might want to deploy more than one instance.
Analogy : one user request/one vert.x instance

Distributed Actors in Akka

I'm fairly new to Akka and new to distributed programming in general. Using Akka's Mist component, I've created supervised actors to handle HTTP requests asynchronously. Everything is currently running on one physical machine with local actors. What I don't understand is how to build a truly fault-tolerant system with more than one box. As stated in the Akka docs:
Also, you (usually) need to know if one box is down and/or the service you are talking to on the other box is down. Here actor supervision/linking is a critical tool for not only monitoring the health of remote services, but to actually manage the service, do something about the problem if the actor or node is down. Such as restarting actors on the same node or on another node.
How do I do this? I'm looking for an example or pointers on how to begin making my application distributed. Other services in our group use Apache gateways in front of multiple Tomcat instances, so the event of a Tomcat server going down is transparent to the user. I'm deploying my service to the Akka microkernel and need to achieve a similar level of high availability across more than one physical box.
I'm using Akka 1.1.3.
Remote supervision works only with client-managed remote actors for the Akka 1.x series.
Akka 2.0 that is currently under development will support transparent clustering, cluster-wide supervision and cluster-wide lifecycle monitoring.
You might consider putting an HTTP load balancer in front of Akka Microkernel instances running Mist, this would match what your group does with 'Apache gateways'.
Another approach would be to expose remote actors on a number of instances and then use Akka's LoadBalancer or Actor Pool to send messages around, see here
The second approach is a bit of a pain if you have a dynamic pool of machines, because the pool of devices wants to be specified programatically. Akka 2.0 addresses this with cluster support that is setup in the akka.conf file.
As far as the release date of 2.0, for what its worth 1.2 was just recently released on 2011-Sept-19.