Vertx | Global state of Verticles in a cluster - vert.x

Newbie alert.
I'm trying to write a simple module in Vertx that polls the database (PostGres) every 10 seconds and pushes the results to the clients. I'm thinking of confining the blocking code (queries the database via JDBC) in a worker verticle and rest of the above layers are completely non-blocking and async.
This module will be packaged as a jar and distributed to a different apps (typically webapps) which can subscribe to the event bus via the javascript bridge.
My question here is in a clustered environment where I have 5 processes of the webapp running with the vertx modules, how can I ensure that there's only one vertx verticle querying the database. I don't want all the verticles querying the database and add more load. Or is there a different way to think to solve this problem. I'm using Vertx version 3.4.1

So there are 2 ways how your verticle can be multiplied:
If you instantiate multiple instances when you deploy your verticle
If you start to cluster your vert.x instances in different jvm's or different hosts
You could try to control the number of instances of your verticle which executes the query. Means you ensure, that the verticle only exists in one of your vert.x instances and your verticle is deployed with only one instance.
But this has several drawbacks:
your deployment is not transparent, means your cluster nodes differ in the deployment structure.
if your cluster node dies, where the query verticle is running, then you have no fallback.
So the best thing is, to deploy the verticle on all instances and synchronize it.
I see 3 possibilites:
Use hazelcast (the clustermanager of vert.x) to synchronize
http://vertx.io/docs/apidocs/io/vertx/spi/cluster/hazelcast/HazelcastClusterManager.html#getLockWithTimeout-java.lang.String-long-io.vertx.core.Handler-
There are also datastructures available, which are synchronized over
the cluster
http://vertx.io/docs/apidocs/io/vertx/spi/cluster/hazelcast/HazelcastClusterManager.html#getSyncMap-java.lang.String-
Use your database as synchronization point. you could add a simple
table which stores the last execution time in millis. The polling
modules, will first check if it is time to execute the next poll. If
the polling module executes the poll it also updates the time. This
has to be done in one transaction with a explicit lock on the time
table.
You use redis with the https://redis.io/commands/getset
functionality. You can store the time in millis in a key and ensure
with the getset method, that the upgrade of the time is atomic. So only the polling module which could set the key in redis, will execute the poll.

I'm giving out my naive solution here, I don't know if it would completely solve your problem or not but here is my thought process.
1) Polling bit, yes indeed you can have a worker verticle for blocking call's [ or else you could use Async bit here too IMHO because you already have Async Postgress JDBC client ] for the every 10secs part. code snippet like this can help you
vertx.setPeriodic(10000, id -> {
// This handler will get called every 10 seconds
JsonObject jdbcObject = fetchFromJdbc();
eventBus.publish("INTRESTED_PARTIES", jdbcObject);
});
2) For the listening part all the other verticles can subscribe to event bus and listen for the that address and would be getting the message whenever things would happen
3) This is for ensuring part that not all running instances of your jar start polling the database, for this I think the best possible way to handle would be not deploying the verticle in any jar and running the verticle in an standalone way using runtime vertx command like
vertx run DatabasePoller.java -cluster
And if you really want to be very fancy you could throw in Service Discovery for ensuring part that if the service of the verticle is already register then no other deployments would trigger registrations.
But I want to give you thumbs up on considering the events for getting that information much better way for handling inter-system communication.

Related

How to create a cron job in a Kubernetes deployed app without duplicates?

I am trying to find a solution to run a cron job in a Kubernetes-deployed app without unwanted duplicates. Let me describe my scenario, to give you a little bit of context.
I want to schedule jobs that execute once at a specified date. More precisely: creating such a job can happen anytime and its execution date will be known only at that time. The job that needs to be done is always the same, but it needs parametrization.
My application is running inside a Kubernetes cluster, and I cannot assume that there always will be only one instance of it running at the any moment in time. Therefore, creating the said job will lead to multiple executions of it due to the fact that all of my application instances will spawn it. However, I want to guarantee that a job runs exactly once in the whole cluster.
I tried to find solutions for this problem and came up with the following ideas.
Create a local file and check if it is already there when starting a new job. If it is there, cancel the job.
Not possible in my case, since the duplicate jobs might run on other machines!
Utilize the Kubernetes CronJob API.
I cannot use this feature because I have to create cron jobs dynamically from inside my application. I cannot change the cluster configuration from a pod running inside that cluster. Maybe there is a way, but it seems to me there have to be a better solution than giving the application access to the cluster it is running in.
Would you please be as kind as to give me any directions at which I might find a solution?
I am using a managed Kubernetes Cluster on Digital Ocean (Client Version: v1.22.4, Server Version: v1.21.5).
After thinking about a solution for a rather long time I found it.
The solution is to take the scheduling of the jobs to a central place. It is as easy as building a job web service that exposes endpoints to create jobs. An instance of a backend creating a job at this service will also provide a callback endpoint in the request which the job web service will call at the execution date and time.
The endpoint in my case links back to the calling backend server which carries the logic to be executed. It would be rather tedious to make the job service execute the logic directly since there are a lot of dependencies involved in the job. I keep a separate database in my job service just to store information about whom to call and how. Addressing the startup after crash problem becomes trivial since there is only one instance of the job web service and it can just re-create the jobs normally after retrieving them from the database in case the service crashed.
Do not forget to take care of failing jobs. If your backends are not reachable for some reason to take the callback, there must be some reconciliation mechanism in place that will prevent this failure from staying unnoticed.
A little note I want to add: In case you also want to scale the job service horizontally you run into very similar problems again. However, if you think about what is the actual work to be done in that service, you realize that it is very lightweight. I am not sure if horizontal scaling is ever a requirement, since it is only doing requests at specified times and is not executing heavy work.

Load balancer and celery result backends

I have a task that takes approximately 3 minutes to run. It pulls data from a remote server and makes cpu-intensive analysis on it. This task will be invoked by an api call. Upon the api call, i am planning to give client a unique task id and assign the task to a celery worker. Then the client will poll the server with the given task id to see if the task is completed by celery worker and its result it saved to a result backend. I think of using nginx, gunicorn, flask and dockerize them for a easy deploy in case i need to distribute this architecture across multiple machines.
The problem is that the client may poll different servers due to load balancer and if not handled well, the polled server’s celery’s result backend might not have the task’s result but other server’s celery result backend has it.
Is it possible to use a single result backend over multiple celery instances and make different celery instances wuery the same result backend? What might be other possible ways to solve this other than using cloud storage like S3?
Would I have this problem only if I have multiple machines or would it happen even if I have multiple gunicorn instances in a single machine where nginx acts as a load balancer on them?
Not that it is possible to use a single result backend by all Celery workers, but that is the only setting that makes sense! Same goes for the broker in most cases, unless you have a complicated Celery infrastructure with exchanges, and complicated routes...

How to make Vertx server to serve request in parallel?

How to make Vertx server to serve request in parallel? If lets say there are 50 user who are submitting HTTP request to vertx server then I want all user request should be served in parallel ?
Asking in context of vertx 2 manual
As far as I know, it is the same than vertx 3: http servers handle requests in parallel, unless you block the event loop.
For all 50 user requests to be served in parallel, run your verticle with increased number of instances which will in short scale your application.
Run 50 instances of Java verticle,
vertx run MyVerticle.java -instances 50
From vert.x manual,
-instances : The number of instances of the verticle to instantiate in the Vert.x server. Each verticle instance is strictly single threaded so to scale your application across available cores you might want to deploy more than one instance.
Analogy : one user request/one vert.x instance

JBoss 7.1.1 and the EJB 3.1 Timer Service

I am thinking about porting a Spring Quartz based application to EJB 3.1 to see if EJB has improved. I am having problems understanding how fail-over works with the Schedule Timer Service. In Quartz, there are database tables which clustered Quartz instances use. If one node in your cluster crashes, jobs will still get executed on other nodes.
I have been looking at how the Timer Service persists things and it appears to use the file system of the server the Timer was created on. Is this true? I do not see how this would be possible as it would render the Timer Service unusable since it would not support failover.
So i must be missing something. Can anyone help me out with this?
The EJB timer service is simply not as advanced as Quartz (with or without Spring).
EJB timers are persisted to an unknown location. It may happen to be the file-system, but it could also be the Windows registry if you happen to be running on Windows, or it could be an LDAP server or whatever.
There was an issue on the EJB spec JIRA for some time about this, and it was discussed on the spec mailing list, but then it was brutally dropped and closed because no one bothered to reply anyone (perhaps because a lot of people were on vacation at the time). It's one of the lamest reasons to close an issue if you'd ask me, but I guess the spec lead sometimes must resort to such measures.
Anyway, in JBoss AS persisting happens to an embedded relational datasource, that on its turn writes to the filesystem. Via propriatary configuration you can point this datasource to any remote DB. Fail-over would have to come from propriatary JBoss functionality as well. Although EJB forbids lots of things for the sake of potential clustering, there's no explicit clustering support in the spec and thus specifically EJB timers are not cluster aware.
Not sure if this was available at the time of the question but you can use the 'cluster-ha-singleton' for this, it allows you to create a singleton timer that is invoked from a single cluster node, in case of failover of the chosen node a new node is elected to run the singleton (and therefore the timers)
http://www.jboss.org/quickstarts/eap/cluster-ha-singleton/
It mentions EAP but I am running on AS 7.2.0 fine, the jars are already included in /modules/org/jboss/

Spring schedulers in a load balanced environment

I have multiple quartz cron jobs in a load balanced environment. Currently these jobs are running on each node, which is not desirable. I want a node to run only a particular scheduler and if the node crashes, another node should run the scheduler intended for the node that crashed.
How can this be done with spring 2.5.6/tomcat load balancer.
I think there's a few aspects to this question.
Firstly, Quartz has API methods for pausing and resuming the Scheduler, or even individual triggers and jobs
e.g.
http://www.jarvana.com/jarvana/view/opensymphony/quartz/1.6.1/quartz-1.6.1-javadoc.jar!/org/quartz/Scheduler.html#standby()
I would create a spring bean with a reference to the Quartz scheduler or trigger, and a simple isMasterNode boolean member for storing state. I'd then expose 2 [restricted-access] web service calls: makeMaster and makeSlave, which will call Scheduler.resume() or standby/pause, respectively.
Finall, the big question is how & with what you determine that another node has 'crashed'.
If you're using a hardware loadbalancer to manage this, you could configure it to call the 'makeMaster' WS on the new 'primary' node, which in turn calls Scheduler.resume() or similar.
hth