Axon Server auto-scaling split/merge delay - kubernetes

I am implementing auto-scaling in an application using Axon Server, and running in k8s.
I have created ReST endpoints in the application itself, which look at the local configuration (for processors and thread counts) and then speak to the Axon Server ReST API in order to split/merge the processors appropriately. The intent being to use container lifecycle hooks to trigger them.
As a result, if a new instance (pod) of an application is launched, configured for 2 threads on ProcessorA, then my code will make 2 requests to the /v1/components/blah/processors/ProcessorA/segments/split?context=default endpoint on the server. This is in order to make full use of the 2 new threads.
Likewise, when the pod is shut down, it makes 2 similar requests to the merge endpoint on the server.
When scaling up I see the processor split twice, as expected. However, on shutdown I don't see the merge twice unless I put a long (5s) wait between requests. This isn't likely to be particularly stable, so I'm wondering if there's something else I need to be doing.
Perhaps I ought to request the merge, then loop waiting for it to occur, then request another. This seems like it's going to be excessively slow.
There was another question on SO somewhat related, Automatically scale Axon's tracking event processors, where Steven commented that there was no inbuilt auto-scaling in Axon Server at that point in time. I've not seen anything in more recent times either.

As it stands work is underway to improve the split/merge functionality. For one, the result of a split/merge will be returned, which has been resolved under issue #1001.
This should make it so you do not have to wait for the status' to have been updated, which is the likely cause why it (seems to) take long. This functionality will be part of Axon Framework / Server 4.4 by the way, which should be released relatively soon.
Subsequently, discussion are still underway to allow for auto scaling. One requirement deemed important is the capability of a TrackingEventProcessor to process several segments per thread (issue #1434). This will ensure that the TEP can take over several segments to transition the boundary when scaling, for example.
Eventually though, Axon Server should be able to do this for you. It's just not there yet.
So for now I think the most pragmatic solution is indeed to wait for the result to show up on the status'. As said, I trust 4.4 will improve upon this by returning the result of the split/merge operation once called. Lastly, the Axon team is aware this can be improved upon further, hence why discussion on the matter are underway.

Related

How to persist and replay NestJS CQRS event and saga across restart?

I am making an application which will need to use NestJS' CQRS module, as the requirements naturally lend themselves to that pattern.
Updates to the application logic are expected to be frequent and to happen during busy hours (that's just how my management works...), so the application needs to be able to restart gracefully. However, this means that events started just before the shutdown may not finish, or even if they do, some sagas may not trigger due to some events having happened before the restart... I'd like to ensure that doesn't happen.
I'm aware of NestJS' OnApplicationShutdown and OnApplicationBootstrap hooks, which is exactly for this purpose, but what I'm not sure is what I should do there. How can I capture all events that have unfinished handlers and sagas? Then after a restart, how can I make the event bus aware of the events monitored by sagas, without executing the already executed handlers?
I guess the second part could be worked around with a random ID per event/handler combo, that will be looked up in a log, and if present, the handler will be skipped, and if not, it will be executed and added to the log... But even with such a workaround, I don't see how I could do the first part. There will be a lot of events, and sagas (by definition) execute commands, meaning they have side effects... Even if all commands can become idempotent, the sheer quantity of events and frequent restarts means restarting from the very first command is a no go.
I've seen this package but I'm not sure if it solves this particular use case, or if it's really just logging the events, and pretty much nothing more.

OSB: Analyzing memory of proxy service

I have multiple proxies in a message flow.Is there a way in OSB by which I can monitor the memory utilization of each proxy ? I'm getting OOM, want to investigate which proxy is eating away all/most memory.
Thanks !
If you're getting OOME then it's either because a proxy is not freeing up all the memory it uses (so will eventually fail even with one request at a time), or you use too much memory per invocation and it dies over a certain threshold but is fine under low load. Do you know which it is?
Either way, you will want to generate a heap dump on OOME so you can investigate what's going on. It's annoying but sometimes necessary. A colleague had to do that recently to fix some issues (one problem was an SB-transport platform bug, one was a thread starvation issue due to a platform work manager bug, the last one due to a Muxer bug when used in exalogic).
If it just performs poorly under load, then you'll need to do the usual OSB optimisations, like use fewer Assign steps (but assign more variables per step), do a lot more in xquery rather than proxy steps, especially loops that don't need a service callout, since they can easily be rolled into a for loop in xquery; you know, all the standard stuff.

Scheduling/delaying of jobs/tasks in Play framework 2.x app

In a typical web application, there are some things that I would prefer to run as delayed jobs/tasks. They tend to have some or all of the following properties:
Takes a long time (anywhere from multiple seconds to multiple minutes to multiple hours).
Occupy some resource heavily (CPU, network, disk, external API limits, etc.)
Result not immediately necessary. Can complete HTTP response without it. OK (and possibly even preferable) to delay until later.
Can be (and possibly preferable to) run on (a) different machine(s) than web server(s). The machine(s) are potentially dedicated job/task runners.
Should be run in response to other event(s), or started periodically.
What would be the preferred way(s) to set up, enqueue, schedule, and run delayed jobs/tasks in a Scala + Play Framework 2.x app?
For more details...
The pattern I have used in the past, and which I would like to replicate if applicable, is:
In handler of web request, or in cron-like call, enqueue job(s)
In job runner(s), repeatedly dequeue and run one job at a time
Possibly handle recording job results
This seems to be a relatively simple yet still relatively flexible pattern.
Examples I have encountered in the past include:
Updating derived data in DB
Analytics/tracking API calls for a web request
Delete expired sessions or other stale/outdated DB records
Periodic batch ETLs
In other languages/frameworks, I would typically use a job/task framework. Examples include:
Resque in a Ruby + Rails app
Celery in a Python + Django app
I have found the following existing materials, but unfortunately, I don't think they fit my use case directly.
Play 1.x asynchronous jobs API (+ various SO questions referencing it). Appears to have been removed in 2.x line. No reference to what replaced it.
Play 2.x Akka integration. Seems very general-purpose. I'd imagine it's possible to use Akka for the above, but I'd prefer not to write a jobs/tasks framework if one already exists. Also, no info on how to separate the job runner machine(s) from your web server(s).
This SO answer. Seems potentially promising for the "short to medium duration IO bound" case, e.g. analytics calls, but not necessarily for the "CPU bound" case (probably shouldn't tie up CPU on web server, prefer to ship off to different node), the "lots of network" case, or the "multiple hour" case (probably shouldn't leave that in the background on the web server, even if it isn't eating up too many resources).
This SO question, and related questions. Similar to above, it seems to me that this covers only the cases where it would be appropriate to run on the same web server.
Some further clarification on use-cases (as per commenters' request). There are two main use-cases that I have experienced with something like resque or celery that I am trying to replicate here:
Some event on the site (Most often, an incoming web request causes task to be enqueued.)
Task should run periodically. (Most often, this is implemented as: periodically, enqueue task to be run as above.)
In the case of resque or celery, the tasks enqueued by both use-cases enter queues the same way and are treated the same way by the runner/worker process. Barring other Scala or Play-specific considerations, that would be my initial guess for how to approach this.
Some further clarification on why I do not believe the Akka scheduler fits my use case out-of-the-box (as per commenters' request):
While it is no doubt possible to construct a fitting solution using some combination of the Akka scheduler (for periodic jobs), akka-remote and akka-cluster (for communicating between the job caller and the job runner), that approach requires a certain amount of glue code which is almost a delayed job framework in and of itself. If it exists, I would prefer to use an existing out-of-the-box solution rather than reinvent the wheel.

Is my middle-tier MSMQ queue really necessary?

My scenario is this:
I have multiple webservers that:
need to communicate with the backend (IBus.Publish/IBus.Subscribe)
need to communicate with each-other (IBus.Publish/IBus.Subscribe)
Aside from the webservers, I have a number of windows services that consume the same messages.
In order to make this work, I have the webservers send messages to a central hub, which sole responsebility it is to wrap the message in a new message type and publish it to all subscribers.
Can I somehow avoid this, so I can publish the messages directly from the webservers?
EDIT (Added some code) - Current situation:
... WebServer
_bus.Send(new Message{Body="SomethingChanged"});
... Hub
public void Handle(Message message){
_bus.Publish(new WrappedMessage{Message = message})
}
... Handlers (WebServers, WindowsServices etc)
public void Handle(WrappedMessage message){
//Actually do important stuff
}
Wanted situation:
... WebServer
_bus.Publish(new Message{Body="SomethingChanged"};
... Handlers (WebServers, WindowsServices etc)
public void Handle(Message message){
//Do important stuff
}
Well, there isn't anything that technically prevents you from publishing messages inside your web application, and likewise there's nothing that prevents you from subscribing to those messages in all instances of the same web application. The question is whether you should :)
Without knowing the details of your problem, my immediate feeling is that you would be better off using some kind of shared persistent storage for whatever it is that you're trying to synchronize (a cache?), possibly using some kind of read replication if you'd like to scale out and make reads really fast.
Again, without knowing the details of your problem, I'll try and suggest something, and then you can see if that could inspire you into an even better solution... here goes:
Use MongoDB (possible as a replica set if you want to scale out your read operations) as the persistent storage of the thing you're caching
Whenever something happens in the web application, bus.Send a message to your backend
In your backend message handler, you update Mongo (which automatically will replicate to read slaves)
Whenever you need to query your data, you just query your Mongo set (using slaveOk=true whenever you can accept slightly stale values)
The reason I'm suggesting this alternative solution, is that web applications (at least in .NET land) have this funny transient nature where the IIS will dictate its lifecycle, and at any given time you can have n instances of it. This complicates matters if you keep state in it. This makes me think of the web application as a client, not a publisher.
A simpler solution is to keep state in something that does not come & go, e.g. a database. And the reason I'm suggesting Mongo is that my guess is that you're worried about being able to serve web requests fast, but since MongoDB is fairly easy to install as a replica set where read operations will be pretty fast (and, more importantly: horisontally scaleable), my guess is that this setup would make everything much simpler.
How does that sound?

NSOperationQueue many threads

In our iPhone application we have several tabs and selecting each tab triggers network connection. In the past we were just detaching new thread for each connection. And after several very quick tab switches application was becoming unresponsive.
Now we decided to use operation queue which supposed should control number of threads and should not allow the application to become unresponsive. But now the app becomes unresponsive even with fewer quick switches (although now it recovers from unresponsiveness quicker).
I ran the app on device from xcode and paused it after several quick switches to see the number of threads. And what I have found is that there are several threads with the following stack:
0 __workq_kernreturn
2 _init_cpu_capabilities
Any idea what are these threads and how to get rid of them?
One of the big benefits of using NSOperationQueue is that you can forget about threads and let the system worry about that for you. It sounds like the root of your problem is that you've got several operations running simultaneously that are no longer needed. Rather than worrying about the specific threads, consider getting those operations to terminate so that they're no longer using up computing resources.
For what it's worth, my guess is that those threads are being managed by Grand Central Dispatch. GCD will create worker threads to process blocks (and operations), and it'll be as efficient as it can about that.
the important part of your problem does not likely lie in the internal/private implementation of worker threads. a good implementation will likely employ a thread pool because creating one thread per operation would cost a lot. operations can reuse and hold on to idle threads.
the important part (likely) lies in your use of the public apis of the implementation you have chosen.
one obvious implementation to support in this case is operation cancellation: -[NSOperation cancel]. when somebody navigates away from a view which has a pending/unfinished request, simply cancel it (unless you'll need the data for caching).
many implementations may also benefit by making requests less often. for example: if your server results only update about once per hour, then it doesn't make sense to request it 'about every minute'.
last point: a connection can use a worker thread itself - check the apis you are using to reduce this if it's a problem.