How to run periodic tasks in an Apache Storm topology? - scheduled-tasks

I have an Apache Storm topology and would like to perform a certain action every once in a while. I'm not sure how to approach this in a way which would be natural and elegant.
Should it be a Bolt or a Spout using ScheduledExecutorService, or something else?

Tick tuples are a decent option https://kitmenke.com/blog/2014/08/04/tick-tuples-within-storm/
Edit: Here's the essential code for your bolt
#Override
public Map<String, Object> getComponentConfiguration() {
// configure how often a tick tuple will be sent to our bolt
Config conf = new Config();
conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 300);
return conf;
}
Then you can use TupleUtils.isTick(tuple) in execute to check whether the received tuple is a tick tuple.

I don't know if this is a correct approach, but it seems to be working fine:
At the end of the prepare method of a Bolt, I added a call to intiScheduler(), which contains the following code:
Calendar calendar = Calendar.getInstance();
ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
scheduler.scheduleAtFixedRate(new PeriodicAction() [class implementing Runnable], millisToFullHour(calendar) [wanna start at the top of the hour], 60*60*1000 [run every hour], TimeUnit.MILLISECONDS);
This needs to be used with caution though, because the bolt can have multiple instances depending on your setup.

Related

Use kafka to detect changes on values

I have a streaming application that continuously takes in a stream of coordinates along with some custom metadata that also includes a bitstring. This stream is produced onto a kafka topic using producer API. Now another application needs to process this stream [Streams API] and store the specific bit from the bit string and generate alerts when this bit changes
Below is the continuous stream of messages that need to be processed
{"device_id":"1","status_bit":"0"}
{"device_id":"2","status_bit":"1"}
{"device_id":"1","status_bit":"0"}
{"device_id":"3","status_bit":"1"}
{"device_id":"1","status_bit":"1"} // need to generate alert with change: 0->1
{"device_id":"3","status_bits":"1"}
{"device_id":"2","status_bit":"1"}
{"device_id":"3","status_bits":"0"} // need to generate alert with change 1->0
Now I would like to write these alerts to another kafka topic like
{"device_id":1,"init":0,"final":1,"timestamp":"somets"}
{"device_id":3,"init":1,"final":0,"timestamp":"somets"}
I can save the current bit in the state store using something like
streamsBuilder
.stream("my-topic")
.mapValues((key, value) -> value.getStatusBit())
.groupByKey()
.windowedBy(TimeWindows.of(Duration.ofMinutes(1)))
.reduce((oldAggValue, newMessageValue) -> newMessageValue, Materialized.as("bit-temp-store"));
but I am unable to understand how can I detect this change from the existing bit. Do I need to query the state store somehow inside the processor topology? If yes? How? If no? What else could be done?
Any suggestions/ideas that I can try(maybe completely different from what I am thinking) are also appreciated. I am new to Kafka and thinking in terms of event driven streams is eluding me.
Thanks in advance.
I am not sure this is the best approach, but in the similar task I used an intermediate entity to capture the state change. In your case it will be something like
streamsBuilder.stream("my-topic").groupByKey()
.aggregate(DeviceState::new, new Aggregator<String, Device, DeviceState>() {
public DeviceState apply(String key, Device newValue, DeviceState state) {
if(!newValue.getStatusBit().equals(state.getStatusBit())){
state.setChanged(true);
}
state.setStatusBit(newValue.getStatusBit());
state.setDeviceId(newValue.getDeviceId());
state.setKey(key);
return state;
}
}, TimeWindows.of(…) …).filter((s, t) -> (t.changed())).toStream();
In the resulting topic you will have the changes. You can also add some attributes to DeviceState to initialise it first, depending whether you want to send the event, when the first device record arrives, etc.

How to set sequence numbers manually in QuickFixJ?

I'm acting as an acceptor. Is there a way to set sequence numbers manually?
The first idea I had, was to modify .seqnums files, but it does not work.
Google mentions existence of setNextSenderMsgSeqNum and setNextTargetMsgSeqNum methods, however I can't tell on which object I should call them (using quickfixj 1.4).
I'm aware that setting sequence numbers by hand is discouraged and there are bunch of flags like ResetOnLogon and ResetOnDisconnect, but I have no control over initiator and there are bunch of other acceptors (test-tools) which are using the same session.
Application myApp = new FIXSender();
settings = new SessionSettings(sessionConfig);
MessageFactory messageFactory = new MessageFactory();
MessageStoreFactory storeFactory = new FileStoreFactory(settings);
LogFactory logFactory = new FileLogFactory(settings);
Acceptor acceptor = new SocketAcceptor(myApp, storeFactory, settings, logFactory, messageFactory);
acceptor.start();
First of all you need to explore the quickfixJ code to see how it is done.
Secondly what is the reason to use such an old version of quickfixJ ? Why not upgrade to the most recent version.
Thirdly you should be very wary of changing sequence numbers if you don't understand properly how they are used in the communication. If you don't understand you are guaranteed to get into murky problems.
You can do something like
Session.lookupSession(sessionID).setNextSenderMsgSeqNum())
But before you do it, it is very important to understand how sequence numbers are used
You can set the FIX fields, override the toAdmin callback
#Override
public void toAdmin(Message message, SessionID sessionId) {
message.setBoolean(ResetSeqNumFlag.FIELD, true);
}

apache storm missing event detection based on time

I want to detect a missing event in a data stream ( e.g. detect a customer request that has not been responded within 1 hour of its reception. )
Here, I want to detect the "Response" event and make an alert.
I tried using tick tuple by setting TOPOLOGY_TICK_TUPLE_FREQ_SECS but it is configured at a bolt level and might come after 15th minute of getting a customer request.
#Override public Map getComponentConfiguration() {
Config conf = new Config();
conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 1800);
return conf; }
^ this doesn't work.
Let me know in comments if any other information is required. Thanks in advance for the help.
This might help http://storm.apache.org/releases/1.0.3/Windowing.html
you can define 5 mins windows and check the status of last window events and alert based on what is received
or create an intermediate bolt which maintains these windows and sends the normal alert tuples(instead of tick tuple) in case of timeouts

Non blocking REST with Spring Boot and Java 8

I need assistance.
An issue with one of my endpoints timing out is causing me distress.
I did some performance tweaking with SQL and other REST services I am using but it only helps a little bit.
A nice solution for this problem, I thought, would be to use some of the Async capabilities of Spring Boot and Java 8 and perform some sort of "Fire and forget" action.
I tried something like that, but it is no good, the "Time to rock!" message gets printed out all right but it seems that getLyrics() method is not invoked at all!
//MyController.java
#GET
#Path("na/na/na")
#Produces({"application/json"})
public Response getLyrics() {
final String lyrics = delegate.getLyrics();
return Response.ok().entity(lyrics.isEmpty()).build();
}
//MyDelegate.java
#Async("threadPoolTaskExecutor")
public Future<Boolean> getLyrics() {
LOG.info("Time to rock!");
final boolean result = lyricsService.getLyrics();
return new AsyncResult<Boolean>(result);
}
//MyAsyncConfig.java
#Configuration
#EnableAsync
public class MyAsyncConfig {
#Bean(name = "threadPoolTaskExecutor")
public Executor threadPoolTaskExecutor() {
return new ThreadPoolTaskExecutor();
}
}
So, lyricsService.getLyrics() (that is not being called for some reason) does all the work, calls other services, fetches stuff from the SQL database, and performs calls against some other REST endpoints. All of this takes time and sometimes* causes a timeout. I would like it to process in peace, and if possible, return some sort of response when possible.
I tried several variations of this solution as it seems to be close to what I need, but can't seem to get why it is not working for me.
*often
I think, spring futures have to wait until operation is done. But java completable future is more powerful maybe you can try that.
Future<String> a = ...;
while(!a.isDone()){
}
here is a sample.
https://spring.io/guides/gs/async-method/
You may use Spring's DefferedResult, along with Java8's Computable future to make your controller non-blocking and thereby delegate the long running taks inside the Comupatable Future's whenAsync method.Here is a working example -
https://github.com/kazi-imran/TransactionStatistics

Help and advice needed working with Quartz.NET NthIncludedDayTrigger

I've started using Quartz.NET recently, and so far, it's been really
helpful. Now, I'm trying to use it to create a job that runs once a
month using a NthIncludedDayTrigger (I want to use the
NthIncludedDayTrigger as eventually I will be specifying a calendar to
exclude weekends/holidays).
To familiarise myself with the code, I've
set up a simple console application to create an NthIncludedDayTrigger
where the first fire time will be 15 seconds from now:
static void Main(string[] args)
{
IScheduler scheduler = StdSchedulerFactory.DefaultScheduler;
scheduler.Start();
var jobDetail = new JobDetail("Job name", "Group name", typeof(SomeIJobImplementation));
var trigger = new NthIncludedDayTrigger();
trigger.Name = "Trigger name";
trigger.MisfireInstruction = MisfireInstruction.NthIncludedDayTrigger.DoNothing;
trigger.IntervalType = NthIncludedDayTrigger.IntervalTypeMonthly;
//I'm using the following while experimenting with the code (AddHour(1) to account for BST):
trigger.FireAtTime = DateTime.UtcNow.AddHours(1).AddSeconds(15).ToString("HH:mm:ss");
//I'm using the following while experimenting with the code:
trigger.N = DateTime.Today.Day;
Console.WriteLine("Started, press any key to stop ...");
Console.ReadKey();
scheduler.Shutdown(false);
}
...
public class SomeIJobImplementation : IJob
{
public void Execute(JobExecutionContext context)
{
Logger.Write(String.Format(
"Job executed called at {0}",
DateTime.Now.ToString("dd-MMM-yyyy HH:mm:ss")), null, 1,
TraceEventType.Information);
}
}
Running this results in the job being executed multiple times
(approximately once per second) for one minute. I'm using an ADO.NET
job store and can see in my database that QRTZ_TRIGGERS.NEXT_FIRE_TIME
is set to the last executed time, i.e. doesn't seem to be scheduled to
run again.
I expected the above code to run the job once (after about 15
seconds), then schedule the job to run again in one months time.
Perphaps the issue is just with the way I'm using Quartz.NET whilst
I've been experimenting or, maybe, my expectations are wrong? Either
way, I would be most grateful for any help/suggestions to explain the
behaviour I've observed, and what I need to change to get the
behaviour I want.
I must be late but I was trying to implement the same solution and ended up here.
I reckon you should star the scheduler after you've defined jobs and triggers.