Rx .net subject OnNext exception is losing downstream observers - system.reactive

Dislaimer: I am newbie to Rx.Net.
I want to understand the best way to consume events from the server using Rx.Net. Currently, I have a consumer class that contains a rx Subject, to delegate the consumed update to downstream consumers as :
Event Listener/Processor:
public IObservable<IUpdate> UpdateStream => _subject?.AsObservable();
try
{
// ... processing ...
_subject.OnNext(update); // update is the variable
}
catch (Exception ex)
{
_subject.OnError(ex);
}
Downstream-subscribers:
public void Subscribe()
{
_eventListener.UpdateStream.Subscribe(update =>
{
_fooProcessor.Process(update);
},
ex =>
{
// log
Subscribe(); // an effort to resubscribe lost subscription
},
() => { // log completion (optional)...}
}
I have noticed that subject throws exception onNext (an item with the same key has already been added), wherein, the subject.HasObservers property is false (in other words, the downstream subscription list is lost). The OnError code line does hit, but the downstream subscribers do not get notified (because of lost subscription).
I tried using Observer.EventPattern to listen to the consuming event and create the observable to be consumed by downstream-subscribers; but that did not work as well (I could not evaluate the point of failure in this case).
Is there a pattern to resubscribe from downstream consumers (on different dlls), in such cases?
Appreciate any help.
Thanks!

I found that the downstream-subscriber was throwing an exception, resulting in dropping the subscription. This is not an issue now.
Thanks - How to handle exceptions in OnNext when using ObserveOn?

Related

Kafka Streams - Transformers with State in Fields and Task / Threading Model

I have a Transformer with a state store that uses punctuate to operate on said state store.
After a few iterations of punctuate, the operation may have finished, so I'd like to cancel the punctuate -- but only for the Task that has actually finished the operation on the partition's respective state store. The punctuate operations for the Tasks that are not done yet should keep running. To that purpose my transformer keeps a reference to the Cancellable returned by schedule().
As far as I can tell, every Task always gets its own isolated Transformer instance and every Task gets its own isolated scheduled punctuate() within that instance (?)
However, since this is effectively state, but not inside a stateStore, I'm not sure how safe this is. For instance, are there certain scenarios in which one transformer instance might be shared across tasks (and therefore absolutely no state must be kept outside of StateStores)?
public class CoolTransformer implements Transformer {
private KeyValueStore stateStore;
private Cancellable taskPunctuate; // <----- Will this lead to conflicts between tasks?
public void init(ProcessorContext context) {
this.store = context.getStateStore(...);
this.taskPunctuate = context.schedule(Duration.ofMillis(...), PunctuationType.WALL_CLOCK_TIME, this::scheduledOperation);
}
private void scheduledOperation(long l) {
stateStore.get(...)
// do stuff...
if (done) {
this.taskPunctuate.cancel(); // <----- Will this lead to conflicts between tasks?
}
}
public KeyValue transform(key, value) {
// do stuff
stateStore.put(key, value)
}
public void close() {
taskPunctuate.cancel();
}
}
You might be able to look into TransformerSupplier, specifically TransformSupplier#get(), this will ensure that ensure we new transformer will be created for when they should be kept independent. Also the Transformers should not share objects, so be careful of this with your Cancellable taskPunctuate. If either of these cases are violated you should see errors like org.apache.kafka.streams.errors.StreamsException: Current node is unknown, ConcurrentModificationException or InstanceAlreadyExistsException.

Samza: Delay processing of messages until timestamp

I'm processing messages from a Kafka topic with Samza. Some of the messages come with a timestamp in the future and I'd like to postpone the processing until after that timestamp. In the meantime, I'd like to keep processing other incoming messages.
What I tried to do is make my Task queue the messages and implement the WindowableTask to periodically check the messages if their timestamp allows to process them. The basic idea looks like this:
public class MyTask implements StreamTask, WindowableTask {
private HashSet<MyMessage> waitingMessages = new HashSet<>();
#Override
public void process(IncomingMessageEnvelope incomingMessageEnvelope, MessageCollector messageCollector, TaskCoordinator taskCoordinator) {
byte[] message = (byte[]) incomingMessageEnvelope.getMessage();
MyMessage parsedMessage = MyMessage.parseFrom(message);
if (parsedMessage.getValidFromDateTime().isBeforeNow()) {
// Do the processing
} else {
waitingMessages.add(parsedMessage);
}
}
#Override
public void window(MessageCollector messageCollector, TaskCoordinator taskCoordinator) {
for (MyMessage message : waitingMessages) {
if (message.getValidFromDateTime().isBeforeNow()) {
// Do the processing and remove the message from the set
}
}
}
}
This obviously has some downsides. I'd be losing my waiting messages in memory when I redeploy my task. So I'd like to know the best practice for delaying the processing of messages with Samza. Do I need to reemit the messages to the same topic again and again until I can finally process them? We're talking about delaying the processing for a few minutes up to 1-2 hours here.
It's important to keep in mind, when dealing with message queues, is that they perform a very specific function in a system: they hold messages while the processor(s) are busy processing preceding messages. It is expected that a properly-functioning message queue will deliver messages on demand. What this implies is that as soon as a message reaches the head of the queue, the next pull on the queue will yield the message.
Notice that delay is not a configurable part of the equation. Instead, delay is an output variable of a system with a queue. In fact, Little's Law offers some interesting insights into this.
So, in a system where a delay is necessary (for example, to join/wait for a parallel operation to complete), you should be looking at other methods. Typically a queryable database would make sense in this particular instance. If you find yourself keeping messages in a queue for a pre-set period of time, you're actually using the message queue as a database - a function it was not designed to provide. Not only is this risky, but it also has a high likelihood of hurting the performance of your message broker.
I think you could use key-value store of Samza to keep state of your task instance instead of in-memory Set.
It should look something like:
public class MyTask implements StreamTask, WindowableTask, InitableTask {
private KeyValueStore<String, MyMessage> waitingMessages;
#SuppressWarnings("unchecked")
#Override
public void init(Config config, TaskContext context) throws Exception {
this.waitingMessages = (KeyValueStore<String, MyMessage>) context.getStore("messages-store");
}
#Override
public void process(IncomingMessageEnvelope incomingMessageEnvelope, MessageCollector messageCollector,
TaskCoordinator taskCoordinator) {
byte[] message = (byte[]) incomingMessageEnvelope.getMessage();
MyMessage parsedMessage = MyMessage.parseFrom(message);
if (parsedMessage.getValidFromDateTime().isBefore(LocalDate.now())) {
// Do the processing
} else {
waitingMessages.put(parsedMessage.getId(), parsedMessage);
}
}
#Override
public void window(MessageCollector messageCollector, TaskCoordinator taskCoordinator) {
KeyValueIterator<String, MyMessage> all = waitingMessages.all();
while(all.hasNext()) {
MyMessage message = all.next().getValue();
// Do the processing and remove the message from the set
}
}
}
If you redeploy you task Samza should recreate state of key-value store (Samza keeps values in special kafka topic related to key-value store). You need of course provide some extra configuration of your store (in above example for messages-store).
You could read about key-value store here (for the latest Samza version):
https://samza.apache.org/learn/documentation/0.14/container/state-management.html

QuickFIX/J: Not receiving ExecutionReport messages

After opening an order with our brokerage firm, we desire to obtain the fill price from the ExecutionReport messages. Below you will find the callback code used.
The MarketDataSnapshotFullRefresh messages are received properly, but the second if block is never triggered. Strangely, the corresponding messages.log file does contain multiple 35=8 messages.
We use QuickFIX/J as FIX engine.
#Override
public void fromApp(Message message, SessionID sessionID) throws FieldNotFound, IncorrectDataFormat, IncorrectTagValue, UnsupportedMessageType {
if (message instanceof MarketDataSnapshotFullRefresh) {
// do stuff with MarketDataSnapshotFullRefresh
}
if(message instanceof ExecutionReport) {
// do stuff with ExecutionReport
}
Message handling is ideally done by a quickfix.MessageCracker, though sometimes handling them in fromApp is the way to go.
You can read more about message cracking here: QuickFIX/J User Manual - Receiving Messages.
I'll outline both ways:
In fromApp
Messages coming in fromApp are not of specific message types as defined in the QuickFIX/J library, but are of type quickfix.Message. If you wanted to process them the way you are doing now (from fromApp), you would have to inspect the MsgType manually:
MsgType msgType = (MsgType) message.getHeader( ).getField( new MsgType( ) );
Based on the type retrieved, you would call a handler method for the specific message type:
if( msgType.valueEquals( MsgType.MARKET_DATA_SNAPSHOT_FULL_REFRESH ) )
handleMarketDataSnapshotFullRefresh( message, sessionID );
else if ...
...
private void handleMarketDataSnapshotFullRefresh( quickfix.Message msg, SessionID sessionID ) {
// handler implementation
}
Using MessageCracker
Another way to handle incoming messages as mentioned before, is through a MessageCracker. You would e.g. extend the class that implements quickfix.Application with quickfix.MessageCracker.
Add an onMessage method with two parameters, first being the message type, second a SessionID. Call crack from the fromApp method which will route to the appropriate handler.
import quickfix.*;
public class MyApplication extends MessageCracker implements Application
{
public void fromApp(Message message, SessionID sessionID)
throws FieldNotFound, UnsupportedMessageType, IncorrectTagValue {
crack(message, sessionID);
}
#Handler
public void onMessage(quickfix.fix44.MarketDataSnapshotFullRefresh mdsfr, SessionID sessionID) {
// handler implementation
}
}
Why are you doing the message processing in the wrong place ? If you check what is recommended by Quickfix you will see they recommend message processing happens in onMessage (which you might not have implemented). And there should only exist a message cracker in fromApp method.
Or else your fromApp method is going to be a hotchpotch of code and the next person handling your code is not going to be a happy soul.

Verticles and uncaught exceptions

Considering the scenario that one of the verticles throws an uncaught exception.
What happens next?
If the verticle state is removed from the system is there some mechanism similar to erlang supervisors to restart the verticle?
Documentation is not very clear about this aspect.
Update based on comments:
What interest me the most is the situation when an exception is thrown from the processing handlers of a received message (through the bus)
Regards
I have answered part of my own question (with the help of a test program)
When exception is thrown in a event handler then the exception is caught by vert.x and swallowed (ignored). The event handler will process the next message.
Update: The app can register an exception handler and have all the uncaught Throwable delivered to this handler. There you can perform additional general processing
Update2: Use Vertx.exceptionHandler to register the handler
Vert.x is all about the same style, asynchronous programming, which is mainly highlighted by callback handlers.
To handle the deployment failure case, you have first to go the programmatic way, i.e. you have to deploy your verticle programmatically through let's say a deployment verticle providing a completion handler that will be populated with deployment result, here down a sample using Java (since your haven't opt for a specific language, I will go with my best) where:
MainVerticle: is your deployment verticle (used mainly to deploy other verticles)
some.package.MyVerticle: is your real verticle, note that I used the id here and not an instance.
public class MainVerticle extends AbstractVerticle {
public void start() {
vertx.deployVerticle("some.package.MyVerticle", res -> {
if (res.succeeded()) {
// Do whatever if deployment succeeded
} else {
// Handle deployment failure here...
}
});
}
}
Now when it comes to 'messaging failures', it would be harder to highlight a specific case since it can occur at many places and on behalf of both messaging ends.
If you want to register a failure case handler when sending a message, you can instantiate a MessageProducer<T> representing the stream it can be written to, then register an exception handler on it:
EventBus eb = vertx.eventBus();
MessageProducer<String> sender = eb.sender("someAddress");
sender.exceptionHandler(e -> {
System.out.println("An error occured" + e.getCause());
});
sender.write("Hello...");
On the other side, you can handle failure case when reading the received messages pretty much the same way, but using a MessageConsumer<T> this time:
EventBus eb = vertx.eventBus();
MessageConsumer<String> receiver = eb.consumer("someAddress");
receiver.exceptionHandler(e -> {
System.out.println("An error occured while readeing data" + e.getCause());
}).handler(msg -> {
System.out.println("A message has been received: " + msg.body());
});
To add a bit to the previous answer, if you want to react to all uncaught exceptions, register handler on vertx object, as follows:
vertx.exceptionHandler(new Handler<Throwable>() {
#Override
public void handle(Throwable event) {
// do what you meant to do on uncaught exception, e.g.:
System.err.println("Error");
someLogger.error(event + " throws exception: " + event.getStackTrace());
}
});
I ran into something similar to this. When an exception happens as part of processing a message in a Verticle, I just wanted to reply with the Exception.
The idea is to just bubble up the exceptions all the way back to the entry point in the app where a decision can be made about what to do with the failure, while capturing the entire stack along the way.
To accomplish it I wrote this function:
protected ReplyException buildReplyException(Throwable cause, String message)
{
ReplyException ex = new ReplyException(ReplyFailure.RECIPIENT_FAILURE, -1, message);
ex.initCause(cause);
return ex;
}
Which I then use to build handlers, or reply handlers, like this:
reply -> {
if (reply.succeeded()) {
message.reply(true);
} else {
message.reply(buildReplyException(reply.cause(), ""));
}
});
This way the guy that sent the original message will get a failed response which contains a cause that's an exception which has a full stack trace populated on it.
This approach worked very well for me to handle errors while processing messages.

How do I warm up an actor's state from database when starting up?

My requirement is to start a long running process to tag all the products that are expired. This is run every night at 1:00 AM. The customers may be accessing some of the products on the website, so they have instances around the time when the job is run. The others are in the persistent media, not yet having instances because the customers are not accessing them.
Where should I hook up the logic to read the latest state of an actor from a persistent media and create a brand new actor? Should I have that call in the Prestart override method? If so, how can I tell the ProductActor that a new actor being created.
Or should I send a message to the ProductActor like LoadMeFromAzureTable which will load the state from the persistent media after an actor being created?
There are different ways to do it depending on what you need, as opposed to there being precisely one "right" answer.
You could use a Persistent Actor to recover state from a durable store automatically on startup (or in case of crash, to recover). Or, if you don't want to use that module (still in beta as of July 2015), you could do it yourself one of two ways:
1) You could load your state in PreStart, but I'd only go with this if you can make the operation async via your database client and use the PipeTo pattern to send the results back to yourself incrementally. But if you need to have ALL the state resident in memory before you start doing work, then you need to...
2) Make a finite state machine using behavior switching. Start in a gated state, send yourself a message to load your data, and stash everything that comes in. Then switch to a receiving state and unstash all messages when your state is done loading. This is the approach I prefer.
Example (just mocking the DB load with a Task):
public class ProductActor : ReceiveActor, IWithUnboundedStash
{
public IStash Stash { get; set; }
public ProductActor()
{
// begin in gated state
BecomeLoading();
}
private void BecomeLoading()
{
Become(Loading);
LoadInitialState();
}
private void Loading()
{
Receive<DoneLoading>(done =>
{
BecomeReady();
});
// stash any messages that come in until we're done loading
ReceiveAny(o =>
{
Stash.Stash();
});
}
private void LoadInitialState()
{
// load your state here async & send back to self via PipeTo
Task.Run(() =>
{
// database loading task here
return new Object();
}).ContinueWith(tr =>
{
// do whatever (e.g. error handling)
return new DoneLoading();
}).PipeTo(Self);
}
private void BecomeReady()
{
Become(Ready);
// our state is ready! put all those stashed messages back in the mailbox
Stash.UnstashAll();
}
private void Ready()
{
// handle those unstashed + new messages...
ReceiveAny(o =>
{
// do whatever you need to do...
});
}
}
/// <summary>
/// Marker interface.
/// </summary>
public class DoneLoading {}