Placement of observeOn gives me different behavior - rx-java2

When I run the following snippet I don't see the backpressure.
public static void main(String[] args) throws InterruptedException {
MyFileProcessor pro = new MyFileProcessor();
Timer t = new Timer();
t.start();
Disposable x = pro
.generateFlowable(
new File("path\\to\\file.raw"))
.subscribeOn(Schedulers.io(), false).observeOn(Schedulers.io()).map(y -> {
System.out.println(Thread.currentThread().getName() + " xxx");
return y;
})
.subscribe(onNext -> {
System.out.println(Thread.currentThread().getName() + " " + new String(onNext));
Thread.sleep(100);
}, Throwable::printStackTrace, () -> {
System.out.println("Done");
t.end();
System.out.println(t.getTotalTime());
});
Thread.sleep(1000000000);
}
When I run the class above I get an alternating lines of
RxCachedThreadScheduler-1 xxx
RxCachedThreadScheduler-1 Line1
....
Its using the same thread.
Now when I move the observeOn to just before the subscribe, I see a bunch of
RxCachedThreadScheduler-1 xxx
Followed by a bunch of
RxCachedThreadScheduler-1 Line1
I am assuming this is back pressure but still the thread used is the same.
Why am I seeing this behavior?
Why is only one thread being utilized?
There is no operator as such for the observeOn to operate on, so why am I seeing this behavior?
[edit]
public Flowable<byte[]> generateFlowable(File file) {
return Flowable.generate(() -> new BufferedInputStream(new FileInputStream(file)), (bufferedIs, output) -> {
try {
byte[] data = getMessageRawData(bufferedIs);
if (data != null)
output.onNext(data);
else
output.onComplete();
}
catch (Exception e) {
output.onError(e);
}
return bufferedIs;
}, bufferedIs -> {
try {
bufferedIs.close();
}
catch (IOException ex) {
RxJavaPlugins.onError(ex);
}
});
}

Why is only one thread being utilized?
Works correctly because you check the running thread after observeOn and thus you are supposed to see the same thread there and below, no matter what happens above it. subscribeOn affects generateFlowable where, I suppose, you don't print the current thread and thus you don't see it runs on a different IO thread.
Now when I move the observeOn to just before the subscribe
There shouldn't be any difference unless something odd happens in generateFlowable.

Related

Vert.x: How to wait for a future to complete

Is there a way to wait for a future to complete without blocking the event loop?
An example of a use case with querying Mongo:
Future<Result> dbFut = Future.future();
mongo.findOne("myusers", myQuery, new JsonObject(), res -> {
if(res.succeeded()) {
...
dbFut.complete(res.result());
}
else {
...
dbFut.fail(res.cause());
}
}
});
// Here I need the result of the DB query
if(dbFut.succeeded()) {
doSomethingWith(dbFut.result());
}
else {
error();
}
I know the doSomethingWith(dbFut.result()); can be moved to the handler, yet if it's long, the code will get unreadable (Callback hell ?) It that the right solution ? Is that the omny solution without additional libraries ?
I'm aware that rxJava simplifies the code, but as I don't know it, learning Vert.x and rxJava is just too much.
I also wanted to give a try to vertx-sync. I put the dependency in the pom.xml; everything got downloaded fine but when I started my app, I got the following error
maurice#mickey> java \
-javaagent:~/.m2/repository/co/paralleluniverse/quasar-core/0.7.5/quasar-core-0.7.5-jdk8.jar \
-jar target/app-dev-0.1-fat.jar \
-conf conf/config.json
Error opening zip file or JAR manifest missing : ~/.m2/repository/co/paralleluniverse/quasar-core/0.7.5/quasar-core-0.7.5-jdk8.jar
Error occurred during initialization of VM
agent library failed to init: instrument
I know what the error means in general, but I don't know in that context... I tried to google for it but didn't find any clear explanation about which manifest to put where. And as previously, unless mandatory, I prefer to learn one thing at a time.
So, back to the question : is there a way with "basic" Vert.x to wait for a future without perturbation on the event loop ?
You can set a handler for the future to be executed upon completion or failure:
Future<Result> dbFut = Future.future();
mongo.findOne("myusers", myQuery, new JsonObject(), res -> {
if(res.succeeded()) {
...
dbFut.complete(res.result());
}
else {
...
dbFut.fail(res.cause());
}
}
});
dbFut.setHandler(asyncResult -> {
if(asyncResult.succeeded()) {
// your logic here
}
});
This is a pure Vert.x way that doesn't block the event loop
I agree that you should not block in the Vertx processing pipeline, but I make one exception to that rule: Start-up. By design, I want to block while my HTTP server is initialising.
This code might help you:
/**
* #return null when waiting on {#code Future<Void>}
*/
#Nullable
public static <T>
T awaitComplete(Future<T> f)
throws Throwable
{
final Object lock = new Object();
final AtomicReference<AsyncResult<T>> resultRef = new AtomicReference<>(null);
synchronized (lock)
{
// We *must* be locked before registering a callback.
// If result is ready, the callback is called immediately!
f.onComplete(
(AsyncResult<T> result) ->
{
resultRef.set(result);
synchronized (lock) {
lock.notify();
}
});
do {
// Nested sync on lock is fine. If we get a spurious wake-up before resultRef is set, we need to
// reacquire the lock, then wait again.
// Ref: https://stackoverflow.com/a/249907/257299
synchronized (lock)
{
// #Blocking
lock.wait();
}
}
while (null == resultRef.get());
}
final AsyncResult<T> result = resultRef.get();
#Nullable
final Throwable t = result.cause();
if (null != t) {
throw t;
}
#Nullable
final T x = result.result();
return x;
}

Repeat Single based on onSuccess() value

I want to repeat a Single based on the single value emitted in onSuccess(). Here is a working example
import org.reactivestreams.Publisher;
import io.reactivex.Flowable;
import io.reactivex.Single;
import io.reactivex.functions.Function;
public class Temp {
void main() {
Job job = new Job();
Single.just(job)
.map(this::processJob)
.repeatWhen(new Function<Flowable<Object>, Publisher<?>>() {
#Override
public Publisher<?> apply(Flowable<Object> objectFlowable) throws Exception {
// TODO repeat when Single emits false
return null;
}
})
.subscribe();
}
/**
* returns true if process succeeded, false if failed
*/
boolean processJob(Job job) {
return true;
}
class Job {
}
}
I understand how repeatWhen works for Observables by relying on the "complete" notification. However since Single doesn't receive that notification I'm not sure what the Flowable<Object> is really giving me. Also why do I need to return a Publisher from this function?
Instead of relying on a boolean value, you could make your job throw an exception when it fails:
class Job {
var isSuccess: Boolean = false
}
fun processJob(job: Job): String {
if (job.isSuccess) {
return "job succeeds"
} else {
throw Exception("job failed")
}
}
val job = Job()
Single.just(job)
.map { processJob(it) }
.retry() // will resubscribe until your job succeeds
.subscribe(
{ value -> print(value) },
{ error -> print(error) }
)
i saw a small discrepancy in the latest docs and your code, so i did a little digging...
(side note - i think the semantics of retryWhen seem like the more appropriate operator for your case, so i've substituted it in for your usage of repeatWhen. but i think the root of your problem remains the same in either case).
the signature for retryWhen is:
retryWhen(Function<? super Flowable<Throwable>,? extends Publisher<?>> handler)
that parameter is a factory function whose input is a source that emits anytime onError is called upstream, giving you the ability to insert custom retry logic that may be influenced through interrogation of the underlying Throwable. this begins to answer your first question of "I'm not sure what the Flowable<Object> is really giving me" - it shouldn't be Flowable<Object> to begin with, it should be Flowable<Throwable> (for the reason i just described).
so where did Flowable<Object> come from? i managed to reproduce IntelliJ's generation of this code through it's auto-complete feature using RxJava version 2.1.17. upgrading to 2.2.0, however, produces the correct result of Flowable<Throwable>. so, see if upgrading to the latest version generates the correct result for you as well.
as for your second question of "Also why do I need to return a Publisher from this function?" - this is used to determine if re-subscription should happen. if the factory function returns a Publisher that emits a terminal state (ie calls onError() or onComplete()) re-subscription will not happen. however, if onNext() is called, it will. (this also explains why the Publisher isn't typed - the type doesn't matter. the only thing that does matter is what kind of notification it publishes).
another way to rewrite this, incorporating the above, might be as follows:
// just some type to use as a signal to retry
private class SpecialException extends RuntimeException {}
// job processing results in a Completable that either completes or
// doesn't (by way of an exception)
private Completable rxProcessJob(Job job) {
return Completable.complete();
// return Completable.error(new SpecialException());
}
...
rxProcessJob(new Job())
.retryWhen(errors -> {
return errors.flatMap(throwable -> {
if(throwable instanceof SpecialException) {
return PublishProcessor.just(1);
}
return PublishProcessor.error(throwable);
});
})
.subscribe(
() -> {
System.out.println("## onComplete()");
},
error -> {
System.out.println("## onError(" + error.getMessage() + ")");
}
);
i hope that helps!
The accepted answer would work, but is hackish. You don't need to throw an error; simply filter the output of processJob which converts the Single to a Maybe, and then use the repeatWhen handler to decide how many times, or with what delay, you may want to resubscribe. See Kotlin code below from a working example, you should be able to easily translate this to Java.
filter { it }
.repeatWhen { handler ->
handler.zipWith(1..3) { _, i -> i }
.flatMap { retryCount -> Flowable.timer(retryDelay.toDouble().pow(retryCount).toLong(), TimeUnit.SECONDS) }
.doOnNext { log.warn("Retrying...") }
}

How to ensure reactor flux process all the messages supplied

Suppose we want to have Flux pipeline to process all messages supplied from several threads. Let's consider the code below:
#Test
public void testFluxCreate() throws InterruptedException {
EmitterProcessor<String> processor = EmitterProcessor.create();
CountDownLatch latch = new CountDownLatch(1);
AtomicLong counter = new AtomicLong();
AtomicLong batch = new AtomicLong();
Flux<List<String>> flux = processor
.doOnSubscribe(ss -> System.out.println(nm() + " : subscribing to + ss))
.onBackpressureError()
.buffer(7)
.publishOn(Schedulers.immediate())
.doOnNext(it -> {
counter.addAndGet(it.size());
System.out.println(batch.incrementAndGet() + " : " + nm() + "Batch: " + it.size());
})
;
CompletableFuture<Void> producer = CompletableFuture.runAsync(() -> {
IntStream.range(1, 1001).forEach(it -> {
//sleep();
processor.onNext("Message-" + it);
});
});
CompletableFuture<Void> producer2 = CompletableFuture.runAsync(() -> {
IntStream.range(1, 1001).forEach(it -> {
//sleep();
processor.onNext("Message2-" + it);
});
});
CompletableFuture<Void> future = CompletableFuture.allOf(producer, producer2).thenAccept(it -> processor.onComplete());
flux.doOnComplete(latch::countDown).subscribe();
future.join();
latch.await();
System.out.println("Total: " + counter);
}
The counter shows us that each time we execute this code the actual number of messages processed is different.
What's wrong with this implementation?
How can we ensure that all the messages were processed before the program ends?
What's wrong with this implementation?
When I run the code I get the following in the logs early after start:
18:39:12.590 [ForkJoinPool.commonPool-worker-1] DEBUG reactor.core.publisher.Operators - Duplicate Subscription has been detected
java.lang.IllegalStateException: Spec. Rule 2.12 - Subscriber.onSubscribe MUST NOT be called more than once (based on object equality)
at reactor.core.Exceptions.duplicateOnSubscribeException(Exceptions.java:162)
at reactor.core.publisher.Operators.reportSubscriptionSet(Operators.java:502)
at reactor.core.publisher.Operators.setOnce(Operators.java:607)
at reactor.core.publisher.EmitterProcessor.onNext(EmitterProcessor.java:245)
at de.schauder.reactivethreads.demo.StackoverflowQuicky.lambda$null$2(StackoverflowQuicky.java:54)
at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:557)
at de.schauder.reactivethreads.demo.StackoverflowQuicky.lambda$main$3(StackoverflowQuicky.java:52)
I'm not familiar with EmitterProcessor but it seems onNext is not thread safe and I'm strongly suspecting that this is the cause for the missing events.
How can we ensure that all the messages were processed before the program ends?
I'd use two separate Producers and merge those. Also I think you don't need the count down latch.
public static void main(String[] args) {
AtomicLong counter = new AtomicLong();
AtomicLong batch = new AtomicLong();
EmitterProcessor<String> processor1 = EmitterProcessor.create();
EmitterProcessor<String> processor2 = EmitterProcessor.create();
Thread thread1 = constructThread(processor1);
Thread thread2 = constructThread(processor2);
Flux<List<String>> flux = processor1.mergeWith(processor2)
.buffer(7)
.onBackpressureError()
.publishOn(Schedulers.immediate())
.doOnNext(it -> {
counter.addAndGet(it.size());
System.out.println(batch.incrementAndGet() + " : Batch: " + it.size());
}).doOnComplete(() -> {
System.out.println("Total count: " + counter.get());
});
thread1.start();
thread2.start();
flux.blockLast();
}
private static Thread constructThread(EmitterProcessor<String> processor) {
return new Thread(() -> {
IntStream.range(1, 1001).forEach(it -> {
processor.onNext("Message2-" + it);
});
processor.onComplete();
});
}
Note about my comment:
onBackpressureError() causes the Flux to emit an error when the subscriber can't handle all the events fast enough, so this could explain the mismatch, but you'd see an exception.

Rxjava User-Retry observable with .cache operator?

i've an observable that I create with the following code.
Observable.create(new Observable.OnSubscribe<ReturnType>() {
#Override
public void call(Subscriber<? super ReturnType> subscriber) {
try {
if (!subscriber.isUnsubscribed()) {
subscriber.onNext(performRequest());
}
subscriber.onCompleted();
} catch (Exception e) {
subscriber.onError(e);
}
}
});
performRequest() will perform a long running task as you might expect.
Now, since i might be launching the same Observable twice or more in a very short amount of time, I decided to write such transformer:
protected Observable.Transformer<ReturnType, ReturnType> attachToRunningTaskIfAvailable() {
return origObservable -> {
synchronized (mapOfRunningTasks) {
// If not in maps
if ( ! mapOfRunningTasks.containsKey(getCacheKey()) ) {
Timber.d("Cache miss for %s", getCacheKey());
mapOfRunningTasks.put(
getCacheKey(),
origObservable
.doOnTerminate(() -> {
Timber.d("Removed from tasks %s", getCacheKey());
synchronized (mapOfRunningTasks) {
mapOfRunningTasks.remove(getCacheKey());
}
})
.cache()
);
} else {
Timber.d("Cache Hit for %s", getCacheKey());
}
return mapOfRunningTasks.get(getCacheKey());
}
};
}
Which basically puts the original .cache observable in a HashMap<String, Observable>.
This basically disallows multiple requests with the same getCacheKey() (Example login) to call performRequest() in parallel. Instead, if a second login request arrives while another is in progress, the second request observable gets "discarded" and the already-running will be used instead. => All the calls to onNext are going to be cached and sent to both subscribers actually hitting my backend only once.
Now, suppouse this code:
// Observable loginTask
public void doLogin(Observable<UserInfo> loginTask) {
loginTask.subscribe(
(userInfo) -> {},
(throwable) -> {
if (userWantsToRetry()) {
doLogin(loinTask);
}
}
);
}
Where loginTask was composed with the previous transformer. Well, when an error occurs (might be connectivity) and the userWantsToRetry() then i'll basically re-call the method with the same observable. Unfortunately that has been cached and I'll receive the same error without hitting performRequest() again since the sequence gets replayed.
Is there a way I could have both the "same requests grouping" behavior that the transformer provides me AND the retry button?
Your question has a lot going on and it's hard to put it into direct terms. I can make a couple recommendations though. Firstly your Observable.create can be simplified by using an Observable.defer(Func0<Observable<T>>). This will run the func every time a new subscriber is subscribed and catch and channel any exceptions to the subscriber's onError.
Observable.defer(() -> {
return Observable.just(performRequest());
});
Next, you can use observable.repeatWhen(Func1<Observable<Void>, Observable<?>>) to decide when you want to retry. Repeat operators will re-subscribe to the observable after an onComplete event. This particular overload will send an event to a subject when an onComplete event is received. The function you provide will receive this subject. Your function should call something like takeWhile(predicate) and onComplete when you do not want to retry again.
Observable.just(1,2,3).flatMap((Integer num) -> {
final AtomicInteger tryCount = new AtomicInteger(0);
return Observable.just(num)
.repeatWhen((Observable<? extends Void> notifications) ->
notifications.takeWhile((x) -> num == 2 && tryCount.incrementAndGet() != 3));
})
.subscribe(System.out::println);
Output:
1
2
2
2
3
The above example shows that retries are aloud when the event is not 2 and up to a max of 22 retries. If you switch to a repeatWhen then the flatMap would contain your decision as to use a cached observable or the realWork observable. Hope this helps!

Monotouch data sync - why does my code sometimes cause sqlite errors?

I have the following calls (actually a few more than this - it's the overall method that's in question here):
ThreadPool.QueueUserWorkItem(Database.Instance.RefreshEventData);
ThreadPool.QueueUserWorkItem(Database.Instance.RefreshLocationData);
ThreadPool.QueueUserWorkItem(Database.Instance.RefreshActData);
1st point is - is it OK to call methods that call WCF services like this? I tried daisy chaining them and it was a mess.
An example of one of the refresh methods being called above is (they all follow the same pattern, just call different services and populate different tables):
public void RefreshEventData (object state)
{
Console.WriteLine ("in RefreshEventData");
var eservices = new AppServicesClient (new BasicHttpBinding (), new EndpointAddress (this.ServciceUrl));
//default the delta to an old date so that if this is first run we get everything
var eventsLastUpdated = DateTime.Now.AddDays (-100);
try {
eventsLastUpdated = (from s in GuideStar.Data.Database.Main.Table<GuideStar.Data.Event> ()
orderby s.DateUpdated descending
select s).ToList ().FirstOrDefault ().DateUpdated;
} catch (Exception ex1) {
Console.WriteLine (ex1.Message);
}
try {
eservices.GetAuthorisedEventsWithExtendedDataAsync (this.User.Id, this.User.Password, eventsLastUpdated);
} catch (Exception ex) {
Console.WriteLine ("error updating events: " + ex.Message);
}
eservices.GetAuthorisedEventsWithExtendedDataCompleted += delegate(object sender, GetAuthorisedEventsWithExtendedDataCompletedEventArgs e) {
try {
List<Event> newEvents = e.Result.ToList ();
GuideStar.Data.Database.Main.EventsAdded = e.Result.Count ();
lock (GuideStar.Data.Database.Main) {
GuideStar.Data.Database.Main.Execute ("BEGIN");
foreach (var s in newEvents) {
GuideStar.Data.Database.Main.InsertOrUpdateEvent (new GuideStar.Data.Event {
Name = s.Name,
DateAdded = s.DateAdded,
DateUpdated = s.DateUpdated,
Deleted = s.Deleted,
StartDate = s.StartDate,
Id = s.Id,
Lat = s.Lat,
Long = s.Long
});
}
GuideStar.Data.Database.Main.Execute ("COMMIT");
LocationsCount = 0;
}
} catch (Exception ex) {
Console.WriteLine("error InsertOrUpdateEvent " + ex.Message);
} finally {
OnDatabaseUpdateStepCompleted (EventArgs.Empty);
}
};
}
OnDatabaseUpdateStepCompleted - just iterates an updateComplete counter when it's called and when it knows that all of the services have come back ok it removes the waiting spinner and the app carries on.
This works OK 1st time 'round - but then sometimes it doesn't with one of these: http://monobin.com/__m6c83107d
I think the 1st question is - is all this OK? I'm not used to using threading and locks so I am wandering into new ground for me. Is using QueueUserWorkItem like this ok? Should I even be using lock before doing the bulk insert/update? An example of which:
public void InsertOrUpdateEvent(Event festival){
try {
if (!festival.Deleted) {
Main.Insert(festival, "OR REPLACE");
}else{
Main.Delete<Event>(festival);
}
} catch (Exception ex) {
Console.WriteLine("InsertOrUpdateEvent failed: " + ex.Message);
}
}
Then the next question is - what am I doing wrong that is causing these sqlite issues?
w://
Sqlite is not thread safe.
If you want to access Sqlite from more than one thread, you must take a lock before you access any SQLite related structures.
Like this:
lock (db){
// Do your query or insert here
}
Sorry, no specific answers, but some thoughts:
Is SqlLite even threadsafe? I'm not sure - it may be that it's not (to the wrapper isn't). Can you lock on a more global object, so no two threads are inserting at the same time?
It's possible that the MT GC is getting a little overenthusiastic, and releasing your string before it's been used. Maybe keep a local reference to it around during the insert? I've had this happen with view controllers, where I had them in an array (tabcontrollers, specificially), but if I didn't keep an member variable around with the reference, they got GC'ed.
Could you get the data in a threaded manner, then queue everything up and insert them in a single thread? Atleast as a test anyway.