I have a function that looks like this:
public Flowable<Integer> max(int a, int b){
// *** Part 1 - start ***
int max = Math.max(a,b);
// *** Part 1 - end ***
return Flowable.defer(() -> {
// *** Part 2 start ***
return Flowable.just(max);
// *** Part 2 end ***
});
}
When I now subscribe like this:
EDITED:
max(3,4).subscribeOn(Schedulers.io()).subscribe();
Will the code from Part 1 run on Schedulers.io()?
What are problems that can happen when you write a function like this?
What code runs in which thread?
What is the difference for part 1 if its not in the stream? Or is it in the stream?
When I now subscribe like this:
That code doesn't subscribe, you have to call subscribe().
Will the code from Part 1 run on Schedulers.io()?
The code in max() runs as soon as it is invoked on some thread: it calculates the max and creates a Flowable capturing the bigger value.
What are problems that can happen when you write a function like this?
Part 1 executes on the caller thread which may not be what you wanted. At that point, RxJava isn't even involved.
What code runs in which thread?
max() runs on the caller thread and nothing else gets executed.
What is the difference for part 1 if its not in the stream? Or is it in the stream?
Part 1 is out of the stream. You have to put those computation into the stream via fromCallable for example, although such trivial operations may not be worth putting into a stream.
public Flowable<Integer> max(int a, int b){
return Flowable.fromCallable(() ->
// *** Part 1 - start ***
Math.max(a, b)
// *** Part 1 - end ***
);
}
max(3, 4)
.subscribeOn(Schedulers.io())
.subscribe(v -> {
System.out.println(Thread.currentThread());
System.out.println(v);
});
Will the code from Part 1 run on Schedulers.io()?
Ans: Part one will run on main thread or caller thread. As you didn't put it under the rxjava method.
What code runs in which thread?
Ans: part 2 runs on caller thread. It may be main thread or a new worker thread.
What is the difference for part 1 if its not in the stream? Or is it in the stream?
Ans: It is out of the stream. You may get output from it. But that didn't depend on rxjava.
Use All code inside the rxjava method. Otherwise stream flow may differ. This type of simple calculation can be used on main thread. use rxjava when working with database or rest api call.
Related
I have some async (ZIO) code, which I need to test. If I create a testing part using Thread.sleep() it works fine and I always get response:
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
But if I made same logic using timeout and interval from eventually then it never works correctly ( I got timeouts):
for {
saved <- database.save(smth)
result <- eventually(timeout(Span(20, Seconds)), interval(Span(20, Seconds))) {
database.search(...)
}
} yield result
I do not understand why timeout and interval works different then Thread.sleep. It should be doing exactly same thing. Can someone explain it to me and tell how I should change this code to do not need to use Thread.sleep()?
Assuming database.search(...) returns ZIO[] object.
eventually{database.search(...)} most probably succeeds immediately after the first try.
It successfully created a task to query the database.
Then database is queried without any retry logic.
Regarding how to make it work:
val search: ZIO[Any, Throwable, String] = ???
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
Something like that should work. But I believe that more elegant solutions exist.
The other answer from #simpadjo addresses the "what" quite succinctly. I'll add some additional context as to why you might see this behavior.
for {
saved <- database.save(smth)
result <- eventually {
Thread.sleep(20000)
database.search(...)
}
} yield result
There are three different technologies being mixed here which is causing some confusion.
First is ZIO which is an asynchronous programming library that uses it's own custom runtime and execution model to perform tasks. The second is eventually which comes from ScalaTest and is useful for checking asynchronous computations by effectively polling the state of a value. And thirdly, there is Thread.sleep which is a Java api that literally suspends the current thread and prevents task progression until the timer expires.
eventually uses a simple retry mechanism that differs based on whether you are using a normal value or a Future from the scala standard library. Basically it runs the code in the block and if it throws then it sleeps the current thread and then retries it based on some interval configuration, eventually timing out. Notably in this case the behavior is entirely synchronous, meaning that as long as the value in the {} doesn't throw an exception it won't keep retrying.
Thread.sleep is a heavy weight operation and in this case it is effectively blocking the function being passed to eventually from progressing for 20 seconds. Meaning that by the time the database.search is called the operation has likely completed.
The second variant is different, it executes the code in the eventually block immediately, if it throws an exception then it will attempt it again based on the interval/timeout logic that your provide. In this scenario the save may not have completed (or propagated if it is eventually consistent). Because you are returning a ZIO which is designed not to throw, and eventually doesn't understand ZIO it will simply return the search attempt with no retry logic.
The accepted answer:
val retried: ZIO[Any with Clock, Throwable, Option[String]] = search.retry(Schedule.spaced(Duration.fromMillis(1000))).timeout(Duration.fromMillis(20000))
works because the retry and timeout are using the built-in ZIO operators which do understand how to actually retry and timeout a ZIO. Meaning that if search fails the retry will handle it until it succeeds.
object GetDetails{
def apply(outStream: java.io.Writer, someObject: SomeClass){
outStream.write(someObject.detail1) //detail1 is string
outStream.write(someObject.detail2) //detail2 is also string
outStream.flush()
}
}
If this implementation is not thread-safe how do I make it thread-safe?
This function is going to be called simultaneously with different inputs.
One thing to think about is how the "state" of your application might change when two different threads call the same function, or interact with the same object. In this case, your "state" might be "what's been written to outStream".
Consider the following scenario:
Thread 1 Thread 2
-------------------- -------------------------------
GetDetails(outStream, object1)
GetDetails(outStream, object2)
outStream.write(object1.detail1)
outStream.write(object2.detail1)
outStream.write(object1.detail2)
outStream.write(object2.detail2)
outStream.flush()
outStream.flush()
Two separate threads both call GetDetails, sharing the same outStream. This is a potential concurrency problem, as the data that gets written to outStream is not guaranteed to be in any particular order. You might get [object1.detail1, object2.detail1, object1.detail2, object2.detail2], or [object2.detail1, object1.detail1, object1.detail2, object2.detail2], and so on.
GetDetails.apply does not change any of GetDetails's state, but it does change the state of the Writer that you pass; in order to ensure thread-safety you must take efforts to avoid using the same Writer at the same time (i.e. the scenario above).
As a counter-point, here's a pretty thread-unsafe method:
object NotThreadSafe {
// mutable state
private var currentPrefix = ""
def countUp(prefix: String) = {
// red flag: changing mutable state, then referring to it
currentPrefix = prefix
for(i <- 1 to 5) println(s"$currentPrefix $i")
}
}
If thread 1 calls NotThreadSafe.countUp("hello") and thread 2 calls NotThreadSafe.countUp("goodbye"), the output will depend on which currentPrefix = prefix happens last.
You could end up with
hello 1
hello 2
goodbye 3 // should have been hello 3, but currentPrefix got changed
goodbye 1
goodbye 4 // should have been hello 4
goodbye 5 // should have been hello 5
goodbye 2
goodbye 3
goodbye 4
goodbye 5
or one of many other permutations.
This is one reason why Scala tends to prefer "immutability" and "statelessness", because those are tools that simply don't need to worry about this kind of issue.
If you do find yourself forced to deal with mutable state, often the simplest way to ensure thread safety is to ensure your method can only be called once at a time, i.e. by using synchronized. At a finer-grained level, you want to ensure that a specific sequence of steps is not interleaved with another copy of that sequence on another thread (e.g. the GetDetails scenario I described at the beginning).
You could also look into semaphores but that's beyond the scope of this answer.
It should be simple, but I have no idea how to do it. I want to run ScalaZ Task in the current thread. I was surprised task.run doesn't run on the current thread, as it is synchronous.
Is it possible to run it on the current thread, and how to do it?
There were some updates and deprecations since http://timperrett.com/2014/07/20/scalaz-task-the-missing-documentation/.
Right now the recommended way of calling task synchronously is:
task.unsafePerformSync // returns result or throws exception
task.unsafePerformSyncAttempt // returns -\/(error) or \/-(result)
Keep in mind, though, that it is not exactly done in the caller's thread - the execution is perfomed in a thread pool defined for a task, but the caller's thread blocks until the execution is finished. There is no way of making the task run exactly in the same thread.
In general, if Task.async is used - there is no way to make composite Task always stay in the same thread as cb (callback) can be called from any place (any thread), so that in a chain like:
Task
.delay("aaa")
.map(_ + "bbb")
.flatMap(x => Task.async(cb => completeCallBackSomewhereElse(cb, x)))
.map(_ + "ccc")
.unsafePerformSync
_ + "bbb" is gonna be executed in a caller's thread
_ + "ccc" is gonna be executed in Somewhereelse's thread as scalaz have no control over it.
Basically, this allows a Task to be a powerful instrument for asynchronous operations, so it might not even know about underlying thread pools or even implement behavior without pure threads and wait/notify.
However, there are special cases where it might work as caller-runs:
1) No Strategy/Task.async related stuff:
Task.delay("aaa").map(_ + "bbb").unsafePerformSync
unsafePerformSync uses CountDownLatch to await for result of runAsync, so if there is no async/non-deterministic operations on the way - runAsync will use caller's thread:
/**
* Run this `Future`, passing the result to the given callback once available.
* Any pure, non-asynchronous computation at the head of this `Future` will
* be forced in the calling thread. At the first `Async` encountered, control
* switches to whatever thread backs the `Async` and this function returns.
*/
def runAsync(cb: A => Unit): Unit =
listen(a => Trampoline.done(cb(a)))
2) You have control over execution strategies. So this simple Java trick will help. Besides, it's already implemented in scalaz and called Strategy.sequential
P.S.
1) If you simply want to start a computation as soon as possible use task.now/Task.unsafeStart.
2) If you want something less heavily related on asynchronous stuff but still lazy and stack-safe, you might take a look here (it's for Cats library) http://eed3si9n.com/herding-cats/Eval.html
3) If you just need to encapsulate side-effects - take a look at scalaz.effect
I was looking over some Scala server code and I saw thins async/await block:
async {
while (cancellationToken.nonCancelled) {
val (request, exchange) = await(listener.nextRequest)
respond(exchange, cancellationToken, handler(request))
}
}
How can this be correct syntax?
As I understand it:
For every execution of the while loop
Thread 1 will execute the code from the while loop except the one in the await clause.
Thread 2 will go in the await clause.
But then Thread 1 will have val (request, exchange) uninstantiated in case Thread 2 doesn't finish computing.
These values will be passed to the respond and handler methods uninstantiated.
So how can you have an assignment in two different threads?
So how can you have an assignment in two different threads?
async-await's main goal is to allow you to do asynchronous programming in a synchronous fashion.
What really happens is that the awaited call executes listener.nextRequest and asynchronously waits for it's completion, it doesn't execute the next line of code until then. This guarantees that if the next line of code is executed, it's values are populated. The assignment should happen where it is visible to the next LOC in the method.
This is possible due to the fact that the async macro actually transforms this code into a state-machine, where the first part is the execution up until the first await, and the next part is everything after.
This is a question about Scala continuations. Can resets be nested? If they can: what are nested resets useful for ? Is there any example of nested resets?
Yes, resets can be nested, and, yes, it can be useful. As an example, I recently prototyped an API for the scalagwt project that would allow GWT developers to write asynchronous RPCs (remote procedure calls) in a direct style (as opposed to the callback-passing style that is used in GWT for Java). For example:
field1 = "starting up..." // 1
field2 = "starting up..." // 2
async { // (reset)
val x = service.someAsyncMethod() // 3 (shift)
field1 = x // 5
async { // (reset)
val y = service.anotherAsyncMethod() // 6 (shift)
field2 = y // 8
}
field2 = "waiting..." // 7
}
field1 = "waiting..." // 4
The comments indicate the order of execution. Here, the async method performs a reset, and each service call performs a shift (you can see the implementation on my github fork, specifically Async.scala).
Note how the nested async changes the control flow. Without it, the line field2 = "waiting" would not be executed until after successful completion of the second RPC.
When an RPC is made, the implementation captures the continuation up to the inner-most async boundary, and suspends it for execution upon successful completion of the RPC. Thus, the nested async block allows control to flow immediately to the line after it as soon as the second RPC is made. Without that nested block, on the other hand, the continuation would extend all the way to the end of the outer async block, in which case all the code within the outer async would block on each and every RPC.
reset forms an abstraction so that code outside is not affected by the fact that the code inside is implemented with continuation magic. So if you're writing code with reset and shift, it can call other code which may or may not be implemented with reset and shift as well. In this sense they can be nested.