Future map's (waiting) execution context. Stops execution with FixedThreadPool - scala

// 1 fixed thread
implicit val waitingCtx = scala.concurrent.ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1))
// "map" will use waitingCtx
val ss = (1 to 1000).map {n => // if I change it to 10 000 program will be stopped at some point, like locking forever
service1.doServiceStuff(s"service ${n}").map{s =>
service1.doServiceStuff(s"service2 ${n}")
}
}
Each doServiceStuff(name:String) takes 5 seconds. doServiceStuff does not have implicit ex:Execution context as parameter, it uses its own ex context inside and does Future {blocking { .. }} on it.
In the end program prints:
took: 5.775849753 seconds for 1000 x 2 stuffs
If I change 1000 to 10000 in, adding even more tasks : val ss = (1 to 10000) then program stops:
~17 027 lines will be printed (out of 20 000). No "ERROR" message
will be printed. No "took" message will be printed
**And will not be processing any futher.
But if I change exContext to ExecutionContext.fromExecutor(null: Executor) (global one) then in ends in about 10 seconds (but not normally).
~17249 lines printed
ERROR: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
took: 10.646309398 seconds
That's the question
: Why with fixed ex-context pool it stops without messaging, but with global ex-context it terminates but with error and messaging?
and sometimes.. it is not reproducable.
UPDATE: I do see "ERROR" and "took" if I increase pool from 1 to N. Does not matter how hight N is - it sill will be the ERROR.
The code is here: https://github.com/Sergey80/scala-samples/tree/master/src/main/scala/concurrency/apptmpl
and here, doManagerStuff2()

I think I have an idea of what's going on. If you squint enough, you'll see that map duty is extremely lightweight: just fire off a new future (because doServiceStuff is a Future). I bet the behavior will change if you switch to flatMap, which will actually flatten the nested future and thus will wait for second doServiceStuff call to complete.
Since you're not flattening out these futures, all your awaits downstream are awaiting on a wrong thing, and you are not catching it because here you're discarding whatever Service returns.
Update
Ok, I misinterpreted your question, although I still think that that nested Future is a bug.
When I try your code with both executors with 10000 task I do get OutOfMemory when creating threads in ForkJoin execution context (i.e. for service tasks), which I'd expect. Did you use any specific memory settings?
With 1000 tasks they both do complete successfully.

Related

ThreadPoolExecutor AbortPolicy not throwing exception

I'm using ThreadPoolExecutor where the default RejectedExecutionHandler is AbortPolicy.
As per the understanding, AbortPolicy will throw a rejection exception once the queue size is full and I'm still pushing the entries to the queue.
This is what I'm using to process something.
CompletableFuture<Void> processTask(Executor executorB) {
return CompletableFuture.runAsync(
() -> {
// doSomething(...);
}, executorB);
}
doSomething(){
Thread.sleep(1000000l);
System.out.println("doing something");
}
I have another executor (let's say executor A), which is calling the processTask method in a loop 10 times (1 sec delay).
For executorB, my queue size is 3, max, and the core pool size is 1. When it is called for the first time it will go in the thread.sleep. As the executorA is continuously sending messages, I should get the rejection exception after 4th message which I'm not seeing anywhere.
Interesting thing is, that the log "doing something" came 4 times which means the tasks after the queue got full were rejected (1 handled by the first call, and 3 were in the queue)
Can someone explain to me why I'm not seeing any exceptions?

How to trigger handle_info due to timeout in erlang?

I am using a gen_server behaviour and trying to understand how can handle_info/2 be triggered from a timeout occurring in a handle_call for example:
-module(server).
-export([init/1,handle_call/3,handle_info/2,terminate/2).
-export([start/0,stop/0]).
init(Data)->
{ok,33}.
start()->
gen_server:start_link(?MODULE,?MODULE,[]).
stop(Pid)->
gen_server:stop(Pid).
handle_call(Request,From,State)->
Return={reply,State,State,5000},
Return.
handle_info(Request,State)->
{stop,Reason,State}.
terminate(Reason,State)->
{ok,S}=file:file_open("D:/Erlang/Supervisor/err.txt",[read,write]),
io:format(S,"~s~n",[Reason]),
ok.
What i want to do:
I was expecting that if I launch the server and would not use gen_server:call/2 for 5 seconds (in my case) then handle_info would be called, which would in turn issue the stop thus calling terminate.
I see it does not happen this way, in fact handle_info is not called at all.
In examples such as this i see the timeout is set in the return of init/1.What I can deduce is that it handle_info gets triggered only if I initialize the server and issue nothing (nor cast nor call for N seconds).If so why I can provide Timeout in the return of both handle_cast/2 and handle_call/3 ?
Update:
I was trying to get the following functionality:
If no call is issued in X seconds trigger handle_info/2
If no cast is issued in Y seconds trigger handle_info/2
I thought this timeouts can be set in the return of handle_call and handle_cast:
{reply,Reply,State,X} //for call
{noreply,State,Y} //for cast
If not, when are those timeouts triggered since they are returns?
To initiate timeout handling from gen_server:handle_call/3 callback, this callback has to be called in the first place. Your Return={reply,State,State,5000}, is not executed at all.
Instead, if you want to “launch the server and would not use gen_server:call/2 for 5 seconds then handle_info/2 would be called”, you might return {ok,State,Timeout} tuple from gen_server:init/1 callback.
init(Data)->
{ok,33,5000}.
You cannot set the different timeouts for different calls and casts. As stated by Alexey Romanov in comments,
Having different timeouts for different types of messages just isn’t something any gen_* behavior does and would have to be simulated by maintaining them inside state.
If one returns {reply,State,Timeout} tuple from any handle_call/3/handle_cast/2, the timeout will be triggered if the mailbox of this process is empty after Timeout.
i suggest you read source code:gen_server.erl
% gen_server.erl
% line 400
loop(Parent, Name, State, Mod, Time, HibernateAfterTimeout, Debug) ->
Msg = receive
Input ->
Input
after Time ->
timeout
end,
decode_msg(Msg, Parent, Name, State, Mod, Time, HibernateAfterTimeout, Debug, false).
it helps you to understand the parameter Timeout

Gatling load testing and running scenarios

I am looking to create three scenarios:
The first scenario will run a bunch of GET requests for 30s
The second and third scenarios will run in parallel and wait until the first is finished.
I want the requests from the first scenario to be excluded from the report.
I have the basic outline of what I want to achieve but not seeing expected results:
val myFeeder = csv("somefile.csv")
val scenario1 = scenario("Get stuff")
.feed(myFeeder)
.during(30 seconds) {
exec(
http("getStuff(${csv_colName})").get("/someEndpoint/${csv_colName}")
)
}
val scenario2 = ...
val scenario3 = ...
setUp(
scenario1.inject(
constantUsersPerSec(20) during (30 seconds)
).protocols(firstProtocaol),
scenario2.inject(
nothingFor(30 seconds), //wait 30s
...
).protocols(secondProt)
scenario3.inject(
nothingFor(30 seconds), //wait 30s
...
).protocols(thirdProt)
)
I am seeing the first scenario being run throughout the entire test. It doesn't stop after the 30s?
For the first scenario I would like to cycle through the CSV file and perform a request for each line. Perhaps 5-10 requests per second, how do I achieve that?
I would also like it to stop after the 30s and then run the other two in parallel. Hence the nothingFor in last two scenarios above.
Also how do I exclude from report, is it possible?
You are likely not getting the expected results due to the combination of settings between your injection profile and your "Get Stuff" scenario.
constantUsersPerSec(20) during (30 seconds)
will start 20 users on scenario "Get Stuff" every second for 30 seconds. So even during the 30th second, 20 users will START "Get Stuff". The injection pofile only controls when a user starts, not how long they are active for. So when a user executes the "Get Stuff" scenario, they make the 'get' request repeatedly over the course of 30 seconds due to the .during loop.
So at the very least, you will have users executing "Get Stuff" for 60 seconds - well into the execution of your other scenarios. Depending on the execution time for you getStuff call, it may be even longer.
To avoid this, you could work out exactly how long you want the "Get Stuff" scenario to run, set that in the injection profile and have no looping in the scenario. Alternatively, you could just set your 'nothingFor' values to be >60s.
To exclude the Get Stuff calls from reports, you can add silencing to the protocol definition (assuming it's not shared with your other requests). More details at https://gatling.io/docs/3.2/http/http_protocol/#silencing

Kafka Streams - time window close delay?

I'm new to Kafka Streams.
I use the suppress method of KTable in order to handle only the final result of a window like this:
myStream
.windowedBy(TimeWindows.of(Duration.ofSeconds(10)).grace(Duration.ofMillis(500)))
.aggregate(new Aggregation(),
(k, v, a) -> a, // Disabled the actual aggregation in order to eliminate possiblities of latency
materialized.withLoggingDisabled())
.suppress(untilWindowCloses(Suppressed.BufferConfig.unbounded()))
.toStream().peek((k, v) -> log.info("delay " + (System.currentTimeMillis() - k.window().endTime().toEpochMilli())));
This way I get a log with the delay every 10 seconds with the difference between the window end and the actual time the peek was called.
I would exect a very small number here, since this code practically does nothing...
Nevertheless, I get delay of 4-20 sec for each key/window.
I use a thread per task (5 threads for this topic).
Can someone please point out if I'm doing anything wrong?
Thanks!
Edit:
Using VirtualVM shows that ~99% of the time consumed over sun.nio.ch.SelectorImpl.select(). This means AFAIU, that the process is "idle" most of the time.
Edit:
It seems that changing "commit.interval.ms" (which was by default 30000) reduced the delay drastically.
Still delay has peaks of event 15 seconds, so the problem isn't solved yet...

Rx Extensions - Proper way to use delay to avoid unnecessary observables from executing?

I'm trying to use delay and amb to execute a sequence of the same task separated by time.
All I want is for a download attempt to execute some time in the future only if the same task failed before in the past. Here's how I have things set up, but unlike what I'd expect, all three downloads seem to execute without delay.
Observable.amb([
Observable.catch(redditPageStream, Observable.empty()).delay(0 * 1000),
Observable.catch(redditPageStream, Observable.empty()).delay(30 * 1000),
Observable.catch(redditPageStream, Observable.empty()).delay(90 * 1000),
# Observable.throw(new Error('Failed to retrieve reddit page content')).delay(10000)
# Observable.create(
# (observer) ->
# throw new Error('Failed to retrieve reddit page content')
# )
]).defaultIfEmpty(Observable.throw(new Error('Failed to retrieve reddit page content')))
full code can be found here. src
I was hoping that the first successful observable would cancel out the ones still in delay.
Thanks for any help.
delay doesn't actually stop the execution of what ever you are doing it just delays when the events are propagated. If you want to delay execution you would need to do something like:
redditPageStream.delaySubscription(1000)
Since your source is producing immediately the above will delay the actual subscription to the underlying stream to effectively delay when it begins producing.
I would suggest though that you use one of the retry operators to handle your retry logic though rather than rolling your own through the amb operator.
redditPageStream.delaySubscription(1000).retry(3);
will give you a constant retry delay however if you want to implement the linear backoff approach you can use the retryWhen() operator instead which will let you apply whatever logic you want to the backoff.
redditPageStream.retryWhen(errors => {
return errors
//Only take 3 errors
.take(3)
//Use timer to implement a linear back off and flatten it
.flatMap((e, i) => Rx.Observable.timer(i * 30 * 1000));
});
Essentially retryWhen will create an Observable of errors, each event that makes it through is treated as a retry attempt. If you error or complete the stream then it will stop retrying.