API Observable with dynamic caching

API Observable with dynamic caching - scala

An API I'm polling has a field that defines the time that value is cached, cachedUntil. The goal is to create an Observable that polls and emits an event every time the cache has expired. The thing that distinguishes this case, is that the caching is not regular. I.e. Observable.interval does not apply.
In what ways is it possible to implement an Observable that has this behaviour?
The following snippet gives a function that polls the API, emits the requested events and return the cachedUntil delay to the next call.
def getContracts(subscriber: Subscriber[Set[EveContract]]): Option[Long] = {
logger.debug("Fetching new contracts")
try {
val response = parser.getResponse(auth)
if(response == null) {
subscriber.onError(new RuntimeException("Unable to fetch contracts from EVE servers"))
None
}
else if(response.hasError) {
logger.error(response.getError.toString)
subscriber.onError(new RuntimeException(response.getError.toString))
None
} else {
subscriber.onNext(response.getAll.toSet) // Emit new polled data
Some(response.getCachedUntil.getTime - new Date().getTime) // Return the cache delay
}
} catch {
case aex: ApiException ⇒
logger.error("An error occurred when querying the EVE API.")
logger.debug("ApiException: ", aex)
subscriber.onError(aex)
None
}
}

It is possible to use Scheduler workers to reschedule a call togetContracts:
Observable[Set[EveContract]](observer ⇒ {
val worker = Schedulers.newThread().createWorker()
def scheduleContracts(delay: Long) {
worker.schedule(new Action0 {
override def call(){
if(!observer.isUnsubscribed) {
val delay = getContracts(observer)
delay match {
// Reschedule a contract fetch after time d has passed.
case Some(d) ⇒
logger.debug(s"Rescheduling contract fetch in: ${d / 1000} s")
scheduleContracts(d)
case _ ⇒
// Otherwise do nothing
logger.debug("Not rescheduling contract fetch, an error has occured.")
}
} else {
logger.trace("Subscriber has unsubscribed.")
}
}
}, delay, TimeUnit.MILLISECONDS)
}
scheduleContracts(0L)
})
However, I'm very interested in possible other solutions.

Related

ScalaTest asserting multiple futures using AsyncFunSuiteLike

I've been trying to perform a test that uses a mock Http server to respond and a function that returns a Future[String] or an Exception if the Http server response isn't 200.
I'm trying to achieve a test without using Awaits, but instead AsyncFunSuiteLike.
However the following test seems impossible to resolve without doing it synchronously:
test("Error responses") {
Future.sequence {
NanoHTTPD.Response.Status.values().toList.filter(status => status.getRequestStatus >= 400).map {
status => {
httpService.setStatusCode(status)
val responseBody = s"Request failed with status $status"
httpService.setResponseContent(responseBody)
val errorMessage = s"Error response (${status.getRequestStatus}) from http service: $responseBody"
recoverToExceptionIf[ServiceException] {
myObject.httpCall("123456")
}.map {
ex => assert(ex.getMessage === errorMessage)
}
}
}
}.map(assertions => assert(assertions.forall(_ == Succeeded)))
}
Basically the problem is that when the Futures are tested, the NanoHTTPD is set to the last valued set in the map, so all ex.getMessage are the same. If I run those status codes one by one I do get the desired results, but, is there a way to perform all this in one single Async test?

From the looks of it, NanoHTTPD is stateful, so you have a race between the .set... calls and the .httpCall.
If you can spin up a new httpService within each Future, then you should be able to parallelize the tests (unless the state in question would be shared across instances, in which case you're likely out of luck).
So you'd have something like (replace Status with the type of status in your code and HTTPService with the type of httpService):
// following code composed on the fly and not run through the compiler...
def spinUpHTTPService(status: Status, body: String): Future[HTTPService] = {
// insert the code outside of the test which creates httpService
httpService.setStatusCode(status)
httpService.setResponseContent(body)
httpService
}
test("Error responses") {
Future.sequence(
NanoHTTPD.Response.Status.values().toList.filter(status => status.getRequestStatus >= 400).map { status =>
spinUpHTTPService(status, s"Request failed with status $status")
.flatMap { httpService =>
val errorMessage = s"Error response (${status.getRequestStatus}) from http service: $responseBody"
recoverToExceptionIf[ServiceException] {
myObject.httpCall("123456")
} map {
ex => assert(ex.getMessage === errorMessage)
}
} // Future.flatMap
} // List.map
).map { assertions => assertions.forAll(_ == Succeeded) }
}

How to handle removed data from state

I have a sessionization use case. I keep my sessions in-memory thanks to mapWithstate() and update them for each incoming log. When a session ends, signaled with a specific log, I want to retrieve it and remove it from my State.
The problem I stumble upon is that I cannot retrieve AND remove (remove()) my session at the end of each batch, because retrieval happens outside the updateFunction() and the removal within it, i.e. once removed the session cannot be retrieved, and if a session ends, there should not be anymore logs for it, no more keys.
I can still retrieve my ended sessions but the number of "dead" sessions will escalate, thus creating an integral anomaly ("State-overflow") that if left unchecked will threaten the system itself. This solution is not acceptable.
As it seems like a common use-case, I was wondering if anyone had come up with a solution?
EDIT
Sample code below:
def mapWithStateContainer(iResultParsing: DStream[(String, SessionEvent)]) = {
val lStateSpec = StateSpec.function(stateUpdateFunction _).timeout(Seconds(TIMEOUT)
val lResultMapWithState: DStream[(String, Session)] =
iResultParsing.mapWithState(lStateSpec).stateSnapshots()
val lClosedSession: DStream[(String, Session)] =
lResultMapWithState.filter(_._2.mTimeout)
//ideally remove here lClosedSession from the state
}
private def stateUpdateFunction(iKey: String,
iValue: Option[SessionEvent],
iState: State[Session]): Option[(String, Session)] = {
var lResult = None: Option[(String, Session)]
if (iState.isTimingOut()) {
val lClosedSession = iState.get()
lClosedSession.mTimeout = true
lResult = Some(iKey, lClosedSession)
} else if (iState.exists) {
val lUpdatedSession = updateSession(lCurrentSession, iValue)
iState.update(lUpdatedSession)
lResult = Some(iKey, lUpdatedSession)
// we wish to remove the lUpdatedSession from the state once retrieved with lResult
/*if (lUpdatedSession.mTimeout) {
iState.remove()
lResult = None
}*/
} else {
val lInitialState = initSession(iValue)
iState.update(lInitialState)
lResult = Some(iKey, lInitialState)
}
lResult
}
private def updateSession(iCurrentSession: Session,
iNewData: Option[SessionEvent]): Session = {
//user disconnects manually
if (iNewData.get.mDisconnection) {
iCurrentSession.mTimeout = true
}
iCurrentSession
}

Instead of calling MapWithStateRDD.stateSnapshot, you can return the updated state as the return value of your mapWithState operation. This way, the finalized state is always available outside the your stateful DStream.
This means that you can do:
else if (iState.exists) {
val lUpdatedSession = updateSession(lCurrentSession, iValue)
iState.update(lUpdatedSession)
if (lUpdatedSession.mTimeout) {
iState.remove()
}
Some(iKey, lUpdatedSession)
}
And now change your graph to:
val lResultMapWithState = iResultParsing
.mapWithState(lStateSpec)
.filter { case (_, session) => session.mTimeout }
What happens is now that the state is being removed internally, but because you're returning it from your StateSpec function, it's available to you outside for further processing.

Rxjava User-Retry observable with .cache operator?

i've an observable that I create with the following code.
Observable.create(new Observable.OnSubscribe<ReturnType>() {
#Override
public void call(Subscriber<? super ReturnType> subscriber) {
try {
if (!subscriber.isUnsubscribed()) {
subscriber.onNext(performRequest());
}
subscriber.onCompleted();
} catch (Exception e) {
subscriber.onError(e);
}
}
});
performRequest() will perform a long running task as you might expect.
Now, since i might be launching the same Observable twice or more in a very short amount of time, I decided to write such transformer:
protected Observable.Transformer<ReturnType, ReturnType> attachToRunningTaskIfAvailable() {
return origObservable -> {
synchronized (mapOfRunningTasks) {
// If not in maps
if ( ! mapOfRunningTasks.containsKey(getCacheKey()) ) {
Timber.d("Cache miss for %s", getCacheKey());
mapOfRunningTasks.put(
getCacheKey(),
origObservable
.doOnTerminate(() -> {
Timber.d("Removed from tasks %s", getCacheKey());
synchronized (mapOfRunningTasks) {
mapOfRunningTasks.remove(getCacheKey());
}
})
.cache()
);
} else {
Timber.d("Cache Hit for %s", getCacheKey());
}
return mapOfRunningTasks.get(getCacheKey());
}
};
}
Which basically puts the original .cache observable in a HashMap<String, Observable>.
This basically disallows multiple requests with the same getCacheKey() (Example login) to call performRequest() in parallel. Instead, if a second login request arrives while another is in progress, the second request observable gets "discarded" and the already-running will be used instead. => All the calls to onNext are going to be cached and sent to both subscribers actually hitting my backend only once.
Now, suppouse this code:
// Observable loginTask
public void doLogin(Observable<UserInfo> loginTask) {
loginTask.subscribe(
(userInfo) -> {},
(throwable) -> {
if (userWantsToRetry()) {
doLogin(loinTask);
}
}
);
}
Where loginTask was composed with the previous transformer. Well, when an error occurs (might be connectivity) and the userWantsToRetry() then i'll basically re-call the method with the same observable. Unfortunately that has been cached and I'll receive the same error without hitting performRequest() again since the sequence gets replayed.
Is there a way I could have both the "same requests grouping" behavior that the transformer provides me AND the retry button?

Your question has a lot going on and it's hard to put it into direct terms. I can make a couple recommendations though. Firstly your Observable.create can be simplified by using an Observable.defer(Func0<Observable<T>>). This will run the func every time a new subscriber is subscribed and catch and channel any exceptions to the subscriber's onError.
Observable.defer(() -> {
return Observable.just(performRequest());
});
Next, you can use observable.repeatWhen(Func1<Observable<Void>, Observable<?>>) to decide when you want to retry. Repeat operators will re-subscribe to the observable after an onComplete event. This particular overload will send an event to a subject when an onComplete event is received. The function you provide will receive this subject. Your function should call something like takeWhile(predicate) and onComplete when you do not want to retry again.
Observable.just(1,2,3).flatMap((Integer num) -> {
final AtomicInteger tryCount = new AtomicInteger(0);
return Observable.just(num)
.repeatWhen((Observable<? extends Void> notifications) ->
notifications.takeWhile((x) -> num == 2 && tryCount.incrementAndGet() != 3));
})
.subscribe(System.out::println);
Output:
1
2
2
2
3
The above example shows that retries are aloud when the event is not 2 and up to a max of 22 retries. If you switch to a repeatWhen then the flatMap would contain your decision as to use a cached observable or the realWork observable. Hope this helps!

Execute some logic asynchronously in spray routing

Here is my simple routing application:
object Main extends App with SimpleRoutingApp {
implicit val system = ActorSystem("my-system")
startServer(interface = "0.0.0.0", port = System.getenv("PORT").toInt) {
import format.UsageJsonFormat._
import spray.httpx.SprayJsonSupport._
path("") {
get {
complete("OK")
}
} ~
path("meter" / JavaUUID) {
meterUUID => pathEnd {
post {
entity(as[Usage]) {
usage =>
// execute some logic asynchronously
// do not wait for the result
complete("OK")
}
}
}
}
}
}
What I want to achieve is to execute some logic asynchronously in my path directive, do not wait for the result and return immediately HTTP 200 OK.
I am quite new to Scala and spray and wondering if there is any spray way to solve this specific problem. Otherwise I would go into direction of creating Actor for every request and letting it to do the job. Please advice.

There's no special way of handling this in spray: simply fire your async action (a method returning a Future, a message sent to an actor, whatever) and call complete right after.
def doStuffAsync = Future {
// literally anything
}
path("meter" / JavaUUID) { meterUUID =>
pathEnd {
post {
entity(as[Usage]) { usage =>
doStuffAsync()
complete("OK")
}
}
}
}
Conversely, if you need to wait for an async action to complete before sending the response, you can use spray-specific directives for working with Futures or Actors.

Discard all messages except the last one in a Scala actor

I have a SwingWorker actor which computes a plot for display from a parameters object it gets send; then draws the plot on the EDT thread. Some GUI elements can tweak parameters for this plot. When they change I generate a new parameter object and send it to the worker.
This works so far.
Now when moving a slider many events are created and queue up in the worker's mailbox. But I only need to compute the plot for the very last set of parameters. Is there a way to drop all messages from the inbox; keep the last one and process only that?
Currently the code looks like this
val worker = new SwingWorker {
def act() {
while (true) {
receive {
case params: ExperimentParameters => {
//somehow expensive
val result = RunExperiments.generateExperimentData(params)
Swing.onEDT{ GuiElement.redrawWith(result) }
}
}
}
}
}

Meanwhile I have found a solution. You can check the mailbox size of the actor and simply skip the message if it is not 0.
val worker = new SwingWorker {
def act() {
while (true) {
receive {
case params: ExperimentParameters => {
if( mailboxSize == 0) {
//somehow expensive
val result = RunExperiments.generateExperimentData(params)
Swing.onEDT{ GuiElement.redrawWith(result) }
}
}
}
}
}
}

Remember the last event without processing it, have a very short timeout, process the last event when you get the timeout
could look like (not tested)
while(true) {
var lastReceived : Option[ExperimentParameters] = None
receive {case params : ExperimentParameters => lastReceived = Some(params)}
while (!lastReceived.isEmpty) {
receiveWithin(0) {
case params: ExperimentParameters => lastReceived = Some(params)
case TIMEOUT => do your job with lastReceived.get;
}
}
}