Only Take Last Item After X Seconds - reactive-programming

I am trying to skip items from a stream until n-seconds, and then take the last item that was passed in the stream. This is what I have so far:
const delayedState$ = state$.delay(1000);
state$.buffer(
delayedState$
).filter(
(buffer) => buffer && buffer.length > 0
).publishReplay(1).refCount().map(
(buffer) => buffer.slice(-1).pop()
).subscribe((state) => {
saveState({
buttonCount: state.buttonCount
});
})
But this seems messy, and doesn't seem to work when the stream has many changes in very short succession. I am basically trying to follow this:
https://github.com/tayiorbeii/egghead.io_idiomatic_redux_course_notes/blob/master/03-Persisting_the_State_to_the_Local_Storage.md
My constraint; it has to be the last item after n-seconds, not the first item and then wait n-seconds.

In this demo the source Observable emits 10 values with 1s delay between them.
All values are ignored until the inner Observable delayed by 5s emits a value and then only the last values is passed to the subscriber (the last value is 9 because interval() counts value from 0):
let source = Observable.interval(1000).take(10);
source.skipUntil(Observable.of(true).delay(5000))
.takeLast(1)
.subscribe(val => console.log(val));
See live demo: http://plnkr.co/edit/zklJWnnKzHu3smNmr0T2?p=preview

Related

Is there an Operation to block onComplete?

I am trying to learn reactive programming, so forgive me if I ask a silly question. I'm also open to advice on changing my design.
I am working in scala-swing to display the results of a simulator. With one setting, a chart is displayed as a histogram; with the other setting the chart is displayed as the cumulative sum. (I'm probably using the wrong word; in the first setting you might have bin1=2, bin2=5, bin3=3; in the second setting the first height is 2, the second is 2 + 5, the third is 2 + 5 + 3, etc.). The simulator can be slow, so I originally used a Future to compute it, and the set the data into the chart. I decided to try a reactive approach, so my requirements are: 1. I don't want to recreate the data when I change the display mode, and 2. I want to set the Observable once for the chart and have the chart listen to the same Observable permanently.
I got this to work when I started the chain with a PublishSubject and the Future set the data into the start of the chain. When the display mode changed, I created a new PublishSubject().map(newRenderingLogic).subscribe(theChartsObservable). I am now trying to do what looks like the "right way," but it's not working correctly. I've tried to simplify what I have done:
val textObservable: Subject[String] = PublishSubject()
textObservable.subscribe(text => {
println(s"Text: ${text}")
})
var textSubscription: Option[Subscription] = None
val start = Observable.from(Future {
"Base text"
}).cache
var i = 0
val button = new Button() {
text = "Click"
reactions += {
case event => {
i += 1
if (textSubscription.isDefined) {
textSubscription.get.unsubscribe()
}
textSubscription = Some(start.map(((j: Int) => { (base: String) => s"${base} ${j}" })(i)).subscribe(textObservable))
}
}
}
On start, an Observable is created and logic to print some text is added to it. Then, an Observable with the generated data is created and a cache is added so that the result is replayed if the next subscription comes in after its results are generated. Then, a button is created. Then on button clicks a middle observable is chained with unique logic (it's a function that creates a function to append the value of i into the string, run with the current value of i; I tried to make something that couldn't just be reused) that is supposed to change with each click. Then the first Observable is subscribed to it so that the results of the whole chain end up being printed.
In theory, the cache operation takes care of not regenerating the data, and this works once, but onComplete is called on textObservable and then it can't be used again. It works if I subscribe it like this:
textSubscription = Some(start.map(((j: Int) => { (base: String) => s"${base} ${j}" })(i)).subscribe(text => textObservable.onNext(text)))
because the call to onComplete is intercepted, but this looks wrong and I wanted to know if there was a more typical way to do this, or architect it. It makes me think that I don't understand how this is supposed to be done if there isn't an out-of-the-box operation to do this.
Thank you.
I'm not 100% sure if I got the essence of your question right, but: if you have an Observable that may complete and you want to turn it into an Observable that never completes, you can just concatenate it with Observable.never.
For example:
// will complete after emitting those three elements:
val completes = Observable.from(List(1, 2, 3))
// will emit those three elements, but will never complete:
val wontComplete = completes ++ Observable.never

how to detect duplicated line using scala akka stream

we have a scala application that read lines from text file and process them using Akka Stream. for better performance we set parallelism to 5. the problem is if the multiple lines contains the same email we only keep one of the line and treated others as duplicated and throw error. I tried to use a java concurrentHashMap to detect duplication but it didn't work, here is my code:
allIdentifiers = new ConcurrentHashMap[String, Int]()
Source(rows)
.mapAsync(config.parallelism.value) {
case (dataRow, index) => {
val eventResendResult: EitherT[Future, NonEmptyList[ResendError], ResendResult] =
for {
cleanedRow <- EitherT.cond[Future](
!allIdentifiers.containsKey(dataRow.lift(emailIndex)), {
allIdentifiers.put(dataRow.lift(emailIndex),index)
dataRow
}, {
NonEmptyList.of(
DuplicatedError(
s"Duplicated record at row $index",
List(identifier)
)
)
}
)
_ = logger.debug(
LoggingMessage(
requestId = RequestId(),
message = s"allIdentifiers: $allIdentifiers"
)
)
... more process step ...
} yield foldResponses(sent)
eventResendResult
.leftMap(errors => ResendResult(errors.toList, List.empty))
.merge
}
}
.runWith(Sink.reduce { (result1: ResendResult, result2: ResendResult) =>
ResendResult(
result1.errors ++ result2.errors,
result1.results ++ result2.results
)
})
we have config.parallelism.value set to 5, means any moment it'll process up to 5 lines concurrently. what I observed is if there are duplicated lines right next to each other, it didn't work, example:
line 0 contains email1
line 1 contains email1
line 2 contains email2
line 3 contains email2
line 4 contains email3
from the log i see the concurrentHashMap was populated with entries, but all lines passed the duplication detect and moved to the next process step.
so Akka Stream's parallelism is not the same thing as java's multithreads? how can i detect duplicated line in this case?
The problem is in the following snippet:
cleanedRow <- EitherT.cond[Future](
!allIdentifiers.containsKey(dataRow.lift(emailIndex)), {
allIdentifiers.put(dataRow.lift(emailIndex),index)
dataRow
}, {
NonEmptyList.of(
DuplicatedError(
s"Duplicated record at row $index",
List(identifier)
)
)
}
)
In particular: imagine two threads simultaneously processing an email which should be deduplicated. It is possible for the following to happen (in order)
The first thread checks containsKey and finds the email is not in the map
The second thread checks containsKey and finds the email is not in the map
The first thread adds the email to the map (based on results from step 1.) and passes the email through
The second thread adds the email to the map (based on results from step 3.) and passes the email through
In other words: you need to atomically check the map for the key and update it. This is a pretty common sort of thing to want, so it is exactly what ConcurrentHashMap's put does: it updates the value at the key and returns the previous value it replaced, if there was one.
I'm not too familiar with the combinators in Cats, so the following might not be idiomatic. However, note how it inserts and checks for a previous value in one atomic step.
cleanedRow <- EitherT(Future.successful {
val previous = allIdentifiers.put(dataRow.lift(emailIndex), index)
Either.cond(
previous != null,
dataRow,
NonEmptyList.of(
DuplicatedError(
s"Duplicated record at row $index",
List(identifier)
)
)
)
})

RxJS interleaving merged observables (priority queue?)

UPDATE
I think I've figured out the solution. I explain it in this video. Basically, use timeoutWith, and some tricks with zip (within zip).
https://youtu.be/0A7C1oJSJDk
If I have a single observable like this:
A-1-2--B-3-4-5-C--D--6-7-E
I want to put the "numbers" as lower priority; it should wait until the "letters" is filled up (a group of 2 for example) OR a timeout is reached, and then it can emit. Maybe the following illustration (of the desired result) can help:
A------B-1-----C--D-2----E-3-4-5-6-7
I've been experimenting with some ideas... one of them: first step is to split that stream (groupBy), one containing letters, and the other containing numbers..., then "something in the middle" happen..., and finally those two (sub)streams get merged.
It's that "something in the middle" what I'm trying to figure out.
How to achieve it? Is that even possible with RxJS (ver 5.5.6)? If not, what's the closest one? I mean, what I want to avoid is having the "numbers" flooding the stream, and not giving enough chance for the "letters" to be processed in timely manner.
Probably this video I made of my efforts so far can clarify as well:
Original problem statement: https://www.youtube.com/watch?v=mEmU4JK5Tic
So far: https://www.youtube.com/watch?v=HWDI9wpVxJk&feature=youtu.be
The problem with my solution so far (delaying each emission in "numbers" substream using .delay) is suboptimal, because it keeps clocking at slow pace (10 seconds) even after the "characters" (sub)stream has ended (not completed -- no clear boundary here -- just not getting more value for indeterminate amount of time). What I really need is, to have the "numbers" substream raise its pace (to 2 seconds) once that happen.
Unfortunately I don't know RxJs5 that much and use xstream myself (authored by one of the contributor to RxJS5) which is a little bit simpler in terms of the number of operators.
With this I crafted the following example:
(Note: the operators are pretty much the same as in Rx5, the main difference is with flatten wich is more or less like switch but seems to handle synchronous streams differently).
const xs = require("xstream").default;
const input$ = xs.of("A",1,2,"B",3,4,5,"C","D",6,7,"E");
const initialState = { $: xs.never(), count: 0, buffer: [] };
const state$ = input$
.fold((state, value) => {
const t = typeof value;
if (t === "string") {
return {
...state,
$: xs.of(value),
count: state.count + 1
};
}
if (state.count >= 2) {
const l = state.buffer.length;
return {
...state,
$: l > 0 ? xs.of(state.buffer[0]) : xs.of(value) ,
count: 0,
buffer: state.buffer.slice(1).concat(value)
};
}
return {
...state,
$: xs.never(),
buffer: state.buffer.concat(value),
};
}, initialState);
xs
.merge(
state$
.map(s => s.$),
state$
.last()
.map(s => xs.of.apply(xs, s.buffer))
)
.flatten()
.subscribe({
next: console.log
});
Which gives me the result you are looking for.
It works by folding the stream on itself, looking at the type of values and emitting a new stream depending on it. When you need to wait because not enough letters were dispatched I emit an emptystream (emits no value, no errors, no complete) as a "placeholder".
You could instead of emitting this empty stream emit something like
xs.empty().endsWith(xs.periodic(timeout)).last().mapTo(value):
// stream that will emit a value only after a specified timeout.
// Because the streams are **not** flattened concurrently you can
// use this as a "pending" stream that may or may not be eventually
// consumed
where value is the last received number in order to implement timeout related conditions however you would then need to introduce some kind of reflexivity with either a Subject in Rx or xs.imitate with xstream because you would need to notify your state that your "pending" stream has been consumed wich makes the communication bi-directionnal whereas streams / observables are unidirectionnal.
The key here the use of timeoutWith, to switch to the more aggresive "pacer", when the "events" kicks in. In this case the "event" is "idle detected in the higher-priority stream".
The video: https://youtu.be/0A7C1oJSJDk

How to sessionize stream with Apache Flink?

I want to sessionize this stream: 1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,0,3,3,3,5, ... to these sessions:
1,1,1
2,2,2,2,2
3,3,3,3,3,3,3
0
3,3,3
5
I've wrote CustomTrigger to detect when stream elements change from 1 to 2 (2 to 3, 3 to 0 and so on) and then fire the trigger. But this is not the solution, because when I processing the first element of 2's, and fire the trigger the window will be [1,1,1,2] but I need to fire the trigger on the last element of 1's.
Here is the pesudo of my onElement function in my custom trigger class:
override def onElement(element: Session, timestamp: Long, window: W, ctx: TriggerContext): TriggerResult = {
if (prevState == element.value) {
prevState = element.value
TriggerResult.CONTINUE
} else {
prevState = element.value
TriggerResult.FIRE
}
}
How can I solve this problem?
I think a FlatMapFunction with a ListState is the easiest way to implement this use-case.
When a new element arrives (i.e., the flatMap() method is called), you check if the value changed. If the value did not changed, you append the element to the state. If the value changed, you emit the current list state as a session, clear the list, and insert the new element as the first to the list state.
However, you should keep in mind that this assumes that the order of elements is preserved. Flink ensures within a partition, i.e, as long as elements are not shuffled and all operators run with the same parallelism.

Why I am getting only one item out of this Observable?

I have a cold observable with static number of items, I needed some time delay between each item, I have combined it with another IObservable I got through Observable.Timer. I am using Zip .
var ob1 = Observable.Range(1, 100);
var ob2 = Observable.Timer(TimeSpan.FromSeconds(1.0));
var myObservable = Observable.Zip(ob1, ob2, (a, b) => b);
myObservable.Subscribe(a => Console.WriteLine("Item encountered"));
///Allow enough time for Timer observable to give back multiple ticks
Thread.Sleep(3000);
But output only prints "Item encountered" once. What am I missing ?
To confirm the commentary, Observable.Interval is the way to go for just a single argument - and thus it has always been!
I found the solution. Observable.Timer takes two arguments for my scenario, first one is due time for first item and second due time is for all subsequent items. And if only one TimeSpan argument is supplied, it would yield only one item.
Observable.Timer(TimeSpan.FromSeconds(1.0), TimeSpan.FromSeconds(1.0));