RxJava2, how to emit a list of all emits so far on each emit - rx-java2

I have a stream of objects emitted from an Observable.
Each time there is an emit I want to map it to a list of all emits so far,
I know I can do it by storing a global list and do it manualy, but I was wondering if there is an operator that does this out of the box.
a -> [a]
b -> [a,b]
c -> [a,b,c]
a -> [a,b,c,a]
d -> [a,b,c,a,d]
f -> [a,b,c,a,d,f]
b -> [a,b,c,a,d,f,b]
g -> [a,b,c,a,d,f,b,g]
when I use toList(). it only sends a single list once the stream completes.
EDIT 1:
here is the way my flow looks right now:
source.itemStream()
.doOnNext( item -> handleItem())
.subscribe()
and what I would prefer is more along the lines of:
source.itemStream()
.someOperator() // this will map to a stream of all items so far.
.doOnNext( item -> handleItem())
.subscribe()
or alternatively
source.itemStream()
.compose(x -> listBuildingFunction(x)) // this will map to a stream of all items so far.
.doOnNext( item -> handleItem())
.subscribe()

Collect the items into a list and emit a copy of that list:
Observable.defer(() -> {
List<Integer> seen = new LinkedList<>();
return Observable.range(1, 10)
.map(v -> {
seen.add(v);
return new ArrayList<>(seen);
});
})
.subscribe(System.out::println);

using akarnokd's Answer above I did the following: (in Kotlin)
private fun toPartialList(): ObservableTransformer<Item>, List<Item>> {
return ObservableTransformer { observable ->
Observable.defer {
val tempList = ArrayList<Item>()
observable.map { item ->
tempList.add(item)
return#map ArrayList(tempList)
}
}
}
}
and then used it in my stream:
source.itemStream()
.compose(toPartialList())
.doOnNext { handleItems(it) }
.subscribe()

Related

An unobvious behavior of collect() method

private lazy var dispatchQueue = DispatchQueue(
label: "\(type(of: self))",
attributes: .concurrent
)
func makeArrayGreatAgain(elem0: SomeStruct, elem1: SomeStruct, elem2: SomeStruct, elem3: SomeStruct) {
[elem0 elem1, elem2, elem3]
.publisher
.receive(on: dispatchQueue)
.tryMap { print("R", $0) } // Void
.eraseToAnyPublisher() // just copy-pasted
.collect(4)
.sink { result in print("E", result) }
receiveValue: { result in print("S", result) }
}
from time to time I see different logs.
For example:
R
R
R
S [(), (), ()]
E
R
or
R
R
R
R
S [(), (), (), ()]
E
Why? I want to obtain the whole result:
R
R
R
R
S [(), (), (), ()]
E
How can I achieve this?
Comment
the same problem appears for collect() without arguments
This has nothing to do with .collect() and everything to do with receive(on:) that receives, in your example, on a concurrent queue.
So, the values and - crucially for this question - the completion signal, are processed concurrently, so the order they are emitted in is not the same as the order they arrived in.
And so, sometimes, the completion signal actually arrives at the next step before all the values had arrived.
If the next step is a collect publisher, when it sees the completion, it emits an array of values that it received up to that point and also completes.
To ensure that you receive the values in order, use a serial queue:
var serialQ = DispatchQueue(
label: "foo"
)
let c = ["1", "2", "3", "4"]
.publisher
.receive(on: serialQ) // or receive(on: DispatchQueue.main)
.map { print("R", $0) }
.collect()
.sink (receiveCompletion: { print("E", $0) },
receiveValue: { print("S", $0) })
You will always get:
R 1
R 2
R 3
R 4
S [(),(),(),()]
E finished

Why does `Publishers.Map` consume upstream values eagerly?

Suppose I have a custom subscriber that requests one value on subscription and then an additional value three seconds after it receives the previous value:
class MySubscriber: Subscriber {
typealias Input = Int
typealias Failure = Never
private var subscription: Subscription?
func receive(subscription: Subscription) {
print("Subscribed")
self.subscription = subscription
subscription.request(.max(1))
}
func receive(_ input: Int) -> Subscribers.Demand {
print("Value: \(input)")
DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(3)) {
self.subscription?.request(.max(1))
}
return .none
}
func receive(completion: Subscribers.Completion<Never>) {
print("Complete")
subscription = nil
}
}
If I use this to subscribe to an infinite range publisher, back pressure is handled gracefully, with the publisher waiting 3 seconds each time until it receives the next demand to send a value:
(1...).publisher.subscribe(MySubscriber())
// Prints values infinitely with ~3 seconds between each:
//
// Subscribed
// Value: 1
// Value: 2
// Value: 3
// ...
But if I add a map operator then MySubscriber never even receives a subscription; map appears to have synchronously requested Demand.Unlimited upon receiving its subscription and the app infinitely spins as map tries to exhaust the infinite range:
(1...).publisher
.map { value in
print("Map: \(value)")
return value * 2
}
.subscribe(MySubscriber())
// The `map` transform is executed infinitely with no delay:
//
// Map: 1
// Map: 2
// Map: 3
// ...
My question is, why does map behave this way? I would have expected map to just pass its downstream demand to the upstream. Since map is supposed to be for transformation rather than side effects, I don't understand what the use case is for its current behavior.
EDIT
I implemented a version of map to show how I think it ought to work:
extension Publishers {
struct MapLazily<Upstream: Publisher, Output>: Publisher {
typealias Failure = Upstream.Failure
let upstream: Upstream
let transform: (Upstream.Output) -> Output
init(upstream: Upstream, transform: #escaping (Upstream.Output) -> Output) {
self.upstream = upstream
self.transform = transform
}
public func receive<S: Subscriber>(subscriber: S) where S.Input == Output, S.Failure == Upstream.Failure {
let mapSubscriber = Subscribers.LazyMapSubscriber(downstream: subscriber, transform: transform)
upstream.receive(subscriber: mapSubscriber)
}
}
}
extension Subscribers {
class LazyMapSubscriber<Input, DownstreamSubscriber: Subscriber>: Subscriber {
let downstream: DownstreamSubscriber
let transform: (Input) -> DownstreamSubscriber.Input
init(downstream: DownstreamSubscriber, transform: #escaping (Input) -> DownstreamSubscriber.Input) {
self.downstream = downstream
self.transform = transform
}
func receive(subscription: Subscription) {
downstream.receive(subscription: subscription)
}
func receive(_ input: Input) -> Subscribers.Demand {
downstream.receive(transform(input))
}
func receive(completion: Subscribers.Completion<DownstreamSubscriber.Failure>) {
downstream.receive(completion: completion)
}
}
}
extension Publisher {
func mapLazily<Transformed>(transform: #escaping (Output) -> Transformed) -> AnyPublisher<Transformed, Failure> {
Publishers.MapLazily(upstream: self, transform: transform).eraseToAnyPublisher()
}
}
Using this operator, MySubscriber receives the subscription immediately and the mapLazily transform is only executed when there is demand:
(1...).publisher
.mapLazily { value in
print("Map: \(value)")
return value * 2
}
.subscribe(MySubscriber())
// Only transforms the values when they are demanded by the downstream subscriber every 3 seconds:
//
// Subscribed
// Map: 1
// Value: 2
// Map: 2
// Value: 4
// Map: 3
// Value: 6
// Map: 4
// Value: 8
My guess is that the particular overload of map defined for Publishers.Sequence is using some kind of shortcut to enhance performance. This breaks for infinite sequences, but even for finite sequences eagerly exhausting the sequence regardless of the downstream demand messes with my intuition. In my view, the following code:
(1...3).publisher
.map { value in
print("Map: \(value)")
return value * 2
}
.subscribe(MySubscriber())
ought to print:
Subscribed
Map: 1
Value: 2
Map: 2
Value: 4
Map: 3
Value: 6
Complete
but instead prints:
Map: 1
Map: 2
Map: 3
Subscribed
Value: 2
Value: 4
Value: 6
Complete
Here's a simpler test that doesn't involve any custom subscribers:
(1...).publisher
//.map { $0 }
.flatMap(maxPublishers: .max(1)) {
(i:Int) -> AnyPublisher<Int,Never> in
Just<Int>(i)
.delay(for: 3, scheduler: DispatchQueue.main)
.eraseToAnyPublisher()
}
.sink { print($0) }
.store(in: &storage)
It works as expected, but then if you uncomment the .map you get nothing, because the .map operator is accumulating the infinite upstream values without publishing anything.
On the basis of your hypothesis that map is somehow optimizing for a preceding sequence publisher, I tried this workaround:
(1...).publisher.eraseToAnyPublisher()
.map { $0 }
// ...
And sure enough, it fixed the problem! By hiding the sequence publisher from the map operator, we prevent the optimization.

How to block the return until a timer expires using RxJava

I'm not seeing anything ever get returned by the scan. I know it's because the mutableList gets returned right away, but how do I block the return until the time expires?
Basically, all I want to do is fill up the mutable list for as long as the take() permits then return that mutableList to the calling function.
This is what I have tried.
private val timeoutScheduler: Scheduler = Schedulers.computation()
fun scanForAllDevicesStartingWith(devicePrefix: String): List<String> {
Log.d(TAG, "Scanning for devices starting with $devicePrefix")
val mutableList = mutableListOf<String>()
val result = scanForDevices()
.take(3, TimeUnit.SECONDS, timeoutScheduler)
.subscribe { scanResult ->
val name = scanResult.bleDevice.name
Logger.d(TAG, "Potential device named $name found")
if(name != null) {
if(name.startsWith(prefix = devicePrefix)) {
Logger.d(TAG, "Match found $name")
mutableList.plus(name)
}
}
}
return mutableList
}
private fun scanForDevices(): Observable<ScanResult>
= rxBleClient.scanBleDevices(
ScanSettings.Builder()
.setScanMode(ScanSettings.SCAN_MODE_LOW_LATENCY)
.setCallbackType(ScanSettings.CALLBACK_TYPE_ALL_MATCHES)
.build(),
ScanFilter.Builder()
.build())
}
OK, here it is boiled down for the next person who wants to do this kind of thing. In Rx, they have Singles which are Observables that just emit one value. In my case I needed a list of String values, so just need to use a Single of type List of type String. That gets just one element emitted that happens to be a list of Strings. The code looks like this...
fun returnAllDevicesStartingWith(devicePrefix: String): Single<List<String>> {
return scanForDevices()
.take(3, TimeUnit.SECONDS, timeoutScheduler)
.map { it.bleDevice.name }
.filter { it.startsWith(devicePrefix) }
.toList()
}
The function that calls it (written in Java instead of Kotlin) looks like this:
List<String> devices = bleUtility.returnAllDevicesStartingWith(prefix).blockingGet();
I tested it using a mocked function like this:
//Begin test code
var emittedList: List<String> = listOf("dev1-1", "dev1-2", "dev2-1", "dev2-2", "dev3-1", "dev3-2")
private fun scanForRoomDevices(): Observable<FoundDevice> = Observable
.intervalRange(0, emittedList.size.toLong(), 0, 1, TimeUnit.SECONDS, timeoutScheduler)
.map { index -> FoundDevice(emittedList[index.toInt()], BleDevice(emittedList[index.toInt()])) }
data class FoundDevice(val controllerId: String, val bleDevice: BleDevice)
data class BleDevice(val name: String)
Hope this helps others.

Checking a list and timing out if all entries not found in RxJava/RxKotlin

I have a scenario where I have a function, scanForTargets, that returns an Observable of type FoundNumber. In FoundNumber I just need an ID field I can grab out of it. As each element comes back in the scanResults Observable, I want to check to see if the name field matches one of the names on a target list. If so, then I want to emit that. For example, if I am looking for numbers 1, and 2, and scanForTargets() emits back 1, 2, 3, and 4, then I want scanForValues to emit back only 1 and 2.
The caveat is that I only want to continue doing this until either:
1) A time period elapses (in which case I throw and error)
2) All items on the String list are found before the timeout.
What I have so far looks like this, but I cannot get it to work for me mostly due to the shortcut of stopping once/if all of the targets are found before the timeout.
fun scanForValues(targetList: List<String>): Observable<FoundNumber> {
val scanResult = scanForTargets()
return scanResult.doOnNext {scanResult -> Log.d(TAG, "Found potential target: " + scanResult.name) }
.filter(TargetPredicate(targetList)) //See if it's one of those we want
.timeout(5, TimeUnit.SECONDS) //Wait a max of 5 seconds to find all items
.doOnError { Log.w(TAG, "Failed to scan"}") }
.map{s->scanResult.name}
}
class TargetPredicate(private val targetList: List<String>) : Predicate<ScanResult> { override fun test(scanResult: ScanResult): Boolean {
if(scanResult == null) {
return false
}
return scanResult.name in targetList
}
}
How can I also add the check to stop if I find all of the items in the list? I can't just add another predicate right?
Thanks.
Update: As requested, here is some data to show what I mean.
Let's say that the scanForTargets() and supporting code looks like this:
var emittedList: List<String?> = listOf(null, "0", "1", "2", "3")
fun scanForTargets(): Observable<FoundNumber> = Observable
.intervalRange(0, emittedList.size.toLong(), 0, 1, TimeUnit.SECONDS)
.map { index -> FoundNumber(emittedList[index.toInt()]) }
data class FoundNumber(val targetId: String?)
Now if scanForValues was called with a list of 1 and 2, then it should emit back an Observable of 1 and then 2.
No, it is not as simple as adding another filter.
A possible solution is to use scan to remove items from a set containing your targets, and complete when the set becomes empty.
Example:
val targets = listOf("a", "b", "c")
fun scanForTarget(): Observable<String> = Observable.just("a", "b")
fun scanForValues(targets: List<String>): Completable {
val initial = targets.toMutableSet()
return scanForTarget()
.timeout(5, TimeUnit.SECONDS)
.scan(initial) { acc, next -> acc.remove(next); acc }
.filter { it.isEmpty() }
.singleOrError()
.toCompletable()
}
Note: a Completable is a special type of publisher that can only signal onComplete or onError.
Update: response to question update.
The new example in your question won't work, because null values are not allowed in RxJava2.
Assuming you fix that, the following solution may help you.
fun scanForValues(targets: List<String>): Observable<String> {
val accumulator: Pair<Set<String>, String?> = targets.toSet() to null
return scanForTarget()
.timeout(5, TimeUnit.SECONDS)
.scan(accumulator) { acc, next ->
val (set, previous) = acc
val item = if (next in set) next else null
(set - next) to item // return set and nullable item
}
.filter { it.second != null } // item not null
.take(initial.size) // limit to the number of items
.map { it.second } // unwrap the item from the pair
.map { FoundNumber(it) } // wrap in your class
}
Instead of using only the Set<String> as the accumulator, now we also add the item.
The item is nullable, this allows us to check if a given item was present or not.
Notice that no null values are passed through the observable flow. In this case null values are wrapped inside Pair<Set<String>, String?> which are never null themselves.

Why does map(_:) in Swift Playground return a String and not a tuple?

I am attempting to use Swift Playground to use map(_:) and enumerated() to walk through an array of orders, returning the first perfect match to a customers goods.
However, when testing in Swift Playground; the map(_:) function returns a string when it should be a tuple.
I'm attempting to retrieve the index and the value; of a given array filter.
Right now, my current solution is this;
let orders = [4,2,7]
let goods = 2
var matching:Int = (orders.filter{ $0 == goods }.first) ?? 0 as Int
In this example, the answer is 2; however it doesn't give me the index of the array.
My second attempt in Swift Playground is thus
var r = (orders.filter{ $0 == goods }).enumerated().map { (index, element) -> (Int,Int) in
return (index, element)
}
print (r.first!) // This should report (0,2)
However, this in Swift Playground prints out in the sidebar panel
"(0, 2)\n"
Screenshot:
Why does the sidebar reporting that this is a string?
Is there a way to get the index and element correctly in this example?
Get Index of Orders that matches Goods
To get index & element:
var r = orders.enumerated().map { ( index, element) -> (Int, Int) in
return (index, element)
}.filter { (index, element) -> Bool in
if element == goods {
return true
}
return false
}
or more compact:
var r = orders.enumerated().map { index, element in (index, element) }
.filter { _, element in element == goods ? true : false }
print("r: \(r.first)")
Prints:
r: (1, 2)
If you really want to find the first match only, a for loop is more efficient, as you can break after the first match is found.
Playground
What you see "(0, 2)\n" is the result of print. What it prints out in console is (0, 2) plus newline.
If you want to see the actual value of r.first!in the sidebar, remove the print:
print (r.first!)
r.first!
Result:
Others have already covered the answer, this is just a side note. I suggest you break down your statement to make it clearer.
var r = orders.filter{ $0 == goods }
.enumerated()
.map{ index, element in (index, element) }
The print statement puts your output in "" and also adds the linebreak \n at the end.
If you write r.first!, you will see that it actually is a tuple.