Combining parts of Stream - system.reactive

I've got an observable watching a log that is continuously being written too. Each line is a new onNext call. Sometimes the log outputs a single log item over multiple lines. Detecting this is easy, I just can't find the right RX call.
I'd like to find a way to collect the single log items into a List of lines, and onNext the list when the single log item is complete.
Buffer doesn't seem right as this isn't time based, it's algorithm based.
GroupBy might be what I want, but the documentation is confusing for it. It also seems that the observables it creates probably won't have onComplete called until the completion of the source observable.
This solution can't delay the log much (preferably not at all). I need to be reading the log as close to real time as possible, and order matters.
Any push in the right direction would be great.

This is a typical reactive parsing problem. You could use Rxx Parsers, or for a native solution you can build your own state machine with either Scan or by defining an async iterator. Scan is preferable for simple parsers and often uses a Scan-Where-Select pattern.
Async iterator state machine example: Turnstile
Scan parser example (untested):
IObservable<string> lines = ReadLines();
IObservable<IReadOnlyList<string>> parsed = lines.Scan(
new
{
ParsingItem = (IEnumerable<string>)null,
Item = (IEnumerable<string>)null
},
(state, line) =>
// I'm assuming here that items never span lines partially.
IsItem(line)
? IsItemLastLine(line)
? new
{
ParsingItem = (IEnumerable<string>)null,
Item = (state.ParsingItem ?? Enumerable.Empty<string>()).Concat(line)
}
: new
{
ParsingItem = (state.ParsingItem ?? Enumerable.Empty<string>()).Concat(line),
Item = (List<string>)null
}
: new
{
ParsingItem = (IEnumerable<string>)null,
Item = new[] { line }
})
.Where(result => result.Item != null)
.Select(result => result.Item.ToList().AsReadOnly());

Related

Is there an Operation to block onComplete?

I am trying to learn reactive programming, so forgive me if I ask a silly question. I'm also open to advice on changing my design.
I am working in scala-swing to display the results of a simulator. With one setting, a chart is displayed as a histogram; with the other setting the chart is displayed as the cumulative sum. (I'm probably using the wrong word; in the first setting you might have bin1=2, bin2=5, bin3=3; in the second setting the first height is 2, the second is 2 + 5, the third is 2 + 5 + 3, etc.). The simulator can be slow, so I originally used a Future to compute it, and the set the data into the chart. I decided to try a reactive approach, so my requirements are: 1. I don't want to recreate the data when I change the display mode, and 2. I want to set the Observable once for the chart and have the chart listen to the same Observable permanently.
I got this to work when I started the chain with a PublishSubject and the Future set the data into the start of the chain. When the display mode changed, I created a new PublishSubject().map(newRenderingLogic).subscribe(theChartsObservable). I am now trying to do what looks like the "right way," but it's not working correctly. I've tried to simplify what I have done:
val textObservable: Subject[String] = PublishSubject()
textObservable.subscribe(text => {
println(s"Text: ${text}")
})
var textSubscription: Option[Subscription] = None
val start = Observable.from(Future {
"Base text"
}).cache
var i = 0
val button = new Button() {
text = "Click"
reactions += {
case event => {
i += 1
if (textSubscription.isDefined) {
textSubscription.get.unsubscribe()
}
textSubscription = Some(start.map(((j: Int) => { (base: String) => s"${base} ${j}" })(i)).subscribe(textObservable))
}
}
}
On start, an Observable is created and logic to print some text is added to it. Then, an Observable with the generated data is created and a cache is added so that the result is replayed if the next subscription comes in after its results are generated. Then, a button is created. Then on button clicks a middle observable is chained with unique logic (it's a function that creates a function to append the value of i into the string, run with the current value of i; I tried to make something that couldn't just be reused) that is supposed to change with each click. Then the first Observable is subscribed to it so that the results of the whole chain end up being printed.
In theory, the cache operation takes care of not regenerating the data, and this works once, but onComplete is called on textObservable and then it can't be used again. It works if I subscribe it like this:
textSubscription = Some(start.map(((j: Int) => { (base: String) => s"${base} ${j}" })(i)).subscribe(text => textObservable.onNext(text)))
because the call to onComplete is intercepted, but this looks wrong and I wanted to know if there was a more typical way to do this, or architect it. It makes me think that I don't understand how this is supposed to be done if there isn't an out-of-the-box operation to do this.
Thank you.
I'm not 100% sure if I got the essence of your question right, but: if you have an Observable that may complete and you want to turn it into an Observable that never completes, you can just concatenate it with Observable.never.
For example:
// will complete after emitting those three elements:
val completes = Observable.from(List(1, 2, 3))
// will emit those three elements, but will never complete:
val wontComplete = completes ++ Observable.never

How to combine the elements of an arbitrary number of dependent Fluxes?

In the non reactive world the following code snippet is nothing special:
interface Enhancer {
Result enhance(Result result);
}
Result result = Result.empty();
result = fooEnhancer.enhance(result);
result = barEnhancer.enhance(result);
result = bazEnhancer.enhance(result);
There are three different Enhancer implementations taking a Result instance, enhancing it and returning the enhanced result. Let's assume the order of the enhancer calls matters.
Now what if these methods are replaced by reactive variants returning a Flux<Result>? Because the methods depend on the result(s) of the preceding method, we cannot use combineLatest here.
A possible solution could be:
Flux.just(Result.empty())
.switchMap(result -> first(result)
.switchMap(result -> second(result)
.switchMap(result -> third(result))))
.subscribe(result -> doSomethingWith(result));
Note that the switchMap calls are nested. As we are only interested in the final result, we let switchMap switch to the next flux as soon as new events are emitted in preceding fluxes.
Now let's try to do it with a dynamic number of fluxes. Non reactive (without fluxes), this would again be nothing special:
List<Enhancer> enhancers = <ordered list of different Enhancer impls>;
Result result = Result.empty();
for (Enhancer enhancer : enhancers) {
result = enhancer.enhance(result);
}
But how can I generalize the above reactive example with three fluxes to deal with an arbitrary number of fluxes?
I found a solution using recursion:
#FunctionalInterface
interface FluxProvider {
Flux<Result> get(Result result);
}
// recursive method creating the final Flux
private Flux<Result> cascadingSwitchMap(Result input, List<FluxProvider> fluxProviders, int idx) {
if (idx < fluxProviders.size()) {
return fluxProviders.get(idx).get(input).switchMap(result -> cascadingSwitchMap(result, fluxProviders, idx + 1));
}
return Flux.just(input);
}
// code using the recursive method
List<FluxProvider> fluxProviders = new ArrayList<>();
fluxProviders.add(fooEnhancer::enhance);
fluxProviders.add(barEnhancer::enhance);
fluxProviders.add(bazEnhancer::enhance);
cascadingSwitchMap(Result.empty(), fluxProviders, 0)
.subscribe(result -> doSomethingWith(result));
But maybe there is a more elegant solution using an operator/feature of project-reactor. Does anybody know such a feature? In fact, the requirement doesn't seem to be such an unusual one, is it?
switchMap feels inappropriate here. If you have a List<Enhancer> by the time the Flux pipeline is declared, why not apply a logic close to what you had in imperative style:
List<Enhancer> enhancers = <ordered list of different Enhancer impls>;
Mono<Result> resultMono = Mono.just(Result.empty)
for (Enhancer enhancer : enhancers) {
resultMono = resultMono.map(enhancer::enhance); //previousValue -> enhancer.enhance(previousValue)
}
return resultMono;
That can even be performed later at subscription time for even more dynamic resolution of the enhancers by wrapping the whole code above in a Mono.defer(() -> {...}) block.

RxJS interleaving merged observables (priority queue?)

UPDATE
I think I've figured out the solution. I explain it in this video. Basically, use timeoutWith, and some tricks with zip (within zip).
https://youtu.be/0A7C1oJSJDk
If I have a single observable like this:
A-1-2--B-3-4-5-C--D--6-7-E
I want to put the "numbers" as lower priority; it should wait until the "letters" is filled up (a group of 2 for example) OR a timeout is reached, and then it can emit. Maybe the following illustration (of the desired result) can help:
A------B-1-----C--D-2----E-3-4-5-6-7
I've been experimenting with some ideas... one of them: first step is to split that stream (groupBy), one containing letters, and the other containing numbers..., then "something in the middle" happen..., and finally those two (sub)streams get merged.
It's that "something in the middle" what I'm trying to figure out.
How to achieve it? Is that even possible with RxJS (ver 5.5.6)? If not, what's the closest one? I mean, what I want to avoid is having the "numbers" flooding the stream, and not giving enough chance for the "letters" to be processed in timely manner.
Probably this video I made of my efforts so far can clarify as well:
Original problem statement: https://www.youtube.com/watch?v=mEmU4JK5Tic
So far: https://www.youtube.com/watch?v=HWDI9wpVxJk&feature=youtu.be
The problem with my solution so far (delaying each emission in "numbers" substream using .delay) is suboptimal, because it keeps clocking at slow pace (10 seconds) even after the "characters" (sub)stream has ended (not completed -- no clear boundary here -- just not getting more value for indeterminate amount of time). What I really need is, to have the "numbers" substream raise its pace (to 2 seconds) once that happen.
Unfortunately I don't know RxJs5 that much and use xstream myself (authored by one of the contributor to RxJS5) which is a little bit simpler in terms of the number of operators.
With this I crafted the following example:
(Note: the operators are pretty much the same as in Rx5, the main difference is with flatten wich is more or less like switch but seems to handle synchronous streams differently).
const xs = require("xstream").default;
const input$ = xs.of("A",1,2,"B",3,4,5,"C","D",6,7,"E");
const initialState = { $: xs.never(), count: 0, buffer: [] };
const state$ = input$
.fold((state, value) => {
const t = typeof value;
if (t === "string") {
return {
...state,
$: xs.of(value),
count: state.count + 1
};
}
if (state.count >= 2) {
const l = state.buffer.length;
return {
...state,
$: l > 0 ? xs.of(state.buffer[0]) : xs.of(value) ,
count: 0,
buffer: state.buffer.slice(1).concat(value)
};
}
return {
...state,
$: xs.never(),
buffer: state.buffer.concat(value),
};
}, initialState);
xs
.merge(
state$
.map(s => s.$),
state$
.last()
.map(s => xs.of.apply(xs, s.buffer))
)
.flatten()
.subscribe({
next: console.log
});
Which gives me the result you are looking for.
It works by folding the stream on itself, looking at the type of values and emitting a new stream depending on it. When you need to wait because not enough letters were dispatched I emit an emptystream (emits no value, no errors, no complete) as a "placeholder".
You could instead of emitting this empty stream emit something like
xs.empty().endsWith(xs.periodic(timeout)).last().mapTo(value):
// stream that will emit a value only after a specified timeout.
// Because the streams are **not** flattened concurrently you can
// use this as a "pending" stream that may or may not be eventually
// consumed
where value is the last received number in order to implement timeout related conditions however you would then need to introduce some kind of reflexivity with either a Subject in Rx or xs.imitate with xstream because you would need to notify your state that your "pending" stream has been consumed wich makes the communication bi-directionnal whereas streams / observables are unidirectionnal.
The key here the use of timeoutWith, to switch to the more aggresive "pacer", when the "events" kicks in. In this case the "event" is "idle detected in the higher-priority stream".
The video: https://youtu.be/0A7C1oJSJDk

Need help - How to loop through a list and/or a map

Scala is pretty new for me and I have problems as soon as a leave the gatling dsl.
In my case I call an API (Mailhog) which responds with a lot of mails in json-format. I can’t grab all the values.
I need it with “jsonPath” and I need to “regex” as well.
That leads into a map and a list which I need to iterate through and save each value.
.check(jsonPath("$[*]").ofType[Map[String,Any]].findAll.saveAs("id_map"))
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.saveAs("url_list"))
At first I wanted to loop the “checks” but I did’nt find any to repeat them without repeating the “get”-request too. So it’s a map and a list.
1) I need every value of the map and was able to solve the problem with the following foreach loop.
.foreach("${id_map}", "idx") {
exec(session => {
val idMap = session("idx").as[Map[String,Any]]
val ID = idMap("ID")
session.set("ID", ID)
})
.exec(http("Test")
.get("/{ID}"))
})}
2) I need every 3rd value of the list and make a get-request on them. Before I can do this, I need to replace a part of the string. I tried to replace parts of the string while checking for them. But it won’t work with findAll.
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.transform(raw => raw.replace("""=\r\n""","")).saveAs("url"))
How can I replace a part of every string in my list?
also how can I make a get-request on every 3rd element in the list.
I can't get it to work with the same foreach structure above.
I was abole to solve the problem by myself. At first I made a little change to my check(regex ...) part.
.check(regex("href=3D\\\\\"(.*?)\\\\\"").findAll.transform(_.map(raw => raw.replace("""=\r\n""",""))).saveAs("url_list"))
Then I wanted to make a Get-Request only on every third element of my list (because the URLs I extracted appeared three times per Mail).
.exec(session => {
val url_list =
session("url_list").as[List[Any]].grouped(3).map(_.head).toList
session.set("url_list", url_list)
})
At the end I iterate through my final list with a foreach-loop.
foreach("${url_list}", "urls") {
exec(http("Activate User")
.get("${urls}")
)
}

NodeJS: What is the proper way to handling TCP socket streams ? Which delimiter should I use?

From what I understood here, "V8 has a generational garbage collector. Moves objects aound randomly. Node can’t get a pointer to raw string data to write to socket." so I shouldn't store data that comes from a TCP stream in a string, specially if that string becomes bigger than Math.pow(2,16) bytes. (hope I'm right till now..)
What is then the best way to handle all the data that's comming from a TCP socket ? So far I've been trying to use _:_:_ as a delimiter because I think it's somehow unique and won't mess around other things.
A sample of the data that would come would be something_:_:_maybe a large text_:_:_ maybe tons of lines_:_:_more and more data
This is what I tried to do:
net = require('net');
var server = net.createServer(function (socket) {
socket.on('connect',function() {
console.log('someone connected');
buf = new Buffer(Math.pow(2,16)); //new buffer with size 2^16
socket.on('data',function(data) {
if (data.toString().search('_:_:_') === -1) { // If there's no separator in the data that just arrived...
buf.write(data.toString()); // ... write it on the buffer. it's part of another message that will come.
} else { // if there is a separator in the data that arrived
parts = data.toString().split('_:_:_'); // the first part is the end of a previous message, the last part is the start of a message to be completed in the future. Parts between separators are independent messages
if (parts.length == 2) {
msg = buf.toString('utf-8',0,4) + parts[0];
console.log('MSG: '+ msg);
buf = (new Buffer(Math.pow(2,16))).write(parts[1]);
} else {
msg = buf.toString() + parts[0];
for (var i = 1; i <= parts.length -1; i++) {
if (i !== parts.length-1) {
msg = parts[i];
console.log('MSG: '+msg);
} else {
buf.write(parts[i]);
}
}
}
}
});
});
});
server.listen(9999);
Whenever I try to console.log('MSG' + msg), it will print out the whole buffer, so it's useless to see if something worked.
How can I handle this data the proper way ? Would the lazy module work, even if this data is not line oriented ? Is there some other module to handle streams that are not line oriented ?
It has indeed been said that there's extra work going on because Node has to take that buffer and then push it into v8/cast it to a string. However, doing a toString() on the buffer isn't any better. There's no good solution to this right now, as far as I know, especially if your end goal is to get a string and fool around with it. Its one of the things Ryan mentioned # nodeconf as an area where work needs to be done.
As for delimiter, you can choose whatever you want. A lot of binary protocols choose to include a fixed header, such that you can put things in a normal structure, which a lot of times includes a length. In this way, you slice apart a known header and get information about the rest of the data without having to iterate over the entire buffer. With a scheme like that, one can use a tool like:
node-buffer - https://github.com/substack/node-binary
node-ctype - https://github.com/rmustacc/node-ctype
As an aside, buffers can be accessed via array syntax, and they can also be sliced apart with .slice().
Lastly, check here: https://github.com/joyent/node/wiki/modules -- find a module that parses a simple tcp protocol and seems to do it well, and read some code.
You should use the new stream2 api. http://nodejs.org/api/stream.html
Here are some very useful examples: https://github.com/substack/stream-handbook
https://github.com/lvgithub/stick