GroupBy in fs2? - scala

Is there a function like groupBy (from rxjs) in fs2?
I'd like to use it to covert stream of messages to stream of streams with messages from each chat depends on message's chatId.
Example in rxjs:
function chats(messages$: Observable<Message>): Observable<GroupedObservable<number, Message>> {
return messages$.pipe(groupBy((m) => m.chatId));
}

Related

Get the last records of KStream

I'm very new to Kafka Stream API.
I have a KStream like this:
KStream<Long,String> joinStream = builder.stream(("output"));
The KStream with records value look like this:
The stream will be updated every 1s.
I need to build a Rest API that will be calculated based on the value profit and spotPrice.
But I've struggled to get the value of the last record.
I am assuming that you mean the max value of the stream when you say the last value as the values are continuously arriving. Then you can use the reduce transformation to always update the output stream with the max value.
final StreamsBuilder builder = new StreamsBuilder();
KStream<Long, String> stream = builder.stream("INPUT_TOPIC", Consumed.with(Serdes.Long(), Serdes.String()));
stream
.mapValues(value -> Long.valueOf(value))
.groupByKey()
.reduce(new Reducer<Long>() {
#Override
public Long apply(Long currentMax, Long v) {
return (currentMax > v) ? currentMax : v;
}
})
.toStream().to("OUTPUT_TOPIC");
return builder.build();
And in case that you want to retrive it in a rest api i suggest to take a look at Spring cloud + Kafka streams (https://cloud.spring.io/spring-cloud-stream-binder-kafka/spring-cloud-stream-binder-kafka.html) that you can exchange messages to spring web.

How can I use the "Multi" stream to parse output from JPQL resultList() to send individual items to a topic instead of the entire list object

Trying to create a reactive method (using the #Outgoing annotation) that sends a list of events to a kafka topic
e.g.,
#Outgoing("kafkatopic01")
public Multi<List<Thing>> poll() {
return Multi.createFrom()
.ticks()
.every(Duration.ofSeconds(10))
.onOverflow().drop()
.map(tick -> (List<Things>) ds.getData())
[...]
The "ds.getData()" - in the above example - returns a list of events ("Thing"s) - from a JPQL namedquery - to send to a topic.
QUESTION: How can I code the above...
"return Multi.createFrom()..."
...such that the returned list is not sent as a single object to the "#Outgoing" topic?
In other words, how can I modify the above "Multi" stream such that the list of "Thing" events are sent individually, and not as a single object
kafka
quarkus 1.11.0.CR1
java 11
Looks like this ".onItem().disjoint()" accomplishes what I'm looking for...
return Multi.createFrom()
.ticks()
.every(Duration.ofSeconds(10))
.onOverflow().drop()
.map(tick -> {
List<Thing> list = (List<Thing>) polldata.get(time.get("datetime").atZone(ZoneId.of("America/New_York")));
time.put("datetime", (list.size() == 0 ? time.get("datetime") : Instant.now()));
return list;
})
.onItem()
.<Thing>disjoint();

How to merge stream using StreamZip in Dart

I have two streams:
Stream<List<Order>> stream1 = pendingStream();
Stream<List<Order>> stream2 = preparingStream();
I'm trying to use StreamZip from the package:async/async.dart package to merge the streams like so...
Stream<List<Order>> getData() {
Stream<List<Order>> stream1 = pendingStream();
Stream<List<Order>> stream2 = preparingStream();
return StreamZip([stream1, stream2]);
}
However it won't compile. Saying:
The element type 'Stream<List<Order>>' can't be assigned to the list type 'Stream<Order>'.
From what I understand StreamZip should accept the two streams? What am I dong wrong?
You are creating a StreamZip<T> which will emit a List<T> of each event of its merged streams as you can refer in the documentation.
Each of your merged streams emit a List<Order> type, so that means that you will create a merged stream that will emit a List of List.
Basically, you only need to change your return type from Stream<List<Order>> to Stream<List<List<Order>>>.

Kafka Stream producing custom list of messages based on certain conditions

We have the following stream processing requirement.
Source Stream ->
transform(condition check - If (true) then generate MULTIPLE ADDITIONAL messages else just transform the incoming message) ->
output kafka topic
Example:
If condition is true for message B(D,E,F are the additional messages produced)
A,B,C -> A,D,E,F,C -> Sink Kafka Topic
If condition is false
A,B,C -> A,B,C -> Sink Kafka Topic
Is there a way we can achieve this in Kafka streams?
You can use flatMap() or flatMapValues() methods. These methods take one record and produce zero, one or more records.
flatMap() can modify the key, values and their datatypes while flatMapValues() retains the original keys and change the value and value data type.
Here is an example pseudocode considering the new messages "C","D","E" will have a new key.
KStream<byte[], String> inputStream = builder.stream("inputTopic");
KStream<byte[], String> outStream = inputStream.flatMap(
(key,value)->{
List<KeyValue<byte[], String>> result = new LinkedList<>();
// If message value is "B". Otherwise place your condition based on data
if(value.equalsTo("B")){
result.add(KeyValue.pair("<new key for message C>","C"));
result.add(KeyValue.pair("<new key for message D>","D"));
result.add(KeyValue.pair("<new key for message E>","E"));
}else{
result.add(KeyValue.pair(key,value));
}
return result;
});
outStream.to("sinkTopic");
You can read more about this :
https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#streams-developer-guide-dsl-transformations-stateless

How to Stream to a Global Kafka Table

I have a Kafka Streams application that needs to join an incoming stream against a global table, then after some processing, write out the result of an aggregate back to that table:
KeyValueBytesStoreSupplier supplier = Stores.persistentKeyValueStore(
storeName
);
Materialized<String, String, KeyValueStore<Bytes, byte[]>> m = Materialized.as(
supplier
);
GlobalKTable<String, String> table = builder.globalTable(
topic, m.withKeySerde(
Serdes.String()
).withValueSerde(
Serdes.String()
)
);
stream.leftJoin(
table
...
).groupByKey().aggregate(
...
).toStream().through(
topic, Produced.with(Serdes.String(), Serdes.String())
);
However, when I try to stream into the KTable changelog, I get the following error: Invalid topology: Topic 'topic' has already been registered by another source.
If I try to aggregate to the store itself, I get the following error: InvalidStateStoreException: Store 'store' is currently closed.
How can both join against the table and write back to its changelog?
If this isn't possible, a solution that involves filtering incoming logs against the store would also work.
Calling through() is a shortcut for
stream.to("topic");
KStream stream2 = builder.stream("topic");
Because you use builder.stream("topic") already, you get Invalid topology: Topic 'topic' has already been registered by another source. because each topic can only be consumed once. If you want to feed the data of a stream/topic into different part, you need to reuse the original KStream you created for this topic:
KStream stream = builder.stream("topic");
// this won't work
KStream stream2 = stream.through("topic");
// rewrite to
stream.to("topic");
KStream stream2 = stream; // or just omit `stream2` and reuse `stream`
Not sure what you mean by
If I try to aggregate to the store itself