How do I send multiple messages as my source - scala

I'm just trying out this sample stream that currently has a single TextMessage as a source:
// print each incoming strict text message
val printSink: Sink[Message, Future[Done]] =
Sink.foreach {
case message: TextMessage.Strict =>
println(message.text)
case _ =>
// ignore other message types
}
val helloSource: Source[Message, NotUsed] =
Source.single(TextMessage("hello world!"))
// the Future[Done] is the materialized value of Sink.foreach
// and it is completed when the stream completes
val flow: Flow[Message, Message, Future[Done]] =
Flow.fromSinkAndSourceMat(printSink, helloSource)(Keep.left)
I want to send 2 messages, so I tried this:
val source1 = Source.single(TextMessage("hello"))
val source2 = Source.single(TextMessage("world"))
val helloSource: Source[Message, NotUsed] =
Source.combine(source2)
But I get this error:
polymorphic expression cannot be instantiated to expected type;
[error] found : [U](strategy: Int => akka.stream.Graph[akka.stream.UniformFanInShape[akka.http.scaladsl.model.ws.TextMessage.Strict,U],akka.NotUsed]): akka.stream.scaladsl.Source[U,akka.NotUsed]
[error] required: akka.stream.scaladsl.Source[akka.http.scaladsl.model.ws.Message,akka.NotUsed]
[error] Source.combine(source1, source2)
[error] ^
[error] one error found
What exactly should I be doing instead?

Source.combine is a flexible way to combine multiple sources, and you need to specify a strategy for combining them, as described in the linked documentation.
In this case, where you want to have one finite source be followed by another, you can use the Concat strategy.
val helloSource: Source[Message, NotUsed] =
Source.combine(source1, source2)(Concat(_))
As a simpler alternative, you can use the concat method on the first source:
val helloSource: Source[Message, NotUsed] =
source1.concat(source2)
However, for this example, where you have a fixed set of hardcoded elements, it's even simpler to avoid the creation of multiple sources and create only a single source from an Iterable with Source.apply:
val helloSource: Source[Message, NotUsed] =
Source(Seq(TextMessage("hello"), TextMessage("world")))

Related

How can I use a value in an Akka Stream to instantiate a GooglePubSub Flow?

I'm attempting to create a Flow to be used with a Source queue. I would like this to work with the Alpakka Google PubSub connector: https://doc.akka.io/docs/alpakka/current/google-cloud-pub-sub.html
In order to use this connector, I need to create a Flow that depends on the topic name provided as a String, as shown in the above link and in the code snippet.
val publishFlow: Flow[PublishRequest, Seq[String], NotUsed] =
GooglePubSub.publish(topic, config)
The question
I would like to be able to setup a Source queue that receives the topic and message required for publishing a message. I first create the necessary PublishRequest out of the message String. I then want to run this through the Flow that is instantiated by running GooglePubSub.publish(topic, config). However, I don't know how to get the topic to that part of the flow.
val gcFlow: Flow[(String, String), PublishRequest, NotUsed] = Flow[(String, String)]
.map(messageData => {
PublishRequest(Seq(
PubSubMessage(new String(Base64.getEncoder.encode(messageData._1.getBytes))))
)
})
.via(GooglePubSub.publish(topic, config))
val bufferSize = 10
val elementsToProcess = 5
// newSource is a Source[PublishRequest, NotUsed]
val (queue, newSource) = Source
.queue[(String, String)](bufferSize, OverflowStrategy.backpressure)
.via(gcFlow)
.preMaterialize()
I'm not sure if there's a way to get the topic into the queue without it being a part of the initial data stream. And I don't know how to get the stream value into the dynamic Flow.
If I have improperly used some terminology, please keep in mind that I'm new to this.
You can achieve it by using flatMapConcat and generating a new Source within it:
// using tuple assuming (Topic, Message)
val gcFlow: Flow[(String, String), (String, PublishRequest), NotUsed] = Flow[(String, String)]
.map(messageData => {
val pr = PublishRequest(immutable.Seq(
PubSubMessage(new String(Base64.getEncoder.encode(messageData._2.getBytes)))))
// output flow shape of (String, PublishRequest)
(messageData._1, pr)
})
val publishFlow: Flow[(String, PublishRequest), Seq[String], NotUsed] =
Flow[(String, PublishRequest)].flatMapConcat {
case (topic: String, pr: PublishRequest) =>
// Create a Source[PublishRequest]
Source.single(pr).via(GooglePubSub.publish(topic, config))
}
// wire it up
val (queue, newSource) = Source
.queue[(String, String)](bufferSize, OverflowStrategy.backpressure)
.via(gcFlow)
.via(publishFlow)
.preMaterialize()
Optionally you could substitute tuple with a case class to document it better
case class Something(topic: String, payload: PublishRequest)
// output flow shape of Something[String, PublishRequest]
Something(messageData._1, pr)
Flow[Something[String, PublishRequest]].flatMapConcat { s =>
Source.single(s.payload)... // etc
}
Explanation:
In gcFlow we output FlowShape of tuple (String, PublishRequest) which is passed through publishFlow. The input is tuple (String, PublishRequest) and in flatMapConcat we generate new Source[PublishRequest] which is flowed through GooglePubSub.publish
There would be slight overhead creating new Source for every item. This shouldn't have measurable impact on performance

Akka Stream - How to Stream from multiple SQS Sources

This is a subsequent post of Akka Stream - Select Sink based on Element in Flow.
Assume I have multiple SQS queues I'd like to stream from. I'm using the AWS SQS Connector of Alpakka to create Source.
implicit val sqsClient: AmazonSQSAsync = ???
val queueUrls: List[String] = ???
val sources: List[Source[Message, NotUsed]] = queueUrls.map(url => SqsSource(url))
Now, I'd like to combine the sources to merge them. However, the Source.combine method doesn't support passing a list as parameter, but only support varargs.
def combine[T, U](first: Source[T, _], second: Source[T, _], rest: Source[T, _]*)(strategy: Int ⇒ Graph[UniformFanInShape[T, U], NotUsed])
Of course, I can finger type all sources parameters. But, the parameter will get pretty long if I have 10 source queues.
Is there a way to combine sources from a list of sources?
[Supplement]
As Ramon J Romero y Vigil pointed out, it's a better practice to keep stream "a thin veneer". In this particular case, however, I use single sqsClient for all the SqsSource initialization.
You could use foldLeft to concatenate or merge the sources:
val sources: List[Source[Message, NotUsed]] = ???
val concatenated: Source[Message, NotUsed] = sources.foldLeft(Source.empty[Message])(_ ++ _)
// the same as sources.foldLeft(Source.empty[Message])(_ concat _)
val merged: Source[Message, NotUsed] = sources.foldLeft(Source.empty[Message])(_ merge _)
Alternatively, you could use Source.zipN with flatMapConcat:
val combined: Source[Message, NotUsed] = Source.zipN(sources).flatMapConcat(Source.apply)

How to switch between multiple Sources?

Suppose I have two infinite sources of the same type witch could be connected to the one Graph. I want to switch between them from outside already materialized graph, might be the same way as it possible to shutdown one of them with KillSwitch.
val source1: Source[ByteString, NotUsed] = ???
val source2: Source[ByteString, NotUsed] = ???
val (switcher: Switcher, source: Source[ByteString, NotUsed]) =
Source.combine(source1,source2).withSwitcher.run()
switcher.switch()
By default I want to use source1 and after switch I want to consume data from source2
source1
\
switcher ~> source
source2
Is it possible to implement this logic with Akka Streams?
Ok, after some time I found the solution.
So here I can use the same principle as we have in VLAN. I just need to tag my sources and then pass them through MergeHub. After that it's easy to filter those sources by tag and produce right result as Source.
All that I need to switch from one to another Source is a change of filter condition.
source1.map(s => (tag1, s))
\
MergeHub.filter(_._1 == tagX).map(_._2) -> Source
/
source2.map(s => (tag2, s))
Here is some example:
object SomeSource {
private var current = "tag1"
val source1: Source[ByteString, NotUsed] = ???
val source2: Source[ByteString, NotUsed] = ???
def switch = {
current = if (current == "tag1") "tag2" else "tag1"
}
val (sink: Sink[(String, ByteString), NotUsed],
source: Source[ByteString, NotUsed]) =
MergeHub.source[(String, ByteString)]
.filter(_._1 == current)
.via(Flow[(String, ByteString)].map(_._2))
.toMat(BroadcastHub.sink[ByteString])(Keep.both).run()
source1.map(s => ("tag1", s)).runWith(sink)
source2.map(s => ("tag2", s)).runWith(sink)
}
SomeSource.source // do something with Source
SomeSource.switch() // then switch

How do I dynamically add Source to existing Graph?

What can be alternative to dynamically changing running graph ? Here is my situation. I have graph that ingests articles into DB. Articles come from 3 plugins in different format. Thus I have several flows
val converterFlow1: Flow[ImpArticle, Article, NotUsed]
val converterFlow2: Flow[NewsArticle, Article, NotUsed]
val sinkDB: Sink[Article, Future[Done]]
// These are being created every time I poll plugins
val sourceContentProvider : Source[ImpArticle, NotUsed]
val sourceNews : Source[NewsArticle, NotUsed]
val sourceCit : Source[Article, NotUsed]
val merged = Source.combine(
sourceContentProvider.via(converterFlow1),
sourceNews.via(converterFlow2),
sourceCit)(Merge(_))
val res = merged
.buffer(10, OverflowStrategy.backpressure)
.toMat(sinkDB)(Keep.both)
.run()
Problem is that I get data from content provider once per 24 hrs, from news once per 2 hrs and last source may come at any time because it's coming from humans.
I realize that graphs are immutable but how I can periodically attach new instances of Source to my graph so that I have single point of throttling of the process of ingesting ?
UPDATE: You can say my data is stream of Source-s, three sources in my case. But I cannot change that because I get instances of Source from external classes (so called plugins). These plugins work independently from my ingestion class. I can't combine them into one gigantic class to have single Source.
Okay, in general the correct way would be to join a stream of sources into a single source, i.e. go from Source[Source[T, _], Whatever] to Source[T, Whatever]. This can be done with flatMapConcat or with flatMapMerge. Therefore, if you can get a Source[Source[Article, NotUsed], NotUsed], you can use one of flatMap* variants and obtain a final Source[Article, NotUsed]. Do it for each of your sources (no pun intended), and then your original approach should work.
I've implemented code based up on answer given by Vladimir Matveev and want to share it with others since it looks like common use-case to me.
I knew about Source.queue which Viktor Klang mentioned but I wasn't aware of flatMapConcat. It's pure awesomeness.
implicit val system = ActorSystem("root")
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
case class ImpArticle(text: String)
case class NewsArticle(text: String)
case class Article(text: String)
val converterFlow1: Flow[ImpArticle, Article, NotUsed] = Flow[ImpArticle].map(a => Article("a:" + a.text))
val converterFlow2: Flow[NewsArticle, Article, NotUsed] = Flow[NewsArticle].map(a => Article("a:" + a.text))
val sinkDB: Sink[Article, Future[Done]] = Sink.foreach { a =>
Thread.sleep(1000)
println(a)
}
// These are being created every time I poll plugins
val sourceContentProvider: Source[ImpArticle, NotUsed] = Source(List(ImpArticle("cp1"), ImpArticle("cp2")))
val sourceNews: Source[NewsArticle, NotUsed] = Source(List(NewsArticle("news1"), NewsArticle("news2")))
val sourceCit: Source[Article, NotUsed] = Source(List(Article("a1"), Article("a2")))
val (queue, completionFut) = Source
.queue[Source[Article, NotUsed]](10, backpressure)
.flatMapConcat(identity)
.buffer(2, OverflowStrategy.backpressure)
.toMat(sinkDB)(Keep.both)
.run()
queue.offer(sourceContentProvider.via(converterFlow1))
queue.offer(sourceNews.via(converterFlow2))
queue.offer(sourceCit)
queue.complete()
completionFut.onComplete {
case Success(res) =>
println(res)
system.terminate()
case Failure(ex) =>
ex.printStackTrace()
system.terminate()
}
Await.result(system.whenTerminated, Duration.Inf)
I'd still check success of Future returned by queue.offer but in my case these calls will be pretty infrequent.
If you cannot model it as a Source[Source[_,_],_] then I'd consider using a Source.queue[Source[T,_]](queueSize, overflowStrategy): here
What you'll have to be careful about though is what happens if submission fails.

Akka Stream - Source.fromPublisher

I'm trying to execute the following code based on the akka stream quick start guide:
implicit val system = ActorSystem("QuickStart")
implicit val materializer = ActorMaterializer()
val songs = Source.fromPublisher(SongsService.stream)
val count: Flow[Song, Int, NotUsed] = Flow[Song].map(_ => 1)
val sumSink: Sink[Int, Future[Int]] = Sink.fold[Int, Int](0)(_ + _)
val counterGraph: RunnableGraph[Future[Int]] =
songs
.via(count)
.toMat(sumSink)(Keep.right)
val sum: Future[Int] = counterGraph.run()
sum.foreach(c => println(s"Total songs processed: $c"))
The problem here is that the future never return a result. The biggest difference from the documentation example is my Source.
I have a play enumerator, which I'm converting it to an Akka Publisher, resulting in this SongsService.stream
When using a defined list as a Source like:
val songs = Source(list)
It works, but using the Source.fromPublisher does not.
But the problem here is not the publisher indeed, I can do a simple operation and it works:
val songs = Source.fromPublisher(SongsService.stream)
songs.runForeach(println)
It goes through the database, create the play enumerator, convert it to a publisher and I can iterate over.
Any ideas?
Your publisher is likely never completing.