How to emulate Sink in akka streams? - scala

I have a simple "save" function that is using akka-stream-alpakka multipartUpload, it looks like this:
def save(fileName: String): Future[AWSLocation] = {
val uuid: String = s"${UUID.randomUUID()}"
val s3Sink: Sink[ByteString, Future[MultipartUploadResult]] = s3Client.multipartUpload(s"$bucketName", s"$uuid/$fileName")
val file = Paths.get(s"/tmp/$fileName")
FileIO.fromPath(file).runWith(s3Sink).map(res => {
AWSLocation(uuid, fileName, res.key)
}).recover {
case ex: S3Exception =>
logger.error("Upload to S3 failed with s3 exception", ex)
throw ex
case ex: Throwable =>
logger.error("Upload to S3 failed with an unknown exception", ex)
throw ex
}
}
I want to test this function, 2 cases:
that multipartUpload succeed and I get AWSLocation (my case class) back.
that multipartUpload fails and I get S3Exception
so i thought to spy on multipartUpload and return my own sink, like this:
val mockAmazonS3ProxyService: S3ClientProxy = mock[S3ClientProxy]
val s3serviceMock: S3Service = mock[S3Service]
override val fakeApplication: Application = GuiceApplicationBuilder()
.overrides(bind[S3ClientProxy].toInstance(mockAmazonS3ProxyService))
.router(Router.empty).build()
"test" in {
when(mockAmazonS3ProxyService.multipartUpload(anyString(), anyString())) thenReturn Sink(ByteString.empty, Future.successful(MultipartUploadResult(Uri(""),"","myKey123","",Some(""))))
val res = s3serviceMock.save("someFileName").futureValue
res.key shouldBe "myKey123"
}
the issue is that i get Error:(47, 93) akka.stream.scaladsl.Sink.type does not take parameters, i understand i cant create sink like this, but how can i?
or what could be a better way testing this?

Consider redesigning your method save so it becomes more testable and injection of specific sink that produce different outcomes for different tests is possible (as mentioned by Bennie Krijger).
def save(fileName: String): Future[AWSLocation] = {
val uuid: String = s"${UUID.randomUUID()}"
save(fileName)(() => s3Client.multipartUpload(s"$bucketName", s"$uuid/$fileName"))
}
def save(
fileName: String
)(createS3UploadSink: () => Sink[ByteString, Future[MultipartUploadResult]]): Future[AWSLocation] = {
val s3Sink: Sink[ByteString, Future[MultipartUploadResult]] = createS3UploadSink()
val file = Paths.get(s"/tmp/$fileName")
FileIO
.fromPath(file)
.runWith(s3Sink)
.map(res => {
AWSLocation(uuid, fileName, res.key)
})
.recover {
case ex: S3Exception =>
logger.error("Upload to S3 failed with s3 exception", ex)
throw ex
case ex: Throwable =>
logger.error("Upload to S3 failed with an unknown exception", ex)
throw ex
}
}
The test can look like
class MultipartUploadSpec extends TestKit(ActorSystem("multipartUpload")) with FunSpecLike {
implicit val mat: Materializer = ActorMaterializer()
describe("multipartUpload") {
it("should pass failure") {
val result = save(() => Sink.ignore.mapMaterializedValue(_ => Future.failed(new RuntimeException)))
// assert result
}
it("should pass successfully") {
val result = save(() => Sink.ignore.mapMaterializedValue(_ => Future.successful(new MultipartUploadResult(???))))
// assert result
}
}

Related

Why does the stream never get triggered?

I have the following stream, that never reach the map after flatMapConcat.
private def stream[A](ref: ActorRef[ServerHealthStreamer])(implicit system: ActorSystem[A])
: KillSwitch = {
implicit val materializer = ActorMaterializer()
implicit val dispatcher = materializer.executionContext
system.log.info("=============> Start KafkaDetectorStream <=============")
val addr = system
.settings
.config
.getConfig("kafka")
.getString("servers")
val sink: Sink[ServerHealthEvent, NotUsed] =
ActorSink.actorRefWithAck[ServerHealthEvent, ServerHealthStreamer, Ack](
ref = ref,
onCompleteMessage = Complete,
onFailureMessage = Fail.apply,
messageAdapter = Message.apply,
onInitMessage = Init.apply,
ackMessage = Ack)
Source.tick(1.seconds, 5.seconds, NotUsed)
.flatMapConcat(_ => Source.fromFuture(health(addr)))
.map {
case true =>
KafkaActiveConfirmed
case false =>
KafkaInactiveConfirmed
}
.viaMat(KillSwitches.single)(Keep.right)
.to(sink)
.run()
}
private def health(server: String)(implicit executor: ExecutionContext): Future[Boolean] = {
val props = new Properties
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, server)
props.put(AdminClientConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG, "10000")
props.put(AdminClientConfig.REQUEST_TIMEOUT_MS_CONFIG, "5000")
Future {
AdminClient
.create(props)
.listTopics()
.names()
.get()
}
.map(_ => true)
.recover {
case _: Throwable => false
}
}
What I mean is, that this part:
.map {
case true =>
KafkaActiveConfirmed
case false =>
KafkaInactiveConfirmed
}
never gets executed and I do not know the reason. The method health executes as expected.
Try to add .log between flatMapConcat and map to see emited element. log can else log errors and stream cancelation.
https://doc.akka.io/docs/akka/current/stream/operators/Source-or-Flow/log.html
Note, .log using implicit logger
And your .flatMapConcat(_ => Source.fromFuture(health(addr))) seams triky,
try .mapAsyncUnordered(1)(_ => health(addr))

How to using Akka Stream with Akk-Http to stream the response

I'm new to Akka Stream. I used following code for CSV parsing.
class CsvParser(config: Config)(implicit system: ActorSystem) extends LazyLogging with NumberValidation {
import system.dispatcher
private val importDirectory = Paths.get(config.getString("importer.import-directory")).toFile
private val linesToSkip = config.getInt("importer.lines-to-skip")
private val concurrentFiles = config.getInt("importer.concurrent-files")
private val concurrentWrites = config.getInt("importer.concurrent-writes")
private val nonIOParallelism = config.getInt("importer.non-io-parallelism")
def save(r: ValidReading): Future[Unit] = {
Future()
}
def parseLine(filePath: String)(line: String): Future[Reading] = Future {
val fields = line.split(";")
val id = fields(0).toInt
try {
val value = fields(1).toDouble
ValidReading(id, value)
} catch {
case t: Throwable =>
logger.error(s"Unable to parse line in $filePath:\n$line: ${t.getMessage}")
InvalidReading(id)
}
}
val lineDelimiter: Flow[ByteString, ByteString, NotUsed] =
Framing.delimiter(ByteString("\n"), 128, allowTruncation = true)
val parseFile: Flow[File, Reading, NotUsed] =
Flow[File].flatMapConcat { file =>
val src = FileSource.fromFile(file).getLines()
val source : Source[String, NotUsed] = Source.fromIterator(() => src)
// val gzipInputStream = new GZIPInputStream(new FileInputStream(file))
source
.mapAsync(parallelism = nonIOParallelism)(parseLine(file.getPath))
}
val computeAverage: Flow[Reading, ValidReading, NotUsed] =
Flow[Reading].grouped(2).mapAsyncUnordered(parallelism = nonIOParallelism) { readings =>
Future {
val validReadings = readings.collect { case r: ValidReading => r }
val average = if (validReadings.nonEmpty) validReadings.map(_.value).sum / validReadings.size else -1
ValidReading(readings.head.id, average)
}
}
val storeReadings: Sink[ValidReading, Future[Done]] =
Flow[ValidReading]
.mapAsyncUnordered(concurrentWrites)(save)
.toMat(Sink.ignore)(Keep.right)
val processSingleFile: Flow[File, ValidReading, NotUsed] =
Flow[File]
.via(parseFile)
.via(computeAverage)
def importFromFiles = {
implicit val materializer = ActorMaterializer()
val files = importDirectory.listFiles.toList
logger.info(s"Starting import of ${files.size} files from ${importDirectory.getPath}")
val startTime = System.currentTimeMillis()
val balancer = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val balance = builder.add(Balance[File](concurrentFiles))
val merge = builder.add(Merge[ValidReading](concurrentFiles))
(1 to concurrentFiles).foreach { _ =>
balance ~> processSingleFile ~> merge
}
FlowShape(balance.in, merge.out)
}
Source(files)
.via(balancer)
.withAttributes(ActorAttributes.supervisionStrategy { e =>
logger.error("Exception thrown during stream processing", e)
Supervision.Resume
})
.runWith(storeReadings)
.andThen {
case Success(_) =>
val elapsedTime = (System.currentTimeMillis() - startTime) / 1000.0
logger.info(s"Import finished in ${elapsedTime}s")
case Failure(e) => logger.error("Import failed", e)
}
}
}
I wanted to to use Akka HTTP which would give all ValidReading entities parsed from CSV but I couldn't understand on how would I do that.
The above code fetches file from server and parse each lines to generate ValidReading.
How can I pass/upload CSV via akka-http, parse the file and stream the resulted response back to the endpoint?
The "essence" of the solution is something like this:
import akka.http.scaladsl.server.Directives._
val route = fileUpload("csv") {
case (metadata, byteSource) =>
val source = byteSource.map(x => x)
complete(HttpResponse(entity = HttpEntity(ContentTypes.`text/csv(UTF-8)`, source)))
}
You detect that the uploaded thing is a multipart-form-data with a chunk named "csv". You get the byteSource from that. Do the calculation (insert your logic to the .map(x=>x) part). Convert your data back to ByteString. Complete the request with the new source. This will make your endoint like a proxy.

Is there a way to get the item being mapped when an exception is thrown akka streams?

I want to be able to log certain attributes of the item being mapped if there is an exception thrown, so I was wondering is there a way to get the item being mapped when an exception is thrown akka streams?
If I have:
val decider: Supervision.Decider = { e =>
//val item = getItemThatCausedException
logger.error("Exception in stream with itemId:"+item.id, e)
Supervision.Resume
}
implicit val actorSystem = ActorSystem()
val materializerSettings = ActorMaterializerSettings(actorSystem).withSupervisionStrategy(decider)
implicit val materializer = ActorMaterializer(materializerSettings)(actorSystem)
Source(List(item1,item2,item3)).map { item =>
if (item.property < 0) {
throw new RuntimeException("Error")
} else {
i
}
}
Is there a way of getting the failed item in Supervision.Decider or after the map is done?
Not with a Supervision.Decide but you could achieve it in a different way.
Check out this program:
object Streams extends App{
implicit val system = ActorSystem("test")
implicit val mat = ActorMaterializer()
val source = Source(List("1", "2", "3")).map { item =>
Try {
if (item == "2") {
throw new RuntimeException("Error")
} else {
item
}
}
}
source
.alsoTo(
Flow[Try[String]]
.filter(_.isFailure)
.to(Sink.foreach(t => println("failure: " + t))))
.to(
Flow[Try[String]]
.filter(_.isSuccess)
.to(Sink.foreach(t => println("success " + t)))).run()
}
Outputs:
success Success(1)
failure: Failure(java.lang.RuntimeException: Error)
success Success(3)
This is somewhat convoluted but you can do this by wrapping your mapping function in a stream and using flatMapConcat like so:
Source(List(item1, item2, item3)).flatMapConcat { item =>
Source(List(item))
.map(mapF)
.withAttributes(ActorAttributes.supervisionStrategy { e: Throwable =>
logger.error("Exception in stream with itemId:" + item.id, e)
Supervision.Resume
})
}
def mapF(item: Item) =
if (item.property < 0) {
throw new RuntimeException("Error")
} else {
i
}
This is possible because each stream stage can have its own supervision strategy.
You can use Supervision.Decider to log those attributes.
object Test extends App {
implicit val system = ActorSystem("test")
implicit val mat = ActorMaterializer()
val testSupervisionDecider: Supervision.Decider = {
case ex: RuntimeException =>
println(s"some run time exception ${ex.getMessage}")
Supervision.Resume
case ex: Exception =>
//if you want to stop the stream
Supervision.Stop
}
val source = Source(List("1", "2", "3")).map { item =>
if (item == "2") {
throw new RuntimeException(s"$item")
} else {
item
}
}
source
.to(Sink.foreach(println(_)))
.withAttributes(ActorAttributes.supervisionStrategy(testSupervisionDecider))
.run
}
The Output is:
1
some run time exception 2
3

Upload File accross diff cluster using akka http

I am trying to upload the file using akka-http, It works while I try to upload it on the same system or cluster but now how do I upload it to specific server ?
(path("/uploadFile") & post) {
extractRequestContext {
ctx => {
implicit val materializer = ctx.materializer
implicit val ec = ctx.executionContext
fileUpload("fileUpload") {
case (fileInfo, fileStream) =>
val sink = FileIO.toPath(Paths.get("/tmp/sample.jar") resolve fileInfo.fileName)
val writeResult = fileStream.runWith(sink)
onSuccess(writeResult) { result =>
result.status match {
case Success(_) => complete(s"Successfully written ${result.count} bytes")
case Failure(e) => throw e
}
}
}
}
}
}

How to download a HTTP resource to a file with Akka Streams and HTTP?

Over the past few days I have been trying to figure out the best way to download a HTTP resource to a file using Akka Streams and HTTP.
Initially I started with the Future-Based Variant and that looked something like this:
def downloadViaFutures(uri: Uri, file: File): Future[Long] = {
val request = Get(uri)
val responseFuture = Http().singleRequest(request)
responseFuture.flatMap { response =>
val source = response.entity.dataBytes
source.runWith(FileIO.toFile(file))
}
}
That was kind of okay but once I learnt more about pure Akka Streams I wanted to try and use the Flow-Based Variant to create a stream starting from a Source[HttpRequest]. At first this completely stumped me until I stumbled upon the flatMapConcat flow transformation. This ended up a little more verbose:
def responseOrFail[T](in: (Try[HttpResponse], T)): (HttpResponse, T) = in match {
case (responseTry, context) => (responseTry.get, context)
}
def responseToByteSource[T](in: (HttpResponse, T)): Source[ByteString, Any] = in match {
case (response, _) => response.entity.dataBytes
}
def downloadViaFlow(uri: Uri, file: File): Future[Long] = {
val request = Get(uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
source.
via(requestResponseFlow).
map(responseOrFail).
flatMapConcat(responseToByteSource).
runWith(FileIO.toFile(file))
}
Then I wanted to get a little tricky and use the Content-Disposition header.
Going back to the Future-Based Variant:
def destinationFile(downloadDir: File, response: HttpResponse): File = {
val fileName = response.header[ContentDisposition].get.value
val file = new File(downloadDir, fileName)
file.createNewFile()
file
}
def downloadViaFutures2(uri: Uri, downloadDir: File): Future[Long] = {
val request = Get(uri)
val responseFuture = Http().singleRequest(request)
responseFuture.flatMap { response =>
val file = destinationFile(downloadDir, response)
val source = response.entity.dataBytes
source.runWith(FileIO.toFile(file))
}
}
But now I have no idea how to do this with the Future-Based Variant. This is as far as I got:
def responseToByteSourceWithDest[T](in: (HttpResponse, T), downloadDir: File): Source[(ByteString, File), Any] = in match {
case (response, _) =>
val source = responseToByteSource(in)
val file = destinationFile(downloadDir, response)
source.map((_, file))
}
def downloadViaFlow2(uri: Uri, downloadDir: File): Future[Long] = {
val request = Get(uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
val sourceWithDest: Source[(ByteString, File), Unit] = source.
via(requestResponseFlow).
map(responseOrFail).
flatMapConcat(responseToByteSourceWithDest(_, downloadDir))
sourceWithDest.runWith(???)
}
So now I have a Source that will emit one or more (ByteString, File) elements for each File (I say each File since there is no reason the original Source has to be a single HttpRequest).
Is there anyway to take these and route them to a dynamic Sink?
I'm thinking something like flatMapConcat, such as:
def runWithMap[T, Mat2](f: T => Graph[SinkShape[Out], Mat2])(implicit materializer: Materializer): Mat2 = ???
So that I could complete downloadViaFlow2 with:
def destToSink(destination: File): Sink[(ByteString, File), Future[Long]] = {
val sink = FileIO.toFile(destination, true)
Flow[(ByteString, File)].map(_._1).toMat(sink)(Keep.right)
}
sourceWithDest.runWithMap {
case (_, file) => destToSink(file)
}
The solution does not require a flatMapConcat. If you don't need any return values from the file writing then you can use Sink.foreach:
def writeFile(downloadDir : File)(httpResponse : HttpResponse) : Future[Long] = {
val file = destinationFile(downloadDir, httpResponse)
httpResponse.entity.dataBytes.runWith(FileIO.toFile(file))
}
def downloadViaFlow2(uri: Uri, downloadDir: File) : Future[Unit] = {
val request = HttpRequest(uri=uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.runWith(Sink.foreach(writeFile(downloadDir)))
}
Note that the Sink.foreach creates Futures from the writeFile function. Therefore there's not much back-pressure involved. The writeFile could be slowed down by the hard drive but the stream would keep generating Futures. To control this you can use Flow.mapAsyncUnordered (or Flow.mapAsync) :
val parallelism = 10
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.mapAsyncUnordered(parallelism)(writeFile(downloadDir))
.runWith(Sink.ignore)
If you want to accumulate the Long values for a total count you need to combine with a Sink.fold:
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.mapAsyncUnordered(parallelism)(writeFile(downloadDir))
.runWith(Sink.fold(0L)(_ + _))
The fold will keep a running sum and emit the final value when the source of requests has dried up.
Using the play Web Services client injected in ws, remmebering to import scala.concurrent.duration._:
def downloadFromUrl(url: String)(ws: WSClient): Future[Try[File]] = {
val file = File.createTempFile("my-prefix", new File("/tmp"))
file.deleteOnExit()
val futureResponse: Future[WSResponse] =
ws.url(url).withMethod("GET").withRequestTimeout(5 minutes).stream()
futureResponse.flatMap { res =>
res.status match {
case 200 =>
val outputStream = java.nio.file.Files.newOutputStream(file.toPath)
val sink = Sink.foreach[ByteString] { bytes => outputStream.write(bytes.toArray) }
res.bodyAsSource.runWith(sink).andThen {
case result =>
outputStream.close()
result.get
} map (_ => Success(file))
case other => Future(Failure[File](new Exception("HTTP Failure, response code " + other + " : " + res.statusText)))
}
}
}