How to process Akka stream based on some condition? - scala

Suppose there is a stream of some files to be processed and only a specific file should be processed(consumed) when a condition is met.
i.e. Only if the stream contains a file named "aaa", process a file named "bbb"
SomeFile(name: String)
What would be the correct(recommended) way to do this?

Okay, here's an example. Be careful about building up too big a buffer here before the trigger hits
class FileFinder {
def matchFiles(triggerName: String,
matchName: String): Flow[SomeFile, SomeFile, NotUsed] =
Flow[SomeFile].statefulMapConcat(
statefulMatcher(matches(triggerName), matches(matchName)))
private def matches(matchName: String): SomeFile => Boolean = {
case SomeFile(name) if name == matchName => true
case _ => false
}
private def statefulMatcher(
triggerFilter: => SomeFile => Boolean,
sendFilter: SomeFile => Boolean): () => SomeFile => List[SomeFile] = {
var found = false
var sendFiles: List[SomeFile] = Nil
() => file: SomeFile =>
{
file match {
case f if triggerFilter(f) =>
found = true
val send = sendFiles
sendFiles = Nil
send
case f if sendFilter(f) =>
if (found)
List(f)
else {
sendFiles = f :: sendFiles
Nil
}
case _ => Nil
}
}
}
}
object FileFinder extends FileFinder {
def main(args: Array[String]): Unit = {
implicit val system: ActorSystem = ActorSystem("finder")
implicit val materializer: ActorMaterializer = ActorMaterializer()
implicit val executor: ExecutionContextExecutor =
materializer.executionContext
implicit val loggingAdapter: LoggingAdapter = system.log
val files = List(SomeFile("aaa"), SomeFile("bbb"), SomeFile("aaa"))
Source(files)
.via(matchFiles("bbb", "aaa"))
.runForeach(println(_))
.andThen({
case Success(_) =>
println("Success")
system.terminate()
case Failure(ex) =>
loggingAdapter.error("Shouldn't happen...", ex)
system.terminate()
})
}
}
case class SomeFile(name: String)

Related

how to delete a java file created from akka streams

I am using akka streams and am following tutorial. The code is trying to delete the created file but it's not getting deleted. I was unable to know which resource is left opened. file.delete() is returning false. the file has the permission of read, write and execute
Here is the example code
final case class FileField(fieldName: String, fileNameF: FileInfo ⇒ File)
final case class PartsAndFiles(form: immutable.Map[String, List[String]], files: immutable.Seq[(FileInfo, File)]) {
final def addForm(fieldName: String, content: String): PartsAndFiles = this.copy(
form = {
val existingContent: List[String] = this.form.getOrElse(fieldName, List.empty)
val newContents: List[String] = content :: existingContent
this.form + (fieldName -> newContents)
}
)
final def addFile(info: FileInfo, file: File): PartsAndFiles = this.copy(
files = this.files :+ ((info, file))
)
}
object PartsAndFiles {
val Empty = PartsAndFiles(immutable.Map.empty, immutable.Seq.empty)
}
def formAndFiles(
fileFields: immutable.Seq[FileField]
): Directive1[PartsAndFiles] = entity(as[Multipart.FormData]).flatMap { formData ⇒
extractRequestContext.flatMap { ctx ⇒
implicit val mat = ctx.materializer
implicit val ec = ctx.executionContext
val uploadingSink =
Sink.foldAsync[PartsAndFiles, Multipart.FormData.BodyPart](PartsAndFiles.Empty) {
(acc, part) ⇒
def discard(p: Multipart.FormData.BodyPart): Future[PartsAndFiles] = {
p.entity.discardBytes()
Future.successful(acc)
}
part.filename.map { fileName ⇒
fileFields.find(_.fieldName == part.name)
.map {
case FileField(fieldName, destFn) ⇒
val fileInfo = FileInfo(part.name, fileName, part.entity.contentType)
val dest = destFn(fileInfo)
part.entity.dataBytes.runWith(FileIO.toPath(dest.toPath)).map { _ ⇒
acc.addFile(fileInfo, dest)
}
}.getOrElse(discard(part))
} getOrElse {
part.entity match {
case HttpEntity.Strict(ct, data) if ct.isInstanceOf[ContentType.NonBinary] ⇒
val charsetName = ct.asInstanceOf[ContentType.NonBinary].charset.nioCharset.name
val partContent = data.decodeString(charsetName)
Future.successful(acc.addForm(part.name, partContent))
case _ ⇒
discard(part)
}
}
}
val uploadedF = formData.parts.runWith(uploadingSink)
onSuccess(uploadedF)
}
}
val fileFields = scala.collection.immutable.Seq(FileField("picture", _ => new File("/tmp/uploaded")))
val routes: Route = formAndFiles(fileFields) {
case PartsAndFiles(fields, files) =>
files.foreach(_._2.delete())
val body = s"""
|File: ${files.head._2.getAbsolutePath}
|Form: ${fields}
""".stripMargin
complete(OK, body)
}
here
files.foreach(_._2.delete())
is returning false
files.foreach(_._2.isExists)
is returning true
i want to delete this file how can i do it ?

Using existing methods in macro

Suppose i have some class with some methods
class Clz ... {
def someMethod: Map[String, Long] = ...
def id: Long = 0L
}
i'm need to reuse someMethod and overwrite id
i don't know why but it's throw Stackoverflow and also i'm need to do something with params/methods of Clz without returning the result
what i've tried:
object Macros {
def impl(c: whitebox.Context)(annottees: c.Tree*): c.Tree = {
import c.universe._
annottees match {
case (cls # q"$_ class $tpname[..$_] $_(...$paramss) extends { ..$_ } with ..$_ { $_ => ..$stats }") :: Nil =>
val someMethodM = stats
.filter(case q"$_ def $methodName: $_ = $_" => methodName.toString == "someMethod")
.map {
case q"$_ def $_: $_ = $res" => res
}
q"""
$cls
object ClzCompanion {
val someMethodsRef: Map[String, Int] = Map.apply(..$someMethodM)
..${
paramss
.flatten
.foreach {
case q"$_ val $nname: $tpt = $valuee" =>
// simply do something with valuee and nothing else
// "fire and forget"
}
}
..${
paramss
.flatten
.map {
case q"$_ val $nname: $tpt = $valuee" =>
// here logic, if this nname in someMethodsRef then
// someMethodsRef.find({ case (key, _) => key == nname) match {
// case Some(_) => "def $nname: ($tpt, Int) = ... // new method with body
// case None => do nothing...
}
}
}
"""
}
}
}
how i can overwrite the id method at Clz?
and why it throws StackOverflow??

Convert Traversable[T] to Stream[T] without traversing or stack overflow

I am using a library that provides a Traversable[T] that pages through database results. I'd like to avoid loading the whole thing into memory, so I am trying to convert it to a Stream[T].
From what I can tell, the built in "asStream" method loads the whole Traversable into a Buffer, which defeats my purpose. My attempt (below) hits a StackOverflowException on large results, and I can't tell why. Can someone help me understand what is going on? Thanks!
def asStream[T](traversable: => Traversable[T]): Stream[T] = {
if (traversable.isEmpty) Empty
else {
lazy val head = traversable.head
lazy val tail = asStream(traversable.tail)
head #:: tail
}
}
Here's a complete example that reproduces this, based on a suggestion by #SCouto
import scala.collection.immutable.Stream.Empty
object StreamTest {
def main(args: Array[String]) = {
val bigVector = Vector.fill(90000)(1)
val optionStream = asStream(bigVector).map(v => Some(v))
val zipped = optionStream.zipAll(optionStream.tail, None, None)
}
def asStream[T](traversable: => Traversable[T]): Stream[T] = {
#annotation.tailrec
def loop(processed: => Stream[T], pending: => Traversable[T]): Stream[T] = {
if (pending.isEmpty) processed
else {
lazy val head = pending.head
lazy val tail = pending.tail
loop(processed :+ head, tail)
}
}
loop(Empty, traversable)
}
}
Edit: After some interesting ideas from #SCouto, I learned this could also be done with trampolines to keep the result as a Stream[T] that is in the original order
object StreamTest {
def main(args: Array[String]) = {
val bigVector = Range(1, 90000).toVector
val optionStream = asStream(bigVector).map(v => Some(v))
val zipped = optionStream.zipAll(optionStream.tail, None, None)
zipped.take(10).foreach(println)
}
def asStream[T](traversable: => Traversable[T]): Stream[T] = {
sealed trait Traversal[+R]
case class More[+R](result: R, next: () => Traversal[R]) extends Traversal[R]
case object Done extends Traversal[Nothing]
def next(currentTraversable: Traversable[T]): Traversal[T] = {
if (currentTraversable.isEmpty) Done
else More(currentTraversable.head, () => next(currentTraversable.tail))
}
def trampoline[R](body: => Traversal[R]): Stream[R] = {
def loop(thunk: () => Traversal[R]): Stream[R] = {
thunk.apply match {
case More(result, next) => Stream.cons(result, loop(next))
case Done => Stream.empty
}
}
loop(() => body)
}
trampoline(next(traversable))
}
}
Try this:
def asStream[T](traversable: => Traversable[T]): Stream[T] = {
#annotation.tailrec
def loop(processed: Stream[T], pending: Traversable[T]): Stream[T] = {
if (pending.isEmpty) processed
else {
lazy val head = pending.head
lazy val tail = pending.tail
loop(head #:: processed, tail)
}
}
loop(Empty, traversable)
}
The main point is to ensure that your recursive call is the last action of your recursive function.
To ensure this you can use both a nested method (called loop in the example) and the tailrec annotation which ensures your method is tail-safe.
You can find info about tail rec here and in this awesome answer here
EDIT
The problem was that we were adding the element at the end of the Stream. If you add it as head of the Stream as in your example it will work fine. I updated my code. Please test it and let us know the result.
My tests:
scala> val optionStream = asStream(Vector.fill(90000)(1)).map(v => Some(v))
optionStream: scala.collection.immutable.Stream[Some[Int]] = Stream(Some(1), ?)
scala> val zipped = optionStream.zipAll(optionStream.tail, None, None)
zipped: scala.collection.immutable.Stream[(Option[Int], Option[Int])] = Stream((Some(1),Some(1)), ?)
EDIT2:
According to your comments, and considering the fpinscala example as you said. I think this may help you. The point is creating a case class structure with lazy evaluation. Where the head is a single element, and the tail a traversable
sealed trait myStream[+T] {
def head: Option[T] = this match {
case MyEmpty => None
case MyCons(h, _) => Some(h())
}
def tail: myStream[T] = this match {
case MyEmpty => MyEmpty
case MyCons(_, t) => myStream.cons(t().head, t().tail)
}
}
case object MyEmpty extends myStream[Nothing]
case class MyCons[+T](h: () => T, t: () => Traversable[T]) extends myStream[T]
object myStream {
def cons[T](hd: => T, tl: => Traversable[T]): myStream[T] = {
lazy val head = hd
lazy val tail = tl
MyCons(() => head, () => tail)
}
def empty[T]: myStream[T] = MyEmpty
def apply[T](as: T*): myStream[T] = {
if (as.isEmpty) empty
else cons(as.head, as.tail)
}
}
Some Quick tests:
val bigVector = Vector.fill(90000)(1)
myStream.cons(bigVector.head, bigVector.tail)
res2: myStream[Int] = MyCons(<function0>,<function0>)
Retrieving head:
res2.head
res3: Option[Int] = Some(1)
And the tail:
res2.tail
res4: myStream[Int] = MyCons(<function0>,<function0>)
EDIT3
The trampoline solution by the op:
def asStream[T](traversable: => Traversable[T]): Stream[T] = {
sealed trait Traversal[+R]
case class More[+R](result: R, next: () => Traversal[R]) extends Traversal[R]
case object Done extends Traversal[Nothing]
def next(currentTraversable: Traversable[T]): Traversal[T] = {
if (currentTraversable.isEmpty) Done
else More(currentTraversable.head, () => next(currentTraversable.tail))
}
def trampoline[R](body: => Traversal[R]): Stream[R] = {
def loop(thunk: () => Traversal[R]): Stream[R] = {
thunk.apply match {
case More(result, next) => Stream.cons(result, loop(next))
case Done => Stream.empty
}
}
loop(() => body)
}
trampoline(next(traversable))
}
}
Stream doesn't keep the data in memory because you declare how to generate each item. It's very likely that your database data is not been procedurally generated so what you need is to fetch the data the first time you ask for it (something like def getData(index: Int): Future[Data]).
The biggest problem rise in, since you are fetching data from a database, you are probably using Futures so, even if you are able to achieve it, you would have a Future[Stream[Data]] object which is not that nice to use or, much worst, block it.
Wouldn't be much more worthy just to paginate your database data query?

stopping akka stream Source created from file on reaching file's end

I've created an akka-stream source by reading contents from a file.
implicit val system = ActorSystem("reactive-process")
implicit val materializer = ActorMaterializer()
implicit val ec: ExecutionContextExecutor = system.dispatcher
case class RequestCaseClass(data: String)
def httpPoolFlow(host: String, port: String) = Http().cachedHostConnectionPool[RequestCaseClass](host, port)(httpMat)
def makeFileSource(filePath: String) = FileIO.fromFile(new File(filePath))
.via(Framing.delimiter(ByteString(System.lineSeparator), 10000, allowTruncation = false))
.map(_.utf8String)
def requestFramer = Flow.fromFunction((dataRow: String) =>
(getCmsRequest(RequestCaseClass(dataRow)), RequestCaseClass(dataRow))
).filter(_._1.isSuccess).map(z => (z._1.get, z._2))
def responseHandler = Flow.fromFunction(
(in: (Try[HttpResponse], RequestCaseClass)) => {
in._1 match {
case Success(r) => r.status.intValue() match {
case 200 =>
r.entity.dataBytes.runWith(Sink.ignore)
logger.info(s"Success response for: ${in._2.id}")
true
case _ =>
val b = Await.result(r.entity.dataBytes.runFold(ByteString.empty)(_ ++ _)(materializer)
.map(bb => new String(bb.toArray))(materializer.executionContext), 60.seconds)
logger.warn(s"Non-200 response for: ${in._2.id} r: $b")
false
}
case Failure(t) =>
logger.error(s"Failed completing request for: ${in._2.id} e: ${t.getMessage}")
false
}
}
Now when using it in a runnable graph, I want the entire stream to stop on reaching End Of File being read using makeFileSource(fp).
makeFileSource(fp)
.via(requestFramer)
.via(httpPoolFlow)
.via(responseHandler)
.runWith(Sink.ignore)
The stream currently doesn't stop on reaching end of file.

PersistentActor cannot invoke persist handler in Future onComplete

I am pretty new using PersistentActor ,
when I try to call updateState from a future onComplete, fails , nothing happanes , tried to debug it and I do get to the persist call but not into the updateState
trait Event
case class Cmd(data: String)
case class Evt(data: String) extends Event
class BarActor extends PersistentActor{
implicit val system = context.system
implicit val executionContext = system.dispatcher
def updateState(event: Evt): Unit ={
println("Updating state")
state = state.updated(event)
sender() ! state
}
def timeout(implicit ec: ExecutionContext) =
akka.pattern.after(duration = 2 seconds, using = system.scheduler)(Future.failed(new TimeoutException("Got timed out!")))
val receiveCommand: Receive = {
case Cmd(data) =>
def anotherFuture(i: Int)(implicit system: ActorSystem) = {
val realF = Future {
i % 2 match {
case 0 =>
Thread.sleep(100)
case _ =>
Thread.sleep(500)
}
i
}
Future.firstCompletedOf(Seq(realF, timeout))
.recover {
case _ => -1
}
}
val res = (1 to 10).map(anotherFuture(_))
val list = Future.sequence(res)
list.onComplete{
case _ =>
persist(Evt("testing"))(updateState)
}
}
}
You could try this:
list.onComplete {
case _ => self ! Evt("testing")
}
And add this to receiveCommand
case evt: Evt =>
persist(Evt("testing"))(updateStates)