multipart form data in Lagom - scala

I want to have a service which receives an item object, the object contains; name, description, price and picture.
the other attributes are strings which easily can be sent as Json object but for including picture what is the best solution?
if multipart formdata is the best solution how it is handled in Lagom?

You may want to check the file upload example in the lagom-recipes repository on GitHub.
Basically, the idea is to create an additional Play router. After that, we have to tell Lagom to use it as noted in the reference documentation (this feature is available since 1.5.0). Here is how the router might look like:
class FileUploadRouter(action: DefaultActionBuilder,
parser: PlayBodyParsers,
implicit val exCtx: ExecutionContext) {
private def fileHandler: FilePartHandler[File] = {
case FileInfo(partName, filename, contentType, _) =>
val tempFile = {
val f = new java.io.File("./target/file-upload-data/uploads", UUID.randomUUID().toString).getAbsoluteFile
f.getParentFile.mkdirs()
f
}
val sink: Sink[ByteString, Future[IOResult]] = FileIO.toPath(tempFile.toPath)
val acc: Accumulator[ByteString, IOResult] = Accumulator(sink)
acc.map {
case akka.stream.IOResult(_, _) =>
FilePart(partName, filename, contentType, tempFile)
}
}
val router = Router.from {
case POST(p"/api/files") =>
action(parser.multipartFormData(fileHandler)) { request =>
val files = request.body.files.map(_.ref.getAbsolutePath)
Results.Ok(files.mkString("Uploaded[", ", ", "]"))
}
}
}
And then, we simply tell Lagom to use it
override lazy val lagomServer =
serverFor[FileUploadService](wire[FileUploadServiceImpl])
.additionalRouter(wire[FileUploadRouter].router)
Alternatively, we can make use of the PlayServiceCall class. Here is a simple sketch on how to do that provided by James Roper from the Lightbend team:
// The type of the service call is NotUsed because we are handling it out of band
def myServiceCall: ServiceCall[NotUsed, Result] = PlayServiceCall { wrapCall =>
// Create a Play action to handle the request
EssentialAction { requestHeader =>
// Now we create the sink for where we want to stream the request to - eg it could
// go to a file, a database, some other service. The way Play gives you a request
// body is that you need to return a sink from EssentialAction, and when it gets
// that sink, it stream the request body into that sink.
val sink: Sink[ByteString, Future[Done]] = ...
// Play wraps sinks in an abstraction called accumulator, which makes it easy to
// work with the result of handling the sink. An accumulator is like a future, but
// but rather than just being a value that will be available in future, it is a
// value that will be available once you have passed a stream of data into it.
// We wrap the sink in an accumulator here.
val accumulator: Accumulator[ByteString, Done] = Accumulator.forSink(sink)
// Now we have an accumulator, but we need the accumulator to, when it's done,
// produce an HTTP response. Right now, it's just producing akka.Done (or whatever
// your sink materialized to). So we flatMap it, to handle the result.
accumulator.flatMap { done =>
// At this point we create the ServiceCall, the reason we do that here is it means
// we can access the result of the accumulator (in this example, it's just Done so
// not very interesting, but it could be something else).
val wrappedAction = wrapCall(ServiceCall { notUsed =>
// Here is where we can do any of the actual business logic, and generate the
// result that can be returned to Lagom to be serialized like normal
...
})
// Now we invoke the wrapped action, and run it with no body (since we've already
// handled the request body with our sink/accumulator.
wrappedAction(request).run()
}
}
}
Generally speaking, it probably isn't a good idea to use Lagom for that purpose. As noted on the GitHub issue on PlayServiceCall documentation:
Many use cases where we fallback to PlayServiceCall are related to presentation or HTTP-specific use (I18N, file upload, ...) which indicate: coupling of the lagom service to the presentation layer or coupling of the lagom service to the transport.
Quoting James Roper again (a few years back):
So currently, multipart/form-data is not supported in Lagom, at least not out of the box. You can drop down to a lower level Play API to handle it, but perhaps it would be better to handle it in a web gateway, where any files handled are uploaded directly to a storage service such as S3, and then a Lagom service might store the meta data associated with it.
You can also check the discussion here, which provides some more insight.

Related

scala ZIO foreachPar

I'm new to parallel programming and ZIO, i'm trying to get data from an API, by parallel requests.
import sttp.client._
import zio.{Task, ZIO}
ZIO.foreach(files) { file =>
getData(file)
Task(file.getName)
}
def getData(file: File) = {
val data: String = readData(file)
val request = basicRequest.body(data).post(uri"$url")
.headers(content -> "text", char -> "utf-8")
.response(asString)
implicit val backend: SttpBackend[Identity, Nothing, NothingT] = HttpURLConnectionBackend()
request.send().body
resquest.Response match {
case Success(value) => {
val src = new PrintWriter(new File(filename))
src.write(value.toString)
src.close()
}
case Failure(exception) => log error
}
when i execute the program sequentially, it work as expected,
if i tried to run parallel, by changing ZIO.foreach to ZIO.foreachPar.
The program is terminating prematurely, i get that, i'm missing something basic here,
any help is appreciated to help me figure out the issue.
Generally speaking I wouldn't recommend mixing synchronous blocking code as you have with asynchronous non-blocking code which is the primary role of ZIO. There are some great talks out there on how to effectively use ZIO with the "world" so to speak.
There are two key points I would make, one ZIO lets you manage resources effectively by attaching allocation and finalization steps and two, "effects" we could say are "things which actually interact with the world" should be wrapped in the tightest scope possible*.
So lets go through this example a bit, first of all, I would not suggest using the default Identity backed backend with ZIO, I would recommend using the AsyncHttpClientZioBackend instead.
import sttp.client._
import zio.{Task, ZIO}
import zio.blocking.effectBlocking
import sttp.client.asynchttpclient.zio.AsyncHttpClientZioBackend
// Extract the common elements of the request
val baseRequest = basicRequest.post(uri"$url")
.headers(content -> "text", char -> "utf-8")
.response(asString)
// Produces a writer which is wrapped in a `Managed` allowing it to be properly
// closed after being used
def managedWriter(filename: String): Managed[IOException, PrintWriter] =
ZManaged.fromAutoCloseable(UIO(new PrintWriter(new File(filename))))
// This returns an effect which produces an `SttpBackend`, thus we flatMap over it
// to extract the backend.
val program = AsyncHttpClientZioBackend().flatMap { implicit backend =>
ZIO.foreachPar(files) { file =>
for {
// Wrap the synchronous reading of data in a `Task`, but which allows runs this effect on a "blocking" threadpool instead of blocking the main one.
data <- effectBlocking(readData(file))
// `send` will return a `Task` because it is using the implicit backend in scope
resp <- baseRequest.body(data).send()
// Build the managed writer, then "use" it to produce an effect, at the end of `use` it will automatically close the writer.
_ <- managedWriter("").use(w => Task(w.write(resp.body.toString)))
} yield ()
}
}
At this point you will just have the program which you will need to run using one of the unsafe methods or if you are using a zio.App through the main method.
* Not always possible or convenient, but it is useful because it prevents resource hogging by yielding tasks back to the runtime for scheduling.
When you use a purely functional IO library like ZIO, you must not call any side-effecting functions (like getData) except when calling factory methods like Task.effect or Task.apply.
ZIO.foreach(files) { file =>
Task {
getData(file)
file.getName
}
}

Akka Streams, break tuple item apart?

Using the superPool from akka-http, I have a stream that passes down a tuple. I would like to pipeline it to the Alpakka Google Pub/Sub connector. At the end of the HTTP processing, I encode everything for the pub/sub connector and end up with
(PublishRequest, Long) // long is a timestamp
but the interface of the connector is
Flow[PublishRequest, Seq[String], NotUsed]
One first approach is to kill one part:
.map{ case(publishRequest, timestamp) => publishRequest }
.via(publishFlow)
Is there an elegant way to create this pipeline while keeping the Long information?
EDIT: added my not-so-elegant solution in the answers. More answers welcome.
I don't see anything inelegant about your solution using GraphDSL.create(), which I think has an advantage of visualizing the stream structure via the diagrammatic ~> clauses. I do see problem in your code. For example, I don't think publisher should be defined by add-ing a flow to the builder.
Below is a skeletal version (briefly tested) of what I believe publishAndRecombine should look like:
val publishFlow: Flow[PublishRequest, Seq[String], NotUsed] = ???
val publishAndRecombine = Flow.fromGraph(GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val bcast = b.add(Broadcast[(PublishRequest, Long)](2))
val zipper = b.add(Zip[Seq[String], Long])
val publisher = Flow[(PublishRequest, Long)].
map{ case (pr, _) => pr }.
via(publishFlow)
val timestamp = Flow[(PublishRequest, Long)].
map{ case (_, ts) => ts }
bcast.out(0) ~> publisher ~> zipper.in0
bcast.out(1) ~> timestamp ~> zipper.in1
FlowShape(bcast.in, zipper.out)
})
There is now a much nicer solution for this which will be released in Akka 2.6.19 (see https://github.com/akka/akka/pull/31123).
In order to use the aformentioned unsafeViaData you would first have to represent (PublishRequest, Long) using FlowWithContext/SourceWithContext. FlowWithContext/SourceWithContext is an abstraction that was specifically designed to solve this problem (see https://doc.akka.io/docs/akka/current/stream/stream-context.html). The problem being you have a stream with the data part that is typically what you want to operate on (in your case the ByteString) and then you have the context (aka metadata) part which you typically just pass along unmodified (in your case the Long).
So in the end you would have something like this
val myFlow: FlowWithContext[PublishRequest, Long, PublishRequest, Long, NotUsed] =
FlowWithContext.fromTuples(originalFlowAsTuple) // Original flow that has `(PublishRequest, Long)` as an output
myFlow.unsafeViaData(publishFlow)
In contrast to Akka Streams, break tuple item apart?, not only is this solution involve much less boilerplate since its part of akka but it also retains the materialized value rather than losing it and always ending up with a NotUsed.
For the people wondering why the method unsafeViaData has unsafe in the name, its because the Flow that you pass into this method cannot add,drop or reorder any of the elements in the stream (doing so would mean that the context no longer properly corresponds to the data part of the stream). Ideally we would use Scala's type system to catch such errors at compile time but doing so would require a lot of changes to akka-stream especially if the changes need to remain backwards compatibility (which when dealing with akka we do). More details are in the PR mentioned earlier.
My not-so-elegant solution is using a custom flows that recombine things:
val publishAndRecombine = Flow.fromGraph(GraphDSL.create() { implicit b =>
val bc = b.add(Broadcast[(PublishRequest, Long)](2))
val publisher = b.add(Flow[(PublishRequest, Long)]
.map { case (pr, _) => pr }
.via(publishFlow))
val zipper = b.add(Zip[Seq[String], Long]).
bc.out(0) ~> publisher ~> zipper.in0
bc.out(1).map { case (pr, long) => long } ~> zipper.in1
FlowShape(bc.in, zipper.out)
})

How to create an Akka flow with backpressure and Control

I need to create a function with the following Interface:
import akka.kafka.scaladsl.Consumer.Control
object ItemConversionFlow {
def build(config: StreamConfig): Flow[Item, OtherItem, Control] = {
// Implementation goes here
}
My problem is that I don't know how to define the flow in a way that it fits the interface above.
When I am doing something like this
val flow = Flow[Item]
.map(item => doConversion(item)
.filter(_.isDefined)
.map(_.get)
the resulting type is Flow[Item, OtherItem, NotUsed]. I haven't found something in the Akka documentation so far. Also the functions on akka.stream.scaladsl.Flow only offer a "NotUsed" instead of Control. Would be great if someone could point me into the right direction.
Some background: I need to setup several pipelines which only distinguish in the conversion part. These pipelines are sub streams to a main stream which might be stopped for some reason (a corresponding message arrives in some kafka topic). Therefor I need the Control part. The idea would be to create a Graph template where I just insert the mentioned flow as argument (a factory returning it). For a specific case we have a solution which works. To generalize it I need this kind of flow.
You actually have backpressure. However, think about what do you really need about backpressure... you are not using asynchronous stages to increase your throughput... for example. Backpressure avoids fast producers overgrowing susbscribers https://doc.akka.io/docs/akka/2.5/stream/stream-rate.html. In your sample donĀ“t worry about it, your stream will ask for new elements to he publisher depending on how long doConversion takes to complete.
In case that you want to obtain the result of the stream use toMat or viaMat. For example, if your stream emits Item and transform these into OtherItem:
val str = Source.fromIterator(() => List(Item(Some(1))).toIterator)
.map(item => doConversion(item))
.filter(_.isDefined)
.map(_.get)
.toMat(Sink.fold(List[OtherItem]())((a, b) => {
// Examine the result of your stream
b :: a
}))(Keep.right)
.run()
str will be Future[List[OtherItem]]. Try to extrapolate this to your case.
Or using toMat with KillSwitches, "Creates a new [[Graph]] of [[FlowShape]] that materializes to an external switch that allows external completion
* of that unique materialization. Different materializations result in different, independent switches."
def build(config: StreamConfig): Flow[Item, OtherItem, UniqueKillSwitch] = {
Flow[Item]
.map(item => doConversion(item))
.filter(_.isDefined)
.map(_.get)
.viaMat(KillSwitches.single)(Keep.right)
}
val stream =
Source.fromIterator(() => List(Item(Some(1))).toIterator)
.viaMat(build(StreamConfig(1)))(Keep.right)
.toMat(Sink.ignore)(Keep.both).run
// This stops the stream
stream._1.shutdown()
// When it finishes
stream._2 onComplete(_ => println("Done"))

Obtain the URI of a Controller method call

I'm writing a Play 2.3.2 application in Scala.
In my application I'm writing a method that call an other controller method like the following:
def addTagToUser = CorsAction.async { request =>
implicit val userRestFormat = UserFormatters.restFormatter
implicit val inputFormat = InputFormatters.restFormatter
implicit val outputWriter = OutputFormatters.restWriter
//update the tag of a user
def updateTagToUserDB(value: JsValue): Future[Boolean] = {
val holder : WSRequestHolder = WS.url("http://localhost:9000/recommendation/ advise")
val complexHolder = holder.withHeaders("Content-Type" -> "application/json")
complexHolder.post(value).map(response => response.status match {//handle the response
case 200 => true
case _ => false
}
)
}
val jsonData = request.body.asJson //get the json data
jsonData match {
case Some(x) => x.validate[Input] match {
case JsSuccess(input, _) => updateTagToUserDB(x).flatMap(status => status match {
case true => Future{Ok}
case _ => Future{InternalServerError("Error on update the users tags")}
})
case e: JsError => Future{BadRequest("json bad formed")}
}
case None => Future{BadRequest("need a json value")}
}
}
But in this code I've the problem that the url is create static, Is possible to get the absolute uri of a Controller method in Play??
How can I make that??
As mentioned in reverse routing section of Play docs, you can achieve this with the following method call:
routes.Application.advise()
Note that routes exists in controllers so if you are in controllers package you can simply access reverse routes with routes.ControllerName.methodName.
From other parts of the code you need to use the fully qualified package, i.e. controllers.reverse.Application.advise().
If controller method takes a parameter you need to pass the desired argument and get the actual route, for example routes.Application.someMethod(10).
Reverse routing is a powerful asset in Play toolbox which frees you from repeating yourself. It's future proof in a sense that if you change your route, the change will be reflected automatically to the whole application.
Alternative
This approach may not be the best approach.
Redirecting to another controller makes sense, but sending a request to another controller which resides just inside the same web app is overkill and unnecessary. It would be more wise if your web app serves responses to outside not to request a response from itself.
You can easily avoid it by putting the common logic somewhere else and use it from both controllers. According to best practices a good controller is a thin one! By better layering life will be much easier.

Parallel file processing in Scala

Suppose I need to process files in a given folder in parallel. In Java I would create a FolderReader thread to read file names from the folder and a pool of FileProcessor threads. FolderReader reads file names and submits the file processing function (Runnable) to the pool executor.
In Scala I see two options:
create a pool of FileProcessor actors and schedule a file processing function with Actors.Scheduler.
create an actor for each file name while reading the file names.
Does it make sense? What is the best option?
Depending on what you're doing, it may be as simple as
for(file<-files.par){
//process the file
}
I suggest with all my energies to keep as far as you can from the threads. Luckily we have better abstractions which take care of what's happening below, and in your case it appears to me that you do not need to use actors (while you can) but you can use a simpler abstraction, called Futures. They are a part of Akka open source library, and I think in the future will be a part of the Scala standard library as well.
A Future[T] is simply something that will return a T in the future.
All you need to run a future, is to have an implicit ExecutionContext, which you can derive from a java executor service. Then you will be able to enjoy the elegant API and the fact that a future is a monad to transform collections into collections of futures, collect the result and so on. I suggest you to give a look to http://doc.akka.io/docs/akka/2.0.1/scala/futures.html
object TestingFutures {
implicit val executorService = Executors.newFixedThreadPool(20)
implicit val executorContext = ExecutionContext.fromExecutorService(executorService)
def testFutures(myList:List[String]):List[String]= {
val listOfFutures : Future[List[String]] = Future.traverse(myList){
aString => Future{
aString.reverse
}
}
val result:List[String] = Await.result(listOfFutures,1 minute)
result
}
}
There's a lot going on here:
I am using Future.traverse which receives as a first parameter which is M[T]<:Traversable[T] and as second parameter a T => Future[T] or if you prefer a Function1[T,Future[T]] and returns Future[M[T]]
I am using the Future.apply method to create an anonymous class of type Future[T]
There are many other reasons to look at Akka futures.
Futures can be mapped because they are monad, i.e. you can chain Futures execution :
Future { 3 }.map { _ * 2 }.map { _.toString }
Futures have callback: future.onComplete, onSuccess, onFailure, andThen etc.
Futures support not only traverse, but also for comprehension
Ideally you should use two actors. One for reading the list of files, and one for actually reading the file.
You start the process by simply sending a single "start" message to the first actor. The actor can then read the list of files, and send a message to the second actor. The second actor then reads the file and processes the contents.
Having multiple actors, which might seem complicated, is actually a good thing in the sense that you have a bunch of objects communicating with eachother, like in a theoretical OO system.
Edit: you REALLY shouldn't be doing doing concurrent reading of a single file.
I was going to write up exactly what #Edmondo1984 did except he beat me to it. :) I second his suggestion in a big way. I'll also suggest that you read the documentation for Akka 2.0.2. As well, I'll give you a slightly more concrete example:
import akka.dispatch.{ExecutionContext, Future, Await}
import akka.util.duration._
import java.util.concurrent.Executors
import java.io.File
val execService = Executors.newCachedThreadPool()
implicit val execContext = ExecutionContext.fromExecutorService(execService)
val tmp = new File("/tmp/")
val files = tmp.listFiles()
val workers = files.map { f =>
Future {
f.getAbsolutePath()
}
}.toSeq
val result = Future.sequence(workers)
result.onSuccess {
case filenames =>
filenames.foreach { fn =>
println(fn)
}
}
// Artificial just to make things work for the example
Thread.sleep(100)
execContext.shutdown()
Here I use sequence instead of traverse, but the difference is going to depend on your needs.
Go with the Future, my friend; the Actor is just a more painful approach in this instance.
But if use actors, what's wrong with that?
If we have to read / write to some property file. There is my Java example. But still with Akka Actors.
Lest's say we have an actor ActorFile represents one file. Hm.. Probably it can not represent One file. Right? (would be nice it could). So then it represents several files like PropertyFilesActor then:
Why would not use something like this:
public class PropertyFilesActor extends UntypedActor {
Map<String, String> filesContent = new LinkedHashMap<String, String>();
{ // here we should use real files of cource
filesContent.put("file1.xml", "");
filesContent.put("file2.xml", "");
}
#Override
public void onReceive(Object message) throws Exception {
if (message instanceof WriteMessage) {
WriteMessage writeMessage = (WriteMessage) message;
String content = filesContent.get(writeMessage.fileName);
String newContent = content + writeMessage.stringToWrite;
filesContent.put(writeMessage.fileName, newContent);
}
else if (message instanceof ReadMessage) {
ReadMessage readMessage = (ReadMessage) message;
String currentContent = filesContent.get(readMessage.fileName);
// Send the current content back to the sender
getSender().tell(new ReadMessage(readMessage.fileName, currentContent), getSelf());
}
else unhandled(message);
}
}
...a message will go with parameter (fileName)
It has its own in-box, accepting messages like:
WriteLine(fileName, string)
ReadLine(fileName, string)
Those messages will be storing into to the in-box in the order, one after antoher. The actor would do its work by receiving messages from the box - storing/reading, and meanwhile sending feedback sender ! message back.
Thus, let's say if we write to the property file, and send showing the content on the web page. We can start showing page (right after we sent message to store a data to the file) and as soon as we received the feedback, update part of the page with a data from just updated file (by ajax).
Well, grab your files and stick them in a parallel structure
scala> new java.io.File("/tmp").listFiles.par
res0: scala.collection.parallel.mutable.ParArray[java.io.File] = ParArray( ... )
Then...
scala> res0 map (_.length)
res1: scala.collection.parallel.mutable.ParArray[Long] = ParArray(4943, 1960, 4208, 103266, 363 ... )