Multipart Form Errors with Lagom - scala

Most of our Lagom entrypoints don't use multipart form requests, but one does. Since Lagom doesn't currently support multipart requests natively, the general suggestion I have seen is to call the underlying Play API, using the PlayServiceCall mechanism.
We have done that, and it works--most of the time. But we experience intermittent errors, especially when submitting large files. These are always cases of java.util.zip.ZipException (of various kinds), looking as if not an entire file has been received for processing.
Here's how the entrypoint looks in the code; in particular, the Play wrapping mechanism:
def upload = PlayServiceCall[NotUsed, UUID] {
wrapCall => Action.async(multipartFormData) {
request => wrapCall(ServiceCall { _ =>
val upload = request.body.file("upload")
val input = new FileInputStream(upload.get.ref.file)
val filename = upload.get.filename
// ...
// other code to actually process the file
// ...
})(request).run
}
}
Here are just two examples of exceptions we're seeing:
Caused by: java.util.zip.ZipException: invalid code lengths set
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.ZipInputStream.read(ZipInputStream.java:194)
at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:214)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
etc.
Caused by: java.util.zip.ZipException: invalid distance too far back
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.ZipInputStream.read(ZipInputStream.java:194)
at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:214)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
etc.
We use Lagom 1.3.8, in Scala. Any suggestions, please?

Try using the new service gateway based on Akka HTTP.
You can enable this by adding the following to your build.sbt:
lagomServiceGatewayImpl in ThisBuild := "akka-http"
The new service gateway is still disabled by default in Lagom 1.3.8, but Lagom users that have experienced this problem have reported that it is resolved by enabling the akka-http gateway. This will become the default implementation in Lagom 1.4.0.

Related

Is there support for compression in ReactiveMongo?

I am using ReactiveMongo as the connector for an Akka-Http, Akka-Streams project. I am creating the MongoConnection as shown below, but the data in the database is compressed using Snappy. No matter where I look, I can't find any mention of compression support in the ReactiveMongo documentation. When I try to connect to the Mongo database using a URL with the compressors=snappy flag, it returns an exception.
I looked through the source code and indeed it appears to have no mention of compression support at all. At this point I'm willing to accept a hack work around.
Can anyone help me please?
MongoConnection.fromString("mongodb://localhost:27017?compressors=snappy").flatMap(uri => driver.connect(uri))
Exception:
23:09:15.311 [default-akka.actor.default-dispatcher-6] ERROR akka.actor.ActorSystemImpl - Error during processing of request: 'The connection URI contains unsupported options: compressors'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
java.lang.IllegalArgumentException: The connection URI contains unsupported options: compressors
at reactivemongo.api.AsyncDriver.connect(AsyncDriver.scala:227)
at reactivemongo.api.AsyncDriver.connect(AsyncDriver.scala:203)
at reactivemongo.api.AsyncDriver.connect(AsyncDriver.scala:252)
If you need a workable example, you can try this:
(You don't actually need a MongoDB container running locally for the error to be thrown)
object ReactiveMongoCompressorIssue extends App {
import scala.concurrent.Await
import scala.concurrent.duration._
implicit val actorSystem = ActorSystem("ReactiveMongoCompressorIssue")
implicit val dispatcher: ExecutionContextExecutor = actorSystem.dispatcher
final val driver = AsyncDriver()
val url = "mongodb://localhost:27017/?compressors=snappy"
val connection = Await.result(MongoConnection.fromString(url).flatMap(uri => driver.connect(uri)), 3.seconds)
assert(connection.active)
}
Thanks to what #cchantep said about how compression in MongoDB is handled on the server side (see the MongoDB docs here) I went back through the ReactiveMongo source code to see if there was a way to either bypass the check or remove the flag from the URL myself and connect without it.
Indeed, I found that there is a boolean flag called strictMode which determines whether ignoredOptions such as the compressors flag should cause an exception to be thrown or not. So now my connection looks like this:
MongoConnection.fromString(url).flatMap(uri => driver.connect(uri, None, strictMode = false))
The None refers to a name of a connection pool, but the other connect method I was using before doesn't use one either so this works fine.
Thank you for the help!

User agent parser (ua-parser) slows down Spark on EMR

I am using ua-parser in my UDFs to parse User agent info. And I noticed that these jobs are very slow compared to the ones without parser. Here is an example:
import org.uaparser.scala.Parser
val parser: Parser = Parser.default
val parseDeviceUDF = udf((ua: String) => Try(parser.parse(ua).device.family).toOption.orNull)
The stange thing is that when I submit job as a EMR step, it is slow, but when I run same code in Zeppelin or Spark shell it works fine. I write data to parquet files. And that is the stage where it gets stuck.
The answer I am about to give is not about an open-source project, but it does provide information that whoever is researching how to parse the user-agent string to obtain device intelligence will want to know about.
WURFL is a time-honored tool to do User-Agent (and more generally HTTP request) analysis and obtain easily consumable device/browser information. ScientiaMobile has recently released a version of WURFL (called WURFL Microservice) that can be obtained from the major marketplaces of AWS, Azure and GCP (in addition to ScientiaMobile itself of course).
In the case at hand, the (Java) code that would bring a Spark user from HTTP logs to device data would look something like this.
JavaDStream enrichedEvents = events.map(evs -> {
WmClient wmClient = WmClientProvider.getOrCreate(wmServerHost, "80");
for (EnrichedEventData evItem : evs) {
...
HttpServletRequestMock request = new HttpServletRequestMock(evItem.getHeaders());
Model.JSONDeviceData device = wmClient.lookupRequest(request);
evItem.setWurflCompleteName(device.capabilities.get("complete_device_name"));
evItem.setWurflDeviceMake(device.capabilities.get("brand_name"));
evItem.setWurflDeviceModel(device.capabilities.get("model_name"));
evItem.setWurflFormFactor(device.capabilities.get("form_factor"));
evItem.setWurflDeviceOS(device.capabilities.get("device_os") + " " + \
device.capabilities.get("device_os_version"));
...
}
return evs;
});
More information about how Spark and WURFL are integrated can be found in this article.
Disclaimer: I am the CTO of ScientiaMobile and original creator of WURFL.

Play WS OAuth content length required

I am having trouble connecting to the Evernote API using the OAuth wrapper bundled with Play 2.6.10 WS.
I am currently using sbt 0.13.15, Oracle JDK 1.8, and Scala 2.12.3.
The relevant piece of code from my OAuth Play controller:
import play.api.libs.oauth._
val KEY = ConsumerKey("KEY", "SECRET")
val EVERNOTE = OAuth(
ServiceInfo(
"https://sandbox.evernote.com/oauth",
"https://sandbox.evernote.com/oauth",
"https://sandbox.evernote.com/OAuth.action",
key = KEY
),
use10a = false
)
// Step 1: Request temporary token
EVERNOTE.retrieveRequestToken(CALLBACK_URL) match {
case Right(t: RequestToken) =>
// Step 2: Request user authorization; pass temporary token from Step 1
// Also, store temporary token and secret for later use
Redirect(EVERNOTE.redirectUrl(t.token)).withSession("token" -> t.token, "secret" -> t.secret)
// TODO: check this out!
case Left(e) => throw e
}
The application crashes due to the exception thrown from the Either returned by retrieveRequestToken. The exact exception is:
OAuthCommunicationException: Communication with the service provider failed: Service provider responded in error: 411 (Length Required)
After some snooping around, it seems as if this issue is common in OAuth and requires the POST request headers to contain a Content-Length (typically set to 0). Example: Why I get 411 Length required error?. But as far as I can tell, Play WS does not expose this option from Signpost (OAuth library under the hood), so I was not able to try this solution.
Of course, I may be overlooking something here. Has anyone experienced a similar issue? I just want to make sure before creating a new issue on the WS repo.
Thanks.
Evernote requires content-length for the API calls so I think that's the case.
Getting 411 error bad request in Evernote

Play ws scala server hangs up on request after 120seconds - which options to use?

I am pretty sure that this is a config problem, so I'll post my code and the relevant application.conf options of my play app.
I have a play server that needs to interact with another server "B" (basically multi-file upload to B). The interaction happens inside an async -Action which should result in an OK with B's response on the upload. This is the reduced code:
def authenticateAndUpload( url: String) = Action.async( parse.multipartFormData) { implicit request =>
val form = authForm.bindFromRequest.get
val (user, pass) = (form.user, form.pass)
//the whole following interaction with the other server happens in a future, i.e. login returns a Future[Option[WSCookie]] which is then used
login(user, pass, url).flatMap {
case Some(cookie) => //use the cookie to upload the files and collect the result, i.e. server responses
//this may take a few minutes and happens in yet another future, which eventually produces the result
result.map(cc => Ok(s"The server under url $url responded with $cc"))
case None =>
Future.successful(Forbidden(s"Unable to log into $url, please go back and try again with other credentials."))
}
}
I am pretty sure that the code itself works since I can see my server log which nicely prints B's responses every few seconds and proceeds until everything is correctly uploaded. The only problem is that the browser hangs up with a server overloaded message after 120s which should be a play default value - but for which config parameter?
I tried to get rid of it by setting every play.server.http. timeout option I could get my hands on and even decided to use play.ws, specific akka, and other options of which I am quite sure that they are not necessary... however the problem remains, here is my current application.config part:
ws.timeout.idle="3600s"
ws.timeout.request ="3600s"
ws.timeout.response="3600s"
play.ws.timeout.idle="3600s"
play.ws.timeout.request="3600s"
play.ws.timeout.response="3600s"
play.server.http.connectionTimeout="3600s"
play.server.http.idleTimeout="3600s"
play.server.http.requestTimeout="3600s"
play.server.http.responseTimeout="3600s"
play.server.http.keepAlive="true"
akka.http.host-connection-pool.idle-timeout="3600s"
akka.http.host-connection-pool.client.idle-timeout= "3600s"
The browser hang up happened both on Safari and Chrome, where Chrome additionally started a second communication with B after about 120 seconds - also both of these communications succeeded and produced the expected logs, only the browsers had both hang up.
I am using Scala 2.12.2 with play 2.6.2 in an SBT environment, the server is under development, pre-compiled but then started via run - I read that it may not pick up the application.conf options - but it did on some file size customizing. Can someone tell me the correct config options or my mistake on the run process?

Spray.io log leaks sensitive information

I'm using Spray client to consume a third-party API. Unfortunately, the API I'm consuming is not very secure and utilizes an authentication method using GET query parameters.
Sometimes we're getting timeouts or connection issues which we know to deal with applicatively. The problem is that Spray logs this at a WARN log-level, and the URL including the sensitive query parameters () are being written in our log files.
Here's an example of the log file.
2015-05-19 12:23:17,024 WARN HttpHostConnectionSlot - Connection attempt to 10.10.10.10:443 failed in response to GET request to /api/?type=keygen&user=test_user&password=S3kret! with 2 retries left, retrying...
2015-05-19 12:23:17,084 WARN HttpHostConnectionSlot - Connection attempt to 10.10.10.10:443 failed in response to GET request to /api/?type=keygen&user=test_user&password=S3kret! with 1 retries left, retrying...
Is there any way to filter this? (Maybe in Akka?)
Spray reuses akka-logging for doing all logging groundwork.
In akka you can redeclare a custom event logger in application config:
akka {
# event-handlers = ["akka.event.Logging$DefaultLogger"] // default one
event-handlers = ["com.example.PrivacyLogger"] // custom one
# Options: ERROR, WARNING, INFO, DEBUG
loglevel = "DEBUG"
}
It may look like this:
class PrivacyLogger extends DefaultLogger {
override def receive: Receive = {
case InitializeLogger(_) ⇒ sender() ! LoggerInitialized
case event: LogEvent ⇒ print(stripSecret(event))
}
private def stripSecret(event:LogEvent) = ...
}
But you always can implement your own message processing logic here instead of simple printing.
PS. If you use slf4j for logging, the solution will mostly look the same, but with some minor differences like overriding akka.event.slf4j.Slf4jEventHandler instead of DefaultLogger.