akka.http.scaladsl.model.ParsingException: Unexpected end of multipart entity while uploading a large file to S3 using akka http - scala

I am trying to upload a large file (90 MB for now) to S3 using Akka HTTP with Alpakka S3 connector. It is working fine for small files (25 MB) but when I try to upload large file (90 MB), I got the following error:
akka.http.scaladsl.model.ParsingException: Unexpected end of multipart entity
at akka.http.scaladsl.unmarshalling.MultipartUnmarshallers$$anonfun$1.applyOrElse(MultipartUnmarshallers.scala:108)
at akka.http.scaladsl.unmarshalling.MultipartUnmarshallers$$anonfun$1.applyOrElse(MultipartUnmarshallers.scala:103)
at akka.stream.impl.fusing.Collect$$anon$6.$anonfun$wrappedPf$1(Ops.scala:227)
at akka.stream.impl.fusing.SupervisedGraphStageLogic.withSupervision(Ops.scala:186)
at akka.stream.impl.fusing.Collect$$anon$6.onPush(Ops.scala:229)
at akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:523)
at akka.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:510)
at akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:376)
at akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:606)
at akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:485)
at akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:581)
at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:749)
at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$shortCircuitBatch(ActorGraphInterpreter.scala:739)
at akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:765)
at akka.actor.Actor.aroundReceive(Actor.scala:539)
at akka.actor.Actor.aroundReceive$(Actor.scala:537)
at akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:671)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:614)
at akka.actor.ActorCell.invoke(ActorCell.scala:583)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
at akka.dispatch.Mailbox.run(Mailbox.scala:229)
at akka.dispatch.Mailbox.exec(Mailbox.scala:241)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Although, I get the success message at the end but file does not uploaded completely. It gets upload of 45-50 MB only.
I am using the below code:
S3Utility.scala
class S3Utility(implicit as: ActorSystem, m: Materializer) {
private val bucketName = "test"
def sink(fileInfo: FileInfo): Sink[ByteString, Future[MultipartUploadResult]] = {
val fileName = fileInfo.fileName
S3.multipartUpload(bucketName, fileName)
}
}
Routes:
def uploadLargeFile: Route =
post {
path("import" / "file") {
extractMaterializer { implicit materializer =>
withoutSizeLimit {
fileUpload("file") {
case (metadata, byteSource) =>
logger.info(s"Request received to import large file: ${metadata.fileName}")
val uploadFuture = byteSource.runWith(s3Utility.sink(metadata))
onComplete(uploadFuture) {
case Success(result) =>
logger.info(s"Successfully uploaded file")
complete(StatusCodes.OK)
case Failure(ex) =>
println(ex, "Error in uploading file")
complete(StatusCodes.FailedDependency, ex.getMessage)
}
}
}
}
}
}
Any help would be appraciated. Thanks

Strategy 1
Can you break the file into smaller chunks and retry, here is the sample code:
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("some-kind-of-endpoint"))
.withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials("user", "pass")))
.disableChunkedEncoding()
.withPathStyleAccessEnabled(true)
.build();
// Create a list of UploadPartResponse objects. You get one of these
// for each part upload.
List<PartETag> partETags = new ArrayList<PartETag>();
// Step 1: Initialize.
InitiateMultipartUploadRequest initRequest = new
InitiateMultipartUploadRequest("bucket", "key");
InitiateMultipartUploadResult initResponse =
s3Client.initiateMultipartUpload(initRequest);
File file = new File("filepath");
long contentLength = file.length();
long partSize = 5242880; // Set part size to 5 MB.
try {
// Step 2: Upload parts.
long filePosition = 0;
for (int i = 1; filePosition < contentLength; i++) {
// Last part can be less than 5 MB. Adjust part size.
partSize = Math.min(partSize, (contentLength - filePosition));
// Create a request to upload a part.
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName("bucket").withKey("key")
.withUploadId(initResponse.getUploadId()).withPartNumber(i)
.withFileOffset(filePosition)
.withFile(file)
.withPartSize(partSize);
// Upload part and add response to our list.
partETags.add(
s3Client.uploadPart(uploadRequest).getPartETag());
filePosition += partSize;
}
// Step 3: Complete.
CompleteMultipartUploadRequest compRequest = new
CompleteMultipartUploadRequest(
"bucket",
"key",
initResponse.getUploadId(),
partETags);
s3Client.completeMultipartUpload(compRequest);
} catch (Exception e) {
s3Client.abortMultipartUpload(new AbortMultipartUploadRequest(
"bucket", "key", initResponse.getUploadId()));
}
Strategy 2
Increase the idle-timeout of the Akka HTTP server (just set it to infinite), like the following:
akka.http.server.idle-timeout=infinite
This would increase the time period for which the server expects to be idle. By default its value is 60 seconds. And if the server is not able to upload the file within that time period, it will close the connection and throw "Unexpected end of multipart entity" error.

Related

How can i handle multipart post request with akka http?

I wont to handle multipart request.
If I accept a request using such a route
val routePutData = path("api" / "putFile" / Segment) {
subDir => {
entity(as[String]) { (str) => {
complete(str)
}
}
}}
I get the following text(i try to send log4j config):
Content-Disposition: form-data; name="file"; filename="log4j.properties"
Content-Type: application/binary
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd hh:mm:ss} %t %-5p %c{1} - %m%n
----gc0pMUlT1B0uNdArYc0p--
How can i get array of bytes from file i send and file name?
I try to use entity(as[Multipart.FormData]), and formFields directive, but it didn't help.
You should keep up with the akka docs, but I think that there were not enought examples in the file uploading section. Anyway, you don't need to extract entity as a string or byte arrays, akka already has a directive, called fileUpload. This takes a parameter called fieldName which is the key to look for in the multipart request, and expects a function to know what to do given the metadata and the content of the file. Something like this:
post {
extractRequestContext { ctx =>
implicit val mat = ctx.materializer
fileUpload(fieldName = "myfile") {
case (metadata, byteSource) =>
val fileName = metadata.fileName
val futureBytes = byteSource
.mapConcat[Byte] { byteString =>
collection.immutable.Iterable.from(
byteString.iterator
)
}
.toMat(Sink.fold(Array.emptyByteArray) {
case (arr, newLine) => arr :+ newLine
}
)(Keep.right)
.run()
val filePath = Files.createFile(Paths.get(s"/DIR/TO/SAVE/FILE/$fileName"))
onSuccess(futureBytes.map(bytes => Files.write(filePath, bytes))) { _ =>
complete(s"wrote file to: ${filePath.toUri.toString}")
}
}
}
}
While the above solution looks good, there is also the storeUploadedFile directive to achieve the same with less code, sth like:
path("upload") {
def tempDestination(fileInfo: FileInfo): File = File.createTempFile(fileInfo.fileName, ".tmp.server")
storeUploadedFile("myfile", tempDestination) {
case (metadataFromClient: FileInfo, uploadedFile: File) =>
println(s"Server stored uploaded tmp file with name: ${uploadedFile.getName} (Metadata from client: $metadataFromClient)")
complete(HttpResponse(StatusCodes.OK))
}
}

How to upload files and get formfields in akka-http

I am trying to upload a file via akka-http, and have gotten it to work with the following snippet
def tempDestination(fileInfo: FileInfo): File =
File.createTempFile(fileInfo.fileName, ".tmp")
val route =
storeUploadedFile("csv", tempDestination) {
case (metadata, file) =>
//Do my operation on the file.
complete("File Uploaded. Status OK")
}
But I'd also want to send a param1/param2 in the posted form.
I tried the following, and it works, but I am having to send the parameters via the URL (http://host:port/csv-upload?userid=arvind)
(post & path("csv-upload")) {
storeUploadedFile("csv", tempDestination) {
case (metadata, file) =>
parameters('userid) { userid =>
//logic for processing the file
complete(OK)
}
}
}
The restriction on the file size is around 200-300 MB. I added the following property to my conf
akka{
http{
parsing{
max-content-length=200m
}
}
}
Is there a way, I can get the parameters via the formFields directive ?
I tried the following
fileUpload("csv") {
case (metadata, byteSource) =>
formFields('userid) { userid =>
onComplete(byteSource.runWith(FileIO.toPath(Paths.get(metadata.fileName)))) {
case Success(value) =>
logger.info(s"${metadata}")
complete(StatusCodes.OK)
case Failure(exception) =>
complete("failure")
But, with the above code, I hit the following exception
java.lang.IllegalStateException: Substream Source cannot be materialized more than once
at akka.stream.impl.fusing.SubSource$$anon$13.setCB(StreamOfStreams.scala:792)
at akka.stream.impl.fusing.SubSource$$anon$13.preStart(StreamOfStreams.scala:802)
at akka.stream.impl.fusing.GraphInterpreter.init(GraphInterpreter.scala:306)
at akka.stream.impl.fusing.GraphInterpreterShell.init(ActorGraphInterpreter.scala:593)
Thanks,
Arvind
I got this working with sth like:
path("upload") {
formFields(Symbol("payload")) { payload =>
println(s"Server received request with additional payload: $payload")
def tempDestination(fileInfo: FileInfo): File = File.createTempFile(fileInfo.fileName, ".tmp.server")
storeUploadedFile("binary", tempDestination) {
case (metadataFromClient: FileInfo, uploadedFile: File) =>
println(s"Server stored uploaded tmp file with name: ${uploadedFile.getName} (Metadata from client: $metadataFromClient)")
complete(Future(FileHandle(uploadedFile.getName, uploadedFile.getAbsolutePath, uploadedFile.length())))
}
}
}
Full example:
https://github.com/pbernet/akka_streams_tutorial/blob/master/src/main/scala/akkahttp/HttpFileEcho.scala

How to stream downloads using Scalaj-Http and Hadoop HttpFs

My question is how to use a Buffered stream when using Scalaj-Http.
I have written the following code which is a complete working example that will download a file from Hadoop HDFS using HttpFS. My goal is to handle very large files and this will require using a buffered approach with multiple I/O writes to a local file.
I have not been able to find documentation on how to use a stream with the ScalaJ-Http interface. I am interested in an example for both download and upload that can handle large multi GB files. My code below uses in memory buffering which is appropriate for only prototyping.
import scalaj.http._
import ujson.Js
import java.text.SimpleDateFormat
import java.net.SocketTimeoutException
import java.io.InputStream
import java.io.BufferedOutputStream
import java.io.FileOutputStream
import java.io.FileNotFoundException
object CopyFileFromHdfs {
def main(args: Array[String]) {
val host = "hadoop.example.com"
val user = "root"
var dstFile = ""
var srcFile = ""
val operation = "OPEN"
val port = 14000
System.setProperty("sun.net.http.allowRestrictedHeaders", "true")
if (args.length != 2)
{
println("Error: Missing or too many arguments")
println("Usage: CopyFileFromHdfs <srcfile> <dstfile>")
System.exit(1)
}
srcFile = args(0)
dstFile = args(1)
// ********************************************************************************
// Create the URL string that we will use to connect to Hadoop HttpFS
//
// The string will look like this:
// http://root#123.456.789.012:14000/webhdfs/v1/?user.name=root&op=OPEN
// ********************************************************************************
val url = makeHttpfsUrl(host, user, srcFile, operation, port)
// ********************************************************************************
// Using HTTP, call the HttpFS server
//
// Exceptions:
// java.net.SocketTimeoutException
// java.net.UnknownHostException
// java.lang.IllegalArgumentException
// Remote Exceptions:
// java.io.FileNotFoundException
// com.sun.jersey.api.NotFoundException
// ********************************************************************************
try {
var response = Http(url)
.timeout(connTimeoutMs = 1000, readTimeoutMs = 5000)
.asBytes
// ********************************************************************************
// Check for an error. We are expecting an HTTP 200 response
// ********************************************************************************
if (response.code < 200 || response.code > 299)
{
val data = ujson.read(response.body)
printf("Error: Cannot download file: %s\n", dstFile)
println(removeQuotes(data("RemoteException")("message").str))
println(removeQuotes(data("RemoteException")("exception").str))
System.exit(1)
}
val is = new FileOutputStream(dstFile)
val bs = new BufferedOutputStream(is)
bs.write(response.body, 0, response.body.length)
bs.close()
is.close()
} catch {
case e: SocketTimeoutException => {
printf("Error: Cannot connect to host %s on port %d\n", host, port)
println(e)
System.exit(1);
}
case e: Exception => {
printf("Error (other): Cannot download file %s\n", srcFile)
println(e)
System.exit(1);
}
}
printf("Success: File downloaded. %s -> %s\n", srcFile, dstFile)
System.exit(0)
}
// ********************************************************************************
// The Json strings are surrounded by quotes.
// This function will remove them (only at the start and the end).
// ********************************************************************************
def removeQuotes(str: String): String = {
// This expression will delete quotes at the beginning and end of a string
return str.replaceAll("^\"|\"$", "");
}
// ********************************************************************************
// Create the URL string that we will use to connect to Hadoop HttpFS
//
// The string will look like this:
// http://root#123.456.789.012:14000/webhdfs/v1/?user.name=root&op=LISTSTATUS
// ********************************************************************************
def makeHttpfsUrl(
host: String,
user: String,
hdfsPath: String,
operation: String,
port: Integer) : String = {
var url = "http://" + user + "#" + host + ":" + port.toString + "/webhdfs/v1"
if (hdfsPath(0) == '/')
url += hdfsPath
else
url += "/" + hdfsPath
url += "?user.name=" + user + "&op=" + operation
return url
}
}

SSLHandshakeException happens during file upload to AWS S3 via Alpakka

I'm trying to setup an Alpakka S3 for files upload purpose. Here is my configs:
alpakka s3 dependency:
...
"com.lightbend.akka" %% "akka-stream-alpakka-s3" % "0.20"
...
Here is application.conf:
akka.stream.alpakka.s3 {
buffer = "memory"
proxy {
host = ""
port = 8000
secure = true
}
aws {
credentials {
provider = default
}
}
path-style-access = false
list-bucket-api-version = 2
}
File upload code example:
private val awsCredentials = new BasicAWSCredentials("my_key", "my_secret_key")
private val awsCredentialsProvider = new AWSStaticCredentialsProvider(awsCredentials)
private val regionProvider = new AwsRegionProvider { def getRegion: String = "us-east-1" }
private val settings = new S3Settings(MemoryBufferType, None, awsCredentialsProvider, regionProvider, false, None, ListBucketVersion2)
private val s3Client = new S3Client(settings)(system, materializer)
val fileSource = Source.fromFuture(ByteString("ololo blabla bla"))
val fileName = UUID.randomUUID().toString
val s3Sink: Sink[ByteString, Future[MultipartUploadResult]] = s3Client.multipartUpload("my_basket", fileName)
fileSource.runWith(s3Sink)
.map {
result => println(s"${result.location}")
} recover {
case ex: Exception => println(s"$ex")
}
When I run this code I get:
javax.net.ssl.SSLHandshakeException: General SSLEngine problem
What can be a reason?
The certificate problem arises for bucket names containing dots.
You may switch to
akka.stream.alpakka.s3.path-style-access = true to get rid of this.
We're considering making it the default: https://github.com/akka/alpakka/issues/1152

Save attachments from response in SoapUI

I get 2 files in response of a SOAP request. I try to save these files with followig Groovy script. I use script as a script assertion for test step. First file is saved successfully in execution, but couldn't find second one.
def fileName = "C:\\<mydirectory>"+'/test.pdf'
def fileName1 = "C:\\<mydirectory>"+'/test1.pdf'
def response = messageExchange.response
assert null != response, "response is null"
def outFile = new FileOutputStream(new File(fileName))
def outFile1 = new FileOutputStream(new File(fileName1))
def ins = messageExchange.responseAttachments[0]?.inputStream
def ins1 = messageExchange.responseAttachments[0]?.inputStream
if (ins) {
com.eviware.soapui.support.Tools.writeAll(outFile, ins)
}
ins.close()
outFile.close()
if (ins1) {
com.eviware.soapui.support.Tools.writeAll(outFile1, ins)
}
ins1.close()
outFile1.close()