I have code like this:
val extractInfo: (Array[Byte] => String) = (fp: Array[Byte]) => {
val parser:Parser = new AutoDetectParser()
val handler:BodyContentHandler = new BodyContentHandler(Integer.MAX_VALUE)
val config:TesseractOCRConfig = new TesseractOCRConfig()
val pdfConfig:PDFParserConfig = new PDFParserConfig()
val inputstream:InputStream = new ByteArrayInputStream(fp)
val metadata:Metadata = new Metadata()
val parseContext:ParseContext = new ParseContext()
parseContext.set(classOf[TesseractOCRConfig], config)
parseContext.set(classOf[PDFParserConfig], pdfConfig)
parseContext.set(classOf[Parser], parser)
parser.parse(inputstream, handler, metadata, parseContext)
handler.toString
}
A function literal that parses text from PDFs using Apache Tika.
What I want, though, is a Try block in here that runs on parser.parse and returns an empty string if it cannot execute. I am not sure how to construct this sort of logic in Scala.
I think what you are looking for is Try.
val extractInfo: (Array[Byte] => String) = (fp: Array[Byte]) => Try {
val parser:Parser = new AutoDetectParser()
...
handler.toString
} getOrElse("")
What this does is catch any error in the body and recover from this error by returning the empty string.
You can just write
try {
val parser:Parser = new AutoDetectParser()
val handler:BodyContentHandler = new BodyContentHandler(Integer.MAX_VALUE)
val config:TesseractOCRConfig = new TesseractOCRConfig()
val pdfConfig:PDFParserConfig = new PDFParserConfig()
val inputstream:InputStream = new ByteArrayInputStream(fp)
val metadata:Metadata = new Metadata()
val parseContext:ParseContext = new ParseContext()
parseContext.set(classOf[TesseractOCRConfig], config)
parseContext.set(classOf[PDFParserConfig], pdfConfig)
parseContext.set(classOf[Parser], parser)
parser.parse(inputstream, handler, metadata, parseContext)
handler.toString
} catch {
case e: Exception => ""
}
because try is an expression in Scala, just like if or match. However, if you intend to use "" as a sentinel value (that is, check later whether an error happened by checking if the result is empty), don't; use Option[String] or Try[String] as the return type instead.
Related
How we can implement the unit test cases for aws lamda serverless.
My code is
object Test1 extends RequestHandler[APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent] with ResponseObjProcess {
override def handleRequest(input: APIGatewayProxyRequestEvent, context: Context): APIGatewayProxyResponseEvent = {
var response = new APIGatewayProxyResponseEvent()
val gson = new Gson
val requestHttpMethod = input.getHttpMethod
val requestBody = input.getBody
val requestHeaders = input.getHeaders
val requestPath = input.getPath
val requestPathParameters = input.getPathParameters
val requestQueryStringParameters = input.getQueryStringParameters
val parsedBody = JSON.parseFull(requestBody).getOrElse(0).asInstanceOf[Map[String, String]]
println(" parsedBody is:: " + parsedBody)
val active = parsedBody.get("active").getOrElse("false")
val created = parsedBody.get("created").getOrElse("0").toLong
val updated = parsedBody.get("updated").getOrElse("0").toLong
requestHttpMethod match {
case "PUT" =>
println(" PUT Request method ")
// insertRecords("alert_summary_report", requestBody)
response.setStatusCode(200)
response.setBody(gson.toJson("PUT"))
case _ =>
println("")
response.setStatusCode(400)
response.setBody(gson.toJson("None"))
}
response
}
}
And I tried to implement unit test cases for the above code.
Below code is:
test("testing record success case") {
var request = new APIGatewayProxyRequestEvent();
request.setHttpMethod(Constants.PUTREQUESTMETHOD)
DELETEREQUESTBODY.put("id", "")
request.setBody(EMPTYREQUESTBODY)
request.setPathParameters(DELETEREQUESTBODY)
println(s"body = ${request.getBody}")
println(s"headers = ${request.getHeaders}")
val response = ProxyRequestMain.handleRequest(subject, testContext)
val assertEqual = response.getStatusCode.equals(200)
assertEqual
}
Actually, I'm getting response.getStatusCode=400 bad requests but test case passed how can I write handle this.
I am looking at your test code and it's not clear to me what you are trying to achieve with your assertions. I think you might have mixed quite a few things. In the code as it currently stands, you have a val, not assertion. I'd encourage you to have a look at the relevant docs and research the options available to you:
http://www.scalatest.org/user_guide/using_assertions
http://www.scalatest.org/user_guide/using_matchers
I'm using IO (cats/scalaz does not matter). And I want to use bracket to close InputStream after I'm done with it. The problem is that I'm reading gzipped files. Here is what I tried:
I (Incorrect).
val io1 = IO(Files.newInputStream(Paths.get("/tmp/file")))
val io2 = io1.map(is => new GZIPInputStream(is))
val io3 = io2.bracket{_ =>
IO(println("use"))
//empty usage
}{ is =>
println("close")
IO(is.close())
}
This is incorrect because of if /tmp/file is a broken zip-file with invalid magic we will never reach "resource release" bracket.
II (Incorrect).
val io1 = IO(Files.newInputStream(Paths.get("/tmp/file")))
val io3 = io1.bracket{is =>
val gzis = new GZIPInputStream(is)
IO(println("use"))
//empty usage
}{ is =>
println("close")
IO(is.close())
}
This is incorrect because we are closing the underlying stream, but not the GzipInputStream so we may end up losing some buffered data inside.
In java I could simply do this without flushing:
var is: InputStream = null
try{
is = Files.newInputStream(Paths.get("/tmp/file"))
is = new GZIPInputStream(is)
//use
} finally {
if(is ne null)
is.close()
}
Can you suggest some approach for dealing with GzipInputStream?
It is not a problem to call close on input stream several times, so you can close InputStream and GZIPInputStream separatelly.
In Java it is common to let try with resources hanlde both streams
try (InputStream is = Files.newInputStream(Paths.get("/tmp/file"));
GZIPInputStream gzis = new GZIPInputStream(is)){
//use gzis
}
// both streams are closed in implicit finaly clause
You can translate this approach to IO brackets
val io1 = IO(Files.newInputStream(Paths.get("/tmp/file")))
val io2 = io1.bracket { is =>
IO(new GZIPInputStream(is)).bracket { gzis =>
IO(println("using gzis"))
}(gzis => IO(gzis.close()))
}(is => IO(is.close()))
To awoid nested brackets you can use Resource
def openFile(path: Path) = Resource(IO {
val is = Files.newInputStream(path)
(is, IO(is.close()))
})
def openGZIP(is: InputStream) = Resource(IO {
val gzis = new GZIPInputStream(is)
(gzis, IO(gzis.close()))
})
val gzip: Resource[IO, GZIPInputStream] = for {
is <- openFile(Paths.get("/tmp/file"))
gzis <- openGZIP(is)
} yield gzis
gzip.use {
gzis => IO(println("using gzis"))
}
I am trying to read xml using the following method to extract data from xml
def xmlparser(xml:String): (String,List[String]) =
Try {
val documentbuilder=DocumentBuilderFactory.newInstance.newDocumentBuilder
val xmldocument = documentbuilder.parse(new InputSource(new java.io.StringReader(xml)))
val nodesofchild=xmldocument.getChildNodes
val xmlvalues=extractvalues(nodesofchild)
("xmlname",xmlvalues)
}
I need to return ("xmlname",xmlvalues) if xml is valid ,else i need to return ("xmlname",null).I tried using ".toOption.orNull" but it is returning only "null".Could somebody help me how to return ("xmlname",null) instead of "null"
Instead of your current code:
def xmlparser(xml:String): (String, Option[List[String]]) =
val values = Try {
val documentbuilder=DocumentBuilderFactory.newInstance.newDocumentBuilder
val xmldocument = documentbuilder.parse(new InputSource(new java.io.StringReader(xml)))
val nodesofchild=xmldocument.getChildNodes
val xmlvalues=extractvalues(nodesofchild)
}
("xmlname", xmlvalues.toOption)
}
I'm using some sample Scala code to make a server that receives a file over websocket, stores the file temporarily, runs a bash script on it, and then returns stdout by TextMessage.
Sample code was taken from this github project.
I edited the code slightly within echoService so that it runs another function that processes the temporary file.
object WebServer {
def main(args: Array[String]) {
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val interface = "localhost"
val port = 3000
import Directives._
val route = get {
pathEndOrSingleSlash {
complete("Welcome to websocket server")
}
} ~
path("upload") {
handleWebSocketMessages(echoService)
}
val binding = Http().bindAndHandle(route, interface, port)
println(s"Server is now online at http://$interface:$port\nPress RETURN to stop...")
StdIn.readLine()
binding.flatMap(_.unbind()).onComplete(_ => actorSystem.shutdown())
println("Server is down...")
}
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val echoService: Flow[Message, Message, _] = Flow[Message].mapConcat {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
TextMessage(analyze(imgOutFile))
}
case BinaryMessage.Streamed(stream) => {
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future(Source.single(""))
}
TextMessage(analyze(imgOutFile))
}
private def analyze(imgfile: File): String = {
val p = Runtime.getRuntime.exec(Array("./run-vision.sh", imgfile.toString))
val br = new BufferedReader(new InputStreamReader(p.getInputStream, StandardCharsets.UTF_8))
try {
val result = Stream
.continually(br.readLine())
.takeWhile(_ ne null)
.mkString
result
} finally {
br.close()
}
}
}
}
During testing using Dark WebSocket Terminal, case BinaryMessage.Strict works fine.
Problem: However, case BinaryMessage.Streaming doesn't finish writing the file before running the analyze function, resulting in a blank response from the server.
I'm trying to wrap my head around how Futures are being used here with the Flows in Akka-HTTP, but I'm not having much luck outside trying to get through all the official documentation.
Currently, .mapAsync seems promising, or basically finding a way to chain futures.
I'd really appreciate some insight.
Yes, mapAsync will help you in this occasion. It is a combinator to execute Futures (potentially in parallel) in your stream, and present their results on the output side.
In your case to make things homogenous and make the type checker happy, you'll need to wrap the result of the Strict case into a Future.successful.
A quick fix for your code could be:
val echoService: Flow[Message, Message, _] = Flow[Message].mapAsync(parallelism = 5) {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
case BinaryMessage.Streamed(stream) =>
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
}
I am upgrading playframework 2.4 from 2.3, I changed versions then if I compile same code, I see following error. Since I am novice at Scala, I am trying to learn Scala to solve this issue but still don't know what is the problem. What I want to do is adding a request header value from original request headers. Any help will be appreciated.
[error] /mnt/garner/project/app-service/app/com/company/playframework/filters/LoggingFilter.scala:26: not enough arguments for constructor Headers: (headers: Seq[(String, String)])play.api.mvc.Headers.
[error] Unspecified value parameter headers.
[error] val newHeaders = new Headers { val data = (requestHeader.headers.toMap
The LoggingFilter class
class LoggingFilter extends Filter {
val logger = AccessLogger.getInstance();
def apply(next: (RequestHeader) => Future[Result])(requestHeader: RequestHeader): Future[Result] = {
val startTime = System.currentTimeMillis
val requestId = logger.createLog();
val newHeaders = new Headers { val data = (requestHeader.headers.toMap
+ (AccessLogger.X_HEADER__REQUEST_ID -> Seq(requestId))).toList }
val newRequestHeader = requestHeader.copy(headers = newHeaders)
next(newRequestHeader).map { result =>
val endTime = System.currentTimeMillis
val requestTime = endTime - startTime
val bytesToString: Enumeratee[ Array[Byte], String ] = Enumeratee.map[Array[Byte]]{ bytes => new String(bytes) }
val consume: Iteratee[String,String] = Iteratee.consume[String]()
val resultBody : Future[String] = result.body |>>> bytesToString &>> consume
resultBody.map {
body =>
logger.finish(requestId, result.header.status, requestTime, body)
}
result;
}
}
}
Edit
I updated codes as following and it compiled well
following codes changed
val newHeaders = new Headers { val data = (requestHeader.headers.toMap
+ (AccessLogger.X_HEADER__REQUEST_ID -> Seq(requestId))).toList }
to
val newHeaders = new Headers((requestHeader.headers.toSimpleMap
+ (AccessLogger.X_HEADER__REQUEST_ID -> requestId)).toList)
It simply states that if you want to construct Headers you need to supply a field named headers which is of type Seq[(String, String)]. If you omit the inital new you will be using the apply function of the corresponding object for Headers which will just take a parameter of a vararg of (String, String) and your code should work. If you look at documentation https://www.playframework.com/documentation/2.4.x/api/scala/index.html#play.api.mvc.Headers and flip between the docs for object and class it should become clear.