Play Iteratees: error for simple file iteration - scala

I'm currently trying to wrap my head around the idea of Enumerators and Iteratees. I decided to start off by looking at Play 2.0's iteratee library, which I've added to my test project with the following lines in my build.sbt file. (I am using Scala 2.10) (docs here)
resolvers += "Typesafe repository" at
"http://repo.typesafe.com/typesafe/releases/"
libraryDependencies += "play" %% "play-iteratees" % "2.1.1"
My goal is to create an Enumerator over the bytes of a file, and eventually attach some parsing logic to it, but when I try what appears to be a simple thing, I get an exception.
My code looks like this:
val instr = getClass.getResourceAsStream(...)
val streamBytes = for {
chunk <- Enumerator fromStream instr
byte <- Enumerator enumerate chunk
} yield byte
val printer = Iteratee.foreach[Byte](println)
streamBytes.apply(printer)
What happens is that (what I assume is) all of the bytes in the file get printed, then I get an IllegalStateException saying that the "Promise already completed".
java.lang.IllegalStateException: Promise already completed.
at scala.concurrent.Promise$class.complete(Promise.scala:55)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:58)
at scala.concurrent.Promise$class.failure(Promise.scala:107)
at scala.concurrent.impl.Promise$DefaultPromise.failure(Promise.scala:58)
at scala.concurrent.Future$$anonfun$flatMap$1.liftedTree3$1(Future.scala:283)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:277)
at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:274)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:29)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Since the stack trace doesn't point to anywhere in my code, and this is unfamiliar territory, I have no idea what's going wrong. Can anyone offer some insight or a solution to this problem?

See if this works for you. I was getting exceptions with your code too, but when I unwound your for-comp, things worked. I'm not 100% sure why because I thought the for-comp desugared to this code anyway, but I must be missing something:
val bytes = Enumerator fromStream instr flatMap (Enumerator enumerate _)
val printer = Iteratee.foreach[Byte](b => println(b))
bytes |>> printer

Related

Isabelle/HOL theory (HOL.Imperative_HOL.ex.Imperative_Quicksort) as Json with scala-isablle and lift framework

I am using https://github.com/dominique-unruh/scala-isabelle/ to digest Isabelle/HOL formalization of quicksort algorithm https://isabelle.in.tum.de/library/HOL/HOL-Imperative_HOL/Imperative_Quicksort.html. I managed to import Quicksort theory into Context via
val ctxt = Context("HOL.Imperative_HOL.ex.Imperative_Quicksort")
And now I expecting that ctxt containts the AST of Impeartive_Quicksort.thy, so I would like it to convert into JSON object tree. I am using Lift framwork for that. My buil.sbt contains
libraryDependencies ++= {
val liftVersion = "3.4.3"
Seq(
"net.liftweb" %% "lift-webkit" % liftVersion % "compile",
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
}
And the code is
val ctxt = Context("HOL.Imperative_HOL.ex.Imperative_Quicksort")
import net.liftweb.json._
import net.liftweb.json.Serialization.write
implicit val formats = net.liftweb.json.DefaultFormats
val jsonString = write(ctxt)
println("before jsonString")
println(jsonString)
println("after jsonString")
which is giving an output in meagre quantity:
before jsonString
{"mlValue":{"id":{"_fun":{},"_ec":{},"_arg":null,"_xform":2}}}
after jsonString
I guess - this is JSON serialization issue. Ctxt definitely contains the Impeartive_Quicksort theory, but there is some problem with transaltion ot JSON.
How can I output the entire theory as JOSN object tree for the AST of Imperative_Quicksort.thy?
There are several problems with this approach:
Using Lift to translate scala-isabelle objects: This will generally not work. I assume that Lift uses reflection (I don't know whether runtime or compile-time) to serialize the internal structure of the objects. (It even encodes fields that are private, i.e., not part of the API.) However, many objects in scala-isabelle (including Context or Term) have a more complex internal structure. For example, Context simply contains a reference to an object stored inside the Isabelle process. (I guess that "_xform":2 is the ID referencing the object inside the Isabelle process.) An Isabelle context is not serializable in principle (it is a datatype that contains closures), the only way to access it is by applying the various ML functions to it that Isabelle provides (and that can be mirrored on the Scala side). A Term on the other hand can be serialized. On the Isabelle side it is a simple datatype. However, the scala-isabelle Term is a bit more complex for efficiency reasons. Data from the Isabelle process are transferred only on-demand. This is why something that simply uses reflection will not get the whole term unless it already has been transferred. You can serialize a Term by writing a simple recursive function using pattern matching (see the doc). However, note that a term can be a huge datastructure with a lot of repetition: For example, type information is repeated over and over and blows up the term hugely.
Getting an AST of the Isabelle theory:
I feel there is a misconception here about what an Isabelle context is. It does not contain an AST of a theory (or anything related to the source code of the theory). Instead, it is the result of evaluating the commands in the theory. The Isabelle processing model works very roughly as follows: The theory file is split into commands (e.g., lemma ..., apply ... etc). Each command comes with its own parser that returns a function (a closure), not an AST. This function is then applied to the current state of the theory/proof and transforms it (e.g., adds a new definition to it). At no point is an AST generated. (The state of this processing is ToplevelState in scala-isabelle, and it may contain a context or a theory or other things depending on the last command.) So, I am doubtful that there is a way to get an AST of a theory in any way (no matter whether scala-isabelle is used or whether it is done directly in Isabelle/ML). The only way, as far as I know, is to implement your own parser that imperfectly mimics Isabelle's parsing and constructs an AST.

How to read a file via sftp in scala

I am looking for a simple way to read a file (and maybe a directory) via sftp protocol in scala.
My tries:
I've looked at the Alpakka library, which is a part of akka.
But this works with streams, which is a complex topic I am not familiar with and it seems to much effort for this.
Then there is spark-sftp: this needs scala spark, which would be a bit much just to load a file.
There is the jsch library for java that could do the job, but I could not bring it to work
I am looking for actual working code that uses a library and sftp instead of a plain scp, which I am forced to do. I've found that there are not many examples for this on the web, and the ones I have found are much more complex.
Here is a working example, using sshj:
import net.schmizz.sshj.SSHClient
import net.schmizz.sshj.sftp.SFTPClient
object Main extends App {
val hostname = "myServerName"
val username = "myUserName"
val password = "thePassword"
val destinationFile = "C:/Temp/Test.txt"
val sourceFile = "./Test.txt"
val ssh = new SSHClient()
ssh.addHostKeyVerifier("xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx")
ssh.connect(hostname)
ssh.authPassword(username, password)
val sftp: SFTPClient = ssh.newSFTPClient()
sftp.get(sourceFile, destinationFile)
sftp.close()
ssh.disconnect()
}
I tested this on scala version 2.13.4 with the following entries in build.sbt:
libraryDependencies += "com.hierynomus" % "sshj" % "0.31.0"
libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.2.3"
I would not recommend to actually use it this way. Some of these steps should be wrapped in a Try and then some error checking should be done if the file didn't exists or the connection failed and so on. I intentionally left that out for clarity.
I am not saying that this is the only or the right library for this task. It is just the first one that did work for me. Especially the addHostKeyVerifier method was very helpful in my case.
There are also other libraries like JSCH, jassh, scala-ssh and scala-ftp wich also could very well do the job.

sbt illegal dynamic reference in runMain

I'm trying to run a code generator, and passing it the filename to write the output:
resourceGenerators in (proj, Compile) += Def.task {
val file = (resourceManaged in (proj, Compile)).value / "swagger.yaml"
(runMain in (proj, Compile)).toTask(s"api.swagger.SwaggerDump $file").value
Seq(file)
}.value
However, this gives me:
build.sbt:172: error: Illegal dynamic reference: file
(runMain in (proj, Compile)).toTask(s"api.swagger.SwaggerDump $file").value
Your code snippet has two problems:
You use { ... }.value instead of { ... }.taskValue. The type of resource generators is Seq[Task[Seq[File]]] and when you do value, you get Seq[File] not Task[Seq[File]]. That causes a legitimate compile error.
The dynamic variable file is used as the argument of toTask, which the current macro implementation prohibits.
Why static?
Sbt forces task implementations to have static dependencies on other tasks. Otherwise, sbt cannot perform task deduplication and cannot provide correct information in the inspect commands. That means that whichever task evaluation you perform inside a task cannot depend on a variable (a value known only at runtime), as your file in toTask does.
To overcome this limitation, there exists dynamic tasks, whose body allows you to return a task. Every "dynamic dependency" has to be defined inside a dynamic task, and then you can depend on the hoisted up dynamic values in the task that you return.
Dynamic solution
The following Scastie is the correct implementation of your task. I copy-paste the code so that folks can have a quick look, but go to that Scastie to check that it successfully compiles and runs.
resourceGenerators in (proj, Compile) += Def.taskDyn {
val file = (resourceManaged in (proj, Compile)).value / "swagger.yaml"
Def.task {
(runMain in (proj, Compile))
.toTask(s"api.swagger.SwaggerDump $file")
.value
Seq(file)
}
}.taskValue
Discussion
If you had fixed the taskValue error, should your task implementation correctly compile?
In my opinion, yes, but I haven't looked at the internal implementation good enough to assert that your task implementation does not hinder task deduplication and dependency extraction. If it does not, the illegal reference check should disappear.
This is a current limitation of sbt that I would like to get rid of, either by improving the whole macro implementation (hoisting up values and making sure that dependency analysis covers more cases) or by just improving the "illegal references checks" to not be over pessimistic. However, this is a hard problem, takes time and it's not likely to happen in the short term.
If this is an issue for you, please file a ticket in sbt/sbt. This is the only way to know the urgency of fixing this issue, if any. For now, the best we can do is to document it.

Spark Task not Serializable with simple accumulator?

I am running this simple code:
val accum = sc.accumulator(0, "Progress");
listFilesPar.foreach {
filepath =>
accum += 1
}
listFilesPar is an RDD[String]
which throws the following error:
org.apache.spark.SparkException: Task not serializable
Right now I don't understand what's happening
and I don't put parenthesis but brackets because I need to write a lengthy function. I am just doing unit testing
The typical cause of this is that the closure unexpectedly captures something. Something that you did not include in your paste, because you would never expect it would be serialized.
You can try to reduce your code until you find it. Or just turn on serialization debug logging with -Dsun.io.serialization.extendedDebugInfo=true. You will probably see in the output that Spark tries to serialize something silly.

abandon calling `get` on Option and generate compile error

If I want to generate compile time error when calling .get on any Option value, how to go about doing this?
Haven't written any custom macros but guess it's about time for it? Any pointers?
There is a compiler plugin called wartremover, that provides what you want.
https://github.com/typelevel/wartremover
It has error messages and warning for some scala functions, that should be avoided for safety.
This is the description of the OptionPartial wart from the github readme page:
scala.Option has a get method which will throw if the value is
None. The program should be refactored to use scala.Option#fold to
explicitly handle both the Some and None cases.
compiler plugin
To add wartremover, as a plugin, to scalac, you need to add this to your project/plugins.sbt:
resolvers += Resolver.sonatypeRepo("releases")
addSbtPlugin("org.brianmckenna" % "sbt-wartremover" % "0.11")
And activate it in your build.sbt:
wartremoverErrors ++= Warts.unsafe
macro
https://github.com/typelevel/wartremover/blob/master/OTHER-WAYS.md descripes other ways how you can use the plugin, one of them is using it as a macro, as mentioned in the question.
Add wart remover as library to your build.sbt:
resolvers += Resolver.sonatypeRepo("releases")
libraryDependencies += "org.brianmckenna" %% "wartremover" % "0.11"
You can make any wart into a macro, like so:
scala> import language.experimental.macros
import language.experimental.macros
scala> import org.brianmckenna.wartremover.warts.Unsafe
import org.brianmckenna.wartremover.warts.Unsafe
scala> def safe(expr: Any):Any = macro Unsafe.asMacro
safe: (expr: Any)Any
scala> safe { 1.some.get }
<console>:10: error: Option#get is disabled - use Option#fold instead
safe { 1.some.get }
The example is adapted from the wartremover github page.
Not strictly an answer to your question, but you might prefer to use Scalaz's Maybe type, which avoids this problem by not having a .get method.