Pattern match in foreach and then do a final step - scala

Is it possible to do anything after a pattern match in a foreach statement?
I want to do a post match step e.g. to set a variable. I also want to force a Unit return as my foreach is String => Unit, and by default Scala wants to return the last statement.
Here is some code:
Iteratee.foreach[String](_ match {
case "date" => out.push("Current date: " + new Date().toString + "<br/>")
case "since" => out.push("Last command executed: " + (ctm - last) + "ms before now<br/>")
case unknow => out.push("Command: " + unknown + " not recognized <br/>")
} // here I would like to set "last = ctm" (will be a Long)
)
UPDATED:
New code and context. Also new questions added :) They are embedded in the comments.
def socket = WebSocket.using[String] { request =>
// Comment from an answer bellow but what are the side effects?
// By convention, methods with side effects takes an empty argument list
def ctm(): Long = System.currentTimeMillis
var last: Long = ctm
// Command handlers
// Comment from an answer bellow but what are the side effects?
// By convention, methods with side effects takes an empty argument list
def date() = "Current date: " + new Date().toString + "<br/>"
def since(last: Long) = "Last command executed: " + (ctm - last) + "ms before now<br/>"
def unknown(cmd: String) = "Command: " + cmd + " not recognized <br/>"
val out = Enumerator.imperative[String] {}
// How to transform into the mapping strategy given in lpaul7's nice answer.
lazy val in = Iteratee.foreach[String](_ match {
case "date" => out.push(date)
case "since" => out.push(since(last))
case unknown => out.push(unknown)
} // Here I want to update the variable last to "last = ctm"
).mapDone { _ =>
println("Disconnected")
}
(in, out)
}

I don't know what your ctm is, but you could always do this:
val xs = List("date", "since", "other1", "other2")
xs.foreach { str =>
str match {
case "date" => println("Match Date")
case "since" => println("Match Since")
case unknow => println("Others")
}
println("Put your post step here")
}
Note you should use {} instead of () when you want use a block of code as the argument of foreach().

I will not answer your question, but I should note that reassigning variables in Scala is a bad practice. I suggest you to rewrite your code to avoid vars.
First, transform your strings to something else:
val strings = it map {
case "date" => "Current date: " + new Date().toString + "<br/>"
case "since" => "Last command executed: " + (ctm - last) + "ms before now<br/>"
case unknow => "Command: " + unknown + " not recognized <br/>"
}
Next, push it
strings map { out.push(_) }
It looks like your implementation of push has side effects. Bad for you, because such methods makes your program unpredictable. You can easily avoid side effects by making push return a tuple:
def push(s: String) = {
...
(ctm, last)
}
And using it like:
val (ctm, last) = out.push(str)
Update:
Of course side effects are needed to make programs useful. I only meant that methods depending on outer variables are less predictable than pure one, it is hard to reason about it. It is easier to test methods without side effects.
Yes, you should prefer vals over vars, it makes your program more "functional" and stateless. Stateless algorithms are thread safe and very predictable.
It seems like your program is stateful by nature. At least, try to stay as "functional" and stateless as you can :)
My suggested solution of your problem is:
// By convention, methods with side effects takes an empty argument list
def ctm(): Long = // Get current time
// Command handlers
def date() = "Current date: " + new Date().toString + "<br/>"
def since(last: Long) = "Last command executed: " + (ctm() - last) + "ms before now<br/>"
def unknown(cmd: String) = "Command: " + unknown + " not recognized <br/>"
// In your cmd processing loop
// First, map inputs to responses
val cmds = inps map {
case "date" => date()
case "since" => since(last)
case unk => unknown(unk)
}
// Then push responses and update state
cmds map { response =>
out.push(response)
// It is a good place to update your state
last = ctm()
}
It is hard to test this without context of your code, so you should fit it to your needs yourself. I hope I've answered your question.

Related

Use of Scala Loan pattern in Success Case

I'm following the tutorial from Alvin Alexander to use Loan Pattern
Here is the code what I use -
val year = 2016
val nationalData = {
val source = io.Source.fromFile(s"resources/Babynames/names/yob$year.txt")
// names is iterator of String, split() gives the array
//.toArray & toSeq is a slow process compare to .toSet // .toSeq gives Stream Closed error
val names = source.getLines().filter(_.nonEmpty).map(_.split(",")(0)).toSet
source.close()
names
// println(names.mkString(","))
}
println("Names " + nationalData)
val info = for (stateFile <- new java.io.File("resources/Babynames/namesbystate").list(); if stateFile.endsWith(".TXT")) yield {
val source = io.Source.fromFile("resources/Babynames/namesbystate/" + stateFile)
val names = source.getLines().filter(_.nonEmpty).map(_.split(",")).
filter(a => a(2).toInt == year).map(a => a(3)).toArray // .toSet
source.close()
(stateFile.take(2), names)
}
println(info(0)._2.size + " names from state "+ info(0)._1)
println(info(1)._2.size + " names from state "+ info(1)._1)
for ((state, sname) <- info) {
println("State: " +state + " Coverage of name in "+ year+" "+ sname.count(n => nationalData.contains(n)).toDouble / nationalData.size) // Set doesn't have length method
}
This is how I applied readTextFile, readTextFileWithTry on the above code to learn/experiment Loan Pattern in the above code
def using[A <: { def close(): Unit }, B](resource: A)(f: A => B): B =
try {
f(resource)
} finally {
resource.close()
}
def readTextFile(filename: String): Option[List[String]] = {
try {
val lines = using(fromFile(filename)) { source =>
(for (line <- source.getLines) yield line).toList
}
Some(lines)
} catch {
case e: Exception => None
}
}
def readTextFileWithTry(filename: String): Try[List[String]] = {
Try {
val lines = using(fromFile(filename)) { source =>
(for (line <- source.getLines) yield line).toList
}
lines
}
}
val year = 2016
val data = readTextFile(s"resources/Babynames/names/yob$year.txt") match {
case Some(lines) =>
val n = lines.filter(_.nonEmpty).map(_.split(",")(0)).toSet
println(n)
case None => println("couldn't read file")
}
val data1 = readTextFileWithTry("resources/Babynames/namesbystate")
data1 match {
case Success(lines) => {
val info = for (stateFile <- data1; if stateFile.endsWith(".TXT")) yield {
val source = fromFile("resources/Babynames/namesbystate/" + stateFile)
val names = source.getLines().filter(_.nonEmpty).map(_.split(",")).
filter(a => a(2).toInt == year).map(a => a(3)).toArray // .toSet
(stateFile.take(2), names)
println(names)
}
}
But in the second case, readTextFileWithTry, I am getting the following error -
Failed, message is: java.io.FileNotFoundException: resources\Babynames\namesbystate (Access is denied)
I guess the reason for the failure is from SO what I understand -
I am trying to open the same file on each iteration of the for loop
Apart from that, I have few concerns regarding how I use -
Is it the good way to use? Can some help me how can I use the TRY on multiple occasions?
I tried to change the return type of readTextFileWithTry like Option[A] or Set/Map or Scala Collection to apply higher-order functions later on that. but not able to succeed. Not sure that is a good practice or not.
How can I use higher-order functions in Success case, as there are multiple operations and in Success case the code blocks get bigger? I can't use any field outside of Success case.
Can someone help me to understand?
I think that you problem has nothing to do with "I am trying to open the same file on each iteration of the for loop" and it is actually the same as in the accepted answer
Unfortunately you didn't provide stack trace so it is not clear on which line this happens. I would guess that the falling call is
val data1 = readTextFileWithTry("resources/Babynames/namesbystate")
And looking at your first code sample:
val info = for (stateFile <- new java.io.File("resources/Babynames/namesbystate").list(); if stateFile.endsWith(".TXT")) yield {
it looks like the path "resources/Babynames/namesbystate" points to a directory. But in your second example you are trying to read it as a file and this is the reason for the error. It comes from the fact that your readTextFileWithTry is not a valid substitute for java.io.File.list call. And File.list doesn't need a wrapper because it doesn't use any intermediate closeable/disposable entity.
P.S. it might make more sense to use File.list(FilenameFilter filter) instead of if stateFile.endsWith(".TXT"))

What advantages does scala.util.Try have over try..catch?

Searching online for the answer gives two prominent posts (Codacy's and Daniel Westheide's), and both give the same answer as Scala's official documentation for Try:
An important property of Try shown in the above example is its ability to pipeline, or chain, operations, catching exceptions along the way.
The example referenced above is:
import scala.io.StdIn
import scala.util.{Try, Success, Failure}
def divide: Try[Int] = {
val dividend = Try(StdIn.readLine("Enter an Int that you'd like to divide:\n").toInt)
val divisor = Try(StdIn.readLine("Enter an Int that you'd like to divide by:\n").toInt)
val problem = dividend.flatMap(x => divisor.map(y => x/y))
problem match {
case Success(v) =>
println("Result of " + dividend.get + "/"+ divisor.get +" is: " + v)
Success(v)
case Failure(e) =>
println("You must've divided by zero or entered something that's not an Int. Try again!")
println("Info from the exception: " + e.getMessage)
divide
}
}
But I can just as easily pipeline operations using the conventional try block:
def divideConventional: Int = try {
val dividend = StdIn.readLine("Enter an Int that you'd like to divide:\n").toInt
val divisor = StdIn.readLine("Enter an Int that you'd like to divide by:\n").toInt
val problem = dividend / divisor
println("Result of " + dividend + "/"+ divisor +" is: " + problem)
problem
} catch {
case (e: Throwable) =>
println("You must've divided by zero or entered something that's not an Int. Try again!")
println("Info from the exception: " + e.getMessage)
divideConventional
}
(Note: divide and divideConventional differ slightly in behavior in that the latter errors out at the first sign of trouble, but that's about it. Try entering "10a" as input to dividend to see what I mean.)
I'm trying to see scala.util.Try's pipelining advantage, but to me it seems the two methods are equal. What am I missing?
I think you're having difficulty seeing the composition abilities of Try[T] because you're handling exceptions locally in both cases. What if you wanted to compose divideConventional with an additional operation?
We'd having some like:
def weNeedAnInt(i: Int) = i + 42
Then we'd have something like:
weNeedAnInt(divideConventional())
But let's say you want to max out the number of retries you allow the user to input (which is usually what you have in real life scenarios, where you can't re-enter a method forever? We'd have to additionally wrap the invocation of weNeedAnInt itself with a try-catch:
try {
weNeedAnInt(divideConventional())
} catch {
case NonFatal(e) => // Handle?
}
But if we used divide, and let's say it didn't handle exceptions locally and propogated the internal exception outwards:
def yetMoreIntsNeeded(i: Int) = i + 64
val result = divide.map(weNeedAnInt).map(yetMoreIntsNeeded) match {
case Failure(e) => -1
case Success(myInt) => myInt
}
println(s"Final output was: $result")
Is this not simpler? Perhaps, I think this has some subjectivity to the answer, I find it cleaner. Imagine we had a long pipeline of such operations, we can compose each Try[T] to the next, and only worry about the problems when once the pipeline completes.

Scala Future, flatMap that works on Either

Is there really a way to transform an object of type Future[Either[Future[T1], Future[T2]]] to and object of type Either[Future[T1], Future[T2]] ??
Maybe something like flatMap that works on Either....
I'm trying to make this code work (I have similar code that implements wrapped chain-of actions, but it doesn't involve future. It works, much simpler). The code below is based on that, with necessary modification to make it work for situation that involves futures.
case class WebServResp(msg: String)
case class WebStatus(code: Int)
type InnerActionOutType = Either[Future[Option[WebServResp]], Future[WebStatus]]
type InnerActionSig = Future[Option[WebServResp]] => Either[Future[Option[WebServResp]], Future[WebStatus]]
val chainOfActions: InnerActionSig = Seq(
{prevRespOptFut =>
println("in action 1: " + prevRespOptFut)
//dont care about prev result
Left(Future.successful(Some(WebServResp("result from 1"))))
},
{prevRespOptFut =>
println("in action 2: " + prevFutopt)
prevRespOptFut.map {prevRespOpt =>
//i know prevResp contains instance of WebServResp. so i skip the opt-matching
val prevWebServResp = prevRespOpt.get
Left(Some(prevWebServResp.msg + " & " + " additional result from 2"))
}
//But the outcome of the map above is: Future[Left(...)]
//What I want is Left(Future[...])
}
)
type WrappedActionSig = InnerActionOutType => InnerActionOutType
val wrappedChainOfActions = chainOfActions.map {innerAction =>
val wrappedAction: WrappedActionSig = {respFromPrevWrappedAction =>
respFromPrevWrappedAction match {
case Left(wsRespOptFut) => {
innerAction(wsRespOptFut)
}
case Right(wsStatusFut) => {
respFromPrevWrappedAction
}
}
}
wrappedAction
}
wrappedChainOfActions.fold(identity[WrappedActionIOType] _) ((l, r) => l andThen r).apply(Left(None))
UPDATE UPDATE UPDATE
Based on comments from Didier below ( Scala Future, flatMap that works on Either )... here's a code that works:
//API
case class WebRespString(str: String)
case class WebStatus(code: Int, str: String)
type InnerActionOutType = Either[Future[Option[WebRespString]], Future[WebStatus]]
type InnerActionSig = Future[Option[WebRespString]] => InnerActionOutType
type WrappedActionSig = InnerActionOutType => InnerActionOutType
def executeChainOfActions(chainOfActions: Seq[InnerActionSig]): Future[WebStatus] = {
val wrappedChainOfActions : Seq[WrappedActionSig] = chainOfActions.map {innerAction =>
val wrappedAction: WrappedActionSig = {respFromPrevWrappedAction =>
respFromPrevWrappedAction match {
case Left(wsRespOptFut) => {
innerAction(wsRespOptFut) }
case Right(wsStatusFut) => {
respFromPrevWrappedAction
}
}
}
wrappedAction
}
val finalResultPossibilities = wrappedChainOfActions.fold(identity[InnerActionOutType] _) ((l, r) => l andThen r).apply(Left(Future.successful(None)))
finalResultPossibilities match {
case Left(webRespStringOptFut) => webRespStringOptFut.map {webRespStringOpt => WebStatus(200, webRespStringOpt.get.str)}
case Right(webStatusFut) => webStatusFut
}
}
//API-USER
executeChainOfActions(Seq(
{prevRespOptFut =>
println("in action 1: " + prevRespOptFut)
//dont care about prev result
Left(Future.successful(Some(WebRespString("result from 1"))))
},
{prevRespOptFut =>
println("in action 2: " + prevRespOptFut)
Left(prevRespOptFut.map {prevRespOpt =>
val prevWebRespString = prevRespOpt.get
Some(WebRespString(prevWebRespString.str + " & " + " additional result from 2"))
})
}
)).map {webStatus =>
println(webStatus.code + ":" + webStatus.str)
}
executeChainOfActions(Seq(
{prevRespOptFut =>
println("in action 1: " + prevRespOptFut)
//Let's short-circuit here
Right(Future.successful(WebStatus(404, "resource non-existent")))
},
{prevRespOptFut =>
println("in action 2: " + prevRespOptFut)
Left(prevRespOptFut.map {prevRespOpt =>
val prevWebRespString = prevRespOpt.get
Some(WebRespString(prevWebRespString.str + " & " + " additional result from 2"))
})
}
)).map {webStatus =>
println(webStatus.code + ":" + webStatus.str)
}
Thanks,
Raka
The type Future[Either[Future[T1], Future[T2]]] means that sometimes later (that's future) one gets an Either, so at that time, one will know which way the calculation will go, and whether one will, still later, gets a T1 or a T2.
So the knowledge of which branch will be chosen (Left or Right) will come later. The type Either[Future[T1], Future[T2] means that one has that knowledge now (don't know what the result will be, but knows already what type it will be). The only way to get out of the Future is to wait.
No magic here, the only way for later to become now is to wait, which is done with result on the Future, and not recommended. `
What you can do instead is say that you are not too interested in knowing which branch is taken, as long has it has not completed, so Future[Either[T1, T2]] is good enough. That is easy. Say you have the Either and you would rather not look not but wait for the actual result :
def asFuture[T1, T2](
either: Either[Future[T1], Future[T2]])(
implicit ec: ExecutionContext)
: Future[Either[T1, T2] = either match {
case Left(ft1) => ft1 map {t1 => Left(t1)}
case Right(ft2) => ft2 map {t2 => Right(t2)}
}
You don't have the Either yet, but a future on that, so just flatMap
f.flatMap(asFuture) : Future[Either[T1, T2]]
(will need an ExecutionContext implicitly available)
It seems like you don't actually need the "failure" case of the Either to be a Future? In which case we can use scalaz (note that the "success" case of an either should be on the right):
import scalaz._
import scalaz.Scalaz._
def futureEitherFutureToFuture[A, B](f: Future[Either[A, Future[B]]])(
implicit ec: ExecutionContext): Future[Either[A, B]] =
f.flatMap(_.sequence)
But it's probably best to always keep the Future on the outside in your API, and to flatMap in your code rather than in the client. (Here it's part of the foldLeftM):
case class WebServResp(msg: String)
case class WebStatus(code: Int)
type OWSR = Option[WebServResp]
type InnerActionOutType = Future[Either[WebStatus, OWSR]]
type InnerActionSig = OWSR => InnerActionOutType
def executeChain(chain: List[InnerActionSig]): InnerActionOutType =
chain.foldLeftM(None: OWSR) {
(prevResp, action) => action(prevResp)
}
//if you want that same API
def executeChainOfActions(chainOfActions: Seq[InnerActionSig]) =
executeChain(chainOfActions.toList).map {
case Left(webStatus) => webStatus
case Right(webRespStringOpt) => WebStatus(200, webRespStringOpt.get.str)
}
(If you need "recovery" type actions, so you really need OWSR to be an Either, then you should still make InnerActionOutType a Future[Either[...]], and you can use .traverse or .sequence in your actions as necessary. If you have an example of an "error-recovery" type action, I can put an example of that here)

Parsing very large xml lazily

I have a huge xml file (40 gbs). I would like to extract some fields from it without loading the entire file into memory. Any suggestions?
A quick example with XMLEventReader based on a tutorial for SAXParser here (as posted by Rinat Tainov).
I'm sure it can be done better but just to show basic usage:
import scala.io.Source
import scala.xml.pull._
object Main extends App {
val xml = new XMLEventReader(Source.fromFile("test.xml"))
def printText(text: String, currNode: List[String]) {
currNode match {
case List("firstname", "staff", "company") => println("First Name: " + text)
case List("lastname", "staff", "company") => println("Last Name: " + text)
case List("nickname", "staff", "company") => println("Nick Name: " + text)
case List("salary", "staff", "company") => println("Salary: " + text)
case _ => ()
}
}
def parse(xml: XMLEventReader) {
def loop(currNode: List[String]) {
if (xml.hasNext) {
xml.next match {
case EvElemStart(_, label, _, _) =>
println("Start element: " + label)
loop(label :: currNode)
case EvElemEnd(_, label) =>
println("End element: " + label)
loop(currNode.tail)
case EvText(text) =>
printText(text, currNode)
loop(currNode)
case _ => loop(currNode)
}
}
}
loop(List.empty)
}
parse(xml)
}
User SAXParser, it will not load entire xml to memory. Here good java example, easily can be used in scala.
If you are happy looking at alternative xml libraries then Scales Xml provides three main pull parsing approaches:
Iterator based - simply use hasNext, next to get more items
iterate function - provides an Iterator but for trees identified by a simple path
Iteratee based - allows combinations of multiple paths
The focus of the upcoming 0.5 version is asynchronous parsing via aalto-xml, allowing for additional non-blocking control options.
In all cases you can control both memory usage and how the document is processed with Scales.

scala overriding None

My first Scala program and I am stuck.
So basically i am trying to override the potential None value of "last" to a 0l in the declaration of past.
import java.util.Date;
object TimeUtil {
var timerM = Map( "" -> new Date().getTime() );
def timeit(seq:String, comment:String) {
val last = timerM.get(seq)
val cur = new Date().getTime()
timerM += seq -> cur;
println( timerM )
if( last == None ) return;
val past = ( last == None ) ? 0l : last ;
Console.println("Time:" + seq + comment + ":" + (cur - past)/1000 )
}
def main(args : Array[String]) {
timeit("setup ", "mmm")
timeit("setup ", "done")
}
}
You should probably try and distil the problem a little more, but certainly include your actual error!
However, there is no ternary operator (? :) in Scala so instead you can use:
if (pred) a else b //instead of pred ? a : b
This is because (almost - see Kevin's comments below) everything in scala is an expression with a return type. This is not true in Java where some things are statements (with no type). With scala, it's always better to use composition, rather than forking (i.e. the if (expr) return; in your example). So I would re-write this as:
val last = timerM.get(seq)
val cur = System.currentTimeMillis
timerM += (seq -> cur)
println(timerM)
last.foreach{l => println("Time: " + seq + comment + ":" + (l - past)/1000 ) }
Note that the ternary operation in your example is extraneous anyway, because last cannot be None at that point (you have just returned if it were). My advice is to use explicit types for a short while, as you get used to Scala. So the above would then be:
val last: Option[Long] = timerM.get(seq)
val cur: Long = System.currentTimeMillis
timerM += (seq -> cur)
println(timerM)
last.foreach{l : Long => println("Time: " + seq + comment + ":" + (l - past)/1000 ) }
(It seems in your comment that you might be trying to assign a Long value to last, which would be an error of course, because last is of type Option[Long], not Long)
You have a couple of "code smells" in there, which suggest that a better, shinier design is probably just around the corner:
You're using a mutable variable, var instead of val
timeit works exclusively by side effects, it modifies state outside of the function and successive calls with the same input can have different results.
seq is slightly risky as a variable name, it's far too close to the (very common & popular) Seq type from the standard library.
So going back to first principles, how else might the same results be achieved in a more "idiomatic" style? Starting with the original design (as I understand it):
the first call to timeit(seq,comment) just notes the current time
subsequent calls with the same value of seq println the time elapsed since the previous call
Basically, you just want to time how long a block of code takes to run. If there was a way to pass a "block of code" to a function, then maybe, just maybe... Fortunately, scala can do this, just use a by-name param:
def timeit(block: => Unit) : Long = {
val start = System.currentTimeMillis
block
System.currentTimeMillis - start
}
Just check out that block parameter, it looks a bit like a function with no arguments, that's how by-name params are written. The last expression of the function System.currentTimeMillis - start is used as the return value.
By wrapping the parameter list in {} braces instead of () parentheses, you can make it look like a built-in control structure, and use it like this:
val duration = timeit {
do stuff here
more stuff
do other stuff
}
println("setup time:" + duration + "ms")
Alternatively, you can push the println behaviour back into the timeit function, but that makes life harder if you later want to reuse it for timing something without printing it to the console:
def timeit(description: String)(block: => Unit) : Unit = {
val start = System.currentTimeMillis
block
val duration System.currentTimeMillis - start
println(description + " took " + duration + "ms")
}
This is another trick, multiple parameter blocks. It allows you to use parentheses for the first block, and braces for the second:
timeit("setup") {
do stuff here
more stuff
do other stuff
}
// will print "setup took XXXms"
Of course, there's a million other variants you can make on this pattern, with varying degrees of sophistication/complexity, but it should be enough to get you started...