I am trying to execute the following scala code on Spark but due to some reason the function selective is not getting called
var lines = sc.textFile(inputPath+fileName,1)
val lines2 =lines.map(l=>selective(
func,List(2,3),List(1,26),l,";",0
))
lines2.toArray().foreach(l=>out.write(l))
.......
The selective function is defined as follows
def selective(f: (String,Boolean) => String , phoneFields: Seq[Int], codeFields: Seq[Int], in: String, delimiter:String, iMode:Int /* 0 for enc, 1 for dec */) :String =
in.split(delimiter,-1).zipWithIndex
.map {
case (str, ix)
if( phoneFields.contains(ix)||codeFields.contains(ix)) =>
var output=f(str,codeFields.contains(ix))
var sTemp=str+":"+output+"\n"
if((iMode==0)&&codeFields.contains(ix)&&(str.compareTo("")!=0) )
CodeDictString+=sTemp
else if(str.compareTo("")!=0)
PhoneDictString+=sTemp
output
case other => other._1
}.mkString(";").+("\n")
The println statement is not executing. Furthermore the function is not returning any thing.
sc is the spark context object
Are you running this in local mode or on a cluster? The function passed to lines.map is evaluated by the Spark workers, so that println will appear in the worker's stdout logs if you're running on a cluster (these logs are viewable via Spark's web UI).
This function does not compile. The syntax
{
some statement
case ... => ...
is not valid. Case statements can only appear like this:
{
case ... => ...
...
case ... => ...
...
}
Since you obviously got something to compile, I bet there's a case statement before that println, and that case statement is not being selected.
Related
I used traverse to execute a collection of futures like this:
val result: Future[List[Either[Error, Int]]] = Future.traverse(urls)(foo(_))
I end up with a Future[List[Either[Error, Int]]]. How can I check that one of these futures resulted in an Error?
I tried to do this but I think it is wrong because I am reading that you cannot substitute variables for futures?
val check: Future[Boolean] = result.map{
fut => fut.exists(c => c.isLeft)
}
check.map{
b => b match {
case true => // do something
case false => // do something
}
}
You can convert the result to a list of errors like this:
val errors: Future[List[Error]] = result.map(_.collect{ case Left(err) => err })
It is then possible to use Await.result to extract these error values, but that is nearly always a bad idea because it blocks the current thread.
It is better to ask "What do I want to do once the Future is complete but returns errors?". Then implement that behaviour in a map or foreach on the errors Future.
Trying to emit a for yield block from a blackbox macro, but I'm failing to understand how you can create the block with valid syntax.
So below source is a hardcoded param name as this block is later inserted inside a method that will have the matching param name. params is just params: Seq[c.universe.ValDef], enclosing the case class fields.
def extract(source: Source): Option[CaseClass] = { ... }
val extractors = accessors(c)(params) map {
case (nm, tpe) => {
val newTerm = TermName(nm.toString + "Opt")
q"""$newTerm <- DoStuff[$tpe].apply("$nm", source)"""
}
}
val extractorNames = accessors(c)(params) map {
case (nm, tpe) => TermName(nm.toString + "Opt")
}
This is basically taking a case class, and outputting a for yield black to basically recreate the case class from a comprehension.
Every field in the case class of the form name: Type is transformed to a set of extractors that yield the same case class instance back if the for comprehension is successful.
case class Test(id: Int, text: String)
Will be macro transformed to the following, where Extract is just a type class and Extract.apply[T : Extract] is just materialising the context bound with implicitly[Extract[T]]:
for {
idOpt <- Extract[Int].apply("id", source): Option[Int]
textOpt <- Extract[String].apply("text", source): Option[String]
} yield Test(idOpt, textOpt)
The problem comes in having to quote the inner for yield expressions with and output a <- b blocks.
def extract(source: Source): Option[$typeName] = {
for {(..$extractors)} yield $companion.apply(..$extractorNames)
}
The error is ';' expected but '<-' found, which is pretty obvious as a <- b is invalid Scala by itself. What is the correct way to generate and quasiquote the expression block such that the above would work?
Here is a list of all the different kinds of quasiquotes.
There you can see that to express the a <- b syntax you need the fq interpolator.
So that code will probably become:
val extractors = accessors(c)(params) map {
case (nm, tpe) => {
val newTerm = TermName(nm.toString + "Opt")
fq"""$newTerm <- DoStuff[$tpe].apply("$nm", source)"""
}
}
And then with the normal interpolator:
q"for (..$extractors) yield $companion.apply(..$extractorNames)"
I am trying to incorporate a database into my http-microservice.
The microservice has a function getValueFromInternet(val: Foo): Future[Value] which was being called by my microservice on a GET request. Now, I want it to happen such that, a function getValue(val: Foo): Future[Value] would first query a db and if the database returns no results, call getValueFromInternet. The database query returns a Future[Seq[Value2]] where I can convert Value2 to Value using a function. And if no entry is found corresponding to that value, an empty Vector is returned.
This is what I have tried so far:
def getValue(val: Foo): Future[Value] = {
val resultFuture = db.getValue(val)
// 1st attempt. Clearly wrong
resultFuture onComplete {
case Success(Vector()) => getValueFromInternet(val)
case Success(vec) => convertValue2to1(vec.head)
}
// 2nd attempt. This is also wrong
resultFuture match {
case Future(Success(Vector())) => getValueFromInternet(val)
case Future(Success(vec)) => convertValue2to1(vec.head)
}
}
I would be grateful for any help suggesting how I can do this.
I have implemented the database and microservice independently and you can find them here and here
You have to use flatMap, since the thing you want to do if the first operation does not return a result also returns a future.
This is as close to your code as possible while still compiling. Note that you can't have identifiers called val in scala, since that is a keyword.
def getValue(v: Foo)(implicit ec: ExecutionContext): Future[Value] = {
val resultFuture: Future[Seq[Value2]] = db.getValue(v)
resultFuture.flatMap { vec =>
if(vec.isEmpty)
getValueFromInternet(v)
else
Future.successful(convertValue2to1(vec.head))
}
}
I have the following list of functions that I need to execute in order.
val steps: List[() => StepResult] = List(step1 _, step2 _, step3 _)
Each step will return a StepResult, which contains the boolean status, and a message:
case class StepResult(success: Boolean, message: String)
The idea is to execute each step in order, but stop going over the list if any of the steps fails. What would be the best way of doing this?
I can go over each step, and execute it:
val results = steps.map { step => step() }
But I'm missing the part of stopping if any of the steps fail. Ideally, I should end up with a List[StepResult] that I can then inspect.
You can use a view to run a map and a takeWhile without iterating through the list twice:
steps.view.map(_()).takeWhile(_.success).force
Views evaluate lazily, and are really handy when you want to call several methods on a collection but only iterate through it once, or only evaluate its contents once. Read more about them here. You can accomplish similar functionality by calling toIterator or toStream instead of view, since those collections operate similarly.
For example:
val step1 = () => { println("running step1"); StepResult(true, "") }
val step2 = () => { println("running step2"); StepResult(true, "") }
val step3 = () => { println("running step3"); StepResult(false, "") }
val step4 = () => { println("running step4"); StepResult(true, "") }
val steps = List(step1, step2, step3, step4)
steps.view.map(s => s()).takeWhile(_.success).force
This will print
running step1
running step2
running step3
Note that running step4 is not printed, since, when using view, the map and takeWhile are used in a single loop. Contrast this with the naive version:
steps.map(s => s()).takeWhile(_.success).toList
Since this doesn't use view, it will run all 4 steps, and print the fourth statement.
If this is a method, you can also use a foldLeft together with a nonlocal return:
def getResults(steps: Seq[() => StepResult]): Seq[StepResult] =
(Seq.empty[StepResult] /: steps) { case (soFar, next) =>
val nextRes = next()
if (nextRes.success) {
soFar :+ nextRes
} else return soFar
}
Or recursively, as explained by Ryan's answer.
You can using recursion:
def doIt(steps: List[() => StepResult]): List[StepResult] = steps match {
case Nil => Nil
case head :: tail =>
val result = head()
if (result.success)
result :: doIt(tail)
else
result :: Nil
}
Option monad is a great expressive way to deal with something-or-nothing things in Scala. But what if one needs to log a message when "nothing" occurs? According to the Scala API documentation,
The Either type is often used as an
alternative to scala.Option where Left
represents failure (by convention) and
Right is akin to Some.
However, I had no luck to find best practices using Either or good real-world examples involving Either for processing failures. Finally I've come up with the following code for my own project:
def logs: Array[String] = {
def props: Option[Map[String, Any]] = configAdmin.map{ ca =>
val config = ca.getConfiguration(PID, null)
config.properties getOrElse immutable.Map.empty
}
def checkType(any: Any): Option[Array[String]] = any match {
case a: Array[String] => Some(a)
case _ => None
}
def lookup: Either[(Symbol, String), Array[String]] =
for {val properties <- props.toRight('warning -> "ConfigurationAdmin service not bound").right
val logsParam <- properties.get("logs").toRight('debug -> "'logs' not defined in the configuration").right
val array <- checkType(logsParam).toRight('warning -> "unknown type of 'logs' confguration parameter").right}
yield array
lookup.fold(failure => { failure match {
case ('warning, msg) => log(LogService.WARNING, msg)
case ('debug, msg) => log(LogService.DEBUG, msg)
case _ =>
}; new Array[String](0) }, success => success)
}
(Please note this is a snippet from a real project, so it will not compile on its own)
I'd be grateful to know how you are using Either in your code and/or better ideas on refactoring the above code.
Either is used to return one of possible two meaningful results, unlike Option which is used to return a single meaningful result or nothing.
An easy to understand example is given below (circulated on the Scala mailing list a while back):
def throwableToLeft[T](block: => T): Either[java.lang.Throwable, T] =
try {
Right(block)
} catch {
case ex => Left(ex)
}
As the function name implies, if the execution of "block" is successful, it will return "Right(<result>)". Otherwise, if a Throwable is thrown, it will return "Left(<throwable>)". Use pattern matching to process the result:
var s = "hello"
throwableToLeft { s.toUpperCase } match {
case Right(s) => println(s)
case Left(e) => e.printStackTrace
}
// prints "HELLO"
s = null
throwableToLeft { s.toUpperCase } match {
case Right(s) => println(s)
case Left(e) => e.printStackTrace
}
// prints NullPointerException stack trace
Hope that helps.
Scalaz library has something alike Either named Validation. It is more idiomatic than Either for use as "get either a valid result or a failure".
Validation also allows to accumulate errors.
Edit: "alike" Either is complettly false, because Validation is an applicative functor, and scalaz Either, named \/ (pronounced "disjonction" or "either"), is a monad.
The fact that Validation can accumalate errors is because of that nature. On the other hand, / has a "stop early" nature, stopping at the first -\/ (read it "left", or "error") it encounters. There is a perfect explanation here: http://typelevel.org/blog/2014/02/21/error-handling.html
See: http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/example/ExampleValidation.scala.html
As requested by the comment, copy/paste of the above link (some lines removed):
// Extracting success or failure values
val s: Validation[String, Int] = 1.success
val f: Validation[String, Int] = "error".fail
// It is recommended to use fold rather than pattern matching:
val result: String = s.fold(e => "got error: " + e, s => "got success: " + s.toString)
s match {
case Success(a) => "success"
case Failure(e) => "fail"
}
// Validation is a Monad, and can be used in for comprehensions.
val k1 = for {
i <- s
j <- s
} yield i + j
k1.toOption assert_≟ Some(2)
// The first failing sub-computation fails the entire computation.
val k2 = for {
i <- f
j <- f
} yield i + j
k2.fail.toOption assert_≟ Some("error")
// Validation is also an Applicative Functor, if the type of the error side of the validation is a Semigroup.
// A number of computations are tried. If the all success, a function can combine them into a Success. If any
// of them fails, the individual errors are accumulated.
// Use the NonEmptyList semigroup to accumulate errors using the Validation Applicative Functor.
val k4 = (fNel <**> fNel){ _ + _ }
k4.fail.toOption assert_≟ some(nel1("error", "error"))
The snippet you posted seems very contrived. You use Either in a situation where:
It's not enough to just know the data isn't available.
You need to return one of two distinct types.
Turning an exception into a Left is, indeed, a common use case. Over try/catch, it has the advantage of keeping the code together, which makes sense if the exception is an expected result. The most common way of handling Either is pattern matching:
result match {
case Right(res) => ...
case Left(res) => ...
}
Another interesting way of handling Either is when it appears in a collection. When doing a map over a collection, throwing an exception might not be viable, and you may want to return some information other than "not possible". Using an Either enables you to do that without overburdening the algorithm:
val list = (
library
\\ "books"
map (book =>
if (book \ "author" isEmpty)
Left(book)
else
Right((book \ "author" toList) map (_ text))
)
)
Here we get a list of all authors in the library, plus a list of books without an author. So we can then further process it accordingly:
val authorCount = (
(Map[String,Int]() /: (list filter (_ isRight) map (_.right.get)))
((map, author) => map + (author -> (map.getOrElse(author, 0) + 1)))
toList
)
val problemBooks = list flatMap (_.left.toSeq) // thanks to Azarov for this variation
So, basic Either usage goes like that. It's not a particularly useful class, but if it were you'd have seen it before. On the other hand, it's not useless either.
Cats has a nice way to create an Either from exception-throwing code:
val either: Either[NumberFormatException, Int] =
Either.catchOnly[NumberFormatException]("abc".toInt)
// either: Either[NumberFormatException,Int] = Left(java.lang.NumberFormatException: For input string: "abc")
in https://typelevel.org/cats/datatypes/either.html#working-with-exception-y-code