recursive variable needs type - scala

I have a code where I wanted to update an RDD as below:
val xRDD = xRDD.zip(tempRDD)
This gave me the error : recursive value x needs type
I want to maintain the xRDD over iterations and modifying it with tempRDD in each iteration. How can I achieve it?
Thanks in advance.

The compiler is telling you that you're attempting to define a variable with itself and use it in it's own definition within an action. To say this another way, you're attempting to use something that doesn't exist in an action to define it.
Edit:
If you have a list of actions that produce new RDD that you'd like to zip together, perhaps you should look at a Fold:
listMyActions.foldLeft(origRDD){ (rdd, f) =>
val tempRDD = f(rdd)
rdd.zip(tempRDD)
}

Don't forget that vals are immutable, this means that you can't reassign something to a previously defined variable. However if you want to do this, you can replace it for a var, which is not recommended, this question is more related to Scala's feature than to Apache-Spark's one. Besides, if you want more information you can consult this post Use of def val and vars in scala.

Related

How to define a infinite Lazy List using a given function or property in Scala?

I recently was seeing an advanced Scala course in Rock the JVM and, in one lesson, Daniel purposed to create a set using propertys (functions going from A to Boolean), the implementation of this Set can be found here.
For example, he was able to create a set "containing" all the naturals by doing this:
val naturals = PBSet[Int](_ => true)
Then he could verify if an input was contained inside that set by doing naturals.contains(input).
My question is, is there any way to accomplish this using Lazy Lists or even better, Lazy Vectors or Lazy Maps?
For instance, given a fibonacci(n) function that returns the nth Fibonacci number, a lazy list containing all the posible outputs for that function would look like something like this:
val allFibonacciNumbers: LazyList[Long] = LazyList.generate(n => fibonacci(n))
I know something similiar could be done by doing this:
val allFibonacciNumbersV2: LazyList[Long] = LazyList.iterate(0L)(n => n + 1).map(n => fibonacci(n))
The problem of that implementation is the start value of the iterate function: It is not going to give all the possible outputs of any function, just the ones after that.
So, how could such a task be accomplished using a combination of the Porperty based Set and the Lazy List? Or even better, with a Lazy Vector or Lazy Map?
I couldn't find anything similar, the closest I could find was something called property based test but that's about it.
Thank you for the immense help and for reading my question. Have a great day!
Well, there is no "LazyMap" out of the box, but it is rather trivial to just roll your own.
Your comments sound like you already know how to compute Fibonacci with a LazyList, from there, you just need to memoize the result:
object FF {
val fm = mutable.Map.empty[Int, BigInt]
val fib: LazyList[BigInt] = BigInt(0) #:: BigInt(1) #::
fib.zip(fib.tail).map(p => p._1 + p._2)
def apply(n: Int) = fm.getOrElseUpdate(n, fib(n))
}
Now, things like FF(100) are linear the first time, and constant time after that.
If you do FF(100), and then FF(99), that second call is still linear for the first time though. Technically, it could be optimized to be constant time too, because fib(99) is already available, but I don't think it's worth the trouble ... and extra storage.

Possible to make use of Scala's Option flatMap method more concise?

I'm admittedly very new to Scala, and I'm having trouble with the syntactical sugar I see in many Scala examples.
It often results in a very concise statement, but honestly so far (for me) a bit unreadable.
So I wish to take a typical use of the Option class, safe-dereferencing, as a good place to start for understanding, for example, the use of the underscore in a particular example I've seen.
I found a really nice article showing examples of the use of Option to avoid the case of null.
https://medium.com/#sinisalouc/demystifying-the-monad-in-scala-cc716bb6f534#.fhrljf7nl
He describes a use as so:
trait User {
val child: Option[User]
}
By the way, you can also write those functions as in-place lambda
functions instead of defining them a priori. Then the code becomes
this:
val result = UserService.loadUser("mike")
.flatMap(user => user.child)
.flatMap(user => user.child)
That looks great! Maybe not as concise as one can do in groovy, but not bad.
So I thought I'd try to apply it to a case I am trying to solve.
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
The Person has an PID which is of type Id. The Id type consists of two String types; the Id-Type and the Id-Value.
I've used the Scala console to test the following:
class Id(val idCode : String, val idVal : String)
class Person(val pid : Id, val name : String)
val anId: Id = new Id("Passport_number", "12345")
val person: Person = new Person(anId, "Sean")
val operson : Option[Person] = Some(person)
OK. That setup my person and it's optional instance.
I learned from the above linked article that I could get the Persons Id-Val by using flatMap; Like this:
val result = operson.flatMap(person => Some(person.pid)).flatMap(pid => Some(pid.idVal)).getOrElse("NoValue")
Great! That works. And if I infact have no person, my result is "NoValue".
I used flatMap (and not Map) because, unless I misunderstand (and my tests with Map were incorrect) if I use Map I have to provide an alternate or default Person instance. That I didn't want to have to do.
OK, so, flatMap is the way to go.
However, that is really not a very concise statement.
If I were writing that in more of a groovy style, I guess i'd be able to do something like this:
val result = person?.pid.idVal
Wow, that's a bit nicer!
Surely Scala has a means to provide something at least nearly as nice as Groovy?
In the above linked example, he was able to make his statement more concise using some of that syntactical sugar I mentioned before. The underscore:
or even more concise:
val result = UserService.loadUser("mike")
.flatMap(_.child)
.flatMap(_.child)
So, it seems in this case the underscore character allows you to skip specifying the type (as the type is inferred) and replace it with underscore.
However, when I try the same thing with my example:
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Scala complains.
<console>:15: error: missing parameter type for expanded function ((x$2) => x$2.idVal)
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Can someone help me along here?
How am I misunderstanding this?
Is there a short-hand method of writing my above lengthy statement?
Is flatMap the best way to achieve what I am after? Or is there a better more concise and/or readable way to do it ?
thanks in advance!
Why do you insist on using flatMap? I'd just use map for your example instead:
val result = operson.map(_.pid).map(_.idVal).getOrElse("NoValue")
or even shorter:
val result = operson.map(_.pid.idVal).getOrElse("NoValue")
You should only use flatMap with functions that return Options. Your pid and idVals are not Options, so just map them instead.
You said
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
This is the essential difference between your example and the User example. In the User example, both the existence of a User instance, and its child field are options. This is why, to get a child, you need to flatMap. However, since in your example, only the existence of a Person is not guaranteed, after you've retrieved an Option[Person], you can safely map to any of its fields.
Think of flatMap as a map, followed by a flatten (hence its name). If I mapped on child:
val ouser = Some(new User())
val child: Option[Option[User]] = ouser.map(_.child)
I would end up with an Option[Option[User]]. I need to flatten that to a single Option level, that's why I use flatMap in the first place.
If you looking for the most concise solution, consider this:
val result = operson.fold("NoValue")(_.pid.idVal)
Though one could find it not clear or confusing

How to interpret a val in Scala that is of type Option[T]

I am trying to analyze Scala code written by someone else, and in doing so, I would like to be able to write Unit Tests (that were not written before the code was written, unfortunately).
Being a relative Newbie to Scala, especially in the Futures concept area, I am trying to understand the following line of code.
val niceAnalysis:Option[(niceReport) => Future[niceReport]] = None
Update:
The above line of code should be:
val niceAnalysis:Option[(NiceReport) => Future[NiceReport]] = None
- Where NiceReport is a case class
-----------Update ends here----------------
Since I am trying to mock up an Actor, I created this new Actor where I introduce my niceAnalysis val as a field.
The first problem I see with this "niceAnalysis" thing is that it looks like an anonymous function.
How do I "initialize" this val, or to give it an initial value.
My goal is to create a test in my test class, where I am going to pass in this initialized val value into my test actor's receive method.
My naive approach to accomplish this looked like:
val myActorUnderTestRef = TestActorRef(new MyActorUnderTest("None))
Neither does IntelliJ like it. My SBT compile and test fails.
So, I need to understand the "niceAnalyis" declaration first and then understand how to give it an initial value. Please advise.
You are correct that this is a value that might contain a function from type niceReport to Future[niceReport]. You can pass an anonymous function or just a function pointer. The easiest to understand might be the pointer, so I will provide that first, but the easiest in longer terms would be the anonymous function most likely, which I will show second:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
def strToFuture(x: String) = Future{ x } //merely wrap the string in a future
val foo = Option(strToFuture)
Conversely, the one liner is as follows:
val foo = Option((x:String)=>Future{x})

apply method on Map object?

First of all let me apologize in advance for what is my very first question posted on stack overflow and probably a quite stupid one.
Since a Map in scala is instantiated using the following syntax:
val myMap=Map(1->”value1”,2->”value2”)
I was expecting the Map object from scala.collection.immutable to provide an apply method with a signature roughly looking like:
def apply[A,B](entries :(A,B)*):Map[A,B]
Obviously I should be blind, but I can’t find such a method. Where is it defined ?
Furthermore, could someone give me information about the purpose of the Map1, Map2, Map3, Map4 classes defined in the Map object ? Should they be used by the developer or are they only used internally by the language and/or the compiler ? Are they related somehow to the map instantiation scheme i was asking about above ?
Thanks in advance for your help.
apply looks like it is defined on scala.collection.generic.GenMapFactory, a superclass of scala.collection.immutable.Map. For some reason, Scaladoc simply ignores this method for 2.9.2. (As Rogach notes, it was there in 2.9.1.)
Map1…Map4 (together with EmptyMap, which is private) are simply optimisations. These are defined inside Map.scala and really just hold up to four keys and values directly without any further indirection. If one tries to add to a Map4, a HashMap will automatically be created.
You normally do not need to create a Map[1-4] manually:
scala> Map('a -> 1)
res0: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 1)
scala> res0.isInstanceOf[scala.collection.immutable.Map.Map1[_,_]]
res1: Boolean = true

Get only the file names using listFiles in Scala

Is there an idiomatic Scala solution to obtain only the file names from File.listFiles?
Perhaps something like:
val names = new File(dir).listFiles.somemagicTrait(_getName)
and have names become a List[String]?
I know I can just loop and add them to a mutable list.
how about?
new File(dir).listFiles.map(_.getName).toList
I'm always wary of answering the wrong part of the question, but as Jean-Phillipe commented, you can get an array of the names from
new File(dir).list
and if you really need a list call toList on that.