Cannot access function that is inside a case class - scala - scala

I am trying to make a RDD of k-nearest neighbors from points inside each Bounding Box. I have these case classes inside the object KNN:
object KNN extends App{
def buildtree(points: Seq[Seq[T]], depth: Int = 0): Option[KdNode[T]]
case class BoundingBox[T](lowerleft_X: T, lowerleft_Y: T, upperright_X: T, upperright_Y: T)
case class KdNode[T](value: Seq[T], left: Option[KdNode[T]], right: Option[KdNode[T]], axis: Int)
{
def Knearest(to: Seq[T]): Seq[Nearest[T]] = {....}
}
case class Nearest[T](value: Seq[T], to: Seq[T], distance: Double)
}
First, I made a Pair RDD of kdtree with key = BoundingBox and Value = Option[KdNode[T]] by using the below line of code. Here, buildtree is a function defined inside the object and PointsRDD is also a PairRDD. This works fine:
val kdtree = PointsRDD.mapValues(p => buildtree(p))
Type of kdtree is RDD[BoundingBox[T],Option[KdNode[T]].
Then, when I try to call the Knearest function on kdtree, the Knearest symbol is not recognized.
I am trying to call Knearest using mapValues and also tried it in a for loop, but both approaches do not work, that is it gives the error "cannot resolve symbol Knearest" (error for both; for loop and mapValues)
val knn = kdtree.mapValues(node => Knearest(node))
for (kd <- kdtree; knearest = kd Knearest to)
where, to is the point from which k-nearest neighbors are to be searched.
I am doing something wrong here, but I cannot figure out what, since I am fairly new to Scala. Please help in this regard. Thanks.

Related

Scala Dynamically built Function Chain with varying inputs

I'm trying to dynamically create a chain of functions to perform on a numeric value. The chain is created at runtime from text instructions.
The trick is that the functions vary in what types they produce. Some functions produce a Double, some produce a Long.
Update
The core issue is that I have a massive amount of data to process, but different values require different processing. In addition to the data I have specifications on how to extract and manipulate values to their final form, such as applying a polynomial, using a lookup table, changing the binary format (like 2s Compliment), etc. These specs are in a file of some sort (I'm creating the file form a database, but that's not important to the conversation), and I can apply these specs to multiple data files.
so with functions (these are just exmaples; there are tons of them):
def Multiply(input: Long, factor:Double):Double = input*factor
def Poly(input:Double, co:Array[Double]):Double = // do some polynomial math
I can manually create a chain like this:
val poly = (x: Double) => EUSteps.Poly(x,Array[Double](1,2))
val mult = (x: Long) => EUSteps.Multiply(x, 1.5)
val chain = mult andThen poly
And if I call chain(1) I get 4
Now I want to be able to parse a string like "MULT(1.5);POLY(1,2)" and get that same chain. The idea is that I can define the chain however I want. Maybe its "MULT(1.5);MULT(2);POLY(1,2,3)." for example. So I can make the functions generic, like this:
def Multiply[A](value: A, factor:Double)(implicit num: Numeric[A]) = num.toDouble(value)*factor
def Poly[A](value:A, co:Array[Double])(implicit num: Numeric[A]) = { // do some poly math
Parsing the string isn't hard as it's very simple.
How can I build the chain dynamically?
If it helps, the input is always going to be Long for the first step in the chain. The result could be Long or Double, and I'm OK with it if I have to do two versions based on the end result, so one that goes Long to Long, the other that goes Long to Double.
What I've tried
If I define my functions as having the same signature, like this:
def Multiply(value: Double, factor:Double) = value*factor
def Poly(value:Double, co:Array[Double]) = {
I can do it as part of a map operation:
def ParseList(instruction:String) = {
var instructions = instruction.split(';')
instructions.map(inst => {
val instParts = inst.split(Array(',','(',')'))
val instruction = instParts(0).toUpperCase()
val instArgs = instParts.drop(1).map(arg => arg.toDouble)
instruction match {
case "POLY" => (x: Double) => EUSteps.Poly(x,instArgs)
case "MULTI" => (x: Double) => Multiply(x,instArgs(0))
}
}).reduceLeft((a,b) => a andThen b)
However, that breaks as soon as I change one of the arguments or return types to Long:
def Multiply(value: Long, factor:Double) = value*factor
And change my case
instruction match {
case "POLY" => (x: Double) => EUSteps.Poly(x,instArgs)
case "MULTI" => (x: Long) => Multiply(x,instArgs(0))
}
}).reduceLeft((a,b) => a andThen b)
Now the Reduce is complaining because it wanted Double => Double instead of Long => Double
Update 2
The way I solved it was to do what Levi suggested in the comments. I'm sure this is not very Scala-y, but when in doubt I go back to my OO roots. I suspect there is a more elegant way to do it though.
I declared an abstract class called ParamVal:
abstract class ParamVal {
def toDouble(): Double
def toLong(): Long
}
Then Long and Double types to go with it that implement the conversions:
case class DoubleVal(value: Double) extends ParamVal {
override def toDouble(): Double = value
override def toLong(): Long = value.toLong
}
case class LongVal(value: Long) extends ParamVal {
override def toDouble(): Double = value.toDouble
override def toLong(): Long = value
}
This lets me define all function inputs as ParamVal, and since each one expects a certain input type it's easy to just call toDouble or toLong as needed.
NOTE: The app that creates these instructions already makes sure the chain is correct.
Some ideas:
Analyze the string chain upfront and figure out what will be the type of the final result and then use it for all steps all along. You will need a family of functions for each type.
Try to use Either[Long, Double] in the reduce part.

Error when I use my val in function, but not when I access val on its own in REPL.

I am writing Scala code to search a QuadTree. I'm fairly new to Scala, so this may be a simple question: Why do I get a error: value rec not a member of QuadTree when I try to use tree.rec in a function, BUT if I ask the REPL to evaluate tree.rec, it successfully returns the data?
Anyway, I've got these classes to start with:
case class Rectangle (minx: Int, maxx: Int, miny: Int, maxy: Int)
abstract class QuadTree
case class Node (value : Int, nw : QuadTree, ne: QuadTree, sw: QuadTree, se: QuadTree, rec: Rectangle) extends QuadTree
case class Leaf (value : Int, rec: Rectangle) extends QuadTree
case class Empty (value: Int, rec: Rectangle) extends QuadTree
So I'm defining a Rectangle, and then a QuadTree, which can has a value, and either contains a Node, a Leaf, or Empty.
Next I have a function that checks to see if two rectangles overlap called rectangles_Overlap, which returns True if two input Rectangles overlap, and returns False if there is no overlap.
If I define
val query = Rectangle(1, 8, 8, 16)
val tree1 = Node(13,Node(12,Leaf(7,Rectangle(1,4,12,16)),Leaf(3,Rectangle(4,8,12,16)),Leaf(2,Rectangle(1,4,8,12)),Empty(0,Rectangle(4,8,8,12)),Rectangle(1,8,8,16)),Empty(0,Rectangle(8,16,8,16)),Empty(0,Rectangle(1,8,1,8)),Leaf(1,Rectangle(8,16,1,8)),Rectangle(1,16,1,16))
If I want to get the size of the tree, I can type into the REPL
scala> tree1.rec
res0: Rectangle = Rectangle(1,16,1,16)
I can also use my rectangles_Overlap method to see if the query Rectangle overlaps with the tree Rectangle.
scala> rectangles_Overlap(query, tree1.rec)
res1: Boolean = true
But if I try to use rectangles_Overlap(query, tree1.rec) in another function, its unhappy!
def queryBoolean(query: Rectangle, tree: QuadTree): Boolean = {
if(rectangles_Overlap(query, tree.rec)) {
println("Yay they overlap so I can do other stuff...")
// I want to add other code here after I get this working
true
} else {
println("Nah, they don't overlap, don't need to do anything")
false
}
}
And then call the function with queryBoolean(query, tree1), I get this error:
<console>:17: error: value rec is not a member of QuadTree
if (rectangles_Overlap(query, tree.rec)) {
So my question again is: Why can I evaluate tree1.rec in the REPL, but cannot use tree.rec in a function without an error?
My original thought was that perhaps I was overloading something... but rec is always a Rectangle in all of the case clase ... extends QuadTree. So I don't believe I am overloading anything?
That is true. rec i.e. type of Rectangle, is not a member of QuadTree. If you change from QuadTree to Node, it will work. Otherwise it must be there in QuadTree too.
def queryBoolean(query: Rectangle, tree: Node): Boolean
Add to QuadTree as val (val is needed to access it), and then you need to override and call super constructor as below:
abstract class QuadTree(val rec: Rectangle)
case class Node (value : Int, nw : QuadTree, ne: QuadTree, sw: QuadTree, se: QuadTree, override val rec: Rectangle) extends QuadTree(rec)
case class Leaf (value : Int, override val rec: Rectangle) extends QuadTree(rec)
case class Empty (value: Int, override val rec: Rectangle) extends QuadTree(rec)
This is because rec is not defined for class QuadTree, but only for its subclasses, therefore there is no value to call. The reason that this works properly on REPL is because of the way it is defined. You define it as
val tree1 = Node(/*some params*/)
Which means it has type Node, instead of QuadTree. As Node has a defined rec parameter, everything works out fine. However in the method, you pass in a raw QuadTree, so no rec is defined.
To get it to work, either add rec to the QuadTree class, or pass in a specific subtype, which has the correct parameters defined.

How to write a currying Scala Function trait?

Issue
First approach
If would like to have
trait Distance extends ((SpacePoint, SpacePoint) => Double)
object EuclideanDistance extends Distance {
override def apply(sp1: SpacePoint, sp2: SpacePoint): Double = ???
}
trait Kernel extends (((Distance)(SpacePoint, SpacePoint)) => Double)
object GaussianKernel extends Kernel {
override def apply(distance: Distance)(sp1: SpacePoint, sp2: SpacePoint): Double = ???
}
However the apply of object GaussianKernel extends Kernel is not an excepted override to the apply of trait Kernel.
Second approach - EDIT: turns out this works afterall...
Alternatively I could write
trait Kernel extends ((Distance) => ( (SpacePoint, SpacePoint) => Double))
object GaussianKernel extends Kernel {
override def apply(distance: Distance): (SpacePoint, SpacePoint) => Double =
(sp1: SpacePoint, sp2: SpacePoint) =>
math.exp(-math.pow(distance(sp1, sp2), 2) / (2))
}
but am not sure this is currying...
EDIT: Turns out that I can use this second approach in a currying fashion. I think it is exactly what the typical currying is, only without the syntactic sugar.
Explanation of the idea
The idea is this: For my algorithm I need a Kernel. This kernel calculates a metric for two vectors in space - here SpacePoints. For that the Kernel requires a way to calculate the distance between the two SpacePoints. Both distance and kernel should be exchangeable (open-closed principle), thus I declare them as traits (in Java I had them declared as interfaces). Here I use the Euclidean Distance (not shown) and the Gaussian Kernel. Why the currying? Later when using those things, the distance is going to be more or less the same for all measurements, while the SpacePoints will change all the time. Again, trying to stay true to the open-closed principle. Thus, in a first step I would like the GaussianKernel to be pre-configured (if you will) with a distance and return a Function that can be feed later in the program with the SpacePoints (I am sure the code is wrong, just to give you an idea what I am aiming at):
val myFirstKernel = GaussianKernel(EuclideanDistance)
val mySecondKernel = GaussianKernel(FancyDistance)
val myThirdKernel = EpanechnikovKernel(EuclideanDistance)
// ... lots lof code ...
val firstOtherClass = new OtherClass(myFirstKernel)
val secondOtherClass = new OtherClass(mySecondKernel)
val thirdOtherClass = new OtherClass(myThirdKernel)
// ... meanwhile in "OtherClass" ...
class OtherClass(kernel: Kernel) {
val thisSpacePoint = ??? // ... fancy stuff going on ...
val thisSpacePoint = ??? // ... fancy stuff going on ...
val calculatedKernel = kernel(thisSpacePoint, thatSpacePoint)
}
Questions
How do I build my trait?
Since distance can be different for different GaussianKernels - should GaussianKernel be a class instead of an object?
Should I partially apply GaussianKernel instead of currying?
Is my approach bad and GaussianKernel should be a class that stores the distance in a field?
I would just use functions. All this extra stuff is just complexity and making things traits doesn't seem to add anything.
def euclideanDistance(p1: SpacePoint1, p1: SpacePoint1): Double = ???
class MyClass(kernel: (SpacePoint, SpacePoint) => Double) { ??? }
val myClass = new MyClass(euclideanDistance)
So just pass the kernel as a function that will computer your distance given two points.
I'm on my phone, so can't fully check, but this will give you an idea.
This will allow you to partially apply the functions if you have the need. Imagine you have a base calculate method...
def calc(settings: Settings)(p1: SpacePoint1, p1: SpacePoint1): Double = ???
val standardCalc = calc(defaultSettings)
val customCalc = calc(customSettings)
I would start with modeling everything as functions first, then roll up commonality into traits only if needed.
Answers
1. How do I build my trait?
The second approach is the way to go. You just can't use the syntactic sugar of currying as usual, but this is the same as currying:
GaussianKernel(ContinuousEuclideanDistance)(2, sp1, sp2)
GaussianKernel(ContinuousManhattanDistance)(2, sp1, sp2)
val eKern = GaussianKernel(ContinuousEuclideanDistance)
eKern(2, sp1, sp2)
eKern(2, sp1, sp3)
val mKern = GaussianKernel(ContinuousManhattanDistance)
mKern(2, sp1, sp2)
mKern(2, sp1, sp3)
Why the first approach does not work
Because currying is only possible for methods (duh...). The issue starts with the notion that a Function is very much like a method, only that the actual method is the apply method, which is invoked by calling the Function's "constructor".
First of all: If an object has an apply method, it already has this ability - no need to extend a Function. Extending a Function only forces the object to have an apply method. When I say "object" here I mean both, a singleton Scala object (with the identifier object) and a instantiated class. If the object is a instantiated class MyClass, then the call MyClass(...) refers to the constructor (thus a new before that is required) and the apply is masked. However, after the instantiation, I can use the resulting object in the way mentioned: val myClass = new MyClass(...), where myClass is an object (a class instance). Now I can write myClass(...), calling the apply method. If the object is a singleton object, then I already have an object and can directly write MyObject(...) to call the apply method. Of course an object (in both senses) does not have a constructor and thus the apply is not masked and can be used. When this is done, it just looks the same way as a constructor, but it isn't (that's Scala syntax for you - just because it looks similar, doesn't mean it's the same thing).
Second of all: Currying is syntactic sugar:
def mymethod(a: Int)(b: Double): String = ???
is syntactic sugar for
def mymethod(a: Int): ((Double) => String) = ???
which is syntactic sugar for
def mymethod(a: Int): Function1[Double, String] = ???
thus
def mymethod(a: Int): Function1[Double, String] = {
new Function1[Double, String] {
def apply(Double): String = ???
}
}
(If we extend a FunctionN[T1, T2, ..., Tn+1] it works like this: The last type Tn+1 is the output type of the apply method, the first N types are the input types.)
Now, we want the apply method here is supposed to be currying:
object GaussianKernel extends Kernel {
override def apply(distance: Distance)(sp1: SpacePoint, sp2: SpacePoint): Double = ???
}
which translates to
object GaussianKernel extends Kernel {
def apply(distance: Distance): Function2[SpacePoint, SpacePoint, Double] = {
new Function2[SpacePoint, SpacePoint, Double] {
def apply(SpacePoint, SpacePoint): Double
}
}
}
Now, so what should GaussianKernel extend (or what is GaussianKernel)? It should extend
Function1[Distance, Function2[SpacePoint, SpacePoint, Double]]
(which is the same as Distance => ((SpacePoint, SpacePoint) => Double)), the second approach).
Now the issue here is, that this cannot be written as currying, because it is a type description and not a method's signature. After discussing all this, this seems obvious, but before discussion all this, it might not have. The thing is, that the type description seemed to have a direct translation into the apply method's (the first, or only one, depending on how one takes the syntactic sugar apart) signature, but it doesn't. To be fair though, it is something that could have been implemented in the compiler: That the type description and the apply method's signature are recognized to be equal.
2. Since distance can be different for different GaussianKernels - should GaussianKernel be a class instead of an object?
Both are valid implementation. Using those later only differenciates only in the presence or absence of new.
If one does not like the new one can consider a companion object as a Factory pattern.
3. Should I partially apply GaussianKernel instead of currying?
In general this is preferred according to http://www.vasinov.com/blog/on-currying-and-partial-function-application/#toc-use-cases
An advantage of currying would be the nicer code without _: ??? for the missing parameters.
4. Is my approach bad and GaussianKernel should be a class that stores the distance in a field?
see 2.

How to sum up all nodes in this scala fold_tree function

I am a beginner with scala. I have been given a fold_tree_preorder function that implements the higher order function fold on a binary tree. The tree, node and leaf definitions are below
abstract class Tree[+A]
case class Leaf[A](value: A) extends Tree[A]
case class Node[A](value: A, left: Tree[A], right: Tree[A]) extends Tree[A]
This is the function I have been given
def fold_tree_preorder [Z,A](f:(Z,A)=>Z) (z:Z) (t:Tree[A]) : Z =
t match {
case Leaf(value) => f(z, value)
case Node(value , lt, rt) => {
val z1 = f(z,value)
val z2 = fold_tree_preorder (f) (z1) (lt)
fold_tree_preorder (f) (z2) (rt)
}
}
I am not sure how to actually call this function. I am trying to do something like the following:
def count_tree [A](t:Tree[A]) : Int =
fold_tree_preorder[A,A=>A]((z,a)=>(z+a))(0)(t)
But I am getting errors like type mismatch error. I don't think the parameters themselves are correct either, but I'm not even sure how to test what the output would look like because I can't figure out the correct way of calling the fold_tree_preorder function. How can I input the correct syntax to call this function?
z is the fold_tree_preorder function is the output type you are expecting which is Int
Use the function like below
assuming that count_tree counts number of nodes of the tree
def count_tree [A](t:Tree[A]) : Int =
fold_tree_preorder[Int, A]((z,a) => z + 1 )(0)(t)
Just add 1 to the z on visiting a node to count number of nodes
def fold_tree_preorder [Z,A](f:(Z,A)=>Z) (z:Z) (t:Tree[A]) : Z
The first argument is f, is function that takes the result so far (of type Z) and the value contained in your tree A)
def count_tree [A](t:Tree[A]) : Int
In your function your promising to return an Int based on a tree of which you don't know the element type, parameterized as A. This leads you to add an Int to an A.
Summing and counting are different things, if you decide to count the number of values, you do not need to know anything about A. If you decide to sum the values, you need to know you have a + operator defined for A.
You might need to learn more about scala's types.
https://twitter.github.io/scala_school/advanced-types.html

accept multiple types and numbers of parameters scala

I have the something akin to the following code (only with more parameters) in a program I am developing:
class Particle {
//variables
def this(position: Position2D, velocity: Vector2D) = {
this()
//constructors
}
def this(xPos: Double, yPos: Double, magnitude: Double, angle: Double) = {
this(new Position2D(xPos, yPos), new Vector2D(magnitude, angle))
}
}
And I would like to make it so that the program is able to accept the Position2D object for the first parameter and two Doubles for the second parameter, or two Doubles for the first parameter and a Vector2D object for the 2nd parameter, without creating more this statements for each combination of parameters. I know that it's possible to use something like:
def this(posObj: Either[Position2D, Array[Double]], velObj: Either[Vector2D, Array[Double]]) = {...}
And then test to see what type posObj and velObj are; however, I was curious if there was a way to do this without requiring the 2nd part of the Either to be just one item such as an Array, so that you could initialize Particle like the following:
val a = new Particle(new Position(3, 6), 30, 5)
val b = new Particle(3, 6, new Vector2D(30, 5))
val c = new Particle(new Position(3, 6), new Vector2D(30, 5))
val d = new Particle(3, 6, 30, 5)
The short answer is no, the constructors are rigid in the sense that the number of parameters (assuming you don't want them in a Seq) for a method (or constructor in this case) has to be quite specific.
Potentially by enclosing the naked pairs in a Tuple2 and creating a typeclass instance to unpack each this would be possible, but it would be an order of magnitude more complicated than just adding the constructors.
A possible solution for this would be to create your class with 2 parameter lists in the constructor, one taking the Position2D and one taking the Vector2D and then create implicit conversions from Tuple2 to both of them:
case class Vector2D(x: Double, y: Double)
case class Position2D(x: Double, y: Double)
implicit def tuple2Vector2D(t: (Double,Double)): Vector2D = Vector2D(t._1, t._2)
implicit def tuple2Position2D(t: (Double,Double)): Position2D = Position2D(t._1, t._2)
class Particle(val v: Vector2D)(val p: Position2D)
scala> new Particle(1.1,3.1)(6.1,0.3)
res15: Particle = Particle#15a74f84
scala> new Particle(new Vector2D(1.1,3.1))(6.1,0.3)
res16: Particle = Particle#47283198
scala> new Particle(1.1,3.1)(new Position2D(6.1,0.3))
res17: Particle = Particle#2571e404
scala> new Particle(new Vector2D(1.1,3.1))(new Position2D(6.1,0.3))
res18: Particle = Particle#38be7bc0