Array of case class as parameter T - scala

If I have these 2 or N case classes as follows:
case class dbep1 (a1:String, b1:String, ts1: Integer)
case class dbep2 (a2:String, b2:String, ts2: Integer)
var dbep1D = new ArrayBuffer[dbep1]()
dbep1D = ArrayBuffer(dbep1("Mark", "Hamlin", 2), dbep1("Kumar", "XYZ", 3), dbep1("Tom", "Poolsoft", 4))
var dbep2D = new ArrayBuffer[dbep2]()
dbep2D = ArrayBuffer(dbep2("Pjotr", "Hamming", 4), dbep2("Kumar", "ABNC", 7), dbep2("Tom", "Gregory", 3))
How can I write a def that accepts both of these as an array of T, like so:
import scala.collection.mutable.ArrayBuffer
def printArray[a: ArrayBuffer[T]]() = {
// do anything with ArrayBuffer[T]
println(a)
}
Clearly not correct, just passing a single variable case class will work as per below:
def doSomeThing[T]() = {
// ...
}
case class SomeClassA(id: UUID, time: Date)
doSomeThing[SomeClassA]()
But what if I want an ArrayBuffer of a give case class as input?
May be just not possible.

Your doSomeThing function does not take an argument. It takes a generic argument but no actual values, and with no arguments at all (not even an implicit ClassTag), there's absolutely nothing you can do with it.
I'll assume you meant
def doSomeThing[T](arg: T) = {
// ...
}
If you want your function to take an arbitrary array buffer instead, you don't replace the thing inside brackets; that's still a free variable. You change the argument type.
def printArray[T](arg: ArrayBuffer[T]) = {
// ...
}

Related

Object instantiation variants in scala

I am really new to scala, and I am currently making my way through the tour (https://docs.scala-lang.org/tour/variances.html).
Now, looking at some library (akka-http), I stumbled across some code like this:
def fetchItem(itemId: Long): Future[Option[Item]] = Future {
orders.find(o => o.id == itemId)
}
And I don't quite understand the syntax, or more precisely, the = Future { part. As I learned, the syntax for methods is def [methodName]([Arguments])(:[returnType]) = [codeblock].
However the above seems to differ in that its having the Future in front of the "codeblock". Is this some kind of object instantiation? Because I could not find documentation about this syntax, I tried in my play code stuff like this:
{
val myCat:Cat = new Cat("first cat")
val myOtherCat:Cat = Cat { "second cat" }
val myThirdCat:Cat = MyObject.getSomeCat
}
...
object MyObject
{
def getSomeCat: Cat = Cat
{
"blabla"
}
}
And all of this works, in that it creates a new Cat object. So it seems like new Cat(args) is equivalent to Cat { args }.
But shouldn't def getSomeCat: Cat = Cat define a method with a code block, not the instantiate a new Cat object? I am confused.
I think there are a couple of things here:
1.
The [codeblock] in method syntax doesn't have to be enclosed in {}. If there's only one expression, it's allowed to omit them.
E.g.
def add(x: Int, y: Int) = x + y
or
def add(x: Int, y: Int) = Future { x + y }
2.
Each class can have its companion object define an apply() method, which can be invoked without explicitly saying "apply" (this is special Scala syntactic sugar). This allows us to construct instances of the class by going through the companion object, and since "apply" can be omitted, at first glance it looks like going through the class itself, just without the "new" keyword.
Without the object:
class Cat(s: String)
val myFirstCat: Cat = new Cat("first cat") // OK
val mySecondCat: Cat = Cat("second cat") // error
And now with the object:
class Cat(s: String)
object Cat {
def apply(s: String) = new Cat(s)
}
val myFirstCat: Cat = new Cat("first cat") // OK
val mySecondCat: Cat = Cat.apply("second cat") // OK
val myThirdCat: Cat = Cat("third cat") // OK (uses apply under the hood)
val myFourthCat: Cat = Cat { "fourth cat" } // OK as well
Note how fourth cat invocation works just fine with curly braces, because methods can be passed codeblocks (last evaluated value in the block will be passed, just like in functions).
3.
Case classes are another slightly "special" Scala construct in a sense that they give you convenience by automatically providing some stuff for you "behind the curtain", including an associated companion object with apply().
case class Cat(s: String)
val myFirstCat: Cat = new Cat("first cat") // OK
val mySecondCat: Cat = Cat.apply("second cat") // OK
val myThirdCat: Cat = Cat("third cat") // OK
What happens in your case with Future is number 2, identical to "fourth cat". Regarding your question about new Cat(args) being equivalent to Cat { args }, it's most likely situation number 3 - Cat is a case class. Either that, or its companion object explicitly defines the apply() method.
The short answer is Yes, that Future code is an object instanciation.
Your Cat class has a single String argument and can be created using Cat(<string>). If you want to compute a value for the string you can put it in a block using {} as you did in your example. This block can contain arbitrary code, and the value of the block will be the value of the last expression in the block which must be type String.
The Future[T] class has a single argument of type T and can be created using Future(T). You can pass an arbitrary block of code as before, as long as it returns a value of type T.
So creating a Future is just like creating any other object. The fetchItem code is just creating a Future object and returning it.
However there is a subtlety with Future in that the parameter is defined as a "call-by-name" parameter not the default "call-by-value" parameter. This means that it is not evaluated until it is used, and it is evaluated every time it is used. In the case of a Future the parameter is evaluated once at a later time and potentially on a different thread. If you use a block to compute the parameter then the whole block will be executed each time the parameter is used.
Scala has very powerful syntax and some useful shortcuts, but it can take a while to get used to it!
A typical method structure will look like:
case class Bar(name: String)
def foo(param: String): Bar = {
// code block.
}
However, Scala is quite flexible when its comes to method definition. One of flexibility is that if your method block contains single expression, then you can ignore curly braces { }.
def foo(param: String): Bar = {
Bar("This is Bar") //Block only contains single expression.
}
// Can be written as:
def foo(param: String): Bar = Bar("This is Bar")
// In case of multiple expressions:
def foo(param: String): Bar = {
println("Returning Bar...")
Bar("This is Bar")
}
def foo(param: String): Bar = println("Returning Bar...") Bar("This is Bar") //Fails
def foo(param: String): Bar = println("Returning Bar..."); Bar("This is Bar") //Fails
def foo(param: String): Bar = {println("Returning Bar..."); Bar("This is Bar")} // Works
Similarly, in your code, fetchItem contains only single expression - Future {orders.find(o => o.id == itemId)} that return a new Future (instance of Future) of Option[Item], therefore braces { } is optional. However, if you want you can write it inside braces as below:
def fetchItem(itemId: Long): Future[Option[Item]] = {
Future {
orders.find(o => o.id == itemId)
}
}
Similarly, if a method take only single parameter, you can use curly braces. i.e. you can invoke fetchItems as:
fetchItem(10)
fetchItem{10}
fetchItem{
10
}
So, why use curly braces { } instead of brackets ( )?
Because, you can provide multiple code blocks inside braces, and this situation is required when need to perform multiple computation and return a value as result of that computation. For example:
fetchItem{
val x = 2
val y = 3
val z = 2
(x + y)*z //final result is 10 which is passed as parameter.
}
// In addition, using curly braces will make your code more cleaner e.g. in case of higher ordered functions.
def test(id: String => Unit) = ???
test {
id => {
val result: List[String] = getDataById(x)
val updated = result.map(_.toUpperCase)
updated.mkString(",")
}
}
Now, coming to your case, when you invoke Future{...}, you are invoking apply(...) method of Scala future companion object that take function literal body: =>T as parameter and return a new Future.
//The below code can also be written as:
Future {
orders.find(o => o.id == itemId)
}
//Can be written as:
Future(orders.find(o => o.id == itemId))
Future.apply(orders.find(o => o.id == itemId))
// However, braces are not allowed with multiple parameter.
def foo(a:String, b:String) = ???
foo("1","2") //work
foo{"1", "2"} //won't work.

Define return value in Spark Scala UDF

Imagine the following code:
def myUdf(arg: Int) = udf((vector: MyData) => {
// complex logic that returns a Double
})
How can I define the return type for myUdf so that people looking at the code will know immediately that it returns a Double?
I see two ways to do it, either define a method first and then lift it to a function
def myMethod(vector:MyData) : Double = {
// complex logic that returns a Double
}
val myUdf = udf(myMethod _)
or define a function first with explicit type:
val myFunction: Function1[MyData,Double] = (vector:MyData) => {
// complex logic that returns a Double
}
val myUdf = udf(myFunction)
I normally use the firt approach for my UDFs
Spark functions define several udf methods that have the following modifier/type: static <RT,A1, ..., A10> UserDefinedFunction
You can specify the input/output data types in square brackets as follows:
def myUdf(arg: Int) = udf[Double, MyData]((vector: MyData) => {
// complex logic that returns a Double
})
You can pass a type parameter to udf but you need to seemingly counter-intuitively pass the return type first, followed by the input types like [ReturnType, ArgTypes...], at least as of Spark 2.3.x. Using the original example (which seems to be a curried function based on arg):
def myUdf(arg: Int) = udf[Double, Seq[Int]]((vector: Seq[Int]) => {
13.37 // whatever
})
There is nothing special about UDF with lambda functions, they behave just like scala lambda function (see Specifying the lambda return type in Scala) so you could do:
def myUdf(arg: Int) = udf(((vector: MyData) => {
// complex logic that returns a Double
}): (MyData => Double))
or instead explicitly define your function:
def myFuncWithArg(arg: Int) {
def myFunc(vector: MyData): Double = {
// complex logic that returns a Double. Use arg here
}
myFunc _
}
def myUdf(arg: Int) = udf(myFuncWithArg(arg))

can we have methods as arguments in a class definition?

Following is the snippet of code I was going thru a scala book.
One of the parameter of the Car class is "def color: String"
I didn't understand. "def" is a keyword to define a method. how can that be used in paramenters?
scala> abstract class Car {
| val year: Int
| val automatic: Boolean = true
| def color: String
| }
A function that takes other functions are arguments is known as a higher-order function, here is an example of that:
// A function that takes a list of Ints, and a function that takes an Int and returns a boolean.
def filterList(list: List[Int], filter: (Int) => Boolean) = { /* implementation */ }
// You might call it like this
def filter(i: Int) = i > 0
val list = List(-5, 0, 5, 10)
filterList(list, filter)
// Or shorthand like this
val list = List(-5, 0, 5, 10)
filterList(list, _ > 0)
However that is not what is happening in your example. In your example the Car class has three class members, two of them are variables, and one of them is a function. If you were to extend your abstract class, you could test the values out:
abstract class Car {
val year: Int
val automatic: Boolean = true
def color: String
}
case class Sedan(year: Int) extends Car {
def color = "red"
}
val volkswagen = Sedan(2012)
volkswagen.year // 2012
volkswagen.automatic // true
volkswagen.color // red
Here, having color as a function (using def) doesn't make too much sense, because with my implementation, the color is always going to be "red".
A better example for using a function for a class member would be for some value that is going to change:
class BrokenClock {
val currentTime = DateTime.now()
}
class Clock {
def currentTime = DateTime.now()
}
// This would always print the same time, because it is a value that was computed once when you create a new instance of BrokenClock
val brokenClock = BrokenClock()
brokenClock.currentTime // 2017-03-31 22:51:00
brokenClock.currentTime // 2017-03-31 22:51:00
brokenClock.currentTime // 2017-03-31 22:51:00
// This will be a different value every time, because each time we are calling a function that is computing a new value for us
val clock = Clock()
clock.currentTime // 2017-03-31 22:51:00
clock.currentTime // 2017-03-31 22:52:00
clock.currentTime // 2017-03-31 22:53:00

How can I extend Scala collections with member values?

Say I have the following data structure:
case class Timestamped[CC[M] < Seq[M]](elems : CC, timestamp : String)
So it's essentially a sequence with an attribute -- a timestamp -- attached to it. This works fine and I could create new instances with the syntax
val t = Timestamped(Seq(1,2,3,4),"2014-02-25")
t.elems.head // 1
t.timestamp // "2014-05-25"
The syntax is unwieldly and instead I want to be able to do something like:
Timestamped(1,2,3,4)("2014-02-25")
t.head // 1
t.timestamp // "2014-05-25"
Where timestamped is just an extension of a Seq and it's implementation SeqLike, with a single attribute val timestamp : String.
This seems easy to do; just use a Seq with a mixin TimestampMixin { val timestamp : String }. But I can't figure out how to create the constructor. My question is: how do I create a constructor in the companion object, that creates a sequence with an extra member value? The signature is as follows:
object Timestamped {
def apply(elems: M*)(timestamp : String) : Seq[M] with TimestampMixin = ???
}
You'll see that it's not straightforward; collections use Builders to instantiate themselves, so I can't simply call the constructor an override some vals.
Scala collections are very complicated structures when it comes down to it. Extending Seq requires implementing apply, length, and iterator methods. In the end, you'll probably end up duplicating existing code for List, Set, or something else. You'll also probably have to worry about CanBuildFroms for your collection, which in the end I don't think is worth it if you just want to add a field.
Instead, consider an implicit conversion from your Timestamped type to Seq.
case class Timestamped[A](elems: Seq[A])(timestamp: String)
object Timestamped {
implicit def toSeq[A](ts: Timestamped[A]): Seq[A] = ts.elems
}
Now, whenever I try to call a method from Seq, the compiler will implicitly convert Timestamped to Seq, and we can proceed as normal.
scala> val ts = Timestamped(List(1,2,3,4))("1/2/34")
ts: Timestamped[Int] = Timestamped(List(1, 2, 3, 4))
scala> ts.filter(_ > 2)
res18: Seq[Int] = List(3, 4)
There is one major drawback here, and it's that we're now stuck with Seq after performing operations on the original Timestamped.
Go the other way... extend Seq, it only has 3 abstract members:
case class Stamped[T](elems: Seq[T], stamp: Long) extends Seq[T] {
override def apply(i: Int) = elems.apply(i)
override def iterator = elems.iterator
override def length = elems.length
}
val x = Stamped(List(10,20,30), 15L)
println(x.head) // 10
println(x.timeStamp) // 15
println(x.map { _ * 10}) // List(100, 200, 300)
println(x.filter { _ > 20}) // List(30)
Keep in mind, this only works as long as Seq is specific enough for your use cases, if you later find you need more complex collection behavior this may become untenable.
EDIT: Added a version closer to the signature you were trying to create. Not sure if this helps you any more:
case class Stamped[T](elems: T*)(stamp: Long) extends Seq[T] {
def timeStamp = stamp
override def apply(i: Int) = elems.apply(i)
override def iterator = elems.iterator
override def length = elems.length
}
val x = Stamped(10,20,30)(15L)
println(x.head) // 10
println(x.timeStamp) // 15
println(x.map { _ * 10}) // List(100, 200, 300)
println(x.filter { _ > 20}) // List(30)
Where elems would end up being a generically created WrappedArray.

Override default argument only when condition is fulfilled

Given
case class Foo (
x: Int = 1,
y: String,
)
What is the best way to instantiate said class, overwriting default params only if a local condition is fulfilled (e.g. the local variable corresponding to the constructor parameter is not None)
object Test {
/* Let's pretend I cannot know the state of x,y
* because they come from the network, a file... */
val x: Option[Int] = getX()
val y: Option[String] = getY()
Foo(
x=???, // how to get x=if(x.isDefined) x else "use default of Foo" here
y=y.get,
)
}
The straightforward solution of checking the condition and instantiating the case class differently does not scale (seven arguments with default values -> 128 different instantiations)
Solution 1) One solution I know is:
object Foo {
val Defaults = Foo(y="")
}
object Test {
val x: Option[Int] = getX()
val y: Option[String] = getY()
Foo(
x=x.getOrElse(Foo.Defaults.x)
y=y.get
)
}
This works ok-ish. When y is None I get the NoSuchElementException, but that's OK because it is a mandatory constructor argument. However, this is clearly a hack and has the distinct drawback that it is possible to write:
Foo(
x=x.getOrElse(Foo.Defaults.x)
y=y.getOrElse(Foo.Defaults.y)
)
When y is None you get a non-sensical default value for y.
Solution 2) Another solution is something like:
sealed trait Field
case object X extends Field
object Foo {
private val template = Foo(y="")
val Defaults = {
case X => template.x
}
}
object Test {
val x: Option[Int] = getX()
val y: Option[String] = getY()
Foo(
x=x.getOrElse(Foo.Defaults(X))
y=y.get
)
}
This is a bit better type safety-wise, but now I have to create a type for each default parameter.
What would a correct and concise solution look like?
Not clear to me how you can do better than the following:
object Foo { def apply(optX: Option[Int], optY: Option[String]): Foo = Foo(optX.getOrElse(1), optY.get) }
case class Foo private(x: Int, y: String)
Foo(Some(5), None) // fails with NoSuchElementException
Foo(Some(5), Some("Hi")) // succeeds in creating Foo(5, "Hi")
Foo(None, Some("Hi")) // succeeds in creating Foo(1, "Hi"), note the default value of x
Whether a parameter is required or optional with a default is encoded via the invocation of get for the former and getOrElse for the latter.
Of course, you could wrap the Option's get method to provide a more meaningful message when required parameters are omitted.
I realize that is not far from your solution 1 and may not meet your needs.