scala: Is collection object mutable? - scala

this question has no functional value. Just trying to get a better understanding of scala.
All collections inherent Iterable. There is no isMutable method in Iterable.
Soliciting input on whether the below isMutable is the most efficient to assess mutability. It seems archaic but couldn't find an alternative other than testing for all mutable collection classes which is not ideal since new mutable classes could be added in the future. (I would define the method using implicit but didn't for simplicity).
import scala.collection.{immutable, mutable}
object IsMutable extends App {
val mutableMap: mutable.Map[String, Int] = mutable.Map("Apples" -> 4,"Pineapples" -> 1,"Oranges" -> 10,"Grapes" -> 7)
val immutableMap: immutable.Map[String, Int] = Map("Apples" -> 4,"Pineapples" -> 1,"Oranges" -> 10,"Grapes" -> 7)
def isMutable[A](obj: Iterable[A]): Boolean = obj.getClass.toString.startsWith("class scala.collection.mutable")
println(isMutable(mutableMap))
println(isMutable(immutableMap))
}

I think relying on the class name is not a good approach, although I believe, that the approach I'm proposing is not probably the most elegant/best way to find out if a collection is mutable or not. But you can use this:
import scala.reflect.ClassTag
def isMutable[T](iterable: scala.collection.Iterable[T]): Boolean =
iterable.isInstanceOf[scala.collection.mutable.Iterable[T]]
I think it would work fine for most of the types (the weird interface below is because I'm using ammonite as REPL, which is pretty cool :D).
# val immutableMap = scala.collection.immutable.Map[String, String]()
immutableMap: Map[String, String] = Map()
# isMutable(immutableMap)
res14: Boolean = false
# val mutableMap = scala.collection.mutable.Map[String, String]()
mutableMap: mutable.Map[String, String] = HashMap()
# isMutable(mutableMap)
res16: Boolean = true

AminMal's idea if fine. Mutable collections do extend mutable.Iterable, so checking if your collection is an instance of it is self-explanatory.
As an alternative way: mutable collections inherit 2 specific traits that allow them to be mutated internally: Growable and Shrinkable. Growable means a collection can be augmented using the += operator, while Shrinkable means it can be reduced using the -= operator.
On a side note, there is a trick to use these operators on immutable collections too: your reference must be declared using var to support reassignment. With mutable collections, though, you don't need reassignment, because these operations are supported by the 2 traits mentioned, which is why mutable collections can be declared using val.
Checking if your collection is an instance of either one of these 2 traits means it is mutable:
val myMap: mutable.Map[String, Int] = mutable.Map(
"Apples" -> 4,
"Pineapples" -> 1,
"Oranges" -> 10,
"Grapes" -> 7
)
val mySet: mutable.Set[Int] = mutable.Set(1, 2, 3)
val myMap2: Map[String, Int] = Map(
"Apples" -> 4,
"Pineapples" -> 1,
"Oranges" -> 10,
"Grapes" -> 7
)
val mySet2: Set[Int] = Set(1, 2, 3)
println(myMap.isInstanceOf[mutable.Growable[_]]) // true
println(myMap2.isInstanceOf[mutable.Shrinkable[_]]) // false
println(mySet.isInstanceOf[mutable.Shrinkable[_]]) // true
println(mySet2.isInstanceOf[mutable.Shrinkable[_]]) // false

Related

Scala Map as parameters for spark ML models

I have developed a tool using pyspark. In that tool, the user provides a dict of model parameters, which is then passed to an spark.ml model such as Logistic Regression in the form of LogisticRegression(**params).
Since I am transferring to Scala now, I was wondering how this can be done in Spark using Scala? Coming from Python, my intuition is to pass a Scala Map such as:
val params = Map("regParam" -> 100)
val model = new LogisticRegression().set(params)
Obviously, it's not as trivial as that. It seem as in scala, we need to set every single parameter separately, like:
val model = new LogisticRegression()
.setRegParam(0.3)
I really want to avoid being forced to iterate over all user input parameters and set the appropriate parameters with tons of if clauses.
Any ideas how to solve this as elegantly as in Python?
According to the LogisticRegression API you need to set each param individually via setter:
Users can set and get the parameter values through setters and
getters, respectively.
An idea is to build your own mapping function to dynamically call the corresponding param setter using reflection.
Scala is a statically typed language, hence by-design doesn't have anything like Python's **params. As already being considered, you can store them in a Map of type[K, Any], but type erasure would erase types of the Map values due to JVM's runtime constraint.
Shapeless provides some neat mixed-type features that can circumvent the problem. An alternative is to use Scala's TypeTag to preserve type information, as in the following example:
import scala.reflect.runtime.universe._
case class Params[K]( m: Map[(K, TypeTag[_]), Any] ) extends AnyVal {
def add[V](k: K, v: V)(implicit vt: TypeTag[V]) = this.copy(
m = this.m + ((k, vt) -> v)
)
def grab[V](k: K)(implicit vt: TypeTag[V]) = m((k, vt)).asInstanceOf[V]
}
val params = Params[String](Map.empty).
add[Int]("a", 100).
add[String]("b", "xyz").
add[Double]("c", 5.0).
add[List[Int]]("d", List(1, 2, 3))
// params: Params[String] = Params( Map(
// (a,TypeTag[Int]) -> 100, (b,TypeTag[String]) -> xyz, (c,TypeTag[Double]) -> 5.0,
// (d,TypeTag[scala.List[Int]]) -> List(1, 2, 3)
// ) )
params.grab[Int]("a")
// res1: Int = 100
params.grab[String]("b")
// res2: String = xyz
params.grab[Double]("c")
// res3: Double = 5.0
params.grab[List[Int]]("d")
// res4: List[Int] = List(1, 2, 3)

Why the val x = mutable.Map(...) is mutable in scala? [duplicate]

This question already has answers here:
access modifers in scala with var and val
(2 answers)
Closed 5 years ago.
Considering the following examples
case 1:
>scala val x = 1
x:Int = 1
>scala x = 2
<console>:11: error: reassignment to val
x=2
^
case 2:
scala> val name = new scala.collection.mutable.HashMap[String, Int]
name: scala.collection.mutable.HashMap[String,Int] = Map()
scala>name("Hello") = 1
scala>name
res1: scala.collection.mutable.HashMap[String,Int] = Map(Hello -> 1)
I can understand the case 1 because x is a val-type. For the case 2, although name is also val-type, name is mutable. How to explain it?
mutable.HashMap is mutable by nature no matter you use val or var.
But val makes a difference if you are mutating/reassigning the reference of instance, as val does not allow re-assignment but var does.
eg.
mutating data is allowed,
scala> val mutableMap = new scala.collection.mutable.HashMap[String, Int]
mutableMap: scala.collection.mutable.HashMap[String,Int] = Map()
scala> mutableMap += ("some name" -> 8888)
res3: mutableMap.type = Map(some name -> 8888)
but mutating reference is not allowed because of val,
scala> mutableMap = new scala.collection.mutable.HashMap[String, Int]
<console>:12: error: reassignment to val
mutableMap = new scala.collection.mutable.HashMap[String, Int]
^
If you want immutable map(no data mutation), use scala.collection.Map.
scala> val immutableMap = scala.collection.Map("prayagupd" -> 1000)
immutableMap: scala.collection.Map[String,Int] = Map(prayagupd -> 1000)
In your second case, name is a pointer pointing to mutable.HashMap and you are editing the hashMap by
name("Hello") = 1
you have explicitly define hashmap as mutable thats why you could edit it. But you won't be able to reference to another object with same name as
name = new scala.collection.mutable.HashMap[String, Int]
But if you define it with var as
var name = new scala.collection.mutable.HashMap[String, Int]
you can reference/point to any other objects
scala> val name = new scala.collection.mutable.HashMap[String, Int]
Here the reference is mutable, the Map collection is not, you can add/remove elements from that map as its of type mutable.HashMap, if you want immutable version try default Scala Map.
Qutoting for more details: http://docs.scala-lang.org/overviews/collections/overview.html
Scala collections systematically distinguish between mutable and immutable collections. A mutable collection can be updated or extended in place. This means you can change, add, or remove elements of a collection as a side effect. Immutable collections, by contrast, never change. You have still operations that simulate additions, removals, or updates, but those operations will in each case return a new collection and leave the old collection unchanged.
In your example the val name is an immutable reference to a mutable instance. The hashmap is mutable you explicitly requested one. The name reference is val and not var as such you can not overwrite the reference and will always point to the same object though the object may change.
You could create a var reference to an immutable map to create the opposite affect.

scala map in a method argument can not add key-value

In Scala , how to pass a map to a method as a reference object so that I can add key-value to this map. I tried this code, it doesn't work.
var rtnMap = Map[Int, String]()
def getUserInputs(rtnMap: Map[Int, String]) {
rtnMap += (1-> "ss") //wrong here
}
I understand, by default, argument in a method is val, like final in java, it can provide some safety. but at least it should allow us to insert a new entry. do you have any idea ?
Welcome to functional programming
First of all your use case is possible with mutable map. You have using immutable map because that is by default available in Scala. Everything from the package scala.Predef is by default available in Scala and you don't need to import it by default.
Below code works as excepted.
import scala.collection.mutable.Map
val gMap = Map[Int, String]()
def getUserInputs(lMap: Map[Int, String]) = {
lMap += (1-> "ss")
}
Below call will change the contents of the gMap
getUserInputs(gMap)
Here is the proof
scala> import scala.collection.mutable.Map
import scala.collection.mutable.Map
scala>
| val gMap = Map[Int, String]()
gMap: scala.collection.mutable.Map[Int,String] = Map()
scala>
| def getUserInputs(lMap: Map[Int, String]) = {
| lMap += (1-> "ss")
| }
getUserInputs: (lMap: scala.collection.mutable.Map[Int,String])scala.collection.mutable.Map[Int,String]
scala> getUserInputs(gMap)
res2: scala.collection.mutable.Map[Int,String] = Map(1 -> ss)
scala> gMap
res3: scala.collection.mutable.Map[Int,String] = Map(1 -> ss)
In the last Scala repl notice the contents of the gMap. gMap contains the added item.
General code improvements
Do not use mutable collections unless you have a strong reason for using it.
In case of immutable collections new instance is returned when a operation to change the existing datastructure is done. This way the existing data structure does not change. This is good in many ways. This ensures program correctness and also ensures something called as referential transparency (read about it).
so your program should be ideally like this
val gMap = Map.empty[String, String] //Map[String, String]()
def getUserInputs(lMap: Map[Int, String]) = {
lMap += (1-> "ss")
}
val newMap = getUserInputs(gMap)
Contents are added to newMap and the old Map is not changed, stays intact. This is very useful because the code holding on to the gMap and accessing the gMap need not be worried about the changes happening to the underlying structure.(Lets say in multi-threaded scenarios, its very useful.)
Keeping the original structure intact and creating the new instance for changed state is the general way of dealing with state in functional programming. So its important to understand this and practice this.
Deprecated syntax and its removed in Scala 2.12
You declared your function like below
def getUserInputs(lMap: Map[Int, String]) { // no = here
lMap += (1-> "ss")
}
In the above function definition there is no = after the closed parenthesis. This is deprecated in Scala 2.12. So don't use it because Scala compiler gives misleading compilation errors with this function declaration syntax.
Correct way is this.
def getUserInputs(lMap: Map[Int, String]) = {
lMap += (1-> "ss")
}
Notice there is = in this syntax.
To pass a reference to a map and change that map, you'd need to use a mutable Map implementation (the default in Scala is an immutable map, regardless of whether it is declared as a val or a var), so an alternative to #Esardes's answer (one that would also work if rtnMap is not in-scope where getUserInputs is defined) would be:
import scala.collection.mutable
def getUserInputs(map: mutable.Map[Int, String]) {
map += (1 -> "ss") // mutating "map"
}
val rtnMap = mutable.Map[Int, String]() // notice this can be a val
getUserInputs(rtnMap)
println(rtnMap)
// Map(1 -> ss)
You either wrap your value in an object that you then pass as a parameter, or, if it's available in the scope of your function, you can directly write
def getUserInputs = {
rtnMap += (1-> "ss")
}

Scala combining maps while handling collision with custom type with Scalaz semigroup

case class Special(foo: Bar) {
}
case class SpecialMap(bar: Map[SpecialKey, Special]) {
//I want to be able to do
def combineValuesMap(that: SpecialMap) : SpecialMap = {
this.bar |+| that.bar
}
}
I have tried overloading + in special however that did not enable me to perform. How can I create a special type that will allow me to utilize the semigroup to do this?
Reference for |+| : Best way to merge two maps and sum the values of same key?
You'll need to provide a semigroup instance for your Special type. The "best" place for instances like this is the companion object for the type, since instances defined there will automatically be available:
case class Special(foo: String)
object Special {
import scalaz.Semigroup
implicit val semigroupSpecial: Semigroup[Special] = Semigroup.instance(
(specialA, specialB) => Special(s"(${specialA.foo}-${specialB.foo})")
)
}
(Note that I'm simplifying a bit, but the idea will be the same in your case.)
And then:
import scalaz.std.map._, scalaz.syntax.semigroup._
val m1 = Map("a" -> Special("a"), "b" -> Special("b"))
val m2 = Map("c" -> Special("c"), "b" -> Special("another b"))
val m3 = m1 |+| m2
This will use the combination function you've provided to combine values in the case of a collision:
scala> m3.foreach(println)
(c,Special(c))
(b,Special((b-another b)))
(a,Special(a))
Which is what you want.

Scala - Multiple ways of initializing containers

I am new to Scala and was wondering what is the difference between initializing a Map data structure using the following three ways:
private val currentFiles: HashMap[String, Long] = new HashMap[String, Long]()
private val currentJars = new HashMap[String, Long]
private val currentVars = Map[String, Long]
There are two different parts to your question.
first, the difference between using an explicit type or not (cases 1 and 2) goes for any class, not necessarily containers.
val x = 1
Here the type is not explicit, and the compiler will try to figure it out using type inference. The type of x will be Int.
val x: Int = 1
Same as above, but now explicitly. If whatever you have at the right of = can't be cast to an Int, you will get a compiler error.
val x: Any = 1
Here we will still store a 1, but the type of the variable will be a parent class, using polymorphism.
The second part of your question is about initialization. The base initialization is as in java:
val x = new List[Int]()
This calls the class constructor and returns a new instance of the exact class.
Now, there is a special method called .apply that you can define and call with just parenthesis, like this:
val x = Seq[Int]()
This is a shortcut for this:
val x = Seq.apply[Int]()
Notice this is a function on the Seq object. The return type is whatever the function wants it to be, it is just another function. That said, it is mostly used to return a new instance of the given type, but there are no guarantees, you need to look at the function documentation to be sure of the contract.
That said, in the case of val x = Map[String, Long]() the implementation returns an actual instance of immutable.HashMap[String, Long], which is kind of the default Map implementation.
Map and HashMap are almost equivalent, but not exactly the same thing.
Map is trait, and HashMap is a class. Although under the hood they may be the same thing (scala.collection.immutable.HashMap) (more on that later).
When using
private val currentVars = Map[String, Long]()
You get a Map instance. In scala, () is a sugar, under the hood you are actually calling the apply() method of the object Map. This would be equivalent to:
private val currentVars = Map.apply[String, Long]()
Using
private val currentJars = new HashMap[String, Long]()
You get a HashMap instance.
In the third statement:
private val currentJars: HashMap[String, Long] = new HashMap[String, Long]()
You are just not relying anymore on type inference. This is exactly the same as the second statement:
private val currentJars: HashMap[String, Long] = new HashMap[String, Long]()
private val currentJars = new HashMap[String, Long]() // same thing
When / Which I use / Why
About type inference, I would recommend you to go with type inference. IMHO in this case it removes verbosity from the code where it is not really needed. But if you really miss like-java code, then include the type :) .
Now, about the two constructors...
Map vs HashMap
Short answer
You should probably always go with Map(): it is shorter, already imported and returns a trait (like a java interface). This last reason is nice because when passing this Map around you won't rely on implementation details since Map is just an interface of what you want or need.
On the other side, HashMap is an implementation.
Long answer
Map is not always a HashMap.
As seen in Programming in Scala, Map.apply[K, V]() can return a different class depending on how many key-value pairs you pass to it (ref):
Number of elements Implementation
0 scala.collection.immutable.EmptyMap
1 scala.collection.immutable.Map1
2 scala.collection.immutable.Map2
3 scala.collection.immutable.Map3
4 scala.collection.immutable.Map4
5 or more scala.collection.immutable.HashMap
When you have less then 5 elements you get an special class for each of these small collections and when you have an empty Map, you get a singleton object.
This is done mostly to get better performance.
You can try it out in repl:
import scala.collection.immutable.HashMap
val m2 = Map(1 -> 1, 2 -> 2)
m2.isInstanceOf[HashMap[Int, Int]]
// false
val m5 = Map(1 -> 1, 2 -> 2, 3 -> 3, 4 -> 4, 5 -> 5, 6 -> 6)
m5.isInstanceOf[HashMap[Int, Int]]
// true
If you are really curious you can even take a look at the source code.
So, even for performance you should also probably stick with Map().