Scala collections, single key multiple values - scala

I have a list of parent keys, each of which could possibly have zero or more associated values. I am not sure which collection to use.
I am using Map[Int,List[String]]
I am declaring the Map as
var nodes = new HashMap[Int, List[String]]
Then I have two methods to handle adding new elements. The first is to add new keys addNode and the second is to add new values addValue. Initially, the key will not have any values associated with it. Later on, during execution, new values will be associated.
def addNode(key: Int) = nodes += (key -> "")
def addValue(key: Int, value: String) = ???
I am not sure how to implement addValues
Update:
In response to #oxbow-lakes answer, This is the error I am receiving. Please note that keys need not have values associated with them.
scala> var nodes = Map.empty[Int, List[String]]
nodes: scala.collection.immutable.Map[Int,List[String]] = Map()
scala> nodes += (1->null)
scala> nodes += (1 -> ("one" :: (nodes get 1 getOrElse Nil)))
java.lang.NullPointerException
at .<init>(<console>:9)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:680)
Update 2:
The problem with the code above is the line nodes += (1->null) the key should be associated with Nil instead. Below is the working code.
scala> var nodes = Map.empty[Int, List[String]]
nodes: scala.collection.immutable.Map[Int,List[String]] = Map()
scala> nodes += (1->Nil)
scala> nodes += (1 -> ("one" :: (nodes get 1 getOrElse Nil)))
scala> nodes
res27: scala.collection.immutable.Map[Int,List[String]] = Map(1 -> List(one))

Using MultiMap
You possibly want to use MultiMap, which is a mutable collection isomorphic to Map[K, Set[V]]. Use as follows:
import collection.mutable
val mm = new mutable.HashMap[Int, mutable.Set[String]] with mutable.MultiMap[Int, String]
Then you add your nodes:
mm addBinding (key, value)
Without MultiMap
The alternative is to stick with immutable values. Assuming you want to avoid using lenses (see scalaz), you can add nodes as follows:
nodes += (key -> (value :: (nodes get key getOrElse Nil)))
Here it is working (in response to your comment):
scala> var nodes = Map.empty[Int, List[String]]
nodes: scala.collection.immutable.Map[Int,List[String]] = Map()
scala> def addNode(key: Int, value: String) =
| nodes += (key -> (value :: (nodes get key getOrElse Nil)))
addNode: (key: Int, value: String)Unit
scala> addNode(1, "Hi")
scala> addNode(1, "Bye")
scala> nodes
res2: scala.collection.immutable.Map[Int,List[String]] = Map(1 -> List(Bye, Hi))
Using Scalaz
Using the scalaz library, you can realize that this is simply using the Empty pattern:
nodes += (key -> (value :: ~(nodes get key)))
Or you could take advantage of the fact that Map is a monoid:
nodes = nodes |+| Map(key -> List(value))

In addition to #oxbow_lakes' answer, here's a idea for how you could use an addMap method that correctly adds two maps together (ie, combining lists for matching keys, adding new lists for new keys):
class EnhancedListMap(self: Map[Int,List[String]]) {
def addMap(other: Map[Int,List[String]]) =
(this.ungroup ++ enhanceListMap(other).ungroup)
.groupBy(_._1)
.mapValues(_.map(_._2))
def ungroup() =
self.toList.flatMap{ case (k,vs) => vs.map(k -> _) }
}
implicit def enhanceListMap(self: Map[Int,List[String]]) = new EnhancedListMap(self)
And you'd use it like this:
val a = Map(1 -> List("a","b"), 2 -> List("c","d"))
val b = Map(2 -> List("e","f"), 3 -> List("g","h"))
a addMap b
//Map(3 -> List(g, h), 1 -> List(a, b), 2 -> List(c, d, e, f))
You can include addNode, addValue, and addValues the same way (to EnhancedListMap above):
def addNode(key: Int) =
if(self contains key) self else self + (key -> Nil)
def addValue(key: Int, value: String) =
self + (key -> (value :: (self get key getOrElse Nil)))
def addValues(key: Int, values: List[String]) =
self + (key -> (values ::: (self get key getOrElse Nil)))
And then use them together:
var nodes = Map.empty[Int, List[String]]
// Map()
nodes = nodes.addNode(1)
// Map(1 -> List())
nodes = nodes.addValue(1,"a")
// Map(1 -> List(a))
nodes = nodes.addValue(2,"b")
// Map(1 -> List(a), 2 -> List(b))
nodes = nodes.addValues(2,List("c","d"))
// Map(1 -> List(a), 2 -> List(c, d, b))
nodes = nodes.addValues(3,List("e","f"))
// Map(1 -> List(a), 2 -> List(c, d, b), 3 -> List(e, f))
nodes = nodes.addMap(Map(3 -> List("g","h"), 4-> List("i","j")))
// Map(1 -> List(a), 2 -> List(c, d, b), 3 -> List(e, f, g, h), 4 -> List(i, j))

I quite like the getOrElseUpdate method provided by mutable maps:
import scala.collection.mutable._
private val nodes = new HashMap[Int, Buffer[String]]
def addNode(key: Int): Unit =
nodes.getOrElseUpdate(key, new ArrayBuffer)
def addValue(key: Int, value: String): Unit =
nodes.getOrElseUpdate(key, new ArrayBuffer) += value

Related

How foldLeft works with Seq of tupels?

I'm new in Scala, so struggling to understand Syntax. Please check below code.
def myDef(entityMap: Seq[(DataName.Value, DataFrame)]) : Seq[Map[Int,Info]] = {
val depenInfo = Seq[Map[Int,Info]]()
entityMap.foldLeft(depenInfo)((info,entity) => {
val(dataName: DataName.Value, df: DataFrame) = entity
info ++ df.createDepenInfo(dataName)
})
}
what am I getting is, Seq of tuples having to two types (DataName.Value, DataFrame) and return type of myDef is Seq of Map.
After that, create one empty Seq of Map, then feed this empty Map to entityMap.foldLeft so that it could add more values in it.
Remaining part I kind of literally stuck to understand. Can anyone please help me to understand what's happening ? If possible with any very simple example similar to above with output.
Thanks
Since there are many user defined class I don't know, I try to mock your data type as following:
import scala.collection.{immutable, Seq}
object Example {
object DataName {
type Value = Int
}
case class DataFrame(fakeData: String) {
def createDepenInfo(value: DataName.Value): Seq[Map[Int, Info]] = Seq(Map(value -> fakeData))
}
type Info = String
def myDef(entityMap: Seq[(DataName.Value, DataFrame)]): Seq[Map[Int, Info]] = {
val depenInfo = Seq[Map[Int, Info]]()
entityMap.foldLeft(depenInfo)((info: Seq[Map[Int, Info]], entity: (DataName.Value, DataFrame)) => {
// here is Pattern matching on tuples,
// here we extract (dataName: DataName.Value, df: DataFrame) from tuple entity: (DataName.Value, DataFrame)
// see: https://docs.scala-lang.org/tour/tuples.html
val (dataName: DataName.Value, df: DataFrame) = entity
// ++ is a method of Seq, it contact two Seq to one
// e.g. Seq(1,2,3) ++ Seq(4,5,6) = Seq(1,2,3,4,5,6)
info ++ df.createDepenInfo(dataName)
})
}
def main(args: Array[String]): Unit = {
val data: immutable.Seq[(DataName.Value, DataFrame)] = (1 to 5).map(i => (i, DataFrame((i + 'a').toChar.toString)))
// Vector((1,DataFrame(b)), (2,DataFrame(c)), (3,DataFrame(d)), (4,DataFrame(e)), (5,DataFrame(f)))
println(data)
val res = myDef(data)
// List(Map(1 -> b), Map(2 -> c), Map(3 -> d), Map(4 -> e), Map(5 -> f))
println(res)
}
}
raw data: Vector((1,DataFrame(b)), (2,DataFrame(c)), (3,DataFrame(d)), (4,DataFrame(e)), (5,DataFrame(f)))
let's say info ++ df.createDepenInfo(dataName) is result
info = Seq(), entity = (1,DataFrame(b)), reuslt=Seq(Map(1 -> b))
info = Seq(Map(1 -> b)), entity = (2,DataFrame(c)), result=Seq(Map(1 -> b), Map(2 -> c))
info = Seq(Map(1 -> b), Map(2 -> c)), entity = (3,DataFrame(d)), result=Seq(Map(1 -> b), Map(2 -> c), Map(3 -> d))
and so on...
You see, during each caluclation, the value info is "saved"(with a init value from deepInfo), and the entity value is "read" from entityMap.
So the final result is List(Map(1 -> b), Map(2 -> c), Map(3 -> d), Map(4 -> e), Map(5 -> f))
In your code, info is the accumulator, depenInfo is the initial value, and entity is a map item (i.e. a key-value tuple). Here's a simpler example where acc is the accumulator and kv is the key-value pair being "read" from the map.
Map(1->2).foldLeft(0)((acc, kv) => {
val (k, v) = kv;
println(s"$k, $v, $acc");
acc + k + v
})
// prints: 1, 2, 0
// result: 3
To read about the accumulator pattern: https://www.arothuis.nl/posts/accumulators-and-folds/
As for the ++, that is the operator/method in Seq (sequence) which concatenates this sequence to another sequence. Simple example with concatenating length 1 sequences together:
Seq(1) ++ Seq(2)
// Seq(1, 2)

updated method on ListMap

I'm using ListMap because I need to keep the insertion order in place. After initializing it seems it works. but when I call updated on it the order gets messed up. 1- Why is that? 2- Is there any other MapLike that doesn't have this problem, if not how should I update the map without problem?
scala> import scala.collection.immutable.ListMap
import scala.collection.immutable.ListMap
scala> val a = ListMap(0 -> "A", 1 -> "B", 2 ->"C")
a: scala.collection.immutable.ListMap[Int,String] = Map(0 -> A, 1 -> B, 2 -> C)
scala> a.foreach(println)
(0,A)
(1,B)
(2,C)
scala> val b = a.updated(1, "D")
b: scala.collection.immutable.ListMap[Int,String] = Map(0 -> A, 2 -> C, 1 -> D)
scala> b.foreach(println)
(0,A)
(2,C)
(1,D)
I could not find any existent immutable collection with desired property. But it could be crafted manually.
import scala.collection.immutable.{IntMap, Map, MapLike}
class OrderedMap[K, +V] private[OrderedMap](backing: Map[K, V], val order: IntMap[K], coorder: Map[K, Int], extSize: Int)
extends Map[K, V] with MapLike[K, V, OrderedMap[K, V]] {
def +[B1 >: V](kv: (K, B1)): OrderedMap[K, B1] = {
val (k, v) = kv
if (backing contains k)
new OrderedMap(backing + kv, order, coorder, extSize)
else new OrderedMap(backing + kv, order + (extSize -> k), coorder + (k -> extSize), extSize + 1)
}
def get(key: K): Option[V] = backing.get(key)
def iterator: Iterator[(K, V)] = for (key <- order.valuesIterator) yield (key, backing(key))
def -(key: K): OrderedMap[K, V] = if (backing contains key) {
val index = coorder(key)
new OrderedMap(backing - key, order - index, coorder - key, extSize)
} else this
override def empty: OrderedMap[K, V] = OrderedMap.empty[K, V]
}
object OrderedMap {
def empty[K, V] = new OrderedMap[K, V](Map.empty, IntMap.empty, Map.empty, 0)
def apply[K, V](assocs: (K, V)*): OrderedMap[K, V] = assocs.foldLeft(empty[K, V])(_ + _)
}
Here order is preserved insertion order map (probably with "holes"). coorder special field needed for efficient handling element removal. extSize is basically order.lastkey + 1 but more straightforward
Now you can verify that
val a = OrderedMap(0 -> "A", 1 -> "B", 2 -> "C")
a.foreach(println)
val b = a.updated(1, "D")
b.foreach(println)
prints
(0,A)
(1,B)
(2,C)
and
(0,A)
(1,D)
(2,C)
From the scala doc for updated
"This method allows one to create a new map with an additional mapping
from key to value."
Note it does not say "with a different value of an existing key". So when you updated with 1->D, that's a new/additional mapping. So it appears at the end of the list, preserving insertion order. The old mapping 1->C is no longer present in the map.
So it's not "messed up" and it's not a problem. It's doing what it's documented to do, the mappings are in insertion order.

Updating immutable map as the side effect of getOrElse

Sometimes I use a Map as a memoization cache. With mutable maps, I use getOrElseUpdate:
mutableMap.getOrElseUpdate(key, {
val value = <compute the value>
value
})
Immutable maps don't have getOrElseUpdate. So I want to do this
immutableMap.getOrElse(key, {
val value = <compute the value>
immutableMap += key -> value
value
})
This seems to work in practice, I have good arguments to believe it works in theory, and it's more or less readable -- is it a terrible idea for some reason I'm missing?
The other alternatives I'm considering are
immutableMap.get(key) match {
case Some(value) => value
case None =>
val value = <compute the value>
immutableMap += key -> value
value
}
which is not much different and is more cumbersome, or
if (immutableMap.contains(key)) {
immutableMap(key)
} else {
val value = <compute the value>
immutableMap += key -> value
value
}
which is the dumbest and probably least idiomatic.
In principle I rather not go for a solution that uses a helper to return the value and the updated map, unless it's the unarguably superior way.
Sure, it seems reasonable except for one small issue... it's not updating your collection! If you're using an immutable Map, then that Map is immutable. You can not change it, ever.
In fact, immutable Map from Scala collection does not even have a += method defined on it, see immutable.Map. All the methods with "append" or "add" new values to the Map actually return a new Map. So for what you've written above to compile, you'd have to not be using something immutable.
To do this with an immutable map, you'll need to work with a var and replace that var with the new Map (which can lead to issues with threading) or you have to adopt a State Monad type pattern in which you return not only the new value but also the new Map.
def getOrCalc(m: Map[Key, Value], k: Key)(f: Key => Value): (Map[Key, Value], Value] ={
if(m.contains(k)) (m, m(k))
else{
val value = f(k)
(m +: (k, value), value)
}
}
My only recommendation (regarding the reasons why you choosed var instead of mutable.Map or Java's ConcurrentMap) is to wrap it into DSL, like:
case class Mutable[K,V](var m: Map[K,V]) {
def orElseUpdate(key: K, compute: => V) = m.getOrElse(key, {
val value = compute
m += key -> value
value
})
}
scala> val a = Mutable(Map(1 -> 2))
a: Mutable[Int,Int] = Mutable(Map(1 -> 2))
scala> a.orElseUpdate(2, 4)
res10: Int = 4
scala> a.orElseUpdate(2, 6)
res11: Int = 4
scala> a.orElseUpdate(3, 6)
res12: Int = 6
Another option (if your computation is lightweight) is just:
m += key -> m.getOrElse(key, compute)
m(key)
Example:
scala> var m = Map(1 -> 2)
m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2)
scala> m += 3 -> m.getOrElse(3, 5)
scala> m
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 5)
scala> m += 3 -> m.getOrElse(3, 5)
scala> m
res3: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 5)
scala> m += 3 -> m.getOrElse(3, 6)
scala> m
res5: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 5)
You can wrap it into DSL as well:
implicit class RichMap[K,V](m: Map[K,V]) {
def kvOrElse(k: K, v: V) = k -> m.getOrElse(k, v)
}
scala> m += m.kvOrElse(3, 7)
scala> m
res7: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 5)
scala> m += m.kvOrElse(4, 7)
scala> m
res9: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 5, 4 -> 7)

how to create the following map of map in Scala

I am a Scala newbie: how can I create the following "map of map" in Scala:
"Outer" map
+-------------------------+-------------------------------+
| Key | Value |
+-------------------------+-------------------------------+
| (employeID, currencyID) | (valueDate, set of CashFlow) |
+-------------------------+-------------------------------+
The value in the "outer" map is also a map:
"Inner" map
+-----------+------------------+
| Key | Value |
+-----------+------------------+
| valueDate | set of CashFlow |
+-----------+------------------+
with the following types:
employeeID: Int
currencyID: Int
valueDate: java.util.Date
set of CashFlow: Set[com.company.CashFlow]
The following won't compile (need to override +=, -=, etc):
myMap = new mutable.HashMap[(Int, Int), java.util.TreeMap[Date, mutable.Set[CashFlow]] with mutable.MultiMap[Date, CashFlow].withDefaultValue(new util.TreeMap[Date, mutable.Set[CashFlow]] with mutable.MultiMap[Date, CashFlow]
Requirements:
map must be mutable
sorted by valueDate
for one valueDate, I can have multiple cash flows
I would like to avoid checking for key existence on (employeeID, currencyID), i.e. myMap(emp1, ccy1).addBinding(date1, cashFlow1) shouldn't fail if the key doesn't exist.
Instead, it should automatically create a new empty sorted MultiMap and initialize it with (date1, cashFlow1)
You can create it without MultiMap:
import scala.collection.mutable._, JavaConverters._, java.util.TreeMap
val myMap = new mutable.HashMap[(Int, Int), mutable.Map[Date, Set[CashFlow]]]{
override def apply(k: (Int, Int)) =
getOrElseUpdate(k, new TreeMap[Date, mutable.Set[CashFlow]].asScala)
}
myMap: scala.collection.mutable.Map[(Int, Int),scala.collection.mutable.Map[java.util.Date,scala.collection.mutable.Set[CashFlow]]] = Map()
override def apply needed as withDefaultValue will always return you the same value.
It's also possible to create MultiMap from java's TreeMap using standard wrappers:
import scala.collection.convert.Wrappers._
//This wrapper will hold your Java's TreeMap inside, and delegate all operations to it
scala> def newTreeMultiMap[K, V]: MultiMap[K, V] = new JMapWrapper(new TreeMap[K, Set[V]]) with MultiMap[K, V]
newTreeMultiMap: [K, V]=> scala.collection.mutable.MultiMap[K,V]
scala> val myMap = new HashMap[(Int, Int), MultiMap[Int, String]]{ override def apply(k: (Int, Int)) = getOrElseUpdate(k, newTreeMultiMap[Int, String]) }
myMap: scala.collection.mutable.Map[(Int, Int),scala.collection.mutable.MultiMap[java.util.Date,CashFlow]] = Map()
I think it should be in standard library but didn't find any.
Examples (I used Int instead of Date to show that ordering works):
scala> val myMap = new HashMap[(Int, Int), MultiMap[Int, String]]{ override def apply(k: (Int, Int)) = getOrElseUpdate(k, newTreeMultiMap[Int, String]) }
myMap: scala.collection.mutable.Map[(Int, Int),scala.collection.mutable.MultiMap[Int,String]] = Map()
scala> myMap(0 -> 0).addBinding(4, "aaa") //no exceptions on myMap(0 -> 0) with empty Map
res27: scala.collection.mutable.MultiMap[Int,String] = Map(4 -> Set(aaa))
scala> myMap(0 -> 0).addBinding(2, "aaa") //should be before 4
res28: scala.collection.mutable.MultiMap[Int,String] = Map(2 -> Set(aaa), 4 -> Set(aaa))
scala> myMap(0 -> 0).addBinding(5, "aaa") //should be after 4
res29: scala.collection.mutable.MultiMap[Int,String] = Map(2 -> Set(aaa), 4 -> Set(aaa), 5 -> Set(aaa))
scala> myMap(0 -> 0).addBinding(4, "bbb")
res30: scala.collection.mutable.MultiMap[Int,String] = Map(2 -> Set(aaa), 4 -> Set(aaa, bbb), 5 -> Set(aaa))
scala> myMap(0 -> 0).toList //finally use the order
res31: List[(Int, scala.collection.mutable.Set[String])] = List((2,Set(aaa)), (4,Set(aaa, bbb)), (5,Set(aaa)))
scala> myMap(0 -> 1).addBinding(2, "aaa")
res18: scala.collection.mutable.MultiMap[Int,String] = Map(2 -> Set(aaa))

Get value with the lowest key value from Map[Int, String]

Say I have a map: Map[Int, String]. How would I get the value [String] with the lowest key [Int]. I've been trying to implement this functionally, but just can't figure out how to do this.
The following code will get you a value with a lowest key (ignoring some corner cases).
def lowestKeyMember[A](m: Map[Int,A]): A = m(m.keys.min)
This will break ties arbitrarily and throw on an empty map. If you need to do this operation frequently and/or on large maps, you should look into SortedMap.
Maps are not normally sorted. You could however use a SortedMap, then the map will be sorted and the first value will be the head. All you need to do is retrieve the head.
map.head()
Come on, people! "Functionally" is code word for "folding".
scala> val m = Map(1->"eins",2->"zwei",3->"drei")
m: scala.collection.immutable.Map[Int,String] = Map(1 -> eins, 2 -> zwei, 3 -> drei)
scala> m.foldLeft(Int.MaxValue -> "") { case (min,p) => if (min._1 <= p._1) min else p }
res0: (Int, String) = (1,eins)
But an 8-char operator?
Let's see, is that enough parens? Don't tell me -> is like - and /: is like /.
scala> (Int.MaxValue -> "" /: m) { case (min,p) => if (min._1 <= p._1) min else p }
<console>:9: error: missing arguments for method /: in trait TraversableOnce;
follow this method with `_' if you want to treat it as a partially applied function
(Int.MaxValue -> "" /: m) { case (min,p) => if (min._1 <= p._1) min else p }
^
Oh, well, OK.
scala> ((Int.MaxValue -> "") /: m) { case (min,p) => if (min._1 <= p._1) min else p }
res2: (Int, String) = (1,eins)
Or,
scala> import math.Ordering.Implicits._
import math.Ordering.Implicits._
scala> ((Int.MaxValue -> "") /: m) { case (min,p) if min <= p => min case (_, p) => p }
res5: (Int, String) = (1,eins)
A variant of the _.keys.min solution that works with Options (i.e. will not throw on an empty map):
scala> val a : Map[Int, String]=Map(1 -> "1", 2 -> "2")
a: Map[Int,String] = Map(1 -> 1, 2 -> 2)
scala> val b : Map[Int, String]=Map()
b: Map[Int,String] = Map()
scala> def valueForMinKey[K,V](a : Map[K,V])(implicit cmp : Ordering[K]) = a.keys.reduceOption(cmp.min(_, _)).map(a(_))
valueForMinKey: [K, V](a: Map[K,V])(implicit cmp: Ordering[K])Option[V]
scala> valueForMinKey(a)
res27: Option[String] = Some(1)
scala> valueForMinKey(b)
res28: Option[String] = None
In this example, the implicit parameter cmp will be satisfied by Ordering.Int. The example will work with any Map where the keys can be ordered (and a matching implict can be found by the compiler).