How to convert Map[String,Seq[String]] to Map[String,String] - scala

I have a Map[String,Seq[String]] and want to basically covert it to a Map[String,String] since I know the sequence will only have one value.

Someone else already mentioned mapValues, but if I were you I would do it like this:
scala> val m = Map(1 -> Seq(1), 2 -> Seq(2))
m: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(1), 2 -> List(2))
scala> m.map { case (k,Seq(v)) => (k,v) }
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2)
Two reasons:
The mapValues method produces a view of the result Map, meaning that the function will be recomputed every time you access an element. Unless you plan on accessing each element exactly once, or you only plan on accessing a very small percentage of them, you don't want that recomputation to take place.
Using a case with (k,Seq(v)) ensures that an exception will be thrown if the function ever sees a Seq that doesn't contain exactly one element. Using _(0) or _.head will throw an exception if there are zero elements, but will not complain if you had more than one, which will likely result in mysterious bugs later on when things go missing without errors.

You can use mapValues().
scala> Map("a" -> Seq("aaa"), "b" -> Seq("bbb"))
res0: scala.collection.immutable.Map[java.lang.String,Seq[java.lang.String]] = M
ap(a -> List(aaa), b -> List(bbb))
scala> res0.mapValues(_(0))
res1: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(a
-> aaa, b -> bbb)

I think I got it by doing the following:
mymap.flatMap(x => Map(x._1 -> x._2.head))

Yet another suggestion:
m mapValues { _.mkString }
This one's agnostic to whether the Seq has multiple elements -- it'll just concatenate all the strings together. If you're concerned about the recomputation of each value, you can make it happen up-front:
(m mapValues { _.mkString }).view.force

Related

Understanding foldLeft with Map instead of List

I fould like to understand how foldLeft works for Maps. I do understand how it works if I have a List and call on it foldLeft with a zero-element and a function:
val list1 = List(1,2,3)
list1.foldLeft(0)((a,b) => a + b)
Where I add the zero-element 0 with the first element of list1 and then add the second element of list1 and so on. So output becomes the new input and the first input is the zero-element.
Now I got the code
val map1 = Map(1 -> 2.0, 3 -> 4.0, 5 -> 6.2) withDefaultValue 0.0
val map2 = Map(0 -> 3.0, 3 -> 7.0) withDefaultValue 0.0
def myfct(terms: Map[Int, Double], term: (Int, Double)): Map[Int, Double] = ???
map1.foldLeft(map2)(myfct)
So my first element here is a Tuple2, but since map2 is a Map and not a Tuple2, what is the zero-element?
When we had a List, namely list1, we always "took the next element in list1". What is the "next element in map1? Is it another pair of map1?
In this context, you can think of a Map as a list of tuples. You can create a list like this: List(1 -> 2.0, 3 -> 4.0, 5 -> 6.2), and call foldLeft on it (that is more or less exactly what Map.foldLeft does). If you understand how foldLeft works with lists, then now you know how it works with Maps too :)
To answer your specific questions:
The first parameter of foldLeft can be of any type. You could pass in a Map instead of an Int in your first example too. It does not have to be of the same type as elements of the collection you are processing (although, it could be), as you have in your first example, nor does it need to be the same type as the collection itself, as you have it in the last example.
Consider this for the sake of example:
List(1,2,3,4,5,6).foldLeft(Map.empty[String,Int]) { case(map,elem) =>
map + (elem.toString -> elem)
}
This produces the same result as list.map { x => x.toString -> x }.toMap. As you can see, the first parameter here is a Map, that is neither List nor an Int.
The type you pass to foldLeft is also the type it returns, and the type that the function you pass in returns. It is not "element zero".
foldLeft will pass that parameter to your reducer function, along with the first element of the list. Your function will combine the two elements, and produce a new value of the same type as the first param. That value is passed in again, again with the second element ... etc.
Perhaps, inspecting the signature of foldLeft would be helpful:
foldLeft[B](z: B)(op: (B, A) ⇒ B): B
Here A is the type of your collection elements, and B can be anything, the only requirement is that the four places where it appears have the same type.
Here is another example, that is (almost) equivalent to list.mkString(","):
List(1,2,3,4,5,6).foldLeft("") {
case("", i) => i.toString
case(s,i) => s + "," + i
}
As I explained in the beginning, a map in this context is a kind of a list (a sequence rather). Just like "we always took the next element of the list" when we dealt with lists, we will take the "next element of the map" in this case. You said it yourself, the elements of the map are tuples, so that's what the type of the next element will be:
Map("one" -> 1, "two" -> 2, "three" -> 3)
.foldLeft("") {
case("", (key,value)) => key + "->" + value
case(s, (key,value)) => s + ", " + key + "->" + value
}
As mentioned above, the signature of the foldLeft operation is as shown:
foldLeft[B](z: B)(op: (B, A) ⇒ B): B
The resulting type of the operation is B. So, this answers your first question:
So my first element here is a Tuple2, but since map2 is a Map and not
a Tuple2, what is the zero-element?
The zero-element is whatever you want to output from the foldLeft. It could be an Int or a Map or anything else for that matter.
op in the function signature (i.e the second argument), is how you would like to operate on that for each element of map1, which is a (key,value) pair, or another way of writing it is as key -> value.
Let's try and understand it by building it up with more simpler operations.
val map1 = Map(1 -> 1.0, 4 -> 4.0, 5 -> 5.0) withDefaultValue 0.0
val map2 = Map(0 -> 0.0, 3 -> 3.0) withDefaultValue 0.0
// Here we're just making the output an Int
// So we just add the first element of the key-value pair.
def myOpInt(z: Int, term: (Int, Double)): Int = {
z + term._1
}
// adds all the first elements of the key-value pairs of map1 together
map1.foldLeft(0)(myOpInt) // ... output is 10
// same as above, but for map2 ...
map2.foldLeft(0)(myOpInt) // ... output is 3
Now, taking it to the next step by using a zero element (z) as an existing map...we would essentially add elements to that map we are using as z.
val map1 = Map(1 -> 1.0, 4 -> 4.0, 5 -> 5.0) withDefaultValue 0.0
val map2 = Map(0 -> 0.0, 3 -> 3.0) withDefaultValue 0.0
// Here z is a Map of Int -> Double
// We expect a pair of (Int, Double) as the terms to fold left with z
def myfct(z: Map[Int, Double], term: (Int, Double)): Map[Int, Double] = {
z + term
}
map1.foldLeft(map2)(myfct)
// output is Map(0 -> 0.0, 5 -> 5.0, 1 -> 1.0, 3 -> 3.0, 4 -> 4.0)
// i.e. same elements of both maps concatenated together, albeit out of order, since Maps don't guarantee order ...
When we had a List, namely list1, we always "took the next element in
list1". What is the "next element in map1? Is it another pair of map1?
Yes, it's another key-value pair (k,v) in map1.

Checking sameness/equality in Scala

As I asked in other post (Unique id for Scala object), it doesn't seem like that I can have id just like Python.
I still need to check the sameness in Scala for unittest. I run a test and compare the returned value of some nested collection object (i.e., List[Map[Int, ...]]) with the one that I create.
However, the hashCode for mutable map is the same as that of immutable map. As a result (x == y) returns True.
scala> val x = Map("a" -> 10, "b" -> 20)
x: scala.collection.immutable.Map[String,Int] = Map(a -> 10, b -> 20)
scala> x.hashCode
res0: Int = -1001662700
scala> val y = collection.mutable.Map("b" -> 20, "a" -> 10)
y: scala.collection.mutable.Map[String,Int] = Map(b -> 20, a -> 10)
scala> y.hashCode
res2: Int = -1001662700
In some cases, it's OK, but in other cases, I may need to make it failed test. So, here comes my question.
Q1: What is the normally used method for comparing two values (including very complicated data types) are the same? I may compare the toString() results, but I don't think this is a good idea.
Q2: Is it a general rule that mutable data structure has the same hashCode with immutable counterpart?
You are looking for AnyRef.eq which does reference equality (which is as close as you can get to Python's id function and is identical if you just want to compare references and you don't care about the actual ID):
scala> x == y
true
scala> x eq y
false

How to find the number of (key , value) pairs in a map in scala?

I need to find the number of (key , value) pairs in a Map in my Scala code. I can iterate through the map and get an answer but I wanted to know if there is any direct function for this purpose or not.
you can use .size
scala> val m=Map("a"->1,"b"->2,"c"->3)
m: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)
scala> m.size
res3: Int = 3
Use Map#size:
The size of this traversable or iterator.
The size method is from TraversableOnce so, barring infinite sequences or sequences that shouldn't be iterated again, it can be used over a wide range - List, Map, Set, etc.

When does Scala actually copy objects?

Background
I have a chunk of code that looks like this:
val big_obj = new BigObj
big_obj.recs[5].foo()
... // other code
big_obj.recs[7].bar()
Problem
I want to do something like this
val big_obj = new BigObj
alias ref = big_obj.recs // looking for something like an alias
ref[5].foo()
... // other code
ref[7].bar()
because I am afraid of making copies of big objects (coming from C++). But then I realised that Scala is probably smart and if I simply do this:
val big_obj = new BigObj
val ref = big_obj.recs // no copies made?
the compiler is probably smart enough to not copy anyways, since it's all read-only.
Question
This got me wondering about Scala's memory model.
Under what situations will copies be made/not made?
I am looking for a simple answer or rule-of-thumb which I can keep in my mind when I deal with really_big_objects, whenever I make assignments, including passing arguments.
Just like Java (and python and probably a lot of other languages), copies are never made of objects. When you assign an object or pass it as an argument, it only copies the reference to the object; the actual object just sits in memory and has an extra thing pointing to it. The only things that would get copied are primitives (integers, doubles, etc).
As you pointed out, this is obviously good for immutable objects, but it's true of all objects, even mutable ones:
scala> val a = collection.mutable.Map(1 -> 2)
a: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> val b = a
b: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> b += (2 -> 4)
res41: b.type = Map(2 -> 4, 1 -> 2)
scala> a
res42: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2)
scala> def addTo(m: collection.mutable.Map[Int,Int]) { m += (3 -> 9) }
addTo: (m: scala.collection.mutable.Map[Int,Int])Unit
scala> addTo(b)
scala> a
res44: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2, 3 -> 9)

Update values of Map

I have a Map like:
Map("product1" -> List(Product1ObjectTypes), "product2" -> List(Product2ObjectTypes))
where ProductObjectType has a field usage. Based on the other field (counter) I have to update all ProductXObjectTypes.
The issue is that this update depends on previous ProductObjectType, and I can't find a way to get previous item when iterating over mapValues of this map. So basically, to update current usage I need: CurrentProduct1ObjectType.counter - PreviousProduct1ObjectType.counter.
Is there any way to do this?
I started it like:
val reportsWithCalculatedUsage =
reportsRefined.flatten.flatten.toList.groupBy(_._2.product).mapValues(f)
but I don't know in mapValues how to access previous list item.
I'm not sure if I understand completely, but if you want to update the values inside the lists based on their predecessors, this can generally be done with a fold:
case class Thing(product: String, usage: Int, counter: Int)
val m = Map(
"product1" -> List(Thing("Fnord", 10, 3), Thing("Meep", 0, 5))
//... more mappings
)
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,0,5)))
m mapValues { list => list.foldLeft(List[Thing]()){
case (Nil, head) =>
List(head)
case (tail, head) =>
val previous = tail.head
val current = head copy (usage = head.usage + head.counter - previous.counter)
current :: tail
} reverse }
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,2,5)))
Note that regular map is an unordered collection, you need to use something like TreeMap to have predictable order of iteration.
Anyways, from what I understand you want to get pairs of all values in a map. Try something like this:
scala> val map = Map(1 -> 2, 2 -> 3, 3 -> 4)
scala> (map, map.tail).zipped.foreach((t1, t2) => println(t1 + " " + t2))
(1,2) (2,3)
(2,3) (3,4)