groupBy on String in Scala - scala

I am doing a groupBy on a string as follows:
"message".groupBy("message".count(_.toChar))
I was expecting it to yield a map as:
{1 => "mag" , 2 => "es"}
However the above code doesn't even compile , where am I going wrong. I want to produce a map based on the discriminator function count of chars.

You can do:
("message".groupBy(identity).mapValues(_.size)
.groupBy(_._2).mapValues(_.foldLeft("")(_+_._1)))
// res8: scala.collection.immutable.Map[Int,String] = Map(2 -> es, 1 -> amg)

Related

Aggregate/Reduce by key function for a map in scala

I have a map as given below in scala.
Map("x"-> "abc", "y"->"adc","z"->"abc", "l"-> "ert","h"->"dfg", "p"-> "adc")
I want the output as follows:
Map("abc"->["x","z"],"adc"->["y" , "p"], "ert"->"l", "dfg"->"h")
So, the output has the array as the value of those those keys which had same values in inital map. How can I get that done optimally?
A groupBy followed by some manipulation of the values it outputs should do.
scala> m.groupBy(x => x._2).mapValues(_.keys.toList)
res10: scala.collection.immutable.Map[String,List[String]]
= Map(abc -> List(x, z), dfg -> List(h), ert -> List(l), adc -> List(y, p))

Scala - map function - Only returned last element of a Map

I am new to Scala and trying out the map function on a Map.
Here is my Map:
scala> val map1 = Map ("abc" -> 1, "efg" -> 2, "hij" -> 3)
map1: scala.collection.immutable.Map[String,Int] =
Map(abc -> 1, efg -> 2, hij -> 3)
Here is a map function and the result:
scala> val result1 = map1.map(kv => (kv._1.toUpperCase, kv._2))
result1: scala.collection.immutable.Map[String,Int] =
Map(ABC -> 1, EFG -> 2, HIJ -> 3)
Here is another map function and the result:
scala> val result1 = map1.map(kv => (kv._1.length, kv._2))
result1: scala.collection.immutable.Map[Int,Int] = Map(3 -> 3)
The first map function returns all the members as expected however the second map function returns only the last member of the Map. Can someone explain why this is happening?
Thanks in advance!
In Scala, a Map cannot have duplicate keys. When you add a new key -> value pair to a Map, if that key already exists, you overwrite the previous value. If you're creating maps from functional operations on collections, then you're going to end up with the value corresponding to the last instance of each unique key. In the example you wrote, each string key of the original map map1 has the same length, and so all your string keys produce the same integer key 3 for result1. What's happening under the hood to calculate result1 is:
A new, empty map is created
You map "abc" -> 1 to 3 -> 3 and add it to the map. Result now contains 1 -> 3.
You map "efg" -> 2 to 3 -> 2 and add it to the map. Since the key is the same, you overwrite the existing value for key = 3. Result now contains 2 -> 3.
You map "hij" -> 3 to 3 -> 3 and add it to the map. Since the key is the same, you overwrite the existing value for key = 3. Result now contains 3 -> 3.
Return the result, which is Map(3 -> 3)`.
Note: I made a simplifying assumption that the order of the elements in the map iterator is the same as the order you wrote in the declaration. The order is determined by hash bin and will probably not match the order you added elements, so don't build anything that relies on this assumption.

Update values of Map

I have a Map like:
Map("product1" -> List(Product1ObjectTypes), "product2" -> List(Product2ObjectTypes))
where ProductObjectType has a field usage. Based on the other field (counter) I have to update all ProductXObjectTypes.
The issue is that this update depends on previous ProductObjectType, and I can't find a way to get previous item when iterating over mapValues of this map. So basically, to update current usage I need: CurrentProduct1ObjectType.counter - PreviousProduct1ObjectType.counter.
Is there any way to do this?
I started it like:
val reportsWithCalculatedUsage =
reportsRefined.flatten.flatten.toList.groupBy(_._2.product).mapValues(f)
but I don't know in mapValues how to access previous list item.
I'm not sure if I understand completely, but if you want to update the values inside the lists based on their predecessors, this can generally be done with a fold:
case class Thing(product: String, usage: Int, counter: Int)
val m = Map(
"product1" -> List(Thing("Fnord", 10, 3), Thing("Meep", 0, 5))
//... more mappings
)
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,0,5)))
m mapValues { list => list.foldLeft(List[Thing]()){
case (Nil, head) =>
List(head)
case (tail, head) =>
val previous = tail.head
val current = head copy (usage = head.usage + head.counter - previous.counter)
current :: tail
} reverse }
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,2,5)))
Note that regular map is an unordered collection, you need to use something like TreeMap to have predictable order of iteration.
Anyways, from what I understand you want to get pairs of all values in a map. Try something like this:
scala> val map = Map(1 -> 2, 2 -> 3, 3 -> 4)
scala> (map, map.tail).zipped.foreach((t1, t2) => println(t1 + " " + t2))
(1,2) (2,3)
(2,3) (3,4)

How to convert Map[String,Seq[String]] to Map[String,String]

I have a Map[String,Seq[String]] and want to basically covert it to a Map[String,String] since I know the sequence will only have one value.
Someone else already mentioned mapValues, but if I were you I would do it like this:
scala> val m = Map(1 -> Seq(1), 2 -> Seq(2))
m: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(1), 2 -> List(2))
scala> m.map { case (k,Seq(v)) => (k,v) }
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2)
Two reasons:
The mapValues method produces a view of the result Map, meaning that the function will be recomputed every time you access an element. Unless you plan on accessing each element exactly once, or you only plan on accessing a very small percentage of them, you don't want that recomputation to take place.
Using a case with (k,Seq(v)) ensures that an exception will be thrown if the function ever sees a Seq that doesn't contain exactly one element. Using _(0) or _.head will throw an exception if there are zero elements, but will not complain if you had more than one, which will likely result in mysterious bugs later on when things go missing without errors.
You can use mapValues().
scala> Map("a" -> Seq("aaa"), "b" -> Seq("bbb"))
res0: scala.collection.immutable.Map[java.lang.String,Seq[java.lang.String]] = M
ap(a -> List(aaa), b -> List(bbb))
scala> res0.mapValues(_(0))
res1: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(a
-> aaa, b -> bbb)
I think I got it by doing the following:
mymap.flatMap(x => Map(x._1 -> x._2.head))
Yet another suggestion:
m mapValues { _.mkString }
This one's agnostic to whether the Seq has multiple elements -- it'll just concatenate all the strings together. If you're concerned about the recomputation of each value, you can make it happen up-front:
(m mapValues { _.mkString }).view.force

Scala: How do I use fold* with Map?

I have a Map[String, String] and want to concatenate the values to a single string.
I can see how to do this using a List...
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.reduceLeft[String](_+_)
res8: String = testing123
fold* or reduce* seem to be the right approach I just can't get the syntax right for a Map.
Folds on a map work the same way they would on a list of pairs. You can't use reduce because then the result type would have to be the same as the element type (i.e. a pair), but you want a string. So you use foldLeft with the empty string as the neutral element. You also can't just use _+_ because then you'd try to add a pair to a string. You have to instead use a function that adds the accumulated string, the first value of the pair and the second value of the pair. So you get this:
scala> val m = Map("la" -> "la", "foo" -> "bar")
m: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(la -> la, foo -> bar)
scala> m.foldLeft("")( (acc, kv) => acc + kv._1 + kv._2)
res14: java.lang.String = lalafoobar
Explanation of the first argument to fold:
As you know the function (acc, kv) => acc + kv._1 + kv._2 gets two arguments: the second is the key-value pair currently being processed. The first is the result accumulated so far. However what is the value of acc when the first pair is processed (and no result has been accumulated yet)? When you use reduce the first value of acc will be the first pair in the list (and the first value of kv will be the second pair in the list). However this does not work if you want the type of the result to be different than the element types. So instead of reduce we use fold where we pass the first value of acc as the first argument to foldLeft.
In short: the first argument to foldLeft says what the starting value of acc should be.
As Tom pointed out, you should keep in mind that maps don't necessarily maintain insertion order (Map2 and co. do, but hashmaps do not), so the string may list the elements in a different order than the one in which you inserted them.
The question has been answered already, but I'd like to point out that there are easier ways to produce those strings, if that's all you want. Like this:
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.mkString
res0: String = testing123
scala> val m = Map(1 -> "abc", 2 -> "def", 3 -> "ghi")
m: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,abc), (2,def), (3,ghi))
scala> m.values.mkString
res1: String = abcdefghi