How to use Reduce on Scala - scala

I am using scala to implement an algorithm. I have a case where I need to implement such scenario:
test = Map(t -> List((t,2)), B -> List((B,3), (B,1)), D -> List((D,1)))
I need to some the second member of every common tuples.
The desired result :
Map((t,2),(B,4),(D,1))
val resReduce = test.foldLeft(Map.empty[String, List[Map.empty[String, Int]]){(count, tup) => count + (tup -> (count.getOrElse(tup, 0) + 1))
I am trying to use "Reduce", I have to go through every group I did and sum their second member. Any idea how to do that.

If you know that all lists are nonempty and start with the same key (e.g. they were produced by groupBy), then you can just
test.mapValues(_.map(_._2).sum).toMap
Alternatively, you might want an intermediate step that allows you to perform error-checking:
test.map{ case(k,xs) =>
val v = {
if (xs.exists(_._1 != k)) ??? // Handle key-mismatch case
else xs.reduceOption((l,r) => l.copy(_2 = l._2 + r._2))
}
v.getOrElse(??? /* Handle empty-list case */)
}

You could do something like this:
test collect{
case (key, many) => (key, many.map(_._2).sum)
}
wherein you do not have to assume that the list has any members. However, if you want to exclude empty lists, add a guard
case (key, many) if many.nonEmpty =>
like that.

scala> val test = Map("t" -> List(("t",2)), "B" -> List(("B",3), ("B",1)), "D" -> List(("D",1)))
test: scala.collection.immutable.Map[String,List[(String, Int)]] = Map(t -> List((t,2)), B -> List((B,3), (B,1)), D -> List((D,1)))
scala> test.map{case (k,v) => (k, v.map(t => t._2).sum)}
res32: scala.collection.immutable.Map[String,Int] = Map(t -> 2, B -> 4, D -> 1)

Yet another approach, in essence quite similar to what has already been suggested,
implicit class mapAcc(val m: Map[String,List[(String,Int)]]) extends AnyVal {
def mapCount() = for ( (k,v) <- m ) yield { (k,v.map {_._2}.sum) }
}
Then for a given
val test = Map("t" -> List(("t",2)), "B" -> List(("B",3), ("B",1)), "D" -> List(("D",1)))
a call
test.mapCount()
delivers
Map(t -> 2, B -> 4, D -> 1)

Related

How to transform input data into following format? - groupby

What I have is the following input data for a function in a piece of scala code I'm writing:
List(
(1,SubScriptionState(CNN,ONLINE,Seq(12))),
(1,SubScriptionState(SKY,ONLINE,Seq(12))),
(1,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,ONLINE,Seq(12))),
(2,SubScriptionState(SKY,ONLINE,Seq(12))),
(2,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,OFFLINE,Seq(13))),
(2,SubScriptionState(SKY,ONLINE,Seq(13))),
(2,SubScriptionState(FOX,ONLINE,Seq(13))),
(3,SubScriptionState(CNN,OFFLINE,Seq(13))),
(3,SubScriptionState(SKY,ONLINE,Seq(13))),
(3,SubScriptionState(FOX,ONLINE,Seq(13)))
)
SubscriptionState is just a case class here:
case class SubscriptionState(channel: Channel, state: ChannelState, subIds: Seq[Long])
I want to transform it into this:
Map(
1 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12)) -> 1,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12)) -> 1),
2 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1),
3 -> Map(
SubScriptionState(SKY,ONLINE,Seq(13)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(13)) -> 1,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1)
)
How would I go about doing this in scala?
Here is my approach to the problem. I think it may not be a perfect solution, but it works as you would expect.
val result: Map[Int, Map[SubscriptionState, Int]] = list
.groupBy(_._1)
.view
.mapValues { statesById =>
statesById
.groupBy { case (_, subscriptionState) => (subscriptionState.channel, subscriptionState.state) }
.map { case (_, groupedStatesById) =>
val subscriptionState = groupedStatesById.head._2 // groupedStatesById should contain at least one element
val allSubIds = groupedStatesById.flatMap(_._2.subIds)
val updatedSubscriptionState = subscriptionState.copy(subIds = allSubIds)
updatedSubscriptionState -> allSubIds.size
}
}.toMap
This is a "simple" solution using groupMap and groupMapReduce
list
.groupMap(_._1)(_._2)
.view
.mapValues{
_.groupMapReduce(ss => (ss.channel, ss.state))(_.subIds)(_ ++ _)
.map{case (k,v) => SubScriptionState(k._1, k._2, v) -> v.length}
}
.toMap
The groupMap converts the data to a Map[Int, List[SubScriptionState]] and the mapValues converts each List to the appropriate Map. (The view and toMap wrappers make mapValues more efficient and safe.)
The groupMapReduce converts the List[SubScriptionState] into a Map[(Channel, ChannelState), List[SubId]].
The map on this inner Map juggles these values around to make Map[SubScriptionState, Int] as required.
I'm not clear what the purpose of inner Map is. The value is the length of the subIds field so it could be obtained directly from the key rather than needing to look it up in the Map
An attempt using foldLeft:
list.foldLeft(Map.empty[Int, Map[SubscriptionState, Int]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionState, Int])
val channelSub = subMap.find { case (sub, _) => sub.channel == next._2.channel && sub.state == next._2.state }
acc + (next._1 -> channelSub.fold(subMap + (next._2 -> next._2.subIds.length)) { case (sub, _) =>
val subIds = sub.subIds ++ next._2.subIds
(subMap - sub) + (sub.copy(subIds = subIds) -> subIds.length)
})
}
I noticed that count is not used while folding and can be calculated using storeIds. Also, as storeIds can vary, the inner Map is rather useless as you will have to use find instead of get to fetch values from Map. So if you have control over your ADTs, you could use an intermediary ADT like:
case class SubscriptionStateWithoutIds(channel: Channel, state: ChannelState)
then you can rewrite your foldLeft as follows:
list.foldLeft(Map.empty[Int, Map[SubscriptionStateWithoutIds, Seq[Long]]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionStateWithoutIds, Seq[Long]])
val withoutId = SubscriptionStateWithoutIds(next._2.channel, next._2.state)
val channelSub = subMap.get(withoutId)
acc + (next._1 -> (subMap + channelSub.fold(withoutId -> next._2.subIds) { seq => withoutId -> (seq ++ next._2.subIds) }))
}
The biggest advantage of intermediary ADT is you can have a cleaner groupMapReduce version:
list.groupMap(_._1)(sub => SubscriptionStateWithoutIds(sub._2.channel, sub._2.state) -> sub._2.subIds)
.map { case (key, value) => key -> value.groupMapReduce(_._1)(_._2)(_ ++ _) }

How to update value of a key in a immutable tree map in Scala

I'm using Scala 2.11 and i'm trying to update the value of a key in the tree map. I tried using updated:
private val xyz = List(0, 100000, 500000, 1000000)
private val abc = List (0, 5, 25, 50)
private var a = TreeMap.empty[Int, TreeMap[Int, Int]] ++ xyz.map {
aa => aa -> (TreeMap.empty[Int, Int] ++ abc.map(bb => bb -> 0))
}
a(xyz(0)).foreach {
case (key, value) =>
if (key < 50) {
a(xyz(0)) = a(xyz(0)).updated(key, 5)
}
}
And got the error:
value update is not a member of scala.collection.immutable.TreeMap[Int,scala.collection.immutable.TreeMap[Int,Int]]
Is it possible to update it? Or could someone please help me replicate the logic using a Java Tree Map since that will also allow me to use floorEntry and ceilingEntry functions. I tried converting to java tree map and it generated a regular map, not a tree map:
private var a = TreeMap.empty[Int, TreeMap[Int, Int]] ++ xyz.map {
aa => aa -> (TreeMap.empty[Int, Int] ++ abc.map(bb => bb -> 0)).asJava
}
private var b = a.asJava
You are getting confused between var/val and mutable/immutable.
I think you correctly understood the difference between val and var, that the former is an immutable variable and later is mutable. ie, if you try to reassign the object assigned as val you will get an error.
import scala.collection.immutable.TreeMap
val tm = TreeMap(1 -> 1, 2 -> 2, 3 -> 3)
tm = TreeMap(1->2)
^
error: reassignment to val
But a var can be mutated:
import scala.collection.immutable.TreeMap
var tm = TreeMap(1 -> 1, 2 -> 2, 3 -> 3)
tm = TreeMap(1->2)
// mutated tm
Notice that in the latter case, even though we are mutating the variable, we are not mutating the collection itself, we are assigning a new TreeMap. As we were using scala.collection.immutable.TreeMap it cant be mutated.
Instead, if we had used scala.collection.mutable.TreeMap, it has an update function
import scala.collection.mutable.TreeMap
val tm = TreeMap(1 -> 1, 2 -> 2, 3 -> 3)
tm.update(1, 5)
tm //TreeMap(1 -> 5, 2 -> 2, 3 -> 3)
Once you change scala.collection.immutable.TreeMap to scala.collection.mutable.TreeMap, this will work
a(xyz(0)).foreach{ case (key, value) =>
if(key < 50){
a(xyz(0)) = a(xyz(0)).updated(key, 5) //addOne(key, 5) if 2.13+
}
}
EDIT using java.util.TreeMap
private val xyz = List(0, 100000, 500000, 1000000)
private val abc = List(0, 5, 25, 50)
import java.util.{TreeMap => JTreeMap}
val jTreeMap = xyz.foldLeft(new JTreeMap[Int, JTreeMap[Int, Int]]()) { (acc, elem) =>
acc.put(
elem,
abc.foldLeft(new JTreeMap[Int, Int]()) { (acc2, elem2) =>
acc2.put(elem2, 0)
acc2
}
)
acc
}
//Map created
jTreeMap.get(xyz.head).replaceAll{
//hack for scala 2.11.x
new java.util.function.BiFunction[Int, Int, Int]{
def apply(key: Int, value: Int) = if (value < 5) 5 else value
}
}
//value edited
It is not possible to update an immutable object, you can only create a new immutable object from the old one. So the code needs to create a new TreeMap from the original one with different values as necessary.
The code looks like this:
val newMap = a.map{
case (k, v) if k == xyz(0) =>
k -> v.map {
case (k2, v2) if k2 < 50 =>
k2 -> 5
case (k2, v2) =>
k2 -> v2
}
case (k, v) =>
k -> v
}
This breaks down to an outer map that looks for matching keys in the outer TreeMap, and an inner map that looks for matching keys in the inner TreeMap. Pattern matching (case) is used to implement the match tests, and also to extract the keys and values.
Each map has one case that selects the values to be modified, and a second case that leaves other values unchanged. The first case returns the original key with a modified value while the second case just returns the original values (k -> v).
Also note that var applies to a variable, not the contents of a variable. It indicates whether the variable can be updated to refer to a different object, but says nothing about whether the object that the variable refers to can be updated. var is rarely used in Scala because it goes against a clean functional design.

Efficiently combine two lists resulting in a set of new lists

I have two sets of lists:
A: {1 -> B -> 3, 4 -> B -> 5 }
B: { 7 -> 8 }
which I want to combine resulting in
A' : {1 -> 7 -> 8 -> 3, 4 -> 7 -> 8 -> 5 }
Note: There might be multiple references to B in the same list.
A: {B -> 1 -> B}
B: {2,3}
A' : {2 -> 1 -> 2, 2 -> 1 -> 3, 3 -> 1 ->2, 3 -> 1 -> 3}
My current approach is recursive and works on each of the individual lists of A:
object dummy {
trait Step
// I owe you referencing another set
case class IOU(ref : String)
// a simple step within the list
case class Step(id : Long)
def recursiveResolve(
id: String,
to: Set[Seq[Step]],
on: Seq[Step]): Set[Seq[Step]] = {
on match {
// here we are done and can return the empty set to resolve the recursion tree
case Nil => Set()
// we have still work left and now work on the first element remaining
case mul :: tiple =>
val step: Set[Seq[Step]] = mul match {
// if we have an I owe you that references the replace in list
case x: IOU if x.ref == id =>
// return the replace in set of lists
to
case x => Set(Seq(x))
}
val next: Set[Seq[Step]] = recursiveResolve(id, to, tiple)
// if the next step returned non empty we have to merge
if (next.nonEmpty) {
step.flatMap { elem: Seq[Step] =>
next.map { nextElem: Seq[Step] =>
elem ++ nextElem
}
}
} else {
step
}
}
}
}
val A : Set[Seq[Step]] = Set( Seq( Step(1), IOU("B"), Step(3)), Seq( Step(4), IOU("B"), Step(5) ))
val B : Set[Seq[Step]] = Set( Seq(Step(7), Step(8) ) )
val result : Set[Seq[Step]] = A.flatMap(elem => recursiveResolve("B",B,elem))
This works but is obviously horribly inefficient as each step of the list involves consing and merging and flatMapping. Furthermore, the problem I am trying to solve involves larger lists, sometimes with multiple references per list to different or the same set.
Is there a more efficient approach?
Ugh, yeah .. that becomes expensive (but still polynomial - I think, it's O(A*N*B^2))
def resolve(bs: Set[Seq[Step]])(s: Seq[Step]): Set[Seq[Step]] = s match {
case Seq(IOU(_), tail#_*) => bs.flatMap { b => resolve(bs)(tail).map(b ++: _) }
case Seq(head, tail#_*) => resolve(bs)(tail).map(head +: _)
case Seq() => Set(Seq())
}
A.flatMap(resolve(B))

How to combine all the values with the same key in Scala?

I have a map like :
val programming = Map(("functional", 1) -> "scala", ("functional", 2) -> "perl", ("orientedObject", 1) -> "java", ("orientedObject", 2) -> "C++")
with the same first element of key appearing multiple times.
How to regroup all the values corresponding to the same first element of key ? Which would turn this map into :
Map("functional" -> List("scala","perl"), "orientedObject" -> List("java","C++"))
UPDATE: This answer is based upon your original question. If you need the more complex Map definition, using a tuple as the key, then the other answers will address your requirements. You may still find this approach simpler.
As has been pointed out, you can't actually have multiple keys with the same value in a map. In the REPL, you'll note that your declaration becomes:
scala> val programming = Map("functional" -> "scala", "functional" -> "perl", "orientedObject" -> "java", "orientedObject" -> "C++")
programming: scala.collection.immutable.Map[String,String] = Map(functional -> perl, orientedObject -> C++)
So you end up missing some values. If you make this a List instead, you can get what you want as follows:
scala> val programming = List("functional" -> "scala", "functional" -> "perl", "orientedObject" -> "java", "orientedObject" -> "C++")
programming: List[(String, String)] = List((functional,scala), (functional,perl), (orientedObject,java), (orientedObject,C++))
scala> programming.groupBy(_._1).map(p => p._1 -> p._2.map(_._2)).toMap
res0: scala.collection.immutable.Map[String,List[String]] = Map(functional -> List(scala, perl), orientedObject -> List(java, C++))
Based on your edit, you have a data structure that looks something like this
val programming = Map(("functional", 1) -> "scala", ("functional", 2) -> "perl",
("orientedObject", 1) -> "java", ("orientedObject", 2) -> "C++")
and you want to scrap the numerical indices and group by the string key. Fortunately, Scala provides a built-in that gets you close.
programming groupBy { case ((k, _), _) => k }
This will return a new map which contains submaps of the original, grouped by the key that we return from the "partial" function. But we want a map of lists, so let's ignore the keys in the submaps.
programming groupBy { case ((k, _), _) => k } mapValues { _.values }
This gets us a map of... some kind of Iterable. But we really want lists, so let's take the final step and convert to a list.
programming groupBy { case ((k, _), _) => k } mapValues { _.values.toList }
You should try the .groupBy method
programming.groupBy(_._1._1)
and you will get
scala> programming.groupBy(_._1._1)
res1: scala.collection.immutable.Map[String,scala.collection.immutable.Map[(String, Int),String]] = Map(functional -> Map((functional,1) -> scala, (functional,2) -> perl), orientedObject -> Map((orientedObject,1) -> java, (orientedObject,2) -> C++))
you can now "clean" by doing something like:
scala> res1.mapValues(m => m.values.toList)
res3: scala.collection.immutable.Map[String,List[String]] = Map(functional -> List(scala, perl), orientedObject -> List(java, C++))
Read the csv file and create a map that contains key and list of values.
val fileStream = getClass.getResourceAsStream("/keyvaluepair.csv")
val lines = Source.fromInputStream(fileStream).getLines
var mp = Seq[List[(String, String)]]();
var codeMap=List[(String, String)]();
var res = Map[String,List[String]]();
for(line <- lines )
{
val cols=line.split(",").map(_.trim())
codeMap ++= Map(cols(0)->cols(1))
}
res = codeMap.groupBy(_._1).map(p => p._1 -> p._2.map(_._2)).toMap
Since no one has put in the specific ordering he asked for:
programming.groupBy(_._1._1)
.mapValues(_.toSeq.map { case ((t, i), l) => (i, l) }.sortBy(_._1).map(_._2))

Scala iterate over map and turn singleton List into just the singleton

I am trying to extract a value of type List[T] to just T in a Map. So for instance:
val c = Map(1->List(1), 2-> List(2), 3->List(3));
would turn into
Map(1->1,2->2,3->3);
Here is what I have written so far:
val Some(values) = request.body.asFormUrlEncoded.foreach {
case (key,value) =>
Map(key->value.head);
};
and here is the error I am receiving:
constructor cannot be instantiated to expected type; found : (T1, T2) required: scala.collection.immutable.Map[String,Seq[String]]
EDIT: This is ocurring wrt to this line:
case (key,value) =>
EDIT2:
request.body.asFormUrlEncoded example output
Some(Map(test -> List(324)))
Some(Map(SpO2 -> List(456), ETCO2 -> List(123)))
Are you sure that you will always have exactly one element in the list? If so, you should do this, which is clear, and has the benefit that it will throw an error if you get a bad list (doesn't have exactly one element) by accident.
c.map { case (k, List(v)) => k -> v }
// Map(1 -> 1, 2 -> 2, 3 -> 3)
If your lists can have more than one element, and you just want the first, you can do this (which will error on empty lists):
val d = Map(1 -> List(1), 2 -> List(2,4,6), 3 -> List(3))
d.map { case (k, List(v, _*)) => k -> v }
// Map(1 -> 1, 2 -> 2, 3 -> 3)
If your lists may not have exactly one element, and you want to ignore any non-singleton lists instead of throwing errors, use collect instead of map:
val e = Map(1 -> List(1), 2 -> List(2,4,6), 3 -> List(3), 4 -> List())
e.collect { case (k, List(v)) => k -> v }
// Map(1 -> 1, 3 -> 3)
As for your code:
val Some(values) = request.body.asFormUrlEncoded.foreach {
case (key,value) =>
Map(key->value.head);
};
This doesn't really make any sense.
First off, foreach doesn't return anything, so assigning its result to a variable will never work. You probably want this to be a map instead, so that it returns a collection.
Second, your use of Some makes it seem like you don't understand Options, so you might want to read up on that.
Third, if you want the result to be a Map (a collection of pairs), then you'll just want to return the pair, key->value.head, and not a Map.
Fourth, if you're getting errors matching on case (key,value), then probably asFormUrlEncoded doesn't actually return a collection of pairs. You should see what its type actually is.
Lastly, the semicolons are unnecessary. You should remove them.
EDIT based on your comment:
Since request.body.asFormUrlEncoded actually returns things like Some(Map("test" -> List(324))), here is how your code should look.
If asFormUrlEncoded might return None, and you don't have any way of handling that, then you should guard against it:
val a = Some(Map("test" -> List(324)))
val value = a match {
case Some(m) => m.collect { case (k, List(v)) => k -> v }
case None => sys.error("expected something, got nothing")
}
If you're sure that asFormUrlEncoded will already return Some, then you can just do this:
val a = Some(Map("test" -> List(324)))
val Some(value) = a.map(_.collect { case (k, List(v)) => k -> v })