Merge two lists which contains case class objects scala - scala

I have two lists which contains case class objects
case class Balance(id: String, in: Int, out: Int)
val l1 = List(Balance("a", 0, 0), Balance("b", 10, 30), Balance("c", 20, 0))
val l2 = List(Balance("a", 10, 0), Balance("b", 40, 0))
I want to sumup the elements in the tuples and combine the lists like below
List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
I have came with following solution
// create list of tuples with 'id' as key
val a = l1.map(b => (b.id, (b.in, b.out)))
val b = l2.map(b => (b.id, (b.in, b.out)))
// combine the lists
val bl = (a ++ b).groupBy(_._1).mapValues(_.unzip._2.unzip match {
case (ll1, ll2) => (ll1.sum, ll2.sum)
}).toList.map(b => Balance(b._1, b._2._1, b._2._2))
// output
// List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
Are they any shorter way to do this?

You don't really need to create the tuple lists.
(l1 ++ l2).groupBy(_.id)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)})
.map{
case (k,(in,out)) => Balance(k,in,out)}
.toList
// res0: List[Balance] = List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0))
You'll note that the result appears out of order because of the intermediate representation as a Map, which, by definition, has no order.

Another approach would be to add a Semigroup instance for Balance and use that for the combine logic. The advantage of this is that that code is in one place only, rather that sprinkled wherever you need to combine lists or maps of Balances.
So, you first add the instance:
import cats.implicits._
implicit val semigroupBalance : Semigroup[Balance] = new Semigroup[Balance]
{
override def combine(x: Balance, y: Balance): Balance =
if(x.id == y.id) // I am arbitrarily deciding this: you can adapt the logic to your
// use case, but if you only need it in the scenario you asked for,
// the case where y.id and x.id are different will never happen.
Balance(x.id, x.in + y.in, x.out + y.out)
else x
}
Then, the code to combine multiple lists becomes simpler (using your example data):
(l1 ++ l2).groupBy(_.id).mapValues(_.reduce(_ |+| _)) //Map(b -> Balance(b,50,30), a -> Balance(a,10,0), c -> Balance(c,20,0))
N.B. As #jwvh already noted, the result will not be in order, in this simple case, because of the default unordered Map the groupBy returns. That could be fixed, if needed.
N.B. You might want to use Monoid instead of Semigroup, if you have a meaningful empty value for Balance.

For those who need to merge two list of case class objects, while maintaining the original ordering, here's my solution which is based on jwvh's answer to this question and this answer.
import scala.collection.immutable.SortedMap
val mergedList: List[Balance] = l1 ++ l2
val sortedListOfBalances: List[Balance] =
SortedMap(mergedList.groupBy(_.id).toSeq:_*)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)
})
.map{
case (k,(in,out)) => Balance(k,in,out)
}
.toList
This will return List(Balance(a,10,0), Balance(b,50,30), Balance(c,20,0)) while when not using SortedMap we get List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0)).
map always returns in an unspecified order unless we specifically use a subtype of SortedMap.

Related

Scala: Combine some elements in a list when they have the same property

How do I combine some elements in a list when they have the same property?
E.g. let say I have the following:
case class Foo(year: Int, amount: Int)
val list = List(Foo(2015, 10), Foo(2015, 15), Foo(2019, 55))
How do I transform list into List(Foo(2015, 25), Foo(2019, 55)) the Scala way?
As you can see both Foo(2015, 10) and Foo(2015, 15) are merged into List(Foo(2015, 25).
Similar question with Combining elements in the same list but that's for C#/LINQ.
If you're on Scala 2.13+, consider using groupMapReduce:
list.groupMapReduce(_.year)(_.amount)(_ + _).
map{ case (y, a) => Foo(y, a) }
// res1: scala.collection.immutable.Iterable[Foo] = List(Foo(2019,55), Foo(2015,25))
Use groupBy to arrange list by year, then map over the results to get it in the proper shape and sum the amount of each Foo.
scala> list.groupBy(foo => foo.year).map(m => Foo(m._1, m._2.map(foo => foo.amount).sum))
res5: scala.collection.immutable.Iterable[Foo] = List(Foo(2015,25), Foo(2019,55))
Just a Refactoring of Brian's answer. I use Pattern Matching to properly name the values.
I think it helps read the code.
list
.groupBy{case Foo(year, _) => year}
.map{ case (year, foos) =>
Foo(year,
foos.map{ case Foo(_, amount) => amount}.sum)
}

groupBy on List as LinkedHashMap instead of Map

I am processing XML using scala, and I am converting the XML into my own data structures. Currently, I am using plain Map instances to hold (sub-)elements, however, the order of elements from the XML gets lost this way, and I cannot reproduce the original XML.
Therefore, I want to use LinkedHashMap instances instead of Map, however I am using groupBy on the list of nodes, which creates a Map:
For example:
def parse(n:Node): Unit =
{
val leaves:Map[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.groupBy(_.label)
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
...
}
In this example, I want leaves to be of type LinkedHashMap to retain the order of n.child. How can I achieve this?
Note: I am grouping by label/tagname because elements can occur multiple times, and for each label/tagname, I keep a list of elements in my data structures.
Solution
As answered by #jwvh I am using foldLeft as a substitution for groupBy. Also, I decided to go with LinkedHashMap instead of ListMap.
def parse(n:Node): Unit =
{
val leaves:mutable.LinkedHashMap[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.foldLeft(mutable.LinkedHashMap.empty[String, Seq[Node]])((m, sn) =>
{
m.update(sn.label, m.getOrElse(sn.label, Seq.empty[Node]) ++ Seq(sn))
m
})
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
To get the rough equivalent to .groupBy() in a ListMap you could fold over your collection. The problem is that ListMap preserves the order of elements as they were appended, not as they were encountered.
import collection.immutable.ListMap
List('a','b','a','c').foldLeft(ListMap.empty[Char,Seq[Char]]){
case (lm,c) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res0: ListMap[Char,Seq[Char]] = ListMap(b -> Seq(b), a -> Seq(a, a), c -> Seq(c))
To fix this you can foldRight instead of foldLeft. The result is the original order of elements as encountered (scanning left to right) but in reverse.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res1: ListMap[Char,Seq[Char]] = ListMap(c -> Seq(c), b -> Seq(b), a -> Seq(a, a))
This isn't necessarily a bad thing since a ListMap is more efficient with last and init ops, O(1), than it is with head and tail ops, O(n).
To process the ListMap in the original left-to-right order you could .toList and .reverse it.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}.toList.reverse
//res2: List[(Char, Seq[Char])] = List((a,Seq(a, a)), (b,Seq(b)), (c,Seq(c)))
Purely immutable solution would be quite slow. So I'd go with
import collection.mutable.{ArrayBuffer, LinkedHashMap}
implicit class ExtraTraversableOps[A](seq: collection.TraversableOnce[A]) {
def orderedGroupBy[B](f: A => B): collection.Map[B, collection.Seq[A]] = {
val map = LinkedHashMap.empty[B, ArrayBuffer[A]]
for (x <- seq) {
val key = f(x)
map.getOrElseUpdate(key, ArrayBuffer.empty) += x
}
map
}
To use, just change .groupBy in your code to .orderedGroupBy.
The returned Map can't be mutated using this type (though it can be cast to mutable.Map or to mutable.LinkedHashMap), so it's safe enough for most purposes (and you could create a ListMap from it at the end if really desired).

reduce a list in scala by value

How can I reduce a list like below concisely
Seq[Temp] = List(Temp(a,1), Temp(a,2), Temp(b,1))
to
List(Temp(a,2), Temp(b,1))
Only keep Temp objects with unique first param and max of second param.
My solution is with lot of groupBys and reduces which is giving a lengthy answer.
you have to
groupBy
sortBy values in ASC order
get the last one which is the largest
Example,
scala> final case class Temp (a: String, value: Int)
defined class Temp
scala> val data : Seq[Temp] = List(Temp("a",1), Temp("a",2), Temp("b",1))
data: Seq[Temp] = List(Temp(a,1), Temp(a,2), Temp(b,1))
scala> data.groupBy(_.a).map { case (k, group) => group.sortBy(_.value).last }
res0: scala.collection.immutable.Iterable[Temp] = List(Temp(b,1), Temp(a,2))
or instead of sortBy(fn).last you can maxBy(fn)
scala> data.groupBy(_.a).map { case (k, group) => group.maxBy(_.value) }
res1: scala.collection.immutable.Iterable[Temp] = List(Temp(b,1), Temp(a,2))
You can generate a Map with groupBy, compute the max in mapValues and convert it back to the Temp classes as in the following example:
case class Temp(id: String, value: Int)
List(Temp("a", 1), Temp("a", 2), Temp("b", 1)).
groupBy(_.id).mapValues( _.map(_.value).max ).
map{ case (k, v) => Temp(k, v) }
// res1: scala.collection.immutable.Iterable[Temp] = List(Temp(b,1), Temp(a,2))
Worth noting that the solution using maxBy in the other answer is more efficient as it minimizes necessary transformations.
You can do this using foldLeft:
data.foldLeft(Map[String, Int]().withDefaultValue(0))((map, tmp) => {
map.updated(tmp.id, max(map(tmp.id), tmp.value))
}).map{case (i,v) => Temp(i, v)}
This is essentially combining the logic of groupBy with the max operation in a single pass.
Note This may be less efficient because groupBy uses a mutable.Map internally which avoids constantly re-creating a new map. If you care about performance and are prepared to use mutable data, this is another option:
val tmpMap = mutable.Map[String, Int]().withDefaultValue(0)
data.foreach(tmp => tmpMap(tmp.id) = max(tmp.value, tmpMap(tmp.id)))
tmpMap.map{case (i,v) => Temp(i, v)}.toList
Use a ListMap if you need to retain the data order, or sort at the end if you need a particular ordering.

Decompose Scala sequence into member values

I'm looking for an elegant way of accessing two items in a Seq at the same time. I've checked earlier in my code that the Seq will have exactly two items. Now I would like to be able to give them names, so they have meaning.
records
.sliding(2) // makes sure we get `Seq` with two items
.map(recs => {
// Something like this...
val (former, latter) = recs
})
Is there an elegant and/or idiomatic way to achieve this in Scala?
I'm not sure if it is any more elegant, but you can also unpick the sequence like this:
val former +: latter +: _ = recs
You can access the elements by their index:
map { recs => {
val (former, latter) = recs(0), recs(1)
}}
You can use pattern matching to decompose the structure of your list:
val records = List("first", "second")
records match {
case first +: second +: Nil => println(s"1: $first, 2: $second")
case _ => // Won't happen (you can omit this)
}
will output
1: first, 2: second
The result of sliding is a List. Using a pattern match, you can give name to these elements like this:
map{ case List(former, latter) =>
...
}
Note that since it's a pattern match, you need to use {} instead of ().
For a records of known types (for example, Int):
records.sliding (2).map (_ match {
case List (former:Int, latter:Int) => former + latter })
Note, that this will unify element (0, 1), then (1, 2), (2, 3) ... and so on. To combine pairwise, use sliding (2, 2):
val pairs = records.sliding (2, 2).map (_ match {
case List (former: Int, latter: Int) => former + latter
case List (one: Int) => one
}).toList
Note, that you now need an extra case for just one element, if the records size is odd.

How to extract elements from 4 lists in scala?

case class TargetClass(key: Any, value: Number, lowerBound: Double, upperBound: Double)
val keys: List[Any] = List("key1", "key2", "key3")
val values: List[Number] = List(1,2,3);
val lowerBounds: List[Double] = List(0.1, 0.2, 0.3)
val upperBounds: List[Double] = List(0.5, 0.6, 0.7)
Now I want to construct a List[TargetClass] to hold the 4 lists. Does anyone know how to do it efficiently? Is using for-loop to add elements one by one very inefficient?
I tried to use zipped, but it seems that this only applies for combining up to 3 lists.
Thank you very much!
One approach:
keys.zipWithIndex.map {
case (item,i)=> TargetClass(item,values(i),lowerBounds(i),upperBounds(i))
}
You may want to consider using the lift method to deal with case of lists being of unequal lengths (and thereby provide a default if keys is longer than any of the lists?)
I realise this doesn't address your question of efficiency. You could fairly easily run some tests on different approaches.
You can apply zipped to the first two lists, to the last two lists, then to the results of the previous zips, then map to your class, like so:
val z12 = (keys, values).zipped
val z34 = (lowerBounds, upperBounds).zipped
val z1234 = (z12.toList, z34.toList).zipped
val targs = z1234.map { case ((k,v),(l,u)) => TargetClass(k,v,l,u) }
// targs = List(TargetClass(key1,1,0.1,0.5), TargetClass(key2,2,0.2,0.6), TargetClass(key3,3,0.3,0.7))
How about:
keys zip values zip lowerBounds zip upperBounds map {
case (((k, v), l), u) => TargetClass(k, v, l, u)
}
Example:
scala> val zipped = keys zip values zip lowerBounds zip upperBounds
zipped: List[(((Any, Number), Double), Double)] = List((((key1,1),0.1),0.5), (((key2,2),0.2),0.6), (((key3,3),0.3),0.7))
scala> zipped map { case (((k, v), l), u) => TargetClass(k, v, l, u) }
res6: List[TargetClass] = List(TargetClass(key1,1,0.1,0.5), TargetClass(key2,2,0.2,0.6), TargetClass(key3,3,0.3,0.7))
It would be nice if .transpose worked on a Tuple of Lists.
for (List(k, v:Number, l:Double, u:Double) <-
List(keys, values, lowerBounds, upperBounds).transpose)
yield TargetClass(k,v,l,u)
I think no matter what you use from an efficiency point of view, you will have to traverse the lists individually. The only question is, do you do it OR for the sake of readability, you use Scala idioms and let Scala do the dirty work for you :) ?
Other approaches are not necessarily more efficient. You can change the order of zipping and the order of assembling the return value of the map function as you like.
Here is a more functional way but I am not sure it will be more efficient. See comments on #wwkudu (zip with index) answer
val res1 = keys zip lowerBounds zip values zip upperBounds
res1.map {
x=> (x._1._1._1,x._1._1._2, x._1._2, x._2)
//Of course, you can return an instance of TargetClass
//here instead of the touple I am returning.
}
I am curious, why do you need a "TargetClass"? Will a touple work?