Perform arithmetic on a list in Scala - scala

I am new to Scala so please forgive me if I am overseeing something extremely basic here. I have the following:
case class Record(
ID: String,
Count: Double)
List(Record("ID1",10.0),Record("ID1",60.0),Record("ID2",50.0),Record("ID3",100.0),Record("ID3",20.0),Record("ID3",10.0))
where x is the ID and y is the Count in Record(x,y). I am able to print this list to the console with println(records).
I am trying to output the following:
ID1,70.0
ID2,50.0
ID3,130.0
which is a summation of the count per ID group. I would like to try the groupBy approach, but I am struggling to be able to parse the ID from each Record in my list in order to begin grouping the counts.
For example, I have considered:
val grouped = records.groupBy(<some_logic_here>)
but the problem is that the objects in the list have Record(x,y) wrapped around it.
Thank you for your help.

case class Record(ID: String,
Count: Double)
val records = List(Record("ID1", 10.0), Record("ID1", 60.0), Record("ID2", 50.0), Record("ID3", 100.0), Record("ID3", 20.0),
Record("ID3", 10.0))
here is the one liner:
val ans = records.groupBy(_.ID).mapValues(_.map(_.Count).sum)
ans.foreach(x => println(s"${x._1},${x._2}"))

case class Record(ID: String, Count: Double)
List(Record("1", 12), Record("1", 13), Record("2", 13))
.groupBy(_.ID)
.map(e => Record(e._1, e._2.map(e => e.Count).sum))
You need to groupBy(ID) and then you get an list, and then you need compute the sum.

Related

How to combine two objects in a List by summing a member

Given this case class:
case class Categories(fruit: String, amount: Double, mappedTo: String)
I have a list containing the following:
List(
Categories("Others",22.38394964594807,"Others"),
Categories("Others",77.6160503540519,"Others")
)
I want to combine two elements in the list by summing up their amount if they are in the same category, so that the end result in this case would be:
List(Categories("Others",99.99999999999997,"Others"))
How can I do that?
Since groupMapReduce was introduced in Scala 2.13, I'll try to provide another approch to Martinjn's great answer.
Assuming we have:
case class Categories(Fruit: String, amount: Double, mappedTo: String)
val categories = List(
Categories("Apple",22.38394964594807,"Others"),
Categories("Apple",77.6160503540519,"Others")
)
If you want to aggregate by both mappedTo and Fruit
val result = categories.groupBy(c => (c.Fruit, c.mappedTo)).map {
case ((fruit, mappedTo), categories) => Categories(fruit, categories.map(_.amount).sum, mappedTo)
}
Code run can be found at Scastie.
If you want to aggregate only by mappedTo, and choose a random Fruit, you can do:
val result = categories.groupBy(c => c.mappedTo).map {
case (mappedTo, categories) => Categories(categories.head.Fruit, categories.map(_.amount).sum, mappedTo)
}
Code run can be found at Scastie
You want to group your list entries by category, and reduce them to a single value. There is groupMapReduce for that, which groups entries, and then maps the group (you don't need this) and reduces the group to a single value.
given
case class Category(category: String, amount: Double)
if you have a val myList: List[Category], then you want to group on Category#category, and reduce them by merging the members, summing up the amount.
that gives
myList.groupMapReduce(_.category) //group
(identity) //map. We don't need to map, so we use the identity mapping
{
case (Category(name, amount1), Category(_, amount2)) =>
Category(name, amount1 + amount2) }
} //reduce, combine each elements by taking the name, and summing the amojunts
In theory just a groupReduce would have been enough, but that doesn't exist, so we're stuck with the identity here.

Scala: Combine some elements in a list when they have the same property

How do I combine some elements in a list when they have the same property?
E.g. let say I have the following:
case class Foo(year: Int, amount: Int)
val list = List(Foo(2015, 10), Foo(2015, 15), Foo(2019, 55))
How do I transform list into List(Foo(2015, 25), Foo(2019, 55)) the Scala way?
As you can see both Foo(2015, 10) and Foo(2015, 15) are merged into List(Foo(2015, 25).
Similar question with Combining elements in the same list but that's for C#/LINQ.
If you're on Scala 2.13+, consider using groupMapReduce:
list.groupMapReduce(_.year)(_.amount)(_ + _).
map{ case (y, a) => Foo(y, a) }
// res1: scala.collection.immutable.Iterable[Foo] = List(Foo(2019,55), Foo(2015,25))
Use groupBy to arrange list by year, then map over the results to get it in the proper shape and sum the amount of each Foo.
scala> list.groupBy(foo => foo.year).map(m => Foo(m._1, m._2.map(foo => foo.amount).sum))
res5: scala.collection.immutable.Iterable[Foo] = List(Foo(2015,25), Foo(2019,55))
Just a Refactoring of Brian's answer. I use Pattern Matching to properly name the values.
I think it helps read the code.
list
.groupBy{case Foo(year, _) => year}
.map{ case (year, foos) =>
Foo(year,
foos.map{ case Foo(_, amount) => amount}.sum)
}

Merge two lists which contains case class objects scala

I have two lists which contains case class objects
case class Balance(id: String, in: Int, out: Int)
val l1 = List(Balance("a", 0, 0), Balance("b", 10, 30), Balance("c", 20, 0))
val l2 = List(Balance("a", 10, 0), Balance("b", 40, 0))
I want to sumup the elements in the tuples and combine the lists like below
List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
I have came with following solution
// create list of tuples with 'id' as key
val a = l1.map(b => (b.id, (b.in, b.out)))
val b = l2.map(b => (b.id, (b.in, b.out)))
// combine the lists
val bl = (a ++ b).groupBy(_._1).mapValues(_.unzip._2.unzip match {
case (ll1, ll2) => (ll1.sum, ll2.sum)
}).toList.map(b => Balance(b._1, b._2._1, b._2._2))
// output
// List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
Are they any shorter way to do this?
You don't really need to create the tuple lists.
(l1 ++ l2).groupBy(_.id)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)})
.map{
case (k,(in,out)) => Balance(k,in,out)}
.toList
// res0: List[Balance] = List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0))
You'll note that the result appears out of order because of the intermediate representation as a Map, which, by definition, has no order.
Another approach would be to add a Semigroup instance for Balance and use that for the combine logic. The advantage of this is that that code is in one place only, rather that sprinkled wherever you need to combine lists or maps of Balances.
So, you first add the instance:
import cats.implicits._
implicit val semigroupBalance : Semigroup[Balance] = new Semigroup[Balance]
{
override def combine(x: Balance, y: Balance): Balance =
if(x.id == y.id) // I am arbitrarily deciding this: you can adapt the logic to your
// use case, but if you only need it in the scenario you asked for,
// the case where y.id and x.id are different will never happen.
Balance(x.id, x.in + y.in, x.out + y.out)
else x
}
Then, the code to combine multiple lists becomes simpler (using your example data):
(l1 ++ l2).groupBy(_.id).mapValues(_.reduce(_ |+| _)) //Map(b -> Balance(b,50,30), a -> Balance(a,10,0), c -> Balance(c,20,0))
N.B. As #jwvh already noted, the result will not be in order, in this simple case, because of the default unordered Map the groupBy returns. That could be fixed, if needed.
N.B. You might want to use Monoid instead of Semigroup, if you have a meaningful empty value for Balance.
For those who need to merge two list of case class objects, while maintaining the original ordering, here's my solution which is based on jwvh's answer to this question and this answer.
import scala.collection.immutable.SortedMap
val mergedList: List[Balance] = l1 ++ l2
val sortedListOfBalances: List[Balance] =
SortedMap(mergedList.groupBy(_.id).toSeq:_*)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)
})
.map{
case (k,(in,out)) => Balance(k,in,out)
}
.toList
This will return List(Balance(a,10,0), Balance(b,50,30), Balance(c,20,0)) while when not using SortedMap we get List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0)).
map always returns in an unspecified order unless we specifically use a subtype of SortedMap.

Update (or Replace) item(s) in immutable collection in Scala

What is best practice to update (or replace) a item in Seq ?
case class Minion(id: Int, name: String, motivation: Int)
val minions: Seq[Minion] = Seq(
Minion(1, "Bob", 50),
Minion(2, "Kevin", 50),
Minion(3, "Stuart", 50))
I'd like to acquire new Collection
Seq(
Minion(1, "Bob", 50),
Minion(2, "Kevin", 50),
Minion(3, "Stuart", 100))
What's best way ?
Use updated:
// first argument is index (zero-based) - so using 2 to replace 3rd item:
scala> minions.updated(2, Minion(3, "Stuart", 100))
res0: Seq[Minion] = List(Minion(1,Bob,50), Minion(2,Kevin,50), Minion(3,Stuart,100))
Or, without repeating the unchanged attributes of the new Minion:
scala> minions.updated(2, minions(2).copy(motivation = 100))
res1: Seq[Minion] = List(Minion(1,Bob,50), Minion(2,Kevin,50), Minion(3,Stuart,100))
Map also works, and might be a little bit easier to read than updated:
minions.map {
case Minion(2, name, n) => Minion(2, name, 100)
case m => m
}
One benefit of this over updated besides readability is that you can modify several elements in one go.

Scala, finding max value in arrays

First time I've had to ask a question here, there is not enough info on Scala out there for a newbie like me.
Basically what I have is a file filled with hundreds of thousands of lists formatted like this:
(type, date, count, object)
Rows look something like this:
(food, 30052014, 400, banana)
(food, 30052014, 2, pizza)
All I need to is find the one row with the highest count.
I know I did this a couple of months ago but can't seem to wrap my head around it now. I'm sure I can do this without a function too. All I want to do is set a value and put that row in it but I can't figure it out.
I think basically what I want to do is a Math.max on the 3rd element in the lists, but I just can't get it.
Any help will be kindly appreciated. Sorry if my wording or formatting of this question isn't the best.
EDIT: There's some extra info I've left out that I should probably add:
All the records are stored in a tsv file. I've done this to split them:
val split_food = food.map(_.split("/t"))
so basically I think I need to use split_food... somehow
Modified version of #Szymon answer with your edit addressed:
val split_food = food.map(_.split("/t"))
val max_food = split_food.maxBy(tokens => tokens(2).toInt)
or, analogously:
val max_food = split_food.maxBy { case Array(_, _, count, _) => count.toInt }
In case you're using apache spark's RDD, which has limited number of usual scala collections methods, you have to go with reduce
val max_food = split_food.reduce { (max: Array[String], current: Array[String]) =>
val curCount = current(2).toInt
val maxCount = max(2).toInt // you probably would want to preprocess all items,
// so .toInt will not be called again and again
if (curCount > maxCount) current else max
}
You should use maxBy function:
case class Purchase(category: String, date: Long, count: Int, name: String)
object Purchase {
def apply(s: String) = s.split("\t") match {
case Seq(cat, date, count, name) => Purchase(cat, date.toLong, count.toInt, name)
}
}
foodRows.map(row => Purchase(row)).maxBy(_.count)
Simply:
case class Record(food:String, date:String, count:Int)
val l = List(Record("ciccio", "x", 1), Record("buffo", "y", 4), Record("banana", "z", 3))
l.maxBy(_.count)
>>> res8: Record = Record(buffo,y,4)
Not sure if you got the answer yet but I had the same issues with maxBy. I found once I ran the package... import scala.io.Source I was able to use maxBy and it worked.