Merge info from one list to another Scala - scala

I have two lists:
val generalInfo = List[GeneralInfo]
val countInfo = List[CountInfo]
case class GeneralInfo(id: String, source: String, languages: Array[String], var count: BigDecimal)
case class CountInfo(id: String, count: BigDecimal)
Every GeneralInfo object is initialized with count = 0;
I need to fill in the count variable in GeneralInfo object, with the count value from CountInfo object, when the id of them are the same. (not all the id's in GeneralInfo list are in CountInfo list)
I am quite new to Scala, is there any way to do it elegantly without a use of dictionary?

If you know that there's a one-to-one relation, you can go through every CountInfo, find the corresponding GeneralInfo object, and set the count for that GeneralInfo.
countInfo.foreach(c => generalInfo.find(_.id == c.id).map(_.count = c.count))
If one id can be shared by many GeneralInfo objects, use filter instead of find:
countInfo.foreach(c =>
generalInfo.filter(_.id == c.id).foreach(_.count = c.count)
)
You can also do it the other way:
generalInfo.foreach(g => countInfo.find(_.id == g.id).map(c => g.count = c.count))
Demo in Scastie

Related

How can I group by the individual elements of a list of elements in Scala

Forgive me if I'm not naming things by their actual name, I've just started to learn Scala. I've been looking around for a while, but can not find a clear answer to my question.
Suppose I have a list of objects, each object has two fields: x: Int and l: List[String], where the Strings, in my case, represent categories.
The l lists can be of arbitrary length, so an object can belong to multiple categories. Furthermore, various objects can belong to the same category. My goal is to group the objects by the individual categories, where the categories are the keys. This means that if an object is linked to say "N" categories, it will occur in "N" of the key-value pairs.
So far I managed to groupBy the lists of categories through:
objectList.groupBy(x => x.l)
However, this obviously groups the objects by list of categories rather than by categories.
I'm trying to do this with immutable collections avoiding loops etc.
If anyone has some ideas that would be much appreciated!
Thanks
EDIT:
By request the actual case class and what I am trying.
case class Car(make: String, model: String, fuelCapacity: Option[Int], category:Option[List[String]])
Once again, a car can belong to multiple categories. Let's say List("SUV", "offroad", "family").
I want to group by category elements rather than by the whole list of categories, and have the fuelCapacity as the values, in order to be able to extract average fuelCapacity per category amongst other metrics.
Using your EDIT as a guide.
case class Car( make: String
, model: String
, fuelCapacity: Option[Int]
, category:Option[List[String]] )
val cars: List[Car] = ???
//all currently known category strings
val cats: Set[String] = cars.flatMap(_.category).flatten.toSet
//category -> list of cars in this category
val catMap: Map[String,List[Car]] =
cats.map(cat => (cat, cars.filter(_.category.contains(cat)))).toMap
//category -> average fuel capacity for cars in this category
val fcAvg: Map[String,Double] =
catMap.map{case (cat, cars) =>
val fcaps: List[Int] = cars.flatMap(_.fuelCapacity)
if (fcaps.lengthIs < 1) (cat, -1d)
else (cat, fcaps.sum.toDouble / fcaps.length)
}
Something like the following?
objectList // Seq[YourType]
.flatMap(o => o.l.map(c => c -> o)) // Seq[(String, YourType)]
.groupBy { case (c,_) => c } // Map[String,Seq[(String,YourType)]]
.mapValues { items => c -> items.map { case (_, o) => o } } // Map[String, Seq[YourType]]
(Deliberately "heavy" to help you understand the idea behind it)
EDIT, or as of Scala 2.13 thanks to groupMap:
objectList // Seq[YourType]
.flatMap(o => o.l.map(c => c -> o)) // Seq[(String, YourType)]
.groupMap { case (c,_) => c } { case (_, o) => o } // Map[String,Seq[YourType]]
You are very close, you just need to split each individual element in the list before the group so try with something like this:
// I used a Set instead of a List,
// since I don't think the order of categories matters
// as well I would think having two times the same category doesn't make sense.
final case class MyObject(x: Int, categories: Set[String] = Set.empty) {
def addCategory(category: String): MyObject =
this.copy(categories = this.categories + category)
}
def groupByCategories(data: List[MyObject]): Map[String, List[Int]] =
data
.flatMap(o => o.categories.map(c => c -> o.x))
.groupMap(_._1)(_._2)

Passing a method and parameters to a Scala case class?

I am parsing an XML document and store its data in various other structured document formats. In this XML document, the elements reference other elements, such as:
<myCar id="12" name="Porsche XYZ" ...>
<connected refId="3" />
</myCar>
...
<myCar id="3" name="Audi XYZ" ...>
...
</myCar>
Here, refId maps to id. When creating the myCar instance with the id 12, I cannot reference to myCar with id 3, because it has not yet been parsed.
Obviously, the easy solution would be to parse the document twice (instantiate references in the second run, after all elements have been parsed and created). However, for performance reasons I only want to parse the document once. Thus, I thought I could just store the relevant reference data in a case class and build a list of instances that is passed from one method to another, in order to process it after having parsed the entire document.
However, my problem is that the logic for creating the references varies to a great extent. So, I cannot use something like this:
case class Ref (a: String, b: String)
val refs: List[Ref] = List.empty
// 1. fill the list with references during parsing
// 2. after parsing the document, process all references in the list
I think what I need is to move all my reference creation logic to separate methods, and then when parsing the document maintain a list with "pointers" to these methods including the appropriate parameters. In this way, I could just iterate through the list after parsing the entire document, and call every method with the correct parameters.
How can this be achieved?
I'm not 100% sure what you're asking but I think you are asking for a way to link the connected cars to the base case class such that after parsing the XML you have a list of cars and for any given car you can access the refId attribute (materialized as a full car object) from the connected tag.
Here is a simple approach:
given:
val xml = <root>
<myCar id="12" name="Porsche XYZ">
<connected refId="3" />
</myCar>
<myCar id="3" name="Audi XYZ">
</myCar>
</root>
First we'll make a case class to model a myCar:
case class Car(
id: String,
name: String,
connectedId: Option[String]
)
Then we parse the XML into Car instances. I'm going to parse it into a Map[String, Car] where the key is Car.id:
val result = (xml \ "myCar").foldLeft(Map.empty[String, Car]) {
case (acc, next) =>
val id: String = (next \# "id")
val name = (next \# "name")
val connectionStr = (next \ "connected" \# "refId")
val connection = Option.unless(connectionStr.isEmpty)(connectionStr)
val car = Car(
id,
name,
connection
)
acc + (id -> car)
}
Next we need a way to turn connectedId into an actual car. I did this by adding a method to Car changing the case class to:
case class Car(
id: String,
name: String,
connectedId: Option[String]
) {
def getConnected(cars: Map[String, Car]): Option[Car] = {
connectedId.flatMap { id =>
cars.get(id)
}
}
}
This method (getConnected) takes the Map produced in the previous step.
Get the list of cars with:
result.values // Iterable(Car(12,Porsche XYZ,Some(3)), Car(3,Audi XYZ,None))
Get the connected car for the first car in the list:
result.values.head.getConnected(result) // Some(Car(3,Audi XYZ,None))
If you want to "fill in" the connected cars add a field to hold the connected car (pass None in the initial foldLeft above):
case class Car(
id: String,
name: String,
connectedId: Option[String],
connected: Option[Car],
) {
def getConnected(cars: Map[String, Car]): Option[Car] = {
connectedId.flatMap { id =>
cars.get(id)
}
}
}
Then just map over the list, adding the connections:
result.values
.map { car =>
val connectedCar = car.connectedId.flatMap { id =>
car.getConnected(result)
}
car.copy(connected = connectedCar)
}
This produces:
List(Car(12,Porsche XYZ,Some(3),Some(Car(3,Audi XYZ,None,None))), Car(3,Audi XYZ,None,None))
This does not recursively fill in the connected cars. You'd have to switch to either make this recursive somehow or use a var in Car to track the connected car and modify references instead of using .copy to accomplish that. I haven't thought about this too much though.
Full working code here: https://scastie.scala-lang.org/YuhNdszQROKTaNExMchaCg

Scala, create a bidimensional array to manages users

i have a problem. I'm having trouble creating a method that takes a user's name as input and manages how many times a user is named. Basically I have this map as data:
private var players: Seq[GamePlayer] = _
and game player:
case class GamePlayer(override val id: String, username: String, override val actorRef: ActorRef) extends Player
from this map I have to create a method that, taken as input a name of a user, creates a two-dimensional array Array [String] [Int] in which a name is associated with the number of times that user is named. Any ideas about it?
I managed to do this but I just don't know how to create and manage a two-dimensional array at scale.
sorry but I'm new to scala and I'm only starting to understand things now ^^"
Thanks
private def manageVote(username: String): Unit = {
//var matrixOfVotes = Array.ofDim[String][Int](this.numberOfPlayers,2)
var numberOfVotes = this.numberOfPlayers
var votes = new Array[Int](this.numberOfPlayers)
isEmpty(username) match {
case true => numberOfVotes=numberOfVotes-1
case false => numberOfVotes=numberOfVotes-1
votes.add(players.indexOf(username))
}
}
and isEmpty:
private def isEmpty(x: String) = Option(x).forall(_.isEmpty)
First of all, there is no such a thing in Java/Scala Array's as you tried to describe Array [String] [Int] - two dimensional array as name stands is an array of arrays, like: Array[Array[Int]].
What you want is Map[String, Int]. If you would like to, how many times Player with specific name appears in Seq[GamePlayer] you can use just groupBy operation, like:
players.groupBy(_.username).map {
case(username, players) => username -> players.size
}

How to combine two objects in a List by summing a member

Given this case class:
case class Categories(fruit: String, amount: Double, mappedTo: String)
I have a list containing the following:
List(
Categories("Others",22.38394964594807,"Others"),
Categories("Others",77.6160503540519,"Others")
)
I want to combine two elements in the list by summing up their amount if they are in the same category, so that the end result in this case would be:
List(Categories("Others",99.99999999999997,"Others"))
How can I do that?
Since groupMapReduce was introduced in Scala 2.13, I'll try to provide another approch to Martinjn's great answer.
Assuming we have:
case class Categories(Fruit: String, amount: Double, mappedTo: String)
val categories = List(
Categories("Apple",22.38394964594807,"Others"),
Categories("Apple",77.6160503540519,"Others")
)
If you want to aggregate by both mappedTo and Fruit
val result = categories.groupBy(c => (c.Fruit, c.mappedTo)).map {
case ((fruit, mappedTo), categories) => Categories(fruit, categories.map(_.amount).sum, mappedTo)
}
Code run can be found at Scastie.
If you want to aggregate only by mappedTo, and choose a random Fruit, you can do:
val result = categories.groupBy(c => c.mappedTo).map {
case (mappedTo, categories) => Categories(categories.head.Fruit, categories.map(_.amount).sum, mappedTo)
}
Code run can be found at Scastie
You want to group your list entries by category, and reduce them to a single value. There is groupMapReduce for that, which groups entries, and then maps the group (you don't need this) and reduces the group to a single value.
given
case class Category(category: String, amount: Double)
if you have a val myList: List[Category], then you want to group on Category#category, and reduce them by merging the members, summing up the amount.
that gives
myList.groupMapReduce(_.category) //group
(identity) //map. We don't need to map, so we use the identity mapping
{
case (Category(name, amount1), Category(_, amount2)) =>
Category(name, amount1 + amount2) }
} //reduce, combine each elements by taking the name, and summing the amojunts
In theory just a groupReduce would have been enough, but that doesn't exist, so we're stuck with the identity here.

Scala : How to pass a class field into a method

I'm new to Scala and attempting to do some data analysis.
I have a CSV files with a few headers - lets say item no., item type, month, items sold.
I have made an Item class with the fields of the headers.
I split the CSV into a list with each iteration of the list being a row of the CSV file being represented by the Item class.
I am attempting to make a method that will create maps based off of the parameter I send in. For example if I want to group the items sold by month, or by item type. However I am struggling to send the Item.field into a method.
F.e what I am attempting is something like:
makemaps(Item.month);
makemaps(Item.itemtype);
def makemaps(Item.field):
if (item.field==Item.month){}
else (if item.field==Item.itemType){}
However my logic for this appears to be wrong. Any ideas?
def makeMap[T](items: Iterable[Item])(extractKey: Item => T): Map[T, Iterable[Item]] =
items.groupBy(extractKey)
So given this example Item class:
case class Item(month: String, itemType: String, quantity: Int, description: String)
You could have (I believe the type ascriptions are mandatory):
val byMonth = makeMap[String](items)(_.month)
val byType = makeMap[String](items)(_.itemType)
val byQuantity = makeMap[Int](items)(_.quantity)
val byDescription = makeMap[String](items)(_.description)
Note that _.month, for instance, creates a function taking an Item which results in the String contained in the month field (simplifying a little).
You could, if so inclined, save the functions used for extracting keys in the companion object:
object Item {
val month: Item => String = _.month
val itemType: Item => String = _.itemType
val quantity: Item => Int = _.quantity
val description: Item => String = _.description
// Allows us to determine if using a predefined extractor or using an ad hoc one
val extractors: Set[Item => Any] = Set(month, itemType, quantity, description)
}
Then you can pass those around like so:
val byMonth = makeMap[String](items)(Item.month)
The only real change semantically is that you explicitly avoid possible extra construction of lambdas at runtime, at the cost of having the lambdas stick around in memory the whole time. A fringe benefit is that you might be able to cache the maps by extractor if you're sure that the source Items never change: for lambdas, equality is reference equality. This might be particularly useful if you have some class representing the collection of Items as opposed to just using a standard collection, like so:
object Items {
def makeMap[T](items: Iterable[Item])(extractKey: Item => T): Map[T,
Iterable[Item]] =
items.groupBy(extractKey)
}
class Items(val underlying: immutable.Seq[Item]) {
def makeMap[T](extractKey: Item => T): Map[T, Iterable[Item]] =
if (Item.extractors.contains(extractKey)) {
if (extractKey == Item.month) groupedByMonth.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.itemType) groupedByItemType.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.quantity) groupedByQuantity.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.description) groupedByDescription.asInstanceOf[Map[T, Iterable[Item]]]
else throw new AssertionError("Shouldn't happen!")
} else {
Items.makeMap(underlying)(extractKey)
}
lazy val groupedByMonth = Items.makeMap[String](underlying)(Item.month)
lazy val groupedByItemType = Items.makeMap[String](underlying)(Item.itemType)
lazy val groupedByQuantity = Items.makeMap[Int](underlying)(Item.quantity)
lazy val groupedByDescription = Items.makeMap[String](underlying)(Item.description)
}
(that is almost certainly a personal record for asInstanceOfs in a small block of code... I'm not sure if I should be proud or ashamed of this snippet)