Scala map extract value repeated maximum times - scala

I have Map[String,String]so keys are distinct but most values are repeated.
For example : Map[car-> "This is a car",truck-> "This is a car", fruit ->"This is a fruit"]
So it should return "This is a car" because it is repeated twice.

I did something like this. Hope it helps.
val j = x.groupBy(_._2)
Then
j.maxBy(_._2.size)
Where x is your original map. The first call, returns a Map and then you just get the key value pair where the value (map , has max entries)

val m1 = Map("this" -> "that", "what" -> "that", "who" -> "me", "you" -> "who")
m1.groupBy(_._2).maxBy(_._2.size)
res0: ... = (that,Map(this -> that, what -> that))

Another solution
map.values.groupBy(t => t ).values.maxBy(_.size).head

Not the most elegant but my solution goes like
val list = Map(car-> "This is a car",truck-> "This is a car", fruit ->"This is a fruit")
list.map{
case (k,v) => if(list.filter{case (key,value)=> value==v }.size>1)v
}.toSet

Related

Better way to create new key in map if key doesnt exist in Scala?

I have the following list of maps in Scala:
val list = List(Map( "age" -> 25, "city" -> "London", "last_name" -> "Smith"),
Map("city" -> "Berlin", "last_name" -> "Robinson"))
And I wish to iterate through the list of maps and check if the key "age" exists. If it doesnt, I want to create the entry and put it in the map. So far I have tried:
val tmp = list.map( item => if (!item.contains("age")) item.updated("age",5) else item)
print(tmp)
which works fine, I just wanted to know if there is a more efficient way to do it (perhaps with comprehensions or anything else?)! Any advice would be appriciated
I'd do it this way.
val newList =
list.map{m => if (m.keySet("age")) m else m + ("age" -> "5")}
Note that if you don't make the value 5 a String then the result is a Map[String,Any], which is not what you want.

How to combine all the values with the same key in Scala?

I have a map like :
val programming = Map(("functional", 1) -> "scala", ("functional", 2) -> "perl", ("orientedObject", 1) -> "java", ("orientedObject", 2) -> "C++")
with the same first element of key appearing multiple times.
How to regroup all the values corresponding to the same first element of key ? Which would turn this map into :
Map("functional" -> List("scala","perl"), "orientedObject" -> List("java","C++"))
UPDATE: This answer is based upon your original question. If you need the more complex Map definition, using a tuple as the key, then the other answers will address your requirements. You may still find this approach simpler.
As has been pointed out, you can't actually have multiple keys with the same value in a map. In the REPL, you'll note that your declaration becomes:
scala> val programming = Map("functional" -> "scala", "functional" -> "perl", "orientedObject" -> "java", "orientedObject" -> "C++")
programming: scala.collection.immutable.Map[String,String] = Map(functional -> perl, orientedObject -> C++)
So you end up missing some values. If you make this a List instead, you can get what you want as follows:
scala> val programming = List("functional" -> "scala", "functional" -> "perl", "orientedObject" -> "java", "orientedObject" -> "C++")
programming: List[(String, String)] = List((functional,scala), (functional,perl), (orientedObject,java), (orientedObject,C++))
scala> programming.groupBy(_._1).map(p => p._1 -> p._2.map(_._2)).toMap
res0: scala.collection.immutable.Map[String,List[String]] = Map(functional -> List(scala, perl), orientedObject -> List(java, C++))
Based on your edit, you have a data structure that looks something like this
val programming = Map(("functional", 1) -> "scala", ("functional", 2) -> "perl",
("orientedObject", 1) -> "java", ("orientedObject", 2) -> "C++")
and you want to scrap the numerical indices and group by the string key. Fortunately, Scala provides a built-in that gets you close.
programming groupBy { case ((k, _), _) => k }
This will return a new map which contains submaps of the original, grouped by the key that we return from the "partial" function. But we want a map of lists, so let's ignore the keys in the submaps.
programming groupBy { case ((k, _), _) => k } mapValues { _.values }
This gets us a map of... some kind of Iterable. But we really want lists, so let's take the final step and convert to a list.
programming groupBy { case ((k, _), _) => k } mapValues { _.values.toList }
You should try the .groupBy method
programming.groupBy(_._1._1)
and you will get
scala> programming.groupBy(_._1._1)
res1: scala.collection.immutable.Map[String,scala.collection.immutable.Map[(String, Int),String]] = Map(functional -> Map((functional,1) -> scala, (functional,2) -> perl), orientedObject -> Map((orientedObject,1) -> java, (orientedObject,2) -> C++))
you can now "clean" by doing something like:
scala> res1.mapValues(m => m.values.toList)
res3: scala.collection.immutable.Map[String,List[String]] = Map(functional -> List(scala, perl), orientedObject -> List(java, C++))
Read the csv file and create a map that contains key and list of values.
val fileStream = getClass.getResourceAsStream("/keyvaluepair.csv")
val lines = Source.fromInputStream(fileStream).getLines
var mp = Seq[List[(String, String)]]();
var codeMap=List[(String, String)]();
var res = Map[String,List[String]]();
for(line <- lines )
{
val cols=line.split(",").map(_.trim())
codeMap ++= Map(cols(0)->cols(1))
}
res = codeMap.groupBy(_._1).map(p => p._1 -> p._2.map(_._2)).toMap
Since no one has put in the specific ordering he asked for:
programming.groupBy(_._1._1)
.mapValues(_.toSeq.map { case ((t, i), l) => (i, l) }.sortBy(_._1).map(_._2))

Scala map and groupby a Map to another Map, with a smaller keyset

Lets say there is a Scala Map, eg:
val map:Map[String, List[String]= Map("Apple" -> List("Red", "Tasty"), "Orange" -> List("Sour", "Orange"), "Banana" -> List("Yellow"), "Mango" -> List("Best", "Yellow", "Favorite"))
Now I want to convert it to the following map, which uses only the map._1.size as key instead, and groups values with same keys together.
Map(5 -> List("Best", "Yellow", "Favorite", "Red", "Tasty"), 6 -> List("Sour", "Orange", "Yellow"))
So how to do it?
map.groupBy(_._1.length).map { case (length, m) =>
length -> m.values.flatten
}
First you group by length, and you'll get a Map[Int, Map[String, List[String]]].
Then you need to flatten those map values to get the final result. You keep the key (length).
You just group by the size as you said and then process the resulting collections further.
F.ex:
map.groupBy(_._1.size).mapValues(_.values).mapValues(_.flatten)
edit: Insan-e:s answer is superior since it avoids one iteration through the collection, even if the ever-present Scala pattern matching makes it a bit more verbose.
Another possible solution could be:
scala> map.groupBy(_._1.size).map(x => (x._1,x._2.values.flatten))
res89: scala.collection.immutable.Map[Int,Iterable[String]] = Map(5 -> List(Red, Tasty, Best, Yellow, Favorite), 6 -> List(Sour, Orange, Yellow))
val result = map.map {case (string,list) => list.map( x => (string.length,x) ) }.flatten.groupBy(_._1).map{ case (int,list) => (int,list.map(_._2)) }

Scala modify mapValues Set of Strings Delete column

If I have a a Map which maps values from String to (String,String,String)
How can I remove the second Stringfrom the List that the Map would be like Map( String ->(String,String))
Example:
var mp = Map(
"K1" -> List("K1_C1","K1_C2","K1_C3"),
"K2" -> List("K2_C1","K2_C2","K2_C3"),
"K2" -> List("K3_C1","K3_C2","K3_C3")
)
How can I reach this:
Map(
"K1" -> List("K1_C1","K1_C3"),
"K2" -> List("K2_C1","K2_C3"),
"K2" -> List("K3_C1","K3_C3")
)
What I've tried this but didn't work
mp.mapValues( _.map(_.drop(2)))
Also I've tried to convert the (String,String,String) to Listbut It didn't work
mp.mapValues(_.map(_.toList.remove(2)))
This is like my first time writing in scala because I have to, and I'm tottaly use it like I write Java
You were pretty close with the drop function, but I suggest you take a look at its documentation. It drops the given number of elements from the beginning of the list.
What you actually want is take the first one and takeRight the last one:
mp.mapValues(list => list.take(1) ++ list.takeRight(1))
This is pretty ugly, however. If you are certain that your values are always a 3-element list, I suggest pattern matching just as I showed with tuples:
mp.mapValues {
case List(first, _, third) => List(first, third)
}
It looks like your map has lists of tuples, not lists of strings. Something like this should work:
m.mapValues { case List((a,b,c)) => (a,c) }
or
m.mapValues { case List((a,b,c)) => List((a,c)) }
or
m.mapValues { case List((a,b,c)) => List(a,c) }
... depending on what type of output you want to end up with.

Filtering maps in iterator

I have following code:
val rows: Iterator[Map[String,String]] = CSVDictReader(file.getInputStream)
val parsedProducts = rows.map(x => Product(name = x.get("NAME"), id = x.get("ID")))
And I would like to replace map entries with empty string. With a map alone I could use:
filter(_._2.trim.nonEmpty)
I cannot get my head around how to do this in a nice way without introducing some helper function to return None in case value is empty string.
Edit: In my example I have only name and id but in the real code there are easily over ten columns of data. Also, I would need to have None instead of empty string value. So name=Option("") should be replaced with name=None
You can filter Options as well.
Let's say your x.get("NAME") returns a Some("") or even Some(" ").
Then you may do something like this: x.get("NAME").filter(_.trim.nonEmpty)
Hope I understood your question correctly
something like this?
val rows: Iterator[Map[String,String]] = CSVDictReader(file.getInputStream)
val parsedProducts = for {
row <- rows
name <- row.get("NAME")
id <- row.get("ID")
} yield Product(name, id)
Here, if row.get("NAME") or row.get("ID") return None, the corresponding entry will not be yielded.
I'm not sure if this is what you're looking for, but the following code snippet:
val rows: Iterator[Map[String,String]] = Iterator(Map("NAME" -> " ", "ID" -> "foo"), Map("NAME" -> " ", "ID" -> ""))
val fieldNames = List("NAME","ID","ANOTHER COLUMN")
val cleanedRows = rows map { row =>
fieldNames map { fieldName =>
Map ( fieldName -> row.get(fieldName).filter (_.trim.nonEmpty) )
}
}
while(cleanedRows.hasNext) {
println(cleanedRows.next)
}
Would print out:
List(Map(NAME -> None), Map(ID -> Some(foo)), Map(ANOTHER COLUMN -> None))
List(Map(NAME -> None), Map(ID -> None), Map(ANOTHER COLUMN -> None))
So at this point cleanedRows would have the entries you need to create your Product instances.