Scala map and/or groupby functions - scala

I am new to Scala and I am trying to figure out some scala syntax.
So I have a list of strings.
wordList: List[String] = List("this", "is", "a", "test")
I have a function that returns a list of pairs that contains consonants and vowels counts per word:
def countFunction(words: List[String]): List[(String, Int)]
So, for example:
countFunction(List("test")) => List(('Consonants', 3), ('Vowels', 1))
I now want to take a list of words and group them by count signatures:
def mapFunction(words: List[String]): Map[List[(String, Int)], List[String]]
//using wordList from above
mapFunction(wordList) => List(('Consonants', 3), ('Vowels', 1)) -> Seq("this", "test")
List(('Consonants', 1), ('Vowels', 1)) -> Seq("is")
List(('Consonants', 0), ('Vowels', 1)) -> Seq("a")
I'm thinking I need to use GroupBy to do this:
def mapFunction(words: List[String]): Map[List[(String, Int)], List[String]] = {
words.groupBy(F: (A) => K)
}
I've read the scala api for Map.GroupBy and see that F represents discriminator function and K is the type of keys you want returned. So I tried this:
words.groupBy(countFunction => List[(String, Int)]
However, scala doesn't like this syntax. I tried looking up some examples for groupBy and nothing seems to help me with my use case. Any ideas?

Based on your description, your count function should take a word instead of a list of words. I would have defined it like this:
def countFunction(words: String): List[(String, Int)]
If you do that you should be able to call words.groupBy(countFunction), which is the same as:
words.groupBy(word => countFunction(word))
If you cannot change the signature of countFunction, then you should be able to call group by like this:
words.groupBy(word => countFunction(List(word)))

You shouldn't put the return type of the function in the call. The compiler can figure this out itself. You should just call it like this:
words.groupBy(countFunction)
If that doesn't work, please post your countFunction implementation.
Update:
I tested it in the REPL and this works (note that my countFunction has a slightly different signature from yours):
scala> def isVowel(c: Char) = "aeiou".contains(c)
isVowel: (c: Char)Boolean
scala> def isConsonant(c: Char) = ! isVowel(c)
isConsonant: (c: Char)Boolean
scala> def countFunction(s: String) = (('Consonants, s count isConsonant), ('Vowels, s count isVowel))
countFunction: (s: String)((Symbol, Int), (Symbol, Int))
scala> List("this", "is", "a", "test").groupBy(countFunction)
res1: scala.collection.immutable.Map[((Symbol, Int), (Symbol, Int)),List[java.lang.String]] = Map((('Consonants,0),('Vowels,1)) -> List(a), (('Consonants,1),('Vowels,1)) -> List(is), (('Consonants,3),('Vowels,1)) -> List(this, test))
You can include the type of the function passed to groupBy, but like I said you don't need it. If you want to pass it in you do it like this:
words.groupBy(countFunction: String => ((Symbol, Int), (Symbol, Int)))

Related

Convert List[(Int,String)] into List[Int] in scala

My goal is to to map every word in a text (Index, line) to a list containing the indices of every line the word occurs in. I managed to write a function that returns a list of all words assigned to a index.
The following function should do the rest (map a list of indices to every word):
def mapIndicesToWords(l:List[(Int,String)]):Map[String,List[Int]] = ???
If I do this:
l.groupBy(x => x._2)
it returns a Map[String, List[(Int,String)]. Now I just want to change the value to type List[Int].
I thought of using .mapValues(...) and fold the list somehow, but I'm new to scala and don't know the correct approach for this.
So how do I convert the list?
Also you can use foldLeft, you need just specify accumulator (in your case Map[String, List[Int]]), which will be returned as a result, and write some logic inside. Here is my implementation.
def mapIndicesToWords(l:List[(Int,String)]): Map[String,List[Int]] =
l.foldLeft(Map[String, List[Int]]())((map, entry) =>
map.get(entry._2) match {
case Some(list) => map + (entry._2 -> (entry._1 :: list))
case None => map + (entry._2 -> List(entry._1))
}
)
But with foldLeft, elements of list will be in reversed order, so you can use foldRight. Just change foldLeft to foldRight and swap input parameters, (map, entry) to (entry, map).
And be careful, foldRight works 2 times slower. It is implemented using method reverse list and foldLeft.
scala> val myMap: Map[String,List[(Int, String)]] = Map("a" -> List((1,"line1"), (2, "line")))
myMap: Map[String,List[(Int, String)]] = Map(a -> List((1,line1), (2,line)))
scala> myMap.mapValues(lst => lst.map(pair => pair._1))
res0: scala.collection.immutable.Map[String,List[Int]] = Map(a -> List(1, 2))

Anonymous comparator function in scala with deconstructed tuples?

I have a list of tuples, that I want to sort ascending according to the second element in the tuple. I do it with this code:
freqs.sortWith( _._2 < _._2 )
But I dont like the ._2 naming, as I would prefer to have a nice name to the second parameter like freqs.sortWith( _.weight < _.weight ).
Any ideas how to do this?
You don't need to repeat everything twice in sortWith, you can use sortBy(_._2) instead.
If you want to have a nice name, create a custom case class that has member variables of this name:
case class Foo(whatever: String, weight: Double)
val list: List[Foo] = ???
list.sortBy(_.weight)
It takes just a single line.
Alternatively, you can "pimp" the tuples locally:
class WeightOps(val whatever: String, val weight: Double)
implicit def tupleToWeightOps(t: (String, Double)): WeightOps =
new WeightOps(t._1, t._2)
then you can use .weight on tuples directly:
val list: List[(String, Double)] = ???
list.sortBy(_.weight)
Don't forget to keep the implicit scope as small as possible.
scala> val list: List[(String, Int)] = List (("foo", 7), ("bar", 3), ("foobar", 5))
list: List[(String, Int)] = List((foo,7), (bar,3), (foobar,5))
scala> list.sortBy {case (ignore, price) => price }
res70: List[(String, Int)] = List((bar,3), (foobar,5), (foo,7))
A case extractor can be used to put meaningful names on variables.

applying partial function on a tuple field, maintaining the tuple structure

I have a PartialFunction[String,String] and a Map[String,String].
I want to apply the partial functions on the map values and collect the entries for which it was applicaple.
i.e. given:
val m = Map( "a"->"1", "b"->"2" )
val pf : PartialFunction[String,String] = {
case "1" => "11"
}
I'd like to somehow combine _._2 with pfand be able to do this:
val composedPf : PartialFunction[(String,String),(String,String)] = /*someMagicalOperator(_._2,pf)*/
val collected : Map[String,String] = m.collect( composedPf )
// collected should be Map( "a"->"11" )
so far the best I got was this:
val composedPf = new PartialFunction[(String,String),(String,String)]{
override def isDefinedAt(x: (String, String)): Boolean = pf.isDefinedAt(x._2)
override def apply(v1: (String, String)): (String,String) = v1._1 -> pf(v1._2)
}
is there a better way?
Here is the magical operator:
val composedPf: PartialFunction[(String, String), (String, String)] =
{case (k, v) if pf.isDefinedAt(v) => (k, pf(v))}
Another option, without creating a composed function, is this:
m.filter(e => pf.isDefinedAt(e._2)).mapValues(pf)
There is a function in Scalaz, that does exactly that: second
scala> m collect pf.second
res0: scala.collection.immutable.Map[String,String] = Map(a -> 11)
This works, because PartialFunction is an instance of Arrow (a generalized function) typeclass, and second is one of the common operations defined for arrows.

Simplest way to extract Option from Scala collections

Imagine you have a Map[Option[Int], String] and you want to have a Map[Int, String] discarding the entry which contain None as the key.
Another example, that should be somehow similar is List[(Option[Int], String)] and transform it to List[(Int, String)], again discarding the tuple which contain None as the first element.
What's the best approach?
collect is your friend here:
example data definition
val data = Map(Some(1) -> "data", None -> "")
solution for Map
scala> data collect { case ( Some(i), s) => (i,s) }
res4: scala.collection.immutable.Map[Int,String] = Map(1 -> data)
the same approach works for a list of tuples
scala> data.toList collect { case ( Some(i), s) => (i,s) }
res5: List[(Int, String)] = List((1,data))

Scala: Count Words

I am trying to write the function of countWords(ws) that counts the frequency of words in a list of words ws returning a map from words to occurrences.
that ws is a List[String], using the List data type I should produce a Map[String,Int] using the Map data type. an example of what should the function do:
def test{
expect (Map("aa" -> 2, "bb" -> 1)) {
countWords(List("aa", "bb"))
}
}
This is just a perpetration for a test and its not an assignment. I have been stuck on this function for while now. This is what I have so far:
object Solution {
// define function countWords
def countWords(ws : List[String]) : Map[String,Int] = ws match {
case List() => List()
}
}//
which gives type mismatch. I am not quite sure how to use the scala Map Function, for example when ws is Empty list what should it return that passed by Map[String,Int] I have been trying, and thats why I post it here to get some help. thank you.
Another way to do it is using groupBy which outputs Map(baz -> List(baz, baz, baz), foo -> List(foo, foo), bar -> List(bar)). Then you can map the values of the Map with mapValues to get a count of the number of times each word appears.
scala> List("foo", "foo", "bar", "baz", "baz", "baz")
res0: List[String] = List(foo, foo, bar, baz, baz, baz)
scala> res0.groupBy(x => x).mapValues(_.size)
res0: scala.collection.immutable.Map[String,Int] = Map(baz -> 3, foo -> 2, bar -> 1)
Regarding the type mismatch in your program countWords is expecting a Map[String, Int] as the return type and the first(and only) match you have returns an empty List with type Nothing. If you change the match to case List() => Map[String, Int]() it will no longer give a type error. It also gives a warning about an in-exhaustive pattern match obviously won't return the correct output.
The easiest solution is this
def countWords(ws: List[String]): Map[String, Int] = {
ws.toSet.map((word: String) => (word, ws.count(_ == word))).toMap
}
But it's not the fastest one since it searches through the list several times.
edit:
The fastest way is to use a mutable HashMap
def countWords(ws: List[String]): Map[String, Int] = {
val map = scala.collection.mutable.HashMap.empty[String, Int]
for(word <- ws) {
val n = map.getOrElse(word, 0)
map += (word -> (n + 1))
}
map.toMap
}
Use fold to go through your list starting with an empty map
ws.foldLeft(Map.empty[String, Int]){
(count, word) => count + (word -> (count.getOrElse(word, 0) + 1))
}