val valueCountsMap: mutable.Map[String, Int] = mutable.Map[String, Int]()
valueCountsMap("a") = 1
valueCountsMap("b") = 1
valueCountsMap("c") = 1
val maxOccurredValueNCount: (String, Int) = valueCountsMap.maxBy(_._2)
// maxOccurredValueNCount: (String, Int) = (b,1)
How can I get None if there's no clear winner when I do maxBy values? I am wondering if there's any native solution already implemented within scala mutable Maps.
No, there's no native solution for what you've described.
Here's how I might go about it.
implicit class UniqMax[K,V:Ordering](m: Map[K,V]) {
def uniqMaxByValue: Option[(K,V)] = {
m.headOption.fold(None:Option[(K,V)]){ hd =>
val ev = implicitly[Ordering[V]]
val (count, max) = m.tail.foldLeft((1,hd)) {case ((c, x), v) =>
if (ev.gt(v._2, x._2)) (1, v)
else if (v._2 == x._2) (c+1, x)
else (c, x)
}
if (count == 1) Some(max) else None
}
}
}
Usage:
Map("a"->11, "b"->12, "c"->11).uniqMaxByValue //res0: Option[(String, Int)] = Some((b,12))
Map(2->"abc", 1->"abx", 0->"ab").uniqMaxByValue //res1: Option[(Int, String)] = Some((1,abx))
Map.empty[Long,Boolean].uniqMaxByValue //res2: Option[(Long, Boolean)] = None
Map('c'->2.2, 'w'->2.2, 'x'->2.1).uniqMaxByValue //res3: Option[(Char, Double)] = None
Related
val adjList = Map("Logging" -> List("Networking", "Game"))
// val adjList: Map[String, List[String]] = Map(Logging -> List(Networking, Game))
adjList.flatMap { case (v, vs) => vs.map(n => (v, n)) }.toList
// val res7: List[(String, String)] = List((Logging,Game))
adjList.map { case (v, vs) => vs.map(n => (v, n)) }.flatten.toList
// val res8: List[(String, String)] = List((Logging,Networking), (Logging,Game))
I am not sure what is happening here. I was expecting the same result from both of them.
.flatMap is Map's .flatMap, but .map is Iterable's .map.
For a Map "Logging" -> "Networking" and "Logging" -> "Game" become just the latter "Logging" -> "Game" because the keys are the same.
val adjList: Map[String, List[String]] = Map("Logging" -> List("Networking", "Game"))
val x0: Map[String, String] = adjList.flatMap { case (v, vs) => vs.map(n => (v, n)) }
//Map(Logging -> Game)
val x: List[(String, String)] = x0.toList
//List((Logging,Game))
val adjList: Map[String, List[String]] = Map("Logging" -> List("Networking", "Game"))
val y0: immutable.Iterable[List[(String, String)]] = adjList.map { case (v, vs) => vs.map(n => (v, n)) }
//List(List((Logging,Networking), (Logging,Game)))
val y1: immutable.Iterable[(String, String)] = y0.flatten
//List((Logging,Networking), (Logging,Game))
val y: List[(String, String)] = y1.toList
//List((Logging,Networking), (Logging,Game))
Also https://users.scala-lang.org/t/map-flatten-flatmap/4180
How is it easier to implement function that find and immutable remove the first occurrence in Scala collection:
case class A(a: Int, b: Int)
val s = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7))
val (s1, r) = s.findAndRemove(_.a == 2)
Result: s1 = Seq(A(1,5), A(4,6), A(5,1), A(2,7)) , r = Some(A(2,3))
It finds the first element that match, and keeps order. It can be improved with List instead of Seq.
case class A(a: Int, b: Int)
val s = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7))
val (s1, r) = s.findAndRemove(_.a == 2)
println(s1)
println(r)
implicit class SeqOps[T](s:Seq[T]) {
def findAndRemove(f:T => Boolean):(Seq[T], Option[T]) = {
s.foldLeft((Seq.empty[T], Option.empty[T])) {
case ((l, None), elem) => if(f(elem)) (l, Option(elem)) else (l :+ elem, None)
case ((l, x), elem) => (l :+ elem, x)
}
}
}
Yeah, a little late to the party, but I thought I'd throw this in.
Minimum invocations of the predicate.
Works with most popular collection types: Seq, List, Array, Vector. Even Set and Map (but for those the collection has no order to preserve and there's no telling which element the predicate will find first). Doesn't work for Iterator or String.
-
import scala.collection.generic.CanBuildFrom
import scala.language.higherKinds
implicit class CollectionOps[U, C[_]](xs :C[U]) {
def findAndRemove(p :U=>Boolean
)(implicit bf :CanBuildFrom[C[U], U, C[U]]
,ev :C[U] => collection.TraversableLike[U, C[U]]
) :(C[U], Option[U]) = {
val (before, after) = xs.span(!p(_))
before ++ after.drop(1) -> after.headOption
}
}
usage:
case class A(a: Int, b: Int)
val (as, a) = Seq(A(1,5), A(4,6), A(2,3), A(5,1), A(2,7)).findAndRemove(_.a==2)
//as: Seq[A] = List(A(1,5), A(4,6), A(5,1), A(2,7))
//a: Option[A] = Some(A(2,3))
val (cs, c) = Array('g','t','e','y','b','e').findAndRemove(_<'f')
//cs: Array[Char] = Array(g, t, y, b, e)
//c: Option[Char] = Some(e)
val (ns, n) = Stream.from(9).findAndRemove(_ > 10)
//ns: Stream[Int] = Stream(9, ?)
//n: Option[Int] = Some(11)
ns.take(5).toList //List[Int] = List(9, 10, 12, 13, 14)
Try something like this
def findAndRemove(as: Seq[A])(fn: A => Boolean): (Seq[A], Option[A]) = {
val index = as.indexWhere(fn)
if(index == -1) as -> None
else as.patch(index, Nil, 1) -> as.lift(index)
}
val (s1, r) = findAndRemove(s)(_.a == 2)
My version:
def findAndRemove(s:Seq[A])(p:A => Boolean):(Seq[A], Option[A])={
val i = s.indexWhere(p)
if(i > 0){
val (l1, l2) = s.splitAt(i)
(l1++l2.tail, Some(l2.head))
}else{
(s, None)
}
}
I have the following:
val x : List[(String, Int)] = List((mealOne,2), (mealTWo,1), (mealThree,2))
I want to replace or transform the String to Int using the below values with a map:
val mealOne = 5.99; val mealTwo = 6.99; val mealThree = 7.99
x.map{ x => if (x._1 == "mealOne") mealOne
else if (x._1 == "mealTwo") mealTwo
else mealThree
}
Result:
List[Double] = List(5.99, 6.99, 7.99)
but I want this:
List[Double,Int] = List((5.99,2), (6.99,1), (7.99,2))
So how can I achieve the above
Thanks
Just don't drop the second component of the tuple then:
x.map{ x => (
if (x._1 == "mealOne") mealOne
else if (x._1 == "mealTwo") mealTwo
else mealThree,
x._2
)}
of course, it works for arbitrary mappings from Strings to Doubles:
def replaceNamesByPrices(
nameToPrice: String => Double,
xs: List[(String, Int)]
): List[(Double, Int)] =
for ((name, amount) <- xs) yield (nameToPrice(name), amount)
so that you can then store the mapping of names to prices in a map, i.e.
val priceTable = Map(
"mealOne" -> 42.99,
"mealTwo" -> 5.99,
"mealThree" -> 2345.65
)
so that
replaceNamesByPrices(priceTable, x)
yields the desired result.
This works in this way:(still simplified, thanks to Andrey Tyukin):
for(m<-x) yield (y(m._1),m._2)
for((m,n)<-x) yield (y(m),n)
or
x.map(t=>(y(t._1),t._2))
x.map{case (m,n)=>(y(m),n)}
Your Lists ( input ):(y is changed to Map)
val x = List(("mealOne",2), ("mealTWo",1), ("mealThree",2))
val y = Map(("mealOne",5.99), ("mealTWo",6.99), ("mealThree",7.99))
In Scala REPL:
scala> for(m<-x) yield (y(m._1),m._2)
res35: List[(Double, Int)] = List((5.99,2), (6.99,1), (7.99,2))
scala> for((m,n)<-x) yield (y(m),n)
res60: List[(Double, Int)] = List((5.99,2), (6.99,1), (7.99,2))
scala> x.map(t=>(y(t._1),t._2))
res57: List[(Double, Int)] = List((5.99,2), (6.99,1), (7.99,2))
scala> x.map{case (m,n)=>(y(m),n)}
res59: List[(Double, Int)] = List((5.99,2), (6.99,1), (7.99,2)
)
So the title of this should be confusing enough so I will do my best to explain. I am trying to break this function up into defined functions for better visibility into how the aggregateByKey works for other teams that will be writing to my code. I have the following aggregate:
val firstLetter = stringRDD.aggregateByKey(Map[Char, Int]())(
(accumCount, value) => accumCount.get(value.head) match {
case None => accumCount + (value.head -> 1)
case Some(count) => accumCount + (value.head -> (count + 1))
},
(accum1, accum2) => accum1 ++ accum2.map{case(k,v) => k -> (v + accum1.getOrElse(k, 0))}
).collect()
I've been wanting to break this up as follows:
val firstLet = Map[Char, Int]
def fSeq(accumCount:?, value:?) = {
accumCount.get(value.head) match {
case None => accumCount + (value.head -> 1)
case Some(count) => accumCount + (value.head -> (count + 1))
}
}
def fComb(accum1:?, accum2:?) = {
accum1 ++ accum2.map{case(k,v) => k -> (v + accum1.getOrElse(k, 0))
}
Due to the initial value being a Map[Char, Int] I am not sure what to make accumCount, Value data types to define. I've tried different things but nothing seeems to work. Can someone help me define the datatypes and explain how you determined it?
seqOp takes accumulator of the same type as the initial value as the first argument, and value of the same type as values in your RDD.
combOp takes two accumulators of the same types the initial value.
Assuming you want to aggregate RDD[(T,U)]:
def fSeq(accumCount: Map[Char, Int], value: U): Map[Char, Int] = ???
def fComb(accum1: Map[Char, Int], accum2: Map[Char, Int]): Map[Char, Int] = ???
I guess in your case U is simply as String, so you should adjust fSeq signature.
BTW, you can use provide default mapping and simplify your functions:
val firstLet = Map[Char, Int]().withDefault(x => 0)
def fSeq(accumCount: Map[Char, Int], value: String): Map[Char, Int] = {
accumCount + (value.head -> (accumCount(value.head) + 1))
}
def fComb(accum1: Map[Char, Int], accum2: Map[Char, Int]): Map[Char, Int] = {
val accum = (accum1.keys ++ accum2.keys).map(k => (k, accum1(k) + accum2(k)))
accum.toMap.withDefault(x => 0)
}
Finally it could be more efficient to use scala.collection.mutable.Map:
import scala.collection.mutable.{Map => MMap}
def firstLetM = MMap[Char, Int]().withDefault(x => 0)
def fSeqM(accumCount: MMap[Char, Int], value: String): MMap[Char, Int] = {
accumCount += (value.head -> (accumCount(value.head) + 1))
}
def fCombM(accum1: MMap[Char, Int], accum2: MMap[Char, Int]): MMap[Char, Int] = {
accum2.foreach{case (k, v) => accum1 += (k -> (accum1(k) + v))}
accum1
}
Test:
def randomChar() = (scala.util.Random.nextInt.abs % 58 + 65).toChar
def randomString() = {
(Seq(randomChar) ++ Iterator.iterate(randomChar)(_ => randomChar)
.takeWhile(_ => scala.util.Random.nextFloat > 0.1)).mkString
}
val stringRdd = sc.parallelize(
(1 to 500000).map(_ => (scala.util.Random.nextInt.abs % 60, randomString)))
val firstLetter = stringRDD.aggregateByKey(Map[Char, Int]())(
(accumCount, value) => accumCount.get(value.head) match {
case None => accumCount + (value.head -> 1)
case Some(count) => accumCount + (value.head -> (count + 1))
},
(accum1, accum2) => accum1 ++ accum2.map{
case(k,v) => k -> (v + accum1.getOrElse(k, 0))}
).collectAsMap()
val firstLetter2 = stringRDD
.aggregateByKey(firstLet)(fSeq, fComb)
.collectAsMap
val firstLetter3 = stringRDD
.aggregateByKey(firstLetM)(fSeqM, fCombM)
.mapValues(_.toMap)
.collectAsMap
firstLetter == val firstLetter2
firstLetter == val firstLetter3
def combinations(occurrences: Occurrences): List[Occurrences] = occurrences match {
case List() => List(List())
case occ :: occs =>
for {
**occSub <- (0 to occ._2).map((occ._1, _)).toList**
occsCombination <- combinations(occs)
} yield (occSub :: occsCombination).filter(x => x._2 != 0)
}
.map((occ._1, _)) is short for .map(i => (occ._1, i)).
For each element between 0 and occ._2, it creates a Tuple as above. So this returns a list of tuples with the first element fixed and the second going from 0 to occ._2.
For example:
scala> val occ = (42,5)
occ: (Int, Int) = (42,5)
scala> (0 to occ._2).map(i => (occ._1, i)).toList
res0: List[(Int, Int)] = List((42,0), (42,1), (42,2), (42,3), (42,4), (42,5))
scala> (0 to occ._2).map((occ._1, _)).toList
res1: List[(Int, Int)] = List((42,0), (42,1), (42,2), (42,3), (42,4), (42,5))