Lift algebird aggregator to consume (and return) Map - scala

The example in the README is very elegant:
scala> Map(1 -> Max(2)) + Map(1 -> Max(3)) + Map(2 -> Max(4))
res0: Map[Int,Max[Int]] = Map(2 -> Max(4), 1 -> Max(3))
Essentially the use of Map here is equivalent to SQL's group by.
But how do I do the same with an arbitrary Aggregator? For example, to achieve the same thing as the code above (but without the Max wrapper class):
scala> import com.twitter.algebird._
scala> val mx = Aggregator.max[Int]
mx: Aggregator[Int,Int,Int] = MaxAggregator(scala.math.Ordering$Int$#78c77)
scala> val mxOfMap = // what goes here?
mxOfMap: Aggregator[Map[Int,Int],Map[Int,Int],Map[Int,Int]] = ...
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)
In other words, how to I convert (or "lift") an Aggregator that operates on values of type T into an Aggregator that operates on values of type Map[K,T] (for some arbitrary K)?

Looks like this can be done fairly easily for Semigroup at least. This should be sufficient in the case where there is no additional logic in the "compose" or "present" phases of the aggregator which needs to be preserved (a Semigroup can be obtained from an Aggregator, discarding compose/prepare).
The code to answer the original question is:
scala> val sgOfMap = Semigroup.mapSemigroup[Int,Int](mx.semigroup)
scala> val mxOfMap = Aggregator.fromSemigroup(sgOfMap)
scala> mxOfMap.reduce(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res0: Map[Int,Int] = Map(2 -> 4, 1 -> 3)
But in practice, it would be better to start by constructing the arbitrary Semigroup directly, rather than constructing an Aggregator merely to extract the semigroup from:
scala> import com.twitter.algebird._
scala> val mx = Semigroup.from { (x: Int, y: Int) => Math.max(x, y) }
scala> val mxOfMap = Semigroup.mapSemigroup[Int,Int](mx)
scala> mxOfMap.sumOption(List(Map(1 -> 2), Map(1 -> 3), Map(2 -> 4)))
res33: Option[Map[Int,Int]] = Some(Map(2 -> 4, 1 -> 3))
Alternatively, convert to aggregator: Aggregator.fromSemigroup(mxOfMap)

Related

In Scala, what does x=> x._1._1 denotes

In the following snippet of code, I am aware that x._1 denotes the first element of the tuple, but I couldn't understand what x._1._1 represents.I am not so familiar with Scala, sorry if it is a relatively naive question, thank you!!
val a = b.groupBy(x=> x._1._1)
Here is a quick example in the REPL of a nested tuple
scala> val t = ((1, 2), 3)
t: ((Int, Int), Int) = ((1,2),3)
scala> t._1 // Get the first part of the tuple
res0: (Int, Int) = (1,2)
scala> t._2 // Get the second part of the tuple
res1: Int = 3
scala> t._1._1 // Get the first part of the first part
res2: Int = 1
And here is an example with a sequence to demonstrate the groupBy:
scala> val s = Seq(((1, 2), 3), ((1, 5), 6), ((2, 4), 32))
s: Seq[((Int, Int), Int)] = List(((1,2),3), ((1,5),6), ((2,4),32))
scala> s.groupBy
def groupBy[K](f: (((Int, Int), Int)) => K): scala.collection.immutable.Map[K,Seq[((Int, Int), Int)]]
scala> s.groupBy(x => x._1._1)
res3: scala.collection.immutable.Map[Int,Seq[((Int, Int), Int)]] = Map(2 -> List(((2,4),32)), 1 -> List(((1,2),3), ((1,5),6)))
In this case the first element of the first element are the target for the grouping. Here's the result in an easier to look at format:
Map(
2 -> List(
((2,4),32)),
1 -> List(
((1,2),3),
((1,5),6))
)
It means x._1 itself is a tuple.
Example:
val b = Seq((("subTuple_1", "subTuple_2"), "tuple_2"))
val a = b.groupBy(x=> x._1._1)
As you mentioned ._1 gives you the first column of your tuple, and if the result of first column is Tuple, you can do ._1.
eg.
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1)
res31: scala.collection.immutable.Map[String,String] = Map(a -> b, c -> d)
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1._1)
res32: scala.collection.immutable.Iterable[String] = List(a, c)
groupBy,
scala> Map(("a" -> "b") -> 100, ("a" -> "c") -> 200).groupBy(_._1._1)
res19: scala.collection.immutable.Map[String,scala.collection.immutable.Map[(String, String),Int]] = Map(a -> Map((a,b) -> 100, (a,c) -> 200))

How to set value in scala Map?

I am new to scala. I have a Map. I want to set a value in the Map with a particular key. Here is the code I am writing -
var mp: Map[Int, ParticipationStateTransition] = Map.empty[Int, ParticipationStateTransition]
val change: ParticipationStateTransition = new ParticipationStateTransition
mp(ri.userID) = change
The error it is showing me on the third line is -
application does not take parameters
What am I doing wrong? Thanks in advance.
Use .updated :
scala> val m = Map(1 -> 2)
m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2)
scala> val n = m.updated(1, 3)
n: scala.collection.immutable.Map[Int,Int] = Map(1 -> 3)
scala> m
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2)
scala> n
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 3)
Note that scala's Map are immutable, so you need to assign the return value of .updated, it will not change the original map.
If you want to change the map in place, you can use collection.mutable.Map and then
scala> val m = collection.mutable.Map(1 -> 2)
m: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> m.update(1, 3)
scala> m
res3: scala.collection.mutable.Map[Int,Int] = Map(1 -> 3)
If you want to set multiple values at once, you can do :
scala> val m = Map(1 -> 2)
m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2)
scala> val n = m ++ List((1 -> 3), (2 -> 4)) // also accepts an Array, a Map, …
n: scala.collection.immutable.Map[Int,Int] = Map(1 -> 3, 2 -> 4)

How to avoid the strange order in which maps are concatenated? (A++B++C ---> BAC)

Concatenating three maps a, b and c, I would expect the result to be in the same order as its respective original maps. But, as shown below, the result is like the maps were b, a and c:
Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import collection.mutable
import collection.mutable
scala> val a = mutable.Map(1->2)
a: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> val b = mutable.Map(2->2)
b: scala.collection.mutable.Map[Int,Int] = Map(2 -> 2)
scala> val c = mutable.Map(3->2)
c: scala.collection.mutable.Map[Int,Int] = Map(3 -> 2)
scala> a ++ b ++ c
res0: scala.collection.mutable.Map[Int,Int] = Map(2 -> 2, 1 -> 2, 3 -> 2)
For four maps, it shows b, d, a, c. For two b, a. The resulting map is always in the same order, no matter the original sequence.
Testing the answer:
Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import collection.mutable.LinkedHashMap
import collection.mutable.LinkedHashMap
scala> val a = LinkedHashMap(1 -> 2)
a: scala.collection.mutable.LinkedHashMap[Int,Int] = Map(1 -> 2)
scala> val b = LinkedHashMap(2 -> 2)
b: scala.collection.mutable.LinkedHashMap[Int,Int] = Map(2 -> 2)
scala> val c = LinkedHashMap(3 -> 2)
c: scala.collection.mutable.LinkedHashMap[Int,Int] = Map(3 -> 2)
scala> a ++ b ++ c
res0: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2, 2 -> 2, 3 -> 2)
Scala's Map (like Java's) does not have a defined iteration order. If you need to maintain insertion order, you can use a ListMap (which is immutable) or a LinkedHashMap (which is not):
scala> import collection.mutable.LinkedHashMap
import collection.mutable.LinkedHashMap
scala> val a = LinkedHashMap(1 -> 2)
a: scala.collection.mutable.LinkedHashMap[Int,Int] = Map(1 -> 2)
scala> a += (2 -> 2)
res0: a.type = Map(1 -> 2, 2 -> 2)
scala> a += (3 -> 2)
res1: a.type = Map(1 -> 2, 2 -> 2, 3 -> 2)
scala> a
res2: scala.collection.mutable.LinkedHashMap[Int,Int] = Map(1 -> 2, 2 -> 2, 3 -> 2)
But in general if you care about the order of your elements, you're probably better off with a different data structure.

Composing two maps

Is there a function in Scala to compose two maps or is flatMap a sensible approach?
scala> val caps: Map[String, Int] = Map(("A", 1), ("B", 2))
caps: Map[String,Int] = Map(A -> 1, B -> 2)
scala> val lower: Map[Int, String] = Map((1, "a"), (2, "b"))
lower: Map[Int,String] = Map(1 -> a, 2 -> b)
scala> caps.flatMap {
| case (cap, idx) => Map((cap, lower(idx)))
| }
res1: scala.collection.immutable.Map[String,String] = Map(A -> a, B -> b)
Some syntactic sugar would be great!
If you know lower will contain keys for all the values in caps, you can use mapValues:
scala> caps mapValues lower
res0: scala.collection.immutable.Map[String,String] = Map(A -> a, B -> b)
If you don't want or need a new collection, just a mapping, it's a little more idiomatic to use andThen:
scala> val composed = caps andThen lower
composed: PartialFunction[String,String] = <function1>
scala> composed("A")
res1: String = a
This also assumes there aren't values in caps that aren't mapped in lower.

How to convert a mutable HashMap into an immutable equivalent in Scala?

Inside a function of mine I construct a result set by filling a new mutable HashMap with data (if there is a better way - I'd appreciate comments). Then I'd like to return the result set as an immutable HashMap. How to derive an immutable from a mutable?
Discussion about returning immutable.Map vs. immutable.HashMap notwithstanding, what about simply using the toMap method:
scala> val m = collection.mutable.HashMap(1 -> 2, 3 -> 4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> m.toMap
res22: scala.collection.immutable.Map[Int,Int] = Map(3 -> 4, 1 -> 2)
As of 2.9, this uses the method toMap in TraversableOnce, which is implemented as follows:
def toMap[T, U](implicit ev: A <:< (T, U)): immutable.Map[T, U] = {
val b = immutable.Map.newBuilder[T, U]
for (x <- self)
b += x
b.result
}
scala> val m = collection.mutable.HashMap(1->2,3->4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> collection.immutable.HashMap() ++ m
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)
or
scala> collection.immutable.HashMap(m.toSeq:_*)
res2: scala.collection.immutable.HashMap[Int,Int] = Map(1 -> 2, 3 -> 4)
If you have a map : logMap: Map[String, String]
just need to do : logMap.toMap()