Combining a filter within a map - scala

I have a list which I am combining to a map in this way, by calling the respective value calculation function. I am using collection.breakout to avoid creating unnecessary intermediate collections since what I am doing is a bit combinatorial, and every little bit of saved iterations helps.
I need to filter out certain tuples from the map, in my case where the value is less than 0. Is it possible to add this to the map itself rather than doing a filter afterwards (thus iterating once again)?
val myMap: Map[Key, Int] = keyList.map(key => key -> computeValue(key))(collection.breakOut)
val myFilteredMap = myMap.filter(_._2 >= 0)
In other words I wish to obtain the second map ideally at one go, so ideally in the first call to map() I filter out the tuples I don't want. Is this possible in any way?

You can easily do this with a foldLeft:
keyList.foldLeft( Map[Key,Int]() ) {
(map, key) =>
val value = computeValue(key)
if ( value >= 0 ) {
map + (key -> value)
} else {
map
}
}

It would probably be best to do a flatMap:
import collection.breakOut
type Key = Int
val keyList = List(-1,0,1,2,3)
def computeValue(i: Int) = i*2
val myMap: Map[Key, Int] =
keyList.flatMap { key =>
val v = computeValue(key)
if (v >= 0) Some(key -> v)
else None
}(breakOut)
You can use collect
val myMap: Map[Key, Int] =
keyList.collect {
case key if computeValue(key) >= 0 => key -> computeValue(key)
}(breakOut)
But that requires re-computing computeValue(key), which is silly. Collect is better when you filter then map.
Or make your own method!:
import scala.collection.generic.CanBuildFrom
import scala.collection.TraversableLike
implicit class EnrichedWithMapfilter[A, Repr](val self: TraversableLike[A, Repr]) extends AnyVal {
def maptofilter[B, That](f: A => B)(p: B => Boolean)(implicit bf: CanBuildFrom[Repr, (A, B), That]): That = {
val b = bf(self.asInstanceOf[Repr])
b.sizeHint(self)
for (x <- self) {
val v = f(x)
if (p(v))
b += x -> f(x)
}
b.result
}
}
val myMap: Map[Key, Int] = keyList.maptofilter(computeValue)(_ >= 0)(breakOut)

Related

map on a TreeMap returns a Map and not a TreeMap in Scala

I am new to Scala, and I am implementing a TreeMap with a multidimensional key like this:
class dimSet (val d:Vector[Int]) extends IndexedSeq[Int] {
def apply(idx:Int) = d(idx)
def length: Int = d.length
}
…
var vals : TreeMap[dimSet, A] = TreeMap[dimSet, A]()(orddimSet)
I have this method
def appOp0(t:TreeMap[dimSet,A], t1:TreeMap[dimSet,A], op:(A,A) => A, unop : (A) => A):TreeMap[dimSet,A] = {
if (t.isEmpty) t1.map((e:Tuple2[dimSet, A]) => (e._1, unop(e._2)))
else if (t1.isEmpty) t.map((t:Tuple2[dimSet, A]) => (t._1, unop(t._2)))
else {
val h = t.head
val h1 = t1.head
if ((h._1) == (h1._1)) appOp0(t.tail, t1.tail, op, unop) + ((h._1, op(h._2, h1._2)))
else if (orddimSet.compare(h._1,h1._1) == 1) appOp0(t, t1.tail, op, unop) + ((h1._1, unop(h1._2)))
else appOp0(t.tail, t1, op, unop) + ((h._1, unop(h._2)))
}
}
But the map method on the TreeMaps (second and third lines) returns a Map, not a TreeMap
I tried on repl with a simplier example and I got this:
scala> val t = TreeMap[dimSet, Double]( (new dimSet(Vector(1,1)), 5.1), (new dimSet(Vector(1,2)), 6.3), (new dimSet(Vector(3,1)), 7.1), (new dimSet(Vector(2,2)), 8.4)) (orddimSet)
scala> val tsq = t.map[(dimSet,Double), TreeMap[dimSet,Double]]((v:Tuple2[dimSet, Double]) => ((v._1, v._2 * v._2)))
<console>:41: error: Cannot construct a collection of type scala.collection.immutable.TreeMap[dimSet,Double] with elements of type (dimSet, Double) based on a collection of type scala.collection.immutable.TreeMap[dimSet,Double].
val tsq = t.map[(dimSet,Double), TreeMap[dimSet,Double]]((v:Tuple2[dimSet, Double]) => ((v._1, v._2 * v._2)))
^
scala> val tsq = t.map((v:Tuple2[dimSet, Double]) => ((v._1, v._2 * v._2)))
tsq: scala.collection.immutable.Map[dimSet,Double] = Map((1, 1) -> 26.009999999999998, (1, 2) -> 39.69, (2, 2) -> 70.56, (3, 1) -> 50.41)
I think CanBuildFrom cannot build my TreeMap as it can do with other TreeMaps, but I couldn't find why, ¿What can I do to return a TreeMap?
Thanks
The problem probably is that there is no implicit Ordering[dimSet] available when you call map. That call requires a CanBuildFrom, which in turn requires an implicit Ordering for TreeMap keys: see in docs.
So make orddimSet implicitly available before calling map:
implicit val ev = orddimSet
if (t.isEmpty) t1.map((e:Tuple2[dimSet, A]) => (e._1, unop(e._2)))
Or you can make an Ordering[dimSet] always automatically implicitly available, if you define an implicit Ordering in dimSet's companion object:
object dimSet {
implicit val orddimSet: Ordering[dimSet] = ??? // you code here
}

Two methods in one scala

Starting my first project with Scala: a poker framework.
So I have the following class
class Card(rank1: CardRank, suit1: Suit){
val rank = rank1
val suit = suit1
}
And a Utils object which contains two methods that do almost the same thing: they count number of cards for each rank or suit
def getSuits(cards: List[Card]) = {
def getSuits(cards: List[Card], suits: Map[Suit, Int]): (Map[Suit, Int]) = {
if (cards.isEmpty)
return suits
val suit = cards.head.suit
val value = if (suits.contains(suit)) suits(suit) + 1 else 1
getSuits(cards.tail, suits + (suit -> value))
}
getSuits(cards, Map[Suit, Int]())
}
def getRanks(cards: List[Card]): Map[CardRank, Int] = {
def getRanks(cards: List[Card], ranks: Map[CardRank, Int]): Map[CardRank, Int] = {
if (cards isEmpty)
return ranks
val rank = cards.head.rank
val value = if (ranks.contains(rank)) ranks(rank) + 1 else 1
getRanks(cards.tail, ranks + (rank -> value))
}
getRanks(cards, Map[CardRank, Int]())
}
Is there any way I can "unify" these two methods in a single one with "field/method-as-parameter"?
Thanks
Yes, that would require high order function (that is, function that takes function as parameter) and type parameters/genericity
def groupAndCount[A,B](elements: List[A], toCount: A => B): Map[B, Int] = {
// could be your implementation, just note key instead of suit/rank
// and change val suit = ... or val rank = ...
// to val key = toCount(card.head)
}
then
def getSuits(cards: List[Card]) = groupAndCount(cards, {c : Card => c.suit})
def getRanks(cards: List[Card]) = groupAndCount(cards, {c: Card => c.rank})
You do not need type parameter A, you could force the method to work only on Card, but that would be a pity.
For extra credit, you can use two parameter lists, and have
def groupAndCount[A,B](elements: List[A])(toCount: A => B): Map[B, Int] = ...
that is a little peculiarity of scala with type inference, if you do with two parameters lists, you will not need to type the card argument when defining the function :
def getSuits(cards: List[Card]) = groupAndCount(cards)(c => c.suit)
or just
def getSuits(cards: List[Card] = groupAndCount(cards)(_.suit)
Of course, the library can help you with the implementation
def groupAndCount[A,B](l: List[A])(toCount: A => B) : Map[A,B] =
l.groupBy(toCount).map{case (k, elems) => (k, elems.length)}
although a hand made implementation might be marginally faster.
A minor note, Card should be declared a case class :
case class Card(rank: CardRank, suit: Suit)
// declaration done, nothing else needed

Map whose keys are compared using eq [duplicate]

I have an object with stores information about specific instances. For that, i would like to use a Map, but as the keys are not by-reference (they aren't, right?) but as hashes provided by the getHashCode method. For better understanding:
import collection.mutable._
import java.util.Random
object Foo {
var myMap = HashMap[AnyRef, Int]()
def doSomething(ar: AnyRef): Int = {
myMap.get(ar) match {
case Some(x) => x
case None => {
myMap += ar -> new Random().nextInt()
doSomething(ar)
}
}
}
}
object Main {
def main(args: Array[String]) {
case class ExampleClass(x: String);
val o1 = ExampleClass("test1")
val o2 = ExampleClass("test1")
println(o2 == o1) // true
println(o2 eq o1) // false
// I want the following two lines to yield different numbers
// and i do not have control over the classes, messing with their
// equals implementation is not possible.
println(Foo.doSomething(o1))
println(Foo.doSomething(o2))
}
}
In cases i have instances with the same hash code the "caching" for the random value will return the same value for both instances even those are not same. Which datastructed is used best in this situation?
Clarification/Edit
I know how this works normally, based on the hashCode and equals method. But that is exactly what I want to avoid. I updated my example to make that clearer. :)
EDIT: Based on clarifications to the question, you can create your own Map implementation, and override elemEquals().
The original implementation (in HashMap)
protected def elemEquals(key1: A, key2: A): Boolean = (key1 == key2)
Change this to:
protected def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
class MyHashMap[A <: AnyRef, B] extends scala.collection.mutable.HashMap[A, B] {
protected override def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
}
Note that to use eq, you need to restrict the key to be an AnyRef, or do a match in the elemEquals() method.
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
val map = new MyHashMap[Foo, String]()
map += (f1 -> "f1")
map += (f2 -> "f2")
map.get(f1) // Some(f1)
map.get(f2) // Some(f2)
--
Original answer
Map works with hashCode() and equals(). Have you implemented equals() correctly in your obejcts? Note that in Scala, == gets translated to a call to equals(). To get the same behaviour of == in Java, use the Scala operator eq
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
f1 == f2 // true
f1.equals(f2) // true
f1 eq f2 // false
val map = new MyHashMap (f1 -> "f1", f2 -> "f2")
map.get(f1) // Some("f2")
map.get(f2) // Some("f2")
Here, the case class implements equals() to be object equivalence, in this case:
f1.i == f1.i
You need to override equals() in your objects to include object equality, i.e something like:
override def equals(o: Any) = { o.asInstanceOf[AnyRef] eq this }
This should still work with the same hashCode().
You can also use IdentityHashMap together with scala.collection.JavaConversions.
Ah based on comment... You could use a wrapper that overrides equal to have reference semantics.
class EqWrap[T <: AnyRef](val value: T) {
override def hashCode() = if (value == null) 0 else value.hashCode
override def equals(a: Any) = a match {
case ref: EqWrap[_] => ref.value eq value
case _ => false
}
}
object EqWrap {
def apply[T <: AnyRef](t: T) = new EqWrap(t)
}
case class A(i: Int)
val x = A(0)
val y = A(0)
val map = Map[EqWrap[A], Int](EqWrap(x) -> 1)
val xx = map.get(EqWrap(x))
val yy = map.get(EqWrap(y))
//xx: Option[Int] = Some(1)
//yy: Option[Int] = None
Original answer (based on not understanding the question - I have to leave this so that the comment makes sense...)
Map already has this semantic (unless I don't understand your question).
scala> val x = A(0)
x: A = A(0)
scala> val y = A(0)
y: A = A(0)
scala> x == y
res0: Boolean = true // objects are equal
scala> x.hashCode
res1: Int = -2081655426
scala> y.hashCode
res2: Int = -2081655426 // same hash code
scala> x eq y
res3: Boolean = false // not the same object
scala> val map = Map(x -> 1)
map: scala.collection.immutable.Map[A,Int] = Map(A(0) -> 1)
scala> map(y)
res8: Int = 1 // return the mapping based on hash code and equal semantic

Is there such a thing as bidirectional maps in Scala?

I'd like to link 2 columns of unique identifiers and be able to get a first column value by a second column value as well as a second column value by a first column value. Something like
Map(1 <-> "one", 2 <-> "two", 3 <-> "three")
Is there such a facility in Scala?
Actually I need even more: 3 columns to select any in a triplet by another in a triplet (individual values will never be met more than once in the entire map). But a 2-column bidirectional map can help too.
Guava has a bimap that you can use along with
import scala.collection.JavaConversions._
My BiMap approach:
object BiMap {
private[BiMap] trait MethodDistinctor
implicit object MethodDistinctor extends MethodDistinctor
}
case class BiMap[X, Y](map: Map[X, Y]) {
def this(tuples: (X,Y)*) = this(tuples.toMap)
private val reverseMap = map map (_.swap)
require(map.size == reverseMap.size, "no 1 to 1 relation")
def apply(x: X): Y = map(x)
def apply(y: Y)(implicit d: BiMap.MethodDistinctor): X = reverseMap(y)
val domain = map.keys
val codomain = reverseMap.keys
}
val biMap = new BiMap(1 -> "A", 2 -> "B")
println(biMap(1)) // A
println(biMap("B")) // 2
Of course one can add syntax for <-> instead of ->.
Here's a quick Scala wrapper for Guava's BiMap.
import com.google.common.{collect => guava}
import scala.collection.JavaConversions._
import scala.collection.mutable
import scala.languageFeature.implicitConversions
class MutableBiMap[A, B] private (
private val g: guava.BiMap[A, B] = new guava.HashBiMap[A, B]()) {
def inverse: MutableBiMap[B, A] = new MutableBiMap[B, A](g.inverse)
}
object MutableBiMap {
def empty[A, B]: MutableBiMap[A, B] = new MutableBiMap()
implicit def toMap[A, B] (x: MutableBiMap[A, B]): mutable.Map[A,B] = x.g
}
I have a really simple BiMap in Scala:
case class BiMap[A, B](elems: (A, B)*) {
def groupBy[X, Y](pairs: Seq[(X, Y)]) = pairs groupBy {_._1} mapValues {_ map {_._2} toSet}
val (left, right) = (groupBy(elems), groupBy(elems map {_.swap}))
def apply(key: A) = left(key)
def apply[C: ClassTag](key: B) = right(key)
}
Usage:
val biMap = BiMap(1 -> "x", 2 -> "y", 3 -> "x", 1 -> "y")
assert(biMap(1) == Set("x", "y"))
assert(biMap("x") == Set(1, 3))
I don't think it exists out of the box, because the generic behavior is not easy to extract
How to handle values matching several keys in a clean api?
However for specific cases here is a good exercise that might help. It must be updated because no hash is used and getting a key or value is O(n).
But the idea is to let you write something similar to what you propose, but using Seq instead of Map...
With the help of implicit and trait, plus find, you could emulate what you need with a kind of clean api (fromKey, fromValue).
The specificities is that a value is not supposed to appear in several places... In this implementation at least.
trait BiMapEntry[K, V] {
def key:K
def value:V
}
trait Sem[K] {
def k:K
def <->[V](v:V):BiMapEntry[K, V] = new BiMapEntry[K, V]() { val key = k; val value = v}
}
trait BiMap[K, V] {
def fromKey(k:K):Option[V]
def fromValue(v:V):Option[K]
}
object BiMap {
implicit def fromInt(i:Int):Sem[Int] = new Sem[Int] {
def k = i
}
implicit def fromSeq[K, V](s:Seq[BiMapEntry[K, V]]) = new BiMap[K, V] {
def fromKey(k:K):Option[V] = s.find(_.key == k).map(_.value)
def fromValue(v:V):Option[K] = s.find(_.value == v).map(_.key)
}
}
object test extends App {
import BiMap._
val a = 1 <-> "a"
val s = Seq(1 <-> "a", 2 <-> "b")
println(s.fromKey(2))
println(s.fromValue("a"))
}
Scala is immutable and values are assigned as reference not copy, so memory footprint will for reference/pointer storage only, which it's better to use to two maps, with type A being key for first and type being B being key for second mapped to B and A respectively, than tun time swapping of maps. And the swapping implementation also has it's own memory footprint and the newly swapped hash-map will also be there in memory till the execution of parent call back and the garbage collector call. And if the the swapping of map is required frequently than virtually your are using equally or more memory than the naive two maps implementation at starting.
One more approach you can try with single map is this(will work only for getting key using mapped value):
def getKeyByValue[A,B](map: Map[A,B], value: B):Option[A] = hashMap.find((a:A,b:B) => b == value)
Code for Scala implementation of find by key:
/** Find entry with given key in table, null if not found.
*/
#deprecatedOverriding("No sensible way to override findEntry as private findEntry0 is used in multiple places internally.", "2.11.0")
protected def findEntry(key: A): Entry =
findEntry0(key, index(elemHashCode(key)))
private[this] def findEntry0(key: A, h: Int): Entry = {
var e = table(h).asInstanceOf[Entry]
while (e != null && !elemEquals(e.key, key)) e = e.next
e
}

abstracting over a collection

Recently, I wrote an iterator for a cartesian product of Anys, and started with a List of List, but recognized, that I can easily switch to the more abstract trait Seq.
I know, you like to see the code. :)
class Cartesian (val ll: Seq[Seq[_]]) extends Iterator [Seq[_]] {
def combicount: Int = (1 /: ll) (_ * _.length)
val last = combicount
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): Seq[_] = {
val res = combination (ll, iter)
iter += 1
res
}
def combination (xx: Seq [Seq[_]], i: Int): List[_] = xx match {
case Nil => Nil
case x :: xs => x (i % x.length) :: combination (xs, i / x.length)
}
}
And a client of that class:
object Main extends Application {
val illi = new Cartesian (List ("abc".toList, "xy".toList, "AB".toList))
// val ivvi = new Cartesian (Vector (Vector (1, 2, 3), Vector (10, 20)))
val issi = new Cartesian (Seq (Seq (1, 2, 3), Seq (10, 20)))
// val iaai = new Cartesian (Array (Array (1, 2, 3), Array (10, 20)))
(0 to 5).foreach (dummy => println (illi.next ()))
// (0 to 5).foreach (dummy => println (issi.next ()))
}
/*
List(a, x, A)
List(b, x, A)
List(c, x, A)
List(a, y, A)
List(b, y, A)
List(c, y, A)
*/
The code works well for Seq and Lists (which are Seqs), but of course not for Arrays or Vector, which aren't of type Seq, and don't have a cons-method '::'.
But the logic could be used for such collections too.
I could try to write an implicit conversion to and from Seq for Vector, Array, and such, or try to write an own, similar implementation, or write an Wrapper, which transforms the collection to a Seq of Seq, and calls 'hasNext' and 'next' for the inner collection, and converts the result to an Array, Vector or whatever. (I tried to implement such workarounds, but I have to recognize: it's not that easy. For a real world problem I would probably rewrite the Iterator independently.)
However, the whole thing get's a bit out of control if I have to deal with Arrays of Lists or Lists of Arrays and other mixed cases.
What would be the most elegant way to write the algorithm in the broadest, possible way?
There are two solutions. The first is to not require the containers to be a subclass of some generic super class, but to be convertible to one (by using implicit function arguments). If the container is already a subclass of the required type, there's a predefined identity conversion which only returns it.
import collection.mutable.Builder
import collection.TraversableLike
import collection.generic.CanBuildFrom
import collection.mutable.SeqLike
class Cartesian[T, ST[T], TT[S]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], seqLike: ST[T] => SeqLike[T, ST[T]], traversableLike: TT[ST[T]] => TraversableLike[ST[T], TT[ST[T]]] ) extends Iterator[ST[T]] {
def combicount (): Int = (1 /: ll) (_ * _.length)
val last = combicount - 1
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): ST[T] = {
val res = combination (ll, iter, cbf())
iter += 1
res
}
def combination (xx: TT[ST[T]], i: Int, builder: Builder[T, ST[T]]): ST[T] =
if (xx.isEmpty) builder.result
else combination (xx.tail, i / xx.head.length, builder += xx.head (i % xx.head.length) )
}
This sort of works:
scala> new Cartesian[String, Vector, Vector](Vector(Vector("a"), Vector("xy"), Vector("AB")))
res0: Cartesian[String,Vector,Vector] = empty iterator
scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res1: Cartesian[String,Array,Array] = empty iterator
I needed to explicitly pass the types because of bug https://issues.scala-lang.org/browse/SI-3343
One thing to note is that this is better than using existential types, because calling next on the iterator returns the right type, and not Seq[Any].
There are several drawbacks here:
If the container is not a subclass of the required type, it is converted to one, which costs in performance
The algorithm is not completely generic. We need types to be converted to SeqLike or TraversableLike only to use a subset of functionality these types offer. So making a conversion function can be tricky.
What if some capabilities can be interpreted differently in different contexts? For example, a rectangle has two 'length' properties (width and height)
Now for the alternative solution. We note that we don't actually care about the types of collections, just their capabilities:
TT should have foldLeft, get(i: Int) (to get head/tail)
ST should have length, get(i: Int) and a Builder
So we can encode these:
trait HasGet[T, CC[_]] {
def get(cc: CC[T], i: Int): T
}
object HasGet {
implicit def seqLikeHasGet[T, CC[X] <: SeqLike[X, _]] = new HasGet[T, CC] {
def get(cc: CC[T], i: Int): T = cc(i)
}
implicit def arrayHasGet[T] = new HasGet[T, Array] {
def get(cc: Array[T], i: Int): T = cc(i)
}
}
trait HasLength[CC] {
def length(cc: CC): Int
}
object HasLength {
implicit def seqLikeHasLength[CC <: SeqLike[_, _]] = new HasLength[CC] {
def length(cc: CC) = cc.length
}
implicit def arrayHasLength[T] = new HasLength[Array[T]] {
def length(cc: Array[T]) = cc.length
}
}
trait HasFold[T, CC[_]] {
def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A
}
object HasFold {
implicit def seqLikeHasFold[T, CC[X] <: SeqLike[X, _]] = new HasFold[T, CC] {
def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A = cc.foldLeft(zero)(op)
}
implicit def arrayHasFold[T] = new HasFold[T, Array] {
def foldLeft[A](cc: Array[T], zero: A)(op: (A, T) => A): A = {
var i = 0
var result = zero
while (i < cc.length) {
result = op(result, cc(i))
i += 1
}
result
}
}
}
(strictly speaking, HasFold is not required since its implementation is in terms of length and get, but i added it here so the algorithm will translate more cleanly)
now the algorithm is:
class Cartesian[T, ST[_], TT[Y]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], stHasLength: HasLength[ST[T]], stHasGet: HasGet[T, ST], ttHasFold: HasFold[ST[T], TT], ttHasGet: HasGet[ST[T], TT], ttHasLength: HasLength[TT[ST[T]]]) extends Iterator[ST[T]] {
def combicount (): Int = ttHasFold.foldLeft(ll, 1)((a,l) => a * stHasLength.length(l))
val last = combicount - 1
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): ST[T] = {
val res = combination (ll, 0, iter, cbf())
iter += 1
res
}
def combination (xx: TT[ST[T]], j: Int, i: Int, builder: Builder[T, ST[T]]): ST[T] =
if (ttHasLength.length(xx) == j) builder.result
else {
val head = ttHasGet.get(xx, j)
val headLength = stHasLength.length(head)
combination (xx, j + 1, i / headLength, builder += stHasGet.get(head, (i % headLength) ))
}
}
And use:
scala> new Cartesian[String, Vector, List](List(Vector("a"), Vector("xy"), Vector("AB")))
res6: Cartesian[String,Vector,List] = empty iterator
scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res7: Cartesian[String,Array,Array] = empty iterator
Scalaz probably has all of this predefined for you, unfortunately, I don't know it well.
(again I need to pass the types because inference doesn't infer the right kind)
The benefit is that the algorithm is now completely generic and that there is no need for implicit conversions from Array to WrappedArray in order for it to work
Excercise: define for tuples ;-)