How to compute inverse of a multi-map - scala

I have a Scala Map:
x: [b,c]
y: [b,d,e]
z: [d,f,g,h]
I want inverse of this map for look-up.
b: [x,y]
c: [x]
d: [x,z] and so on.
Is there a way to do it without using in-between mutable maps
If its not a multi-map - Then following works:
typeMap.flatMap { case (k, v) => v.map(vv => (vv, k))}

EDIT: fixed answer to include what Marth rightfully pointed out. My answer is a bit more lenghty than his as I try to go through each step and not use the magic provided by flatMaps for educational purposes, his is more straightforward :)
I'm unsure about your notation. I assume that what you have is something like:
val myMap = Map[T, Set[T]] (
x -> Set(b, c),
y -> Set(b, d, e),
z -> Set(d, f, g, h)
)
You can achieve the reverse lookup as follows:
val instances = for {
keyValue <- myMap.toList
value <- keyValue._2
}
yield (value, keyValue._1)
At this point, your instances variable is a List of the type:
(b, x), (c, x), (b, y) ...
If you now do:
val groupedLookups = instances.groupBy(_._1)
You get:
b -> ((b, x), (b, y)),
c -> ((c, x)),
d -> ((d, y), (d, z)) ...
Now we want to reduce the values so that they only contain the second part of each pair. Therefore we do:
val reverseLookup = groupedLookup.map(_._1 -> _._2.map(_._2))
Which means that for every pair we maintain the original key, but we map the list of arguments to something that only has the second value of the pair.
And there you have your result.
(You can also avoid assigning to an intermediate result, but I thought it was clearer like this)

Here is my simplification as a function:
def reverseMultimap[T1, T2](map: Map[T1, Seq[T2]]): Map[T2, Seq[T1]] =
map.toSeq
.flatMap { case (k, vs) => vs.map((_, k)) }
.groupBy(_._1)
.mapValues(_.map(_._2))
The above was derived from #Diego Martinoia's answer, corrected and reproduced below in function form:
def reverseMultimap[T1, T2](myMap: Map[T1, Seq[T2]]): Map[T2, Seq[T1]] = {
val instances = for {
keyValue <- myMap.toList
value <- keyValue._2
} yield (value, keyValue._1)
val groupedLookups = instances.groupBy(_._1)
val reverseLookup = groupedLookups.map(kv => kv._1 -> kv._2.map(_._2))
reverseLookup
}

Related

Merge two LinkedHashMap in Scala

Having this code
def mergeWith[K, X, Y, Z](xs: mutable.LinkedHashMap[K, X], ys: mutable.LinkedHashMap[K, Y])(f: (X, Y) => Z): mutable.LinkedHashMap[K, Z] =
xs.flatMap {
case (k, x) => ys.get(k).map(k -> f(x, _))
}
it gives me this:
val map1 = LinkedHashMap(4 -> (4), 7 -> (4,7))
val map2 = LinkedHashMap(3 -> (3), 6 -> (3,6), 7 -> (3,7))
val merged = mergeWith(map1,map2){ (x, y) => (x, y) }
merged: scala.collection.mutable.LinkedHashMap[Int,(Any, Any)] = Map(7 -> ((4,7),(3,7)))
But what i want is this:
merged: scala.collection.mutable.LinkedHashMap[Int,(Any, Any)] = Map(3 -> (3), 4 -> (4), 6 -> (3,6), 7 -> ((4,7),(3,7)))
How to modify my code to obtain it?
It can't be done with the current mergeWith() signature. In particular, you're trying to create a LinkedHashMap[K,Z] but there is no Z input. The only way to get a Z is to invoke f() which requires both X and Y as passed parameters.
So if xs is type LinkedHashMap[Int,Char] and has element (2 -> 'w'), and ys is type LinkedHashMap[Int,Long] and has element (8 -> 4L), how are you going to invoke f(c:Char, l:Long) so that you have a [K,Z] entry for both keys 2 and 8? Not possible.
If the mergeWith() signature can be simplified you might do something like this.
def mergeWith[K,V](xs: collection.mutable.LinkedHashMap[K, V]
,ys: collection.mutable.LinkedHashMap[K, V]
)(f: (V, V) => V): collection.mutable.LinkedHashMap[K,V] = {
val ns = collection.mutable.LinkedHashMap[K,V]()
(xs.keySet ++ ys.keySet).foreach{ k =>
if (!xs.isDefinedAt(k)) ns.update(k, ys(k))
else if (!ys.isDefinedAt(k)) ns.update(k, xs(k))
else ns.update(k, f(xs(k), ys(k)))
}
ns
}
This produces the desired result for the example you've given, but it has a number of undesirable qualities, not the least of which is the mutable data structures.
BTW, there is no such thing as a Tuple1 so (4) is the same thing as 4. And whenever you see type Any, it's a pretty good sign that your design needs a re-think.

Scala type mismatch when adding elements to hashmap

I am representing a graph's adjacency list in Scala in the variable a.
val a = new HashMap[Int, Vector[Tuple2[Int, Int]]] withDefaultValue Vector.empty
for(i <- 1 to N) {
val Array(x, y, r) = readLine.split(" ").map(_.toInt)
a(x) += new Tuple2(y, r)
a(y) += new Tuple2(x, r)
}
I am reading each edge in turn(x and y are nodes, while r is the cost of the edge). After reading it, I am adding it to the adjacency list.
However, when adding the Tuples containing a neighbouring node and a cost to the HashMap I get:
Solution.scala:17: error: type mismatch;
found : (Int, Int)
required: String
a(x) += new Tuple2(y, r)
I don't understand why it wants String. I haven't specified String anywhere.
+= is the operator for concatenating to a String.
You would probably want to do something like: a.update(x, a.getOrElse(x, Vector()) :+ (x, r)).
Also, you are writing Java code in Scala. It compiles, but amounts to abuse of the language :/
Consider doing something like this next time:
val a = Range(1, N)
.map { _ => readline.split(" ").map (_.toInt) }
.flatMap { case Array(x, y, r) =>
Seq(x -> (y, r), y -> (x, r))
}
.groupBy(_._1)
.mapValues { _.map ( _._2) }

Scala: How to map a subset of a seq to a shorter seq

I am trying to map a subset of a sequence using another (shorter) sequence while preserving the elements that are not in the subset. A toy example below tries to give a flower to females only:
def giveFemalesFlowers(people: Seq[Person], flowers: Seq[Flower]): Seq[Person] = {
require(people.count(_.isFemale) == flowers.length)
magic(people, flowers)(_.isFemale)((p, f) => p.withFlower(f))
}
def magic(people: Seq[Person], flowers: Seq[Flower])(predicate: Person => Boolean)
(mapping: (Person, Flower) => Person): Seq[Person] = ???
Is there an elegant way to implement the magic?
Use an iterator over flowers, consume one each time the predicate holds; the code would look like this,
val it = flowers.iterator
people.map ( p => if (predicate(p)) p.withFlowers(it.next) else p )
What about zip (aka zipWith) ?
scala> val people = List("m","m","m","f","f","m","f")
people: List[String] = List(m, m, m, f, f, m, f)
scala> val flowers = List("f1","f2","f3")
flowers: List[String] = List(f1, f2, f3)
scala> def comb(xs:List[String],ys:List[String]):List[String] = (xs,ys) match {
| case (x :: xs, y :: ys) if x=="f" => (x+y) :: comb(xs,ys)
| case (x :: xs,ys) => x :: comb(xs,ys)
| case (Nil,Nil) => Nil
| }
scala> comb(people, flowers)
res1: List[String] = List(m, m, m, ff1, ff2, m, ff3)
If the order is not important, you can get this elegant code:
scala> val (men,women) = people.partition(_=="m")
men: List[String] = List(m, m, m, m)
women: List[String] = List(f, f, f)
scala> men ++ (women,flowers).zipped.map(_+_)
res2: List[String] = List(m, m, m, m, ff1, ff2, ff3)
I am going to presume you want to retain all the starting people (not simply filter out the females and lose the males), and in the original order, too.
Hmm, bit ugly, but what I came up with was:
def giveFemalesFlowers(people: Seq[Person], flowers: Seq[Flower]): Seq[Person] = {
require(people.count(_.isFemale) == flowers.length)
people.foldLeft((List[Person]() -> flowers)){ (acc, p) => p match {
case pp: Person if pp.isFemale => ( (pp.withFlower(acc._2.head) :: acc._1) -> acc._2.tail)
case pp: Person => ( (pp :: acc._1) -> acc._2)
} }._1.reverse
}
Basically, a fold-left, initialising the 'accumulator' with a pair made up of an empty list of people and the full list of flowers, then cycling through the people passed in.
If the current person is female, pass it the head of the current list of flowers (field 2 of the 'accumulator'), then set the updated accumulator to be the updated person prepended to the (growing) list of processed people, and the tail of the (shrinking) list of flowers.
If male, just prepend to the list of processed people, leaving the flowers unchanged.
By the end of the fold, field 2 of the 'accumulator' (the flowers) should be an empty list, while field one holds all the people (with any females having each received their own flower), in reverse order, so finish with ._1.reverse
Edit: attempt to clarify the code (and substitute a test more akin to #elm's to replace the match, too) - hope that makes it clearer what is going on, #Felix! (and no, no offence taken):
def giveFemalesFlowers(people: Seq[Person], flowers: Seq[Flower]): Seq[Person] = {
require(people.count(_.isFemale) == flowers.length)
val start: (List[Person], Seq[Flower]) = (List[Person](), flowers)
val result: (List[Person], Seq[Flower]) = people.foldLeft(start){ (acc, p) =>
val (pList, fList) = acc
if (p.isFemale) {
(p.withFlower(fList.head) :: pList, fList.tail)
} else {
(p :: pList, fList)
}
}
result._1.reverse
}
I'm obviously missing something but isn't it just
people map {
case p if p.isFemale => p.withFlower(f)
case p => p
}

Idiomatic Scala for applying functions in a chain if Option(s) are defined

Is there a pre-existing / Scala-idiomatic / better way of accomplishing this?
def sum(x: Int, y: Int) = x + y
var x = 10
x = applyOrBypass(target=x, optValueToApply=Some(22), sum)
x = applyOrBypass(target=x, optValueToApply=None, sum)
println(x) // will be 32
My applyOrBypass could be defined like this:
def applyOrBypass[A, B](target: A, optValueToApply: Option[B], func: (A, B) => A) = {
optValueToApply map { valueToApply =>
func(target, valueToApply)
} getOrElse {
target
}
}
Basically I want to apply operations depending on wether certain Option values are defined or not. If they are not, I should get the pre-existing value. Ideally I would like to chain these operations and not having to use a var.
My intuition tells me that folding or reducing would be involved, but I am not sure how it would work. Or maybe there is another approach with monadic-fors...
Any suggestions / hints appreciated!
Scala has a way to do this with for comprehensions (The syntax is similar to haskell's do notation if you are familiar with it):
(for( v <- optValueToApply )
yield func(target, v)).getOrElse(target)
Of course, this is more useful if you have several variables that you want to check the existence of:
(for( v1 <- optV1
; v2 <- optV2
; v3 <- optV3
) yield func(target, v1, v2, v3)).getOrElse(target)
If you are trying to accumulate a value over a list of options, then I would recommend a fold, so your optional sum would look like this:
val vs = List(Some(1), None, None, Some(2), Some(3))
(target /: vs) ( (x, v) => x + v.getOrElse(0) )
// => 6 + target
You can generalise this, under the condition that your operation func has some identity value, identity:
(target /: vs) ( (x, v) => func(x, v.getOrElse(identity)) )
Mathematically speaking this condition is that (func, identity) forms a Monoid. But that's by-the-by. The actual effect is that whenever a None is reached, applying func to it and x will always produce x, (None's are ignored, and Some values are unwrapped and applied as normal), which is what you want.
What I would do in a case like this is use partially applied functions and identity:
def applyOrBypass[A, B](optValueToApply: Option[B], func: B => A => A): A => A =
optValueToApply.map(func).getOrElse(identity)
You would apply it like this:
def sum(x: Int)(y: Int) = x + y
var x = 10
x = applyOrBypass(optValueToApply=Some(22), sum)(x)
x = applyOrBypass(optValueToApply=None, sum)(x)
println(x)
Yes, you can use fold. If you have multiple optional operands, there are some useful abstractions in the Scalaz library I believe.
var x = 10
x = Some(22).fold(x)(sum(_, x))
x = None .fold(x)(sum(_, x))
If you have multiple functions, it can be done with Scalaz.
There are several ways to do it, but here is one of the most concise.
First, add your imports:
import scalaz._, Scalaz._
Then, create your functions (this way isn't worth it if your functions are always the same, but if they are different, it makes sense)
val s = List(Some(22).map((i: Int) => (j: Int) => sum(i,j)),
None .map((i: Int) => (j: Int) => multiply(i,j)))
Finally, apply them all:
(s.flatten.foldMap(Endo(_)))(x)

Functional equivalent of if (p(f(a), f(b)) a else b

I'm guessing that there must be a better functional way of expressing the following:
def foo(i: Any) : Int
if (foo(a) < foo(b)) a else b
So in this example f == foo and p == _ < _. There's bound to be some masterful cleverness in scalaz for this! I can see that using BooleanW I can write:
p(f(a), f(b)).option(a).getOrElse(b)
But I was sure that I would be able to write some code which only referred to a and b once. If this exists it must be on some combination of Function1W and something else but scalaz is a bit of a mystery to me!
EDIT: I guess what I'm asking here is not "how do I write this?" but "What is the correct name and signature for such a function and does it have anything to do with FP stuff I do not yet understand like Kleisli, Comonad etc?"
Just in case it's not in Scalaz:
def x[T,R](f : T => R)(p : (R,R) => Boolean)(x : T*) =
x reduceLeft ((l, r) => if(p(f(l),f(r))) r else l)
scala> x(Math.pow(_ : Int,2))(_ < _)(-2, 0, 1)
res0: Int = -2
Alternative with some overhead but nicer syntax.
class MappedExpression[T,R](i : (T,T), m : (R,R)) {
def select(p : (R,R) => Boolean ) = if(p(m._1, m._2)) i._1 else i._2
}
class Expression[T](i : (T,T)){
def map[R](f: T => R) = new MappedExpression(i, (f(i._1), f(i._2)))
}
implicit def tupleTo[T](i : (T,T)) = new Expression(i)
scala> ("a", "bc") map (_.length) select (_ < _)
res0: java.lang.String = a
I don't think that Arrows or any other special type of computation can be useful here. Afterall, you're calculating with normal values and you can usually lift a pure computation that into the special type of computation (using arr for arrows or return for monads).
However, one very simple arrow is arr a b is simply a function a -> b. You could then use arrows to split your code into more primitive operations. However, there is probably no reason for doing that and it only makes your code more complicated.
You could for example lift the call to foo so that it is done separately from the comparison. Here is a simiple definition of arrows in F# - it declares *** and >>> arrow combinators and also arr for turning pure functions into arrows:
type Arr<'a, 'b> = Arr of ('a -> 'b)
let arr f = Arr f
let ( *** ) (Arr fa) (Arr fb) = Arr (fun (a, b) -> (fa a, fb b))
let ( >>> ) (Arr fa) (Arr fb) = Arr (fa >> fb)
Now you can write your code like this:
let calcFoo = arr <| fun a -> (a, foo a)
let compareVals = arr <| fun ((a, fa), (b, fb)) -> if fa < fb then a else b
(calcFoo *** calcFoo) >>> compareVals
The *** combinator takes two inputs and runs the first and second specified function on the first, respectively second argument. >>> then composes this arrow with the one that does comparison.
But as I said - there is probably no reason at all for writing this.
Here's the Arrow based solution, implemented with Scalaz. This requires trunk.
You don't get a huge win from using the arrow abstraction with plain old functions, but it is a good way to learn them before moving to Kleisli or Cokleisli arrows.
import scalaz._
import Scalaz._
def mod(n: Int)(x: Int) = x % n
def mod10 = mod(10) _
def first[A, B](pair: (A, B)): A = pair._1
def selectBy[A](p: (A, A))(f: (A, A) => Boolean): A = if (f.tupled(p)) p._1 else p._2
def selectByFirst[A, B](f: (A, A) => Boolean)(p: ((A, B), (A, B))): (A, B) =
selectBy(p)(f comap first) // comap adapts the input to f with function first.
val pair = (7, 16)
// Using the Function1 arrow to apply two functions to a single value, resulting in a Tuple2
((mod10 &&& identity) apply 16) assert_≟ (6, 16)
// Using the Function1 arrow to perform mod10 and identity respectively on the first and second element of a `Tuple2`.
val pairs = ((mod10 &&& identity) product) apply pair
pairs assert_≟ ((7, 7), (6, 16))
// Select the tuple with the smaller value in the first element.
selectByFirst[Int, Int](_ < _)(pairs)._2 assert_≟ 16
// Using the Function1 Arrow Category to compose the calculation of mod10 with the
// selection of desired element.
val calc = ((mod10 &&& identity) product) ⋙ selectByFirst[Int, Int](_ < _)
calc(pair)._2 assert_≟ 16
Well, I looked up Hoogle for a type signature like the one in Thomas Jung's answer, and there is on. This is what I searched for:
(a -> b) -> (b -> b -> Bool) -> a -> a -> a
Where (a -> b) is the equivalent of foo, (b -> b -> Bool) is the equivalent of <. Unfortunately, the signature for on returns something else:
(b -> b -> c) -> (a -> b) -> a -> a -> c
This is almost the same, if you replace c with Bool and a in the two places it appears, respectively.
So, right now, I suspect it doesn't exist. It occured to me that there's a more general type signature, so I tried it as well:
(a -> b) -> ([b] -> b) -> [a] -> a
This one yielded nothing.
EDIT:
Now I don't think I was that far at all. Consider, for instance, this:
Data.List.maximumBy (on compare length) ["abcd", "ab", "abc"]
The function maximumBy signature is (a -> a -> Ordering) -> [a] -> a, which, combined with on, is pretty close to what you originally specified, given that Ordering is has three values -- almost a boolean! :-)
So, say you wrote on in Scala:
def on[A, B, C](f: ((B, B) => C), g: A => B): (A, A) => C = (a: A, b: A) => f(g(a), g(b))
The you could write select like this:
def select[A](p: (A, A) => Boolean)(a: A, b: A) = if (p(a, b)) a else b
And use it like this:
select(on((_: Int) < (_: Int), (_: String).length))("a", "ab")
Which really works better with currying and dot-free notation. :-) But let's try it with implicits:
implicit def toFor[A, B](g: A => B) = new {
def For[C](f: (B, B) => C) = (a1: A, a2: A) => f(g(a1), g(a2))
}
implicit def toSelect[A](t: (A, A)) = new {
def select(p: (A, A) => Boolean) = t match {
case (a, b) => if (p(a, b)) a else b
}
}
Then you can write
("a", "ab") select (((_: String).length) For (_ < _))
Very close. I haven't figured any way to remove the type qualifier from there, though I suspect it is possible. I mean, without going the way of Thomas answer. But maybe that is the way. In fact, I think on (_.length) select (_ < _) reads better than map (_.length) select (_ < _).
This expression can be written very elegantly in Factor programming language - a language where function composition is the way of doing things, and most code is written in point-free manner. The stack semantics and row polymorphism facilitates this style of programming. This is what the solution to your problem will look like in Factor:
# We find the longer of two lists here. The expression returns { 4 5 6 7 8 }
{ 1 2 3 } { 4 5 6 7 8 } [ [ length ] bi# > ] 2keep ?
# We find the shroter of two lists here. The expression returns { 1 2 3 }.
{ 1 2 3 } { 4 5 6 7 8 } [ [ length ] bi# < ] 2keep ?
Of our interest here is the combinator 2keep. It is a "preserving dataflow-combinator", which means that it retains its inputs after the given function is performed on them.
Let's try to translate (sort of) this solution to Scala.
First of all, we define an arity-2 preserving combinator.
scala> def keep2[A, B, C](f: (A, B) => C)(a: A, b: B) = (f(a, b), a, b)
keep2: [A, B, C](f: (A, B) => C)(a: A, b: B)(C, A, B)
And an eagerIf combinator. if being a control structure cannot be used in function composition; hence this construct.
scala> def eagerIf[A](cond: Boolean, x: A, y: A) = if(cond) x else y
eagerIf: [A](cond: Boolean, x: A, y: A)A
Also, the on combinator. Since it clashes with a method with the same name from Scalaz, I'll name it upon instead.
scala> class RichFunction2[A, B, C](f: (A, B) => C) {
| def upon[D](g: D => A)(implicit eq: A =:= B) = (x: D, y: D) => f(g(x), g(y))
| }
defined class RichFunction2
scala> implicit def enrichFunction2[A, B, C](f: (A, B) => C) = new RichFunction2(f)
enrichFunction2: [A, B, C](f: (A, B) => C)RichFunction2[A,B,C]
And now put this machinery to use!
scala> def length: List[Int] => Int = _.length
length: List[Int] => Int
scala> def smaller: (Int, Int) => Boolean = _ < _
smaller: (Int, Int) => Boolean
scala> keep2(smaller upon length)(List(1, 2), List(3, 4, 5)) |> Function.tupled(eagerIf)
res139: List[Int] = List(1, 2)
scala> def greater: (Int, Int) => Boolean = _ > _
greater: (Int, Int) => Boolean
scala> keep2(greater upon length)(List(1, 2), List(3, 4, 5)) |> Function.tupled(eagerIf)
res140: List[Int] = List(3, 4, 5)
This approach does not look particularly elegant in Scala, but at least it shows you one more way of doing things.
There's a nice-ish way of doing this with on and Monad, but Scala is unfortunately very bad at point-free programming. Your question is basically: "can I reduce the number of points in this program?"
Imagine if on and if were differently curried and tupled:
def on2[A,B,C](f: A => B)(g: (B, B) => C): ((A, A)) => C = {
case (a, b) => f.on(g, a, b)
}
def if2[A](b: Boolean): ((A, A)) => A = {
case (p, q) => if (b) p else q
}
Then you could use the reader monad:
on2(f)(_ < _) >>= if2
The Haskell equivalent would be:
on' (<) f >>= if'
where on' f g = uncurry $ on f g
if' x (y,z) = if x then y else z
Or...
flip =<< flip =<< (if' .) . on (<) f
where if' x y z = if x then y else z