Inverting a key to values mapping - scala

Lets say I have a set of a class Action like this: actions: Set[Action], and each Action class has a val consequences : Set[Consequence], where Consequence is a case class.
I wish to get a map from Consequence to Set[Action] to determine which actions cause a specific Consequence. Obviously since an Action can have multiple Consequences it can appear in multiple sets in the map.
I have been trying to get my head around this (I am new to Scala), wondering if I can do it with something like map() and groupBy(), but a bit lost. I don't wish to revert to imperative programming, especially if there is some Scala mapping function that can help.
What is the best way to achieve this?

Not exactly elegant because groupBy doesn't handle the case of operating already on a Tuple2, so you end up doing a lot of tupling and untupling:
case class Conseq()
case class Action(conseqs: Set[Conseq])
def gimme(actions: Seq[Action]): Map[Conseq, Set[Action]] =
actions.flatMap(a => a.conseqs.map(_ -> a))
.groupBy(_._1)
.mapValues(_.map(_._2)(collection.breakOut))
The first line "zips" each action with all of its consequences, yielding a Seq[(Conseq, Action)], grouping this by the first product element gives Map[Conseq, Seq[(Conseq, Action)]. So the last step needs to transform the map's values from Seq[(Conseq, Action)] to a Set[Action]. This can be done with mapValues. Without the explicit builder factory, it would produce a Seq[Action], so one would have to write .mapValues(_.map(_._2)).toSet. Passing in collection.breakOut in the second parameter list to map makes it possible to save one step and make map directly produce the Set collection type.
Another possibility is to use nested folds:
def gimme2(actions: Seq[Action]) = (Map.empty[Conseq, Set[Action]] /: actions) {
(m, a) => (m /: a.conseqs) {
(m1, c) => m1.updated(c, m1.getOrElse(c, Set.empty) + a)
}
}
This is perhaps more readable. We start with an empty result map, traverse the actions, and in the inner fold traverse each action's consequences which get merged into the result map.

Related

Monadic way to get first Right to result from getting an Either from items of a list?

Up front: I know how to just write a custom function that will do this, but I swear there's a built-in thing whose name I'm just forgetting, to handle it in an idiomatic way. (Also, in my actual use case I'm likely to be using more complex monads involving state and assorted nonsense, and I feel like the answer I'm looking for will handle those as well, while the hand-coded one would need to be updated.)
I have a list items : List[A] and a function f : (A) -> Either[Error, B]. I vaguely recall there's an easy dedicated function that will apply f to each item in items and then return the first Right(b) to result, without applying f to the remaining items (or return Left[error] of the last error, if nothing succeeds.)
For example, if you had f(items(0)) result in Left("random error"), f(items(1)) result in Right("Find this one!"), and f(items(2)) result in launchTheNukes(); Right("Uh oh."), then the return should be Right("Find this one!") and no nukes should be launched.
It's sort of like what's happening in a for comprehension, where you could do:
for{
res0 <- f(items(0))
res1 <- f(items(1))
res2 <- f(items(2))
} yield res2
Which would return either the first error or the final result - so I want that, but to handle an arbitrary list rather than hard-coded, and returning the first success, not the first error. (The answer I'm looking for might be two functions, one to swap the sides of an Either, and one to automatically chain foldLefts across a list... I think there's a single-step solution though.)
Code snippet for commented solution:
def tester(i : Int) : Either[String, Int] = {if (i % 2 == 0) Right(100 / (4 - i)) else Left(i.toString)}
(1 to 5).collectFirst(tester)
I'm assuming (from your mention of more complex monads such as State) that you're using the Cats library.
You probably want one of the methods that come from Traverse
For example, its sequence and traverse methods are two variations of the "I have a list of things, and a thing-to-monad function, and I want a monad of things". Since Either is a monad whose flatMap aborts early upon encountering a Left, you could .swap your Eithers so that the flatMap aborts early upon encountering a Right, and then .swap the result back at the end.
def tester(i : Int): Either[String, Int] = /* from your question */
val items = (1 to 5).toList
items.traverse(tester(_).swap).swap // Right(50)
val allLeft = List(Left("oh no"), Left("uh oh"))
allLeft.traverse(_.swap).swap // Left(List("oh no", "uh oh"))
Ho about list.iterator.map(f).collectFirst { case Right(x) =>x } (this returns Option(x) of the first Right(x) it finds ... Could return Option(Right(x)) but that seems redundant.
Or you might go back to either:
list.iterator.map(f).collectFirst { case x#Right(_) => x }.getOrElse(Left("oops"))
If you insist on getting the last Left in case there are no Rights (doesn't seem to be very meaningful actually), then it seems like a simple recursive scan (that you said you knew how to write) is the best option.

Map and fold a collection in Scala

I have a fold which iterates through elements, dependently modifies them one by one and at the same time modifies their parent.
Currently I replace the elements in their parent one by one in the fold, they are just few and it is not a real performance issue, but I wonder if there is perhaps a nicer way to express this.
case class Behavior(x: Int) {
def simulate(s: Entity): Behavior = copy(x = x + (if (s.alternate) 2 else 1))
}
case class Entity(alternate: Boolean, behaviors: List[Behavior]) {
def replaceBehavior(o: Behavior, n: Behavior): Entity = {
copy(behaviors = behaviors.patch(behaviors.indexOf(o), Seq(n), 1))
}
def toggleAlternate: Entity = copy(alternate = !alternate)
def simulate: Entity = {
behaviors.foldLeft(this) { (e, b) =>
e.replaceBehavior(b, b.simulate(e)).toggleAlternate
}
}
}
val entity = Entity(false, List(Behavior(10), Behavior(20), Behavior(30)))
entity.simulate
Is there some operation or perhaps some clever use of scan or something like that which would allow me to perform foldLeft and map dependent of the foldLeft result in one step? (I would prefer vanilla standard Scala library, but using functional frameworks is possible too).
Folds (fold, foldLeft, foldRight, ...) usually turn some Collection[A] into B.
You could map over A before folding result to B - foldMap maps A => B and assumed existence of Monoid[B] (this is available in Cats in Foldable typeclass), so you would perform transformation Collection[A] --using f--> Collection[B] --using Monoid[B]--> B (code can optimize it to perform things in one step using e.g. foldLeft internally).
Reversing the order of operations - we fold and then we map - is in general impossible because there is nothing that can let us assume that after fold step we will end up with something that is a Functor.
Depending on your specific use case we might try using foldMap to achieve your goal.

Scala practices: lists and case classes

I've just started using Scala/Spark and having come from a Java background and I'm still trying to wrap my head around the concept of immutability and other best practices of Scala.
This is a very small segment of code from a larger program:
intersections is RDD(Key, (String, String))
obs is (Key, (String, String))
Data is just a case class I've defined above.
val intersections = map1 join map2
var listOfDatas = List[Data]()
intersections take NumOutputs foreach (obs => {
listOfDatas ::= ParseInformation(obs._1.key, obs._2._1, obs._2._2)
})
listOfDatas foreach println
This code works and does what I need it to do, but I was wondering if there was a better way of making this happen. I'm using a variable list and rewriting it with a new list every single time I iterate, and I'm sure there has to be a better way to create an immutable list that's populated with the results of the ParseInformation method call. Also, I remember reading somewhere that instead of accessing the tuple values directly, the way I have done, you should use case classes within functions (as partial functions I think?) to improve readability.
Thanks in advance for any input!
This might work locally, but only because you are takeing locally. It will not work once distributed as the listOfDatas is passed to each worker as a copy. The better way of doing this IMO is:
val processedData = intersections map{case (key, (item1, item2)) => {
ParseInfo(key, item1, item2)
}}
processedData foreach println
A note for a new to functional dev: If all you are trying to do is transform data in an iterable (List), forget foreach. Use map instead, which runs your transformation on each item and spits out a new iterable of the results.
What's the type of intersections? It looks like you can replace foreach with map:
val listOfDatas: List[Data] =
intersections take NumOutputs map (obs => {
ParseInformation(obs._1.key, obs._2._1, obs._2._2)
})

What's the best way to open up a list with 0 or 1 options?

In Scala I have a List with an optional Option. This arises for example when you use for comprehension on a List and your yield returns an Option. In my case I was processing a JSON object and using for comprehension on the list of fields (List[JField]).
What's the best way to open up the list and map List() to None and List(Some(a)) to Some(a)?
A first approach would be
def headOrNone[A](list:List[Option[A]]) =
list match {
case Nil => None
case a::Nil => a
}
Another approach
def headOrNone[A](list:List[Option[A]]) = list.headOption.getOrElse(None)
A third approach (a variation on the headOption implementation)
def headOrNone[A](list:List[Option[A]]) = if (list.isEmpty) None else list.head
I personally prefer the third approach. Is there a better name for this function than headOrNone and what is the idiomatic scala way to write it?
You're solving a problem that probably shouldn't have been created. Instead, you probably want
for (x <- list) yield f(x) // Yields Option
to be
list.flatMap(f)
and then you'll have either zero or one things in your list to begin with (which you can extract using headOption).
How about this:
def headOrNone[A](list: List[Option[A]]) = list.flatten.headOption
headOrNone(List(Some(4))) // Some(4)
headOrNone(List()) // None
Though the first choice has the advantage of giving you an error if you happen to have list with more than one item, which, according to your description, seems like an error condition.
But personally, I would re-evaluate the code that produces the List[Option[A]] and see if there's a way to just have it return the right thing in the first place!

Converting a Scala Map to a List

I have a map that I need to map to a different type, and the result needs to be a List. I have two ways (seemingly) to accomplish what I want, since calling map on a map seems to always result in a map. Assuming I have some map that looks like:
val input = Map[String, List[Int]]("rk1" -> List(1,2,3), "rk2" -> List(4,5,6))
I can either do:
val output = input.map{ case(k,v) => (k.getBytes, v) } toList
Or:
val output = input.foldRight(List[Pair[Array[Byte], List[Int]]]()){ (el, res) =>
(el._1.getBytes, el._2) :: res
}
In the first example I convert the type, and then call toList. I assume the runtime is something like O(n*2) and the space required is n*2. In the second example, I convert the type and generate the list in one go. I assume the runtime is O(n) and the space required is n.
My question is, are these essentially identical or does the second conversion cut down on memory/time/etc? Additionally, where can I find information on storage and runtime costs of various scala conversions?
Thanks in advance.
My favorite way to do this kind of things is like this:
input.map { case (k,v) => (k.getBytes, v) }(collection.breakOut): List[(Array[Byte], List[Int])]
With this syntax, you are passing to map the builder it needs to reconstruct the resulting collection. (Actually, not a builder, but a builder factory. Read more about Scala's CanBuildFroms if you are interested.) collection.breakOut can exactly be used when you want to change from one collection type to another while doing a map, flatMap, etc. — the only bad part is that you have to use the full type annotation for it to be effective (here, I used a type ascription after the expression). Then, there's no intermediary collection being built, and the list is constructed while mapping.
Mapping over a view in the first example could cut down on the space requirement for a large map:
val output = input.view.map{ case(k,v) => (k.getBytes, v) } toList