Processing sequence with duplicates concurrently - scala

Suppose I've got a function fab: A => B , a sequence of A and need to get a sequence of pairs (A, B) like this:
def foo(fab: A => B, as: Seq[A]): Seq[(A, B)] = as.zip(as.map(fab))
Now I want to run fab concurrently using scala.concurrent.Future but I want to run fab only once for all duplicate elements in as. For instance,
val fab: A => B = ...
val a1: A = ...
val a2: A = ...
val as = a1 :: a1 :: a2 :: a1 :: a2 :: Nil
foo(fab, as) // invokes fab twice and run these invocations concurrently
How would you implement it ?

def foo[A, B](as: Seq[A])(f: A => B)(implicit exc: ExecutionContext)
: Future[Seq[(A, B)]] = {
Future
.traverse(as.toSet)(a => Future((a, (a, f(a)))))
.map(abs => as map abs.toMap)
}
Explanation:
as.toSet ensures that f is invoked only once for each a
The (a, (a, f(a))) gives you a set with nested tuples of shape (a, (a, b))
Mapping the original sequence of as by a Map with pairs (a, (a, b)) gives you a sequence of (a, b)s.
Since your f is not asynchronous anyway, and since you don't mind using futures, you might consider using par-collections as well:
def foo2[A, B](as: Seq[A])(f: A => B): Seq[(A, B)] = {
as map as.toSet.par.map((a: A) => a -> (a, f(a))).seq.toMap
}

Related

flatmapping a nested Map in scala

Suppose I have val someMap = Map[String -> Map[String -> String]] defined as such:
val someMap =
Map(
("a1" -> Map( ("b1" -> "c1"), ("b2" -> "c2") ) ),
("a2" -> Map( ("b3" -> "c3"), ("b4" -> "c4") ) ),
("a3" -> Map( ("b5" -> "c5"), ("b6" -> "c6") ) )
)
and I would like to flatten it to something that looks like
List(
("a1","b1","c1"),("a1","b2","c2"),
("a2","b3","c3"),("a2","b4","c4"),
("a3","b5","c5"),("a3","b6","c6")
)
What is the most efficient way of doing this? I was thinking about creating some helper function that processes each (a_i -> Map(String,String)) key value pair and return
def helper(key: String, values: Map[String -> String]): (String,String,String)
= {val sublist = values.map(x => (key,x._1,x._2))
return sublist
}
then flatmap this function over someMap. But this seems somewhat unnecessary to my novice scala eyes, so I was wondering if there was a more efficient way to parse this Map.
No need to create helper function just write nested lambda:
val result = someMap.flatMap { case (k, v) => v.map { case (k1, v1) => (k, k1, v1) } }
Or
val y = someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2)))
Since you're asking about efficiency, the most efficient yet functional approach I can think of is using foldLeft and foldRight.
You need foldRight since :: constructs the immutable list in reverse.
someMap.foldRight(List.empty[(String, String, String)]) { case ((a, m), acc) =>
m.foldRight(acc) {
case ((b, c), acc) => (a, b, c) :: acc
}
}
Here, assuming Map.iterator.reverse is implemented efficiently, no intermediate collections are created.
Alternatively, you can use foldLeft and then reverse the result:
someMap.foldLeft(List.empty[(String, String, String)]) { case (acc, (a, m)) =>
m.foldLeft(acc) {
case (acc, (b, c)) => (a, b, c) :: acc
}
}.reverse
This way a single intermediate List is created, but you don't rely on the implementation of the reversed iterator (foldLeft uses forward iterator).
Note: one liners, such as someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2))) are less efficient, as, in addition to the temporary buffer to hold intermediate results of flatMap, they create and discard additional intermediate collections for each inner map.
UPD
Since there seems to be some confusion, I'll clarify what I mean. Here is an implementation of map, flatMap, foldLeft and foldRight from TraversibleLike:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = { // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = bf(repr)
b.sizeHint(this)
b
}
val b = builder
for (x <- this) b += f(x)
b.result
}
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = bf(repr) // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = builder
for (x <- this) b ++= f(x).seq
b.result
}
def foldLeft[B](z: B)(op: (B, A) => B): B = {
var result = z
this foreach (x => result = op(result, x))
result
}
def foldRight[B](z: B)(op: (A, B) => B): B =
reversed.foldLeft(z)((x, y) => op(y, x))
It's clear that map and flatMap create intermediate buffer using corresponding builder, while foldLeft and foldRight reuse the same user-supplied accumulator object, and only use iterators.

Programming a state monad in Scala

The theory of how a state monad looks like I borrow from Philip Wadler's Monads for Functional Programming:
type M a = State → (a, State)
type State = Int
unit :: a → M a
unit a = λx. (a, x)
(*) :: M a → (a → M b) → M b
m * k = λx.
let (a, y) = m x in
let (b, z) = k a y in
(b, z)
The way I would like to use a state monad is as follows:
Given a list L I want different parts of my code to get this list and update this list by adding new elements at its end.
I guess the above would be modified as:
type M = State → (List[Data], State)
type State = List[Data]
def unit(a: List[Data]) = (x: State) => (a,x)
def star(m: M, k: List[Data] => M): M = {
(x: M) =>
val (a,y) = m(x)
val (b,z) = k(a)(y)
(b,z)
}
def get = ???
def update = ???
How do I fill in the details, i.e.?
How do I instantiate my hierarchy to work on a concrete list?
How do I implement get and update in terms of the above?
Finally, how would I do this using Scala's syntax with flatMap and unit?
Your M is defined incorrectly. It should take a/A as a parameter, like so:
type M[A] = State => (A, State)
You've also missed that type parameter elsewhere.
unit should have a signature like this:
def unit[A](a: A): M[A]
star should have a signature like this:
def star[A, B](m: M[A], k: A => M[B]): M[B]
Hopefully, that makes the functions more clear.
Your implementation of unit was pretty much the same:
def unit[A](a: A): M[A] = x => (a, x)
However, in star, the parameter of your lambda (x) is of type State, not M, because M[B] is basically State => (A, State). The rest you got right:
def star[A, B](m: M[A])(k: A => M[B]): M[B] =
(x: State) => {
val (a, y) = m(x)
val (b, z) = k(a)(y)
(b, z)
}
Edit: According to #Luis Miguel Mejia Suarez:
It would probably be easier to implement if you make your State a class and define flatMap inside it. And you can define unit in the companion object.
He suggested final class State[S, A](val run: S => (A, S)), which would also allow you to use infix functions like >>=.
Another way to do it would be to define State as a type alias for a function S => (A, S) and extend it using an implicit class.
type State[S, A] = S => (A, S)
object State {
//This is basically "return"
def unit[S, A](a: A): State[S, A] = s => (a, s)
}
implicit class StateOps[S, A](private runState: S => (A, S)) {
//You can rename this to ">>=" or "flatMap"
def *[B](k: A => State[S, B]): State[S, B] = s => {
val (a, s2) = runState(s)
k(a)(s2)
}
}
If your definition of get is
set the result value to the state and leave the state unchanged
(borrowed from Haskell Wiki), then you can implement it like this:
def get[S]: State[S, S] = s => (s, s)
If you mean that you want to extract the state (in this case a List[Data]), you can use execState (define it in StateOps):
def execState(s: S): S = runState(s)._2
Here's a terrible example of how you can add elements to a List.
def addToList(n: Int)(list: List[Int]): ((), List[Int]) = ((), n :: list)
def fillList(n: Int): State[List[Int], ()] =
n match {
case 0 => s => ((), s)
case n => fillList(n - 1) * (_ => addToList(n))
}
println(fillList(10)(List.empty)) gives us this (the second element can be extracted with execState):
((),List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1))

Getting and setting state in scala

Here is some code from the Functional Programming in Scala book:
import State._
case class State[S, +A](run: S => (A, S)) {
def map[B](f: A => B): State[S, B] =
flatMap(a => unit(f(a)))
def map2[B, C](sb: State[S, B])(f: (A, B) => C): State[S, C] =
flatMap(a => sb.map(b => f(a, b)))
def flatMap[B](f: A => State[S, B]): State[S, B] = State(s => {
val (a, s1) = run(s)
f(a).run(s1)
})
}
object State {
type Rand[A] = State[RNG, A]
def unit[S, A](a: A): State[S, A] =
State(s => (a, s))
// The idiomatic solution is expressed via foldRight
def sequenceViaFoldRight[S, A](sas: List[State[S, A]]): State[S, List[A]] =
sas.foldRight(unit[S, List[A]](List.empty[A]))((f, acc) => f.map2(acc)(_ :: _))
// This implementation uses a loop internally and is the same recursion
// pattern as a left fold. It is quite common with left folds to build
// up a list in reverse order, then reverse it at the end.
// (We could also use a collection.mutable.ListBuffer internally.)
def sequence[S, A](sas: List[State[S, A]]): State[S, List[A]] = {
def go(s: S, actions: List[State[S, A]], acc: List[A]): (List[A], S) =
actions match {
case Nil => (acc.reverse, s)
case h :: t => h.run(s) match {
case (a, s2) => go(s2, t, a :: acc)
}
}
State((s: S) => go(s, sas, List()))
}
// We can also write the loop using a left fold. This is tail recursive like the
// previous solution, but it reverses the list _before_ folding it instead of after.
// You might think that this is slower than the `foldRight` solution since it
// walks over the list twice, but it's actually faster! The `foldRight` solution
// technically has to also walk the list twice, since it has to unravel the call
// stack, not being tail recursive. And the call stack will be as tall as the list
// is long.
def sequenceViaFoldLeft[S, A](l: List[State[S, A]]): State[S, List[A]] =
l.reverse.foldLeft(unit[S, List[A]](List()))((acc, f) => f.map2(acc)(_ :: _))
def modify[S](f: S => S): State[S, Unit] = for {
s <- get // Gets the current state and assigns it to `s`.
_ <- set(f(s)) // Sets the new state to `f` applied to `s`.
} yield ()
def get[S]: State[S, S] = State(s => (s, s))
def set[S](s: S): State[S, Unit] = State(_ => ((), s))
}
I have spent hours thinking about why the get and set methods look the way they do, but I just don't understand.
Could anyone enlighten me, please?
The key is on the 3rd line:
case class State[S, +A](run: S => (A, S))
The stateful computation is expressed with the run function. This function represent a transition from one state S to another state S. A is a value we could produce when moving from one state to the other.
Now, how can we take the state S out of the state-monad? We could make a transition that doesn't go to a different state and we materialise the state as A with the function s => (s, s):
def get[S]: State[S, S] = State(s => (s, s))
How to set the state? All we need is a function that goes to a state s: ??? => (???, s):
def set[S](s: S): State[S, Unit] = State(_ => ((), s))
EDIT I would like to add an example to see get and set in action:
val statefullComputationsCombined = for {
a <- State.get[Int]
b <- State.set(10)
c <- State.get[Int]
d <- State.set(100)
e <- State.get[Int]
} yield (a, c, e)
Without looking further down this answer, what is the type of statefullComputationsCombined?
Must be a State[S, A] right? S is of type Int but what is A? Because we are yielding (a, c, e) must be a 3-tuple made by the As of the flatmap steps (<-).
We said that get "fill" A with the state S so the a, c ,d are of type S, so Int. b, d are Unit because def set[S](s: S): State[S, Unit].
val statefullComputationsCombined: State[Int, (Int, Int, Int)] = for ...
To use statefullComputationsCombined we need to run it:
statefullComputationsCombined.run(1)._1 == (1,10,100)
If we want the state at the end of the computation:
statefullComputationsCombined.run(1)._2 == 100

How to compose two different `State Monad`?

When I learn State Monad, I'm not sure how to compose two functions with different State return types.
State Monad definition:
case class State[S, A](runState: S => (S, A)) {
def flatMap[B](f: A => State[S, B]): State[S, B] = {
State(s => {
val (s1, a) = runState(s)
val (s2, b) = f(a).runState(s1)
(s2, b)
})
}
def map[B](f: A => B): State[S, B] = {
flatMap(a => {
State(s => (s, f(a)))
})
}
}
Two different State types:
type AppendBang[A] = State[Int, A]
type AddOne[A] = State[String, A]
Two methods with differnt State return types:
def addOne(n: Int): AddOne[Int] = State(s => (s + ".", n + 1))
def appendBang(str: String): AppendBang[String] = State(s => (s + 1, str + " !!!"))
Define a function to use the two functions above:
def myAction(n: Int) = for {
a <- addOne(n)
b <- appendBang(a.toString)
} yield (a, b)
And I hope to use it like this:
println(myAction(1))
The problem is myAction is not compilable, it reports some error like this:
Error:(14, 7) type mismatch;
found : state_monad.State[Int,(Int, String)]
required: state_monad.State[String,?]
b <- appendBang(a.toString)
^
How can I fix it? Do I have to define some Monad transformers?
Update: The question may be not clear, let me give an example
Say I want to define another function, which uses addOne and appendBang internally. Since they all need existing states, I have to pass some to it:
def myAction(n: Int)(addOneState: String, appendBangState: Int): ((String, Int), String) = {
val (addOneState2, n2) = addOne(n).runState(addOneState)
val (appendBangState2, n3) = appendBang(n2.toString).runState(appendBangState)
((addOneState2, appendBangState2), n3)
}
I have to run addOne and appendBang one by one, passing and getting the states and result manually.
Although I found it can return another State, the code is not improved much:
def myAction(n: Int): State[(String, Int), String] = State {
case (addOneState: String, appendBangState: Int) =>
val (addOneState2, n2) = addOne(n).runState(addOneState)
val (appendBangState2, n3) = appendBang(n2.toString).runState( appendBangState)
((addOneState2, appendBangState2), n3)
}
Since I'm not quite familiar with them, just wondering is there any way to improve it. The best hope is that I can use for comprehension, but not sure if that's possible
Like I mentioned in my first comment, it will be impossible to use a for comprehension to do what you want, because it can not change the type of the state (S).
Remember that a for comprehension can be translated to a combination of flatMaps, withFilter and one map. If we look at your State.flatMap, it takes a function f to change a State[S,A] into State[S, B]. We can use flatMap and map (and thus a for comprehension) to chain together operations on the same state, but we can't change the type of the state in this chain.
We could generalize your last definition of myAction to combine, compose, ... two functions using state of a different type. We can try to implement this generalized compose method directly in our State class (although this is probably so specific, it probably doesn't belong in State). If we look at State.flatMap and myAction we can see some similarities:
We first call runState on our existing State instance.
We then call runState again
In myAction we first use the result n2 to create a State[Int, String] (AppendBang[String] or State[S2, B]) using the second function (appendBang or f) on which we then call runState. But our result n2 is of type String (A) and our function appendBang needs an Int (B) so we need a function to convert A into B.
case class State[S, A](runState: S => (S, A)) {
// flatMap and map
def compose[B, S2](f: B => State[S2, B], convert: A => B) : State[(S, S2), B] =
State( ((s: S, s2: S2) => {
val (sNext, a) = runState(s)
val (s2Next, b) = f(convert(a)).runState(s2)
((sNext, s2Next), b)
}).tupled)
}
You then could define myAction as :
def myAction(i: Int) = addOne(i).compose(appendBang, _.toString)
val twoStates = myAction(1)
// State[(String, Int),String] = State(<function1>)
twoStates.runState(("", 1))
// ((String, Int), String) = ((.,2),2 !!!)
If you don't want this function in your State class you can create it as an external function :
def combineStateFunctions[S1, S2, A, B](
a: A => State[S1, A],
b: B => State[S2, B],
convert: A => B
)(input: A): State[(S1, S2), B] = State(
((s1: S1, s2: S2) => {
val (s1Next, temp) = a(input).runState(s1)
val (s2Next, result) = b(convert(temp)).runState(s2)
((s1Next, s2Next), result)
}).tupled
)
def myAction(i: Int) =
combineStateFunctions(addOne, appendBang, (_: Int).toString)(i)
Edit : Bergi's idea to create two functions to lift a State[A, X] or a State[B, X] into a State[(A, B), X].
object State {
def onFirst[A, B, X](s: State[A, X]): State[(A, B), X] = {
val runState = (a: A, b: B) => {
val (nextA, x) = s.runState(a)
((nextA, b), x)
}
State(runState.tupled)
}
def onSecond[A, B, X](s: State[B, X]): State[(A, B), X] = {
val runState = (a: A, b: B) => {
val (nextB, x) = s.runState(b)
((a, nextB), x)
}
State(runState.tupled)
}
}
This way you can use a for comprehension, since the type of the state stays the same ((A, B)).
def myAction(i: Int) = for {
x <- State.onFirst(addOne(i))
y <- State.onSecond(appendBang(x.toString))
} yield y
myAction(1).runState(("", 1))
// ((String, Int), String) = ((.,2),2 !!!)

Convert expression to polish notation in Scala

I would like to convert an expression such as: a.meth(b) to a function of type (A, B) => C that performs that exact computation.
My best attempt so far was along these lines:
def polish[A, B, C](symb: String): (A, B) => C = { (a, b) =>
// reflectively check if "symb" is a method defined on a
// if so, reflectively call symb, passing b
}
And then use it like this:
def flip[A, B, C](f : (A, B) => C): (B, A) => C = {(b, a) => f(a,b)}
val op = flip(polish("::"))
def reverse[A](l: List[A]): List[A] = l reduceLeft op
As you can pretty much see, it is quite ugly and you have to do a lot of type checking "manually".
Is there an alternative ?
You can achieve it easily with plain old subtype polymorphism. Just declare interface
trait Iface[B, C] {
def meth(b: B): C
}
Then you could implement polish easily
def polish[B, C](f: (Iface[B, C], B) => C): (Iface[B, C], B) => C = { (a, b) =>
f(a, b)
}
Using it is completely typesafe
object IfaceImpl extends Iface[String, String] {
override def meth(b: String): String = b.reverse
}
polish((a: Iface[String, String], b: String) => a meth b)(IfaceImpl, "hello")
Update:
Actually, you could achieve it using closures only
def polish[A, B, C](f: (A, B) => C): (A, B) => C = f
class Foo {
def meth(b: String): String = b.reverse
}
polish((_: Foo) meth (_: String))(new Foo, "hello")
Or without helper function at all :)
val polish = identity _ // Magic at work
((_: Foo) meth (_: String))(new Foo, "hello")