This doesn't work:
val res = myOption flatMap (value => Seq(value, “blo”))
But this yes:
val res = myOption.toSeq flatMap (value => Seq(value, “blo”))
Don't you think flatMap on Options should take a GenTraversableOnce just like Seq does?
Or this code is bad for the readability and I should use a match or a map/getOrElse?
Edit: We also get the same issue on for/yield.
Cheers
Option.flatMap returns an Option, which is "like" a sequence, but cannot contain more than one element. If it was allowed to take a function, that returned a Seq, and it returned a Seq containing more than one element, what would be the return value of flatMap (remember, it needs to be an Option)?
Why does flatMap need to return an option in the first place? Well, all flatMap implementations return the same type they started with. It makes sense: if I have an Option of something, and want to transform the contents somehow, the most common use case is that I want to end up with another Option. If flatMap returned me a Seq, would would I do? .headOption? That is not a very good idea, because it would potentially silently discard data. if(seq.size < 2) seq.headOption else throw ....? Well, this is a little bit better, but looks ugly, and isn't enforceable at compile time.
Converting an Option to a Seq when you need it on the other hand, is very easy and entirely safe: just do .toSeq.
Overall semantic of flatMap is to work like monadic bind method, that means it tends to have signature like
[A]this:T[A].flatMap[B](f: A => T[B]): T[B]
Sometimes (SeqLike) this signature is generalized to
[A]this:T[A].flatMap[B](f: A => F[B]): T[B]
where F[B] is something easily convertible to T[B]
So not only Option, but also concurrent.Future, util.Try and BindOps - extension syntax for scalaz monads have method flatMap that does not accept any traversable, but only the same wrapper type .
I.e. flatMap is more a thing from monads world, not from collections
Related
I have been looking into FP languages (off and on) for some time and have played with Scala, Haskell, F#, and some others. I like what I see and understand some of the fundamental concepts of FP (with absolutely no background in Category Theory - so don't talk Math, please).
So, given a type M[A] we have map which takes a function A=>B and returns a M[B]. But we also have flatMap which takes a function A=>M[B] and returns a M[B]. We also have flatten which takes a M[M[A]] and returns a M[A].
In addition, many of the sources I have read describe flatMap as map followed by flatten.
So, given that flatMap seems to be equivalent to flatten compose map, what is its purpose? Please don't say it is to support 'for comprehensions' as this question really isn't Scala-specific. And I am less concerned with the syntactic sugar than I am in the concept behind it. The same question arises with Haskell's bind operator (>>=). I believe they both are related to some Category Theory concept but I don't speak that language.
I have watched Brian Beckman's great video Don't Fear the Monad more than once and I think I see that flatMap is the monadic composition operator but I have never really seen it used the way he describes this operator. Does it perform this function? If so, how do I map that concept to flatMap?
BTW, I had a long writeup on this question with lots of listings showing experiments I ran trying to get to the bottom of the meaning of flatMap and then ran into this question which answered some of my questions. Sometimes I hate Scala implicits. They can really muddy the waters. :)
FlatMap, known as "bind" in some other languages, is as you said yourself for function composition.
Imagine for a moment that you have some functions like these:
def foo(x: Int): Option[Int] = Some(x + 2)
def bar(x: Int): Option[Int] = Some(x * 3)
The functions work great, calling foo(3) returns Some(5), and calling bar(3) returns Some(9), and we're all happy.
But now you've run into the situation that requires you to do the operation more than once.
foo(3).map(x => foo(x)) // or just foo(3).map(foo) for short
Job done, right?
Except not really. The output of the expression above is Some(Some(7)), not Some(7), and if you now want to chain another map on the end you can't because foo and bar take an Int, and not an Option[Int].
Enter flatMap
foo(3).flatMap(foo)
Will return Some(7), and
foo(3).flatMap(foo).flatMap(bar)
Returns Some(15).
This is great! Using flatMap lets you chain functions of the shape A => M[B] to oblivion (in the previous example A and B are Int, and M is Option).
More technically speaking; flatMap and bind have the signature M[A] => (A => M[B]) => M[B], meaning they take a "wrapped" value, such as Some(3), Right('foo), or List(1,2,3) and shove it through a function that would normally take an unwrapped value, such as the aforementioned foo and bar. It does this by first "unwrapping" the value, and then passing it through the function.
I've seen the box analogy being used for this, so observe my expertly drawn MSPaint illustration:
This unwrapping and re-wrapping behavior means that if I were to introduce a third function that doesn't return an Option[Int] and tried to flatMap it to the sequence, it wouldn't work because flatMap expects you to return a monad (in this case an Option)
def baz(x: Int): String = x + " is a number"
foo(3).flatMap(foo).flatMap(bar).flatMap(baz) // <<< ERROR
To get around this, if your function doesn't return a monad, you'd just have to use the regular map function
foo(3).flatMap(foo).flatMap(bar).map(baz)
Which would then return Some("15 is a number")
It's the same reason you provide more than one way to do anything: it's a common enough operation that you may want to wrap it.
You could ask the opposite question: why have map and flatten when you already have flatMap and a way to store a single element inside your collection? That is,
x map f
x filter p
can be replaced by
x flatMap ( xi => x.take(0) :+ f(xi) )
x flatMap ( xi => if (p(xi)) x.take(0) :+ xi else x.take(0) )
so why bother with map and filter?
In fact, there are various minimal sets of operations you need to reconstruct many of the others (flatMap is a good choice because of its flexibility).
Pragmatically, it's better to have the tool you need. Same reason why there are non-adjustable wrenches.
The simplest reason is to compose an output set where each entry in the input set may produce more than one (or zero!) outputs.
For example, consider a program which outputs addresses for people to generate mailers. Most people have one address. Some have two or more. Some people, unfortunately, have none. Flatmap is a generalized algorithm to take a list of these people and return all of the addresses, regardless of how many come from each person.
The zero output case is particularly useful for monads, which often (always?) return exactly zero or one results (think Maybe- returns zero results if the computation fails, or one if it succeeds). In that case you want to perform an operation on "all of the results", which it just so happens may be one or many.
The "flatMap", or "bind", method, provides an invaluable way to chain together methods that provide their output wrapped in a Monadic construct (like List, Option, or Future). For example, suppose you have two methods that produce a Future of a result (eg. they make long-running calls to databases or web service calls or the like, and should be used asynchronously):
def fn1(input1: A): Future[B] // (for some types A and B)
def fn2(input2: B): Future[C] // (for some types B and C)
How to combine these? With flatMap, we can do this as simply as:
def fn3(input3: A): Future[C] = fn1(a).flatMap(b => fn2(b))
In this sense, we have "composed" a function fn3 out of fn1 and fn2 using flatMap, which has the same general structure (and so can be composed in turn with further similar functions).
The map method would give us a not-so-convenient - and not readily chainable - Future[Future[C]]. Certainly we can then use flatten to reduce this, but the flatMap method does it in one call, and can be chained as far as we wish.
This is so useful a way of working, in fact, that Scala provides the for-comprehension as essentially a short-cut for this (Haskell, too, provides a short-hand way of writing a chain of bind operations - I'm not a Haskell expert, though, and don't recall the details) - hence the talk you will have come across about for-comprehensions being "de-sugared" into a chain of flatMap calls (along with possible filter calls and a final map call for the yield).
Well, one could argue, you don't need .flatten either. Why not just do something like
#tailrec
def flatten[T](in: Seq[Seq[T], out: Seq[T] = Nil): Seq[T] = in match {
case Nil => out
case head ::tail => flatten(tail, out ++ head)
}
Same can be said about map:
#tailrec
def map[A,B](in: Seq[A], out: Seq[B] = Nil)(f: A => B): Seq[B] = in match {
case Nil => out
case head :: tail => map(tail, out :+ f(head))(f)
}
So, why are .flatten and .map provided by the library? Same reason .flatMap is: convenience.
There is also .collect, which is really just
list.filter(f.isDefinedAt _).map(f)
.reduce is actually nothing more then list.foldLeft(list.head)(f),
.headOption is
list match {
case Nil => None
case head :: _ => Some(head)
}
Etc ...
Consider the following type:
case class Subscriber(books: List[String])
And its instance wrapped in an option:
val s = Option(Subscriber(List("one", "two", "three")))
And an attempt to println all books for a subscriber:
s.flatMap(_.books).foreach(println(_))
This fails, due to:
Error: type mismatch;
found : List[String]
required: Option[?]
result.flatMap(_.books).foreach(println(_))
^
This is kind of expected, because flatMap must return types compatible to its source object and one can easily avoid this error by doing:
s.toList.flatMap(_.books).foreach(println(_))
I could also avoid flatMap, but its not the point.
But isn't there some smart method of achieving this without explicit toList conversion? Intuitively, None and List.empty have a lot in common. And during compilation of s.flatMap, s is implicitly converted to Traversable.
Something in scalaz, maybe?
I think smart implementation of flatamap is a bit missleading.
Such an implementation, as flatmap is well defined, would break the monadic behavior one expects when using flatmap and be in fact broken.
The problem with the smart method is
How should the compiler know what kind of type to use.
How should it be possible to infer if you wanted a List[_] or and Option[_] as a result?
What to do with other types, unknown types and what kind of conversion to apply on them ?
You could achieve the same result you would with the .list in that way , because for given example types don't matter (except you are aiming for a list of Units) at all as there is just output as a side effect:
s foreach ( _.books foreach println )
A direction to go, might be an implicit conversion using structural types but you would get a performance penalty.
The smarter way to do that would be is by using map.
val s = Option(Subscriber(List("one", "two", "three")))
s.map(_.books.map(println))
foreach has side effects while map doesn't
Which is better in practice? Having an optional List or having optional items in the list?
Currently I'm following an optional list.
List[Option[T]] or Option[List[T]]?
Edit:
The problem I'm running into is that i have crud operations that i'm returning optional types from. I have a situation where I have a method that does a single lookup and i want to leverage it to make a function to return a list. I'm still getting my feet wet with scala so I'm curious what the best practice is.
Example:
def findOne(id: Int): Option[T]
regardless of implementation I want to use it for something like these, but which is better? They both seem to be weird to map from. Maybe there's something i'm missing all together:
def find(ids: List[Int]) : Option[List[T]]
vs
def find(ids: List[Int]) : List[Option[T]]
Those two types mean very different things. List[Option[T]] looks like an intermediate result that would you would flatten. (I'd look into using flatMap in this case.)
The second type, Option[List[T]] says there may or may not be a list. This would be a good type to use when you need to distinguish between the "no result" case and the "result is an empty list" case.
I can't think of a situation where both types would make sense.
If you want to retrieve several things that might exist, and it's sensible for some of them to exist and some of them to not exist, it's List[Option[T]] - a list of several entries, each of which is present or not. This would make sense in e.g. a "search" situation, where you want to display whichever ones exist, possibly only some of the requested things. You could implement that method as:
def find(ids: List[Int]) = ids map findOne
If you're using Option to represent something like an an "error" case, and you want "if any of them failed then the whole thing is a failure", then you want Option[List[T]] - either a complete list, or nothing at all. You could implement that, using Scalaz, as:
def find(ids: List[Int]) = ids traverse findOne
DaoWen already got to the point regarding your considerations.
List[Option[T]] doesn't even encode more information than List[T] without the implicit knowledge that your list of ids is in the same order than your result list.
I'd actualy favour
def find(ids: Seq[Int]): Seq[T]
or
def find(ids: Seq[Int]): Option[NonEmptyList[T]]
where NonEmptyList is a type of sequence that actually enforces being not empty, e.g. in Scalaz. Unless you really wanna point out the difference between None and some empty list.
Btw, you might wanna use the more general Seq[T] for your interface instead of List[T]
Suppose I have some type with an associative binary operation that feels a lot like append except that the operation may fail. For example, here's a wrapper for List[Int] that only allows us to "add" lists with the same length:
case class Foo(xs: List[Int]) {
def append(other: Foo): Option[Foo] =
if (xs.size != other.xs.size) None else Some(
Foo(xs.zip(other.xs).map { case (a, b) => a + b })
)
}
This is a toy example, but one of the things it has in common with my real use case is that we could in principle use the type system to make the operation total—in this case by tracking the length of the lists with something like Shapeless's Sized, so that adding lists of unequal lengths would be a compile-time error instead of a runtime failure. That's not too bad, but in my real use case managing the constraints in the type system would require a lot more work and isn't really practical.
(In my use case I have a sensible identity, unlike in this toy example, but we can ignore that for now.)
Is there some principled way to do this kind of thing? Searching for a -> a -> m a or a -> a -> Maybe a on Hoogle doesn't turn up anything interesting. I know I can write an ad-hoc append method that just returns its result wrapped in whatever type I'm using to model failure, but it'd be nice to have something more general that would give me foldMap1, etc. for free—especially since this isn't the first time I've found myself wanting this kind of thing.
I'm a relative Scala beginner and would like some advice on how to proceed on an implementation that seems like it can be done either with a function returning Option or with PartialFunction. I've read all the related posts I could find (see bottom of question), but these seem to involve the technical details of using PartialFunction or converting one to the other; I am looking for an answer of the type "if the circumstances are X,Y,Z, then use A else B, but also consider C".
My example use case is a path search between locations using a library of path finders. Say the locations are of type L, a path was of type P and the desired path search result would be an Iterable[P]. The patch search result should be assembled by asking all the path finders (in something like Google maps these might be Bicycle, Car, Walk, Subway, etc.) for their path suggestions, which may or may not be defined for a particular start/end location pair.
There seem to be two ways to go about this:
(a) define a path finder as f: (L,L) => Option[P] and then get the result via something like finders.map( _.apply(l1,l2) ).filter( _.isDefined ).map( _.get )
(b) define a path finder as f: PartialFunction[(L,L),P] and then get the result via something likefinders.filter( _.isDefined( (l1,l2) ) ).map( _.apply( (l1,l2)) )`
It seems like using a function returning Option[P] would avoid double evaluation of results, so for an expensive computation this may be preferable unless one caches the results. It also seems like using Option one can have an arbitrary input signature, whereas PartialFunction expects a single argument. But I am particularly interested in hearing from someone with practical experience about less immediate, more "bigger picture" considerations, such as the interaction with the Scala library. Would using a PartialFunction have significant benefits in making available certain methods of the collections API that might pay off in other ways? Would such code generally be more concise?
Related but different questions:
Inverse of PartialFunction's lift method
Is the PartialFunction design inefficient?
How to convert X => Option[R] to PartialFunction[X,R]
Is there a nicer way of lifting a PartialFunction in Scala?
costly computation occuring in both isDefined and Apply of a PartialFunction
It's not all that well known, but since 2.8 Scala has a collect method defined on it's collections. collect is similar to filter, but takes a partial function and has the semantics you describe.
It feels like Option might fit your use case better.
My interpretation is that Partial Functions work well to be combined over input ranges. So if f is defined over (SanDiego,Irvine) and g is defined over (Paris,London) then you can get a function that is defined over the combined input (SanDiego,Irvine) and (Paris,London) by doing f orElse g.
But in your case it seems, things happen for a given (l1,l2) location tuple and then you do some work...
If you find yourself writing a lot of {case (L,M) => ... case (P,Q) => ...} then it may be the sign that partial functions are a better fit.
Otherwise options work well with the rest of the collections and can be used like this instead of your (a) proposal:
val processedPaths = for {
f <- finders
p <- f(l1, l2)
} yield process(p)
Within the for comprehension p is lifted into an Traversable, so you don't even have to call filter, isDefined or get to skip the finders without results.