Option as a singleton collection - real life use cases

Option as a singleton collection - real life use cases - scala

The title pretty much sums it up. Option as a singleton collection can sometimes be confusing, but sometimes it allows for an interesting application. I have one example on top of my head, and would like to learn more of such examples.
My only example is running for comprehension on the Option[List[T]]. We can do the following:
val v = Some(List(1, 2, 3))
for {
list <- v.toList
elem <- list
} yield elem + 1
Without having Option.toList, it wouldn't be possible to stay in the same for comprehension, and I'd be forced to write something like this:
for {
list <- v
} yield for {
elem <- list
} yield elem + 1
The first example is cleaner, and it's an advantage of Option being a collection. Of course, the result type will be different in these 2 examples, but let's assume it doesn't matter for the sake of discussion.
Any other examples? I'd especially like to concentrate on collection-like usage, and not usage of Option's monadic properties - those are pretty much obvious. In other words, map and flatMap functions are out of scope of this question. They're definitely very useful, just coming from elsewhere.

I find that working with Option[T] as a collection's main benefit is that you get to use operations defined on a collection, such as map, flatmap, filter, foreach etc. This makes it easier to do operations on a given option, instead of using pattern matching or checking Option[T].isDefined to see if a value exists.
For example, let's take the user repository example from Daniel Westheide blog post about Option[T]:
Say you have a UserRepository object which returns users based on their ID. The user may or may not exist, hence it returns an Option[Person]. Now let's say we want to search a person by id and then filter their age. We can do:
val age: Some[Int] = UserRepository.findById(1).map(_.age)
Now let's say that a Person also has a gender property of type Option[String]. If you wanted to extract that out, you could use map:
val gender: Option[Option[String]] = UserRepository.findById(1).map(_.gender)
But working with nested options isn't too convenient. For that, you have flatMap:
val gender: Option[String] = UserRepository.findById(1).flatMap(_.gender)
And if we want to print out the gender if it exists, we can use foreach:
gender.foreach(println)
You'll find yourself working with scala types that have nested Option[T] fields defined and it's really handy to have collection like methods which help you remove out boilerplate and noise for extracting the actual value out of the operation.
A more real life use case I just encountered the other day was working with the awscala SDK, where I wanted to retrieve an object from S3 storage:
val bucket: Option[Bucket] = s3.bucket(amazonConfig.bucketName)
val result: Option[S3Object] = bucket.flatMap(_.get(amazonConfig.offsetKey))
result.flatMap(s3Object =>
Source.fromInputStream(s3Object.content).mkString.decodeOption[Array[KafkaOffset]])
So what happens here is that you query the S3 service for a bucket, which may or may not exist. Then, you want to extract an S3Object out of it which actually contains the data, but the API itself returns an Option[S3Object], so it's handy to use flatMap to flat out get an Option[S3Object] instead of Option[Option[S3Object]]. Finally, I want to deserialize the S3Object which actually contains a JSON, and using the Argonaut library, it returns an Option[MyObject], so then again using flatMap to the rescue of extracting the inner option type.
Edit:
As you pointed out, map and flatMap belong to the monadic property of Option[T]. I've written a blog post describing the reduction of two options where the final solution was:
def reduce[T](a: Option[T], b: Option[T], f: (T, T) => T): Option[T] = {
(a ++ b).reduceLeftOption(f)
}
Which takes advantage of the ++ operator defined on any collection which is also specifically defined on Option[T], being a collection.

I'd suggest to take a look at the corresponding chapter of The Neophyte's Guide to Scala.
In my experience, most useful use-cases of Option-as-collection are to filter an option and to make flatMap that implicitly filters None values.

Related

Possible to make use of Scala's Option flatMap method more concise?

I'm admittedly very new to Scala, and I'm having trouble with the syntactical sugar I see in many Scala examples.
It often results in a very concise statement, but honestly so far (for me) a bit unreadable.
So I wish to take a typical use of the Option class, safe-dereferencing, as a good place to start for understanding, for example, the use of the underscore in a particular example I've seen.
I found a really nice article showing examples of the use of Option to avoid the case of null.
https://medium.com/#sinisalouc/demystifying-the-monad-in-scala-cc716bb6f534#.fhrljf7nl
He describes a use as so:
trait User {
val child: Option[User]
}
By the way, you can also write those functions as in-place lambda
functions instead of defining them a priori. Then the code becomes
this:
val result = UserService.loadUser("mike")
.flatMap(user => user.child)
.flatMap(user => user.child)
That looks great! Maybe not as concise as one can do in groovy, but not bad.
So I thought I'd try to apply it to a case I am trying to solve.
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
The Person has an PID which is of type Id. The Id type consists of two String types; the Id-Type and the Id-Value.
I've used the Scala console to test the following:
class Id(val idCode : String, val idVal : String)
class Person(val pid : Id, val name : String)
val anId: Id = new Id("Passport_number", "12345")
val person: Person = new Person(anId, "Sean")
val operson : Option[Person] = Some(person)
OK. That setup my person and it's optional instance.
I learned from the above linked article that I could get the Persons Id-Val by using flatMap; Like this:
val result = operson.flatMap(person => Some(person.pid)).flatMap(pid => Some(pid.idVal)).getOrElse("NoValue")
Great! That works. And if I infact have no person, my result is "NoValue".
I used flatMap (and not Map) because, unless I misunderstand (and my tests with Map were incorrect) if I use Map I have to provide an alternate or default Person instance. That I didn't want to have to do.
OK, so, flatMap is the way to go.
However, that is really not a very concise statement.
If I were writing that in more of a groovy style, I guess i'd be able to do something like this:
val result = person?.pid.idVal
Wow, that's a bit nicer!
Surely Scala has a means to provide something at least nearly as nice as Groovy?
In the above linked example, he was able to make his statement more concise using some of that syntactical sugar I mentioned before. The underscore:
or even more concise:
val result = UserService.loadUser("mike")
.flatMap(_.child)
.flatMap(_.child)
So, it seems in this case the underscore character allows you to skip specifying the type (as the type is inferred) and replace it with underscore.
However, when I try the same thing with my example:
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Scala complains.
<console>:15: error: missing parameter type for expanded function ((x$2) => x$2.idVal)
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Can someone help me along here?
How am I misunderstanding this?
Is there a short-hand method of writing my above lengthy statement?
Is flatMap the best way to achieve what I am after? Or is there a better more concise and/or readable way to do it ?
thanks in advance!

Why do you insist on using flatMap? I'd just use map for your example instead:
val result = operson.map(_.pid).map(_.idVal).getOrElse("NoValue")
or even shorter:
val result = operson.map(_.pid.idVal).getOrElse("NoValue")
You should only use flatMap with functions that return Options. Your pid and idVals are not Options, so just map them instead.
You said
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
This is the essential difference between your example and the User example. In the User example, both the existence of a User instance, and its child field are options. This is why, to get a child, you need to flatMap. However, since in your example, only the existence of a Person is not guaranteed, after you've retrieved an Option[Person], you can safely map to any of its fields.
Think of flatMap as a map, followed by a flatten (hence its name). If I mapped on child:
val ouser = Some(new User())
val child: Option[Option[User]] = ouser.map(_.child)
I would end up with an Option[Option[User]]. I need to flatten that to a single Option level, that's why I use flatMap in the first place.

If you looking for the most concise solution, consider this:
val result = operson.fold("NoValue")(_.pid.idVal)
Though one could find it not clear or confusing

Scala: Option[Seq[String]] vs Seq[Option[String]]?

I'm creating a method to retrieve a list of users from a database by ID.
I'm trying to decide on the pros and cons of declaring the ids parameter as Option[Seq[String]] vs Seq[Option[String]]?
In what cases should I favour one over the other?

A list of users in neither well represented as an Option[Seq[String]] nor as a Seq[Option[String]]. I would expect something like a List[User] as a list of users. Or maybe a Vector or Seq
If your string represents your user, and the None case does nothing, you could consider filtering those out. You can do this with
val dbresult: Seq[Option[String]] = ???
val strings = dbresult collect { case Some(str) => str }
or
val strings = dbresult.flatten
but it's difficult to give good advice without knowing what the Option[String] or Option[Seq] represents

As usual this strongly depends on the use case.
A Seq[Option[String]] will be useful if the size of the sequence is relevant (eg., because you want to zip it with another sequence).
If this is not the case I would opt for flattening the sequence in order to just have a Seq[String]. This will likely be a better choice than Option[Seq[String]], as the sequence can also be of zero length.
In fact an Option can usually be treated as if it where an array that can have either length zero or one. Therefore wrapping an Iterable in an Option often only adds unnecessary complexity.

List/Array with optionals

Which is better in practice? Having an optional List or having optional items in the list?
Currently I'm following an optional list.
List[Option[T]] or Option[List[T]]?
Edit:
The problem I'm running into is that i have crud operations that i'm returning optional types from. I have a situation where I have a method that does a single lookup and i want to leverage it to make a function to return a list. I'm still getting my feet wet with scala so I'm curious what the best practice is.
Example:
def findOne(id: Int): Option[T]
regardless of implementation I want to use it for something like these, but which is better? They both seem to be weird to map from. Maybe there's something i'm missing all together:
def find(ids: List[Int]) : Option[List[T]]
vs
def find(ids: List[Int]) : List[Option[T]]

Those two types mean very different things. List[Option[T]] looks like an intermediate result that would you would flatten. (I'd look into using flatMap in this case.)
The second type, Option[List[T]] says there may or may not be a list. This would be a good type to use when you need to distinguish between the "no result" case and the "result is an empty list" case.
I can't think of a situation where both types would make sense.

If you want to retrieve several things that might exist, and it's sensible for some of them to exist and some of them to not exist, it's List[Option[T]] - a list of several entries, each of which is present or not. This would make sense in e.g. a "search" situation, where you want to display whichever ones exist, possibly only some of the requested things. You could implement that method as:
def find(ids: List[Int]) = ids map findOne
If you're using Option to represent something like an an "error" case, and you want "if any of them failed then the whole thing is a failure", then you want Option[List[T]] - either a complete list, or nothing at all. You could implement that, using Scalaz, as:
def find(ids: List[Int]) = ids traverse findOne

DaoWen already got to the point regarding your considerations.
List[Option[T]] doesn't even encode more information than List[T] without the implicit knowledge that your list of ids is in the same order than your result list.
I'd actualy favour
def find(ids: Seq[Int]): Seq[T]
or
def find(ids: Seq[Int]): Option[NonEmptyList[T]]
where NonEmptyList is a type of sequence that actually enforces being not empty, e.g. in Scalaz. Unless you really wanna point out the difference between None and some empty list.
Btw, you might wanna use the more general Seq[T] for your interface instead of List[T]

Scala: selecting function returning Option versus PartialFunction

I'm a relative Scala beginner and would like some advice on how to proceed on an implementation that seems like it can be done either with a function returning Option or with PartialFunction. I've read all the related posts I could find (see bottom of question), but these seem to involve the technical details of using PartialFunction or converting one to the other; I am looking for an answer of the type "if the circumstances are X,Y,Z, then use A else B, but also consider C".
My example use case is a path search between locations using a library of path finders. Say the locations are of type L, a path was of type P and the desired path search result would be an Iterable[P]. The patch search result should be assembled by asking all the path finders (in something like Google maps these might be Bicycle, Car, Walk, Subway, etc.) for their path suggestions, which may or may not be defined for a particular start/end location pair.
There seem to be two ways to go about this:
(a) define a path finder as f: (L,L) => Option[P] and then get the result via something like finders.map( _.apply(l1,l2) ).filter( _.isDefined ).map( _.get )
(b) define a path finder as f: PartialFunction[(L,L),P] and then get the result via something likefinders.filter( _.isDefined( (l1,l2) ) ).map( _.apply( (l1,l2)) )`
It seems like using a function returning Option[P] would avoid double evaluation of results, so for an expensive computation this may be preferable unless one caches the results. It also seems like using Option one can have an arbitrary input signature, whereas PartialFunction expects a single argument. But I am particularly interested in hearing from someone with practical experience about less immediate, more "bigger picture" considerations, such as the interaction with the Scala library. Would using a PartialFunction have significant benefits in making available certain methods of the collections API that might pay off in other ways? Would such code generally be more concise?
Related but different questions:
Inverse of PartialFunction's lift method
Is the PartialFunction design inefficient?
How to convert X => Option[R] to PartialFunction[X,R]
Is there a nicer way of lifting a PartialFunction in Scala?
costly computation occuring in both isDefined and Apply of a PartialFunction

It's not all that well known, but since 2.8 Scala has a collect method defined on it's collections. collect is similar to filter, but takes a partial function and has the semantics you describe.

It feels like Option might fit your use case better.
My interpretation is that Partial Functions work well to be combined over input ranges. So if f is defined over (SanDiego,Irvine) and g is defined over (Paris,London) then you can get a function that is defined over the combined input (SanDiego,Irvine) and (Paris,London) by doing f orElse g.
But in your case it seems, things happen for a given (l1,l2) location tuple and then you do some work...
If you find yourself writing a lot of {case (L,M) => ... case (P,Q) => ...} then it may be the sign that partial functions are a better fit.
Otherwise options work well with the rest of the collections and can be used like this instead of your (a) proposal:
val processedPaths = for {
f <- finders
p <- f(l1, l2)
} yield process(p)
Within the for comprehension p is lifted into an Traversable, so you don't even have to call filter, isDefined or get to skip the finders without results.

Use-cases for Streams in Scala

In Scala there is a Stream class that is very much like an iterator. The topic Difference between Iterator and Stream in Scala? offers some insights into the similarities and differences between the two.
Seeing how to use a stream is pretty simple but I don't have very many common use-cases where I would use a stream instead of other artifacts.
The ideas I have right now:
If you need to make use of an infinite series. But this does not seem like a common use-case to me so it doesn't fit my criteria. (Please correct me if it is common and I just have a blind spot)
If you have a series of data where each element needs to be computed but that you may want to reuse several times. This is weak because I could just load it into a list which is conceptually easier to follow for a large subset of the developer population.
Perhaps there is a large set of data or a computationally expensive series and there is a high probability that the items you need will not require visiting all of the elements. But in this case an Iterator would be a good match unless you need to do several searches, in that case you could use a list as well even if it would be slightly less efficient.
There is a complex series of data that needs to be reused. Again a list could be used here. Although in this case both cases would be equally difficult to use and a Stream would be a better fit since not all elements need to be loaded. But again not that common... or is it?
So have I missed any big uses? Or is it a developer preference for the most part?
Thanks

The main difference between a Stream and an Iterator is that the latter is mutable and "one-shot", so to speak, while the former is not. Iterator has a better memory footprint than Stream, but the fact that it is mutable can be inconvenient.
Take this classic prime number generator, for instance:
def primeStream(s: Stream[Int]): Stream[Int] =
Stream.cons(s.head, primeStream(s.tail filter { _ % s.head != 0 }))
val primes = primeStream(Stream.from(2))
It can be easily be written with an Iterator as well, but an Iterator won't keep the primes computed so far.
So, one important aspect of a Stream is that you can pass it to other functions without having it duplicated first, or having to generate it again and again.
As for expensive computations/infinite lists, these things can be done with Iterator as well. Infinite lists are actually quite useful -- you just don't know it because you didn't have it, so you have seen algorithms that are more complex than strictly necessary just to deal with enforced finite sizes.

In addition to Daniel's answer, keep in mind that Stream is useful for short-circuiting evaluations. For example, suppose I have a huge set of functions that take String and return Option[String], and I want to keep executing them until one of them works:
val stringOps = List(
(s:String) => if (s.length>10) Some(s.length.toString) else None ,
(s:String) => if (s.length==0) Some("empty") else None ,
(s:String) => if (s.indexOf(" ")>=0) Some(s.trim) else None
);
Well, I certainly don't want to execute the entire list, and there isn't any handy method on List that says, "treat these as functions and execute them until one of them returns something other than None". What to do? Perhaps this:
def transform(input: String, ops: List[String=>Option[String]]) = {
ops.toStream.map( _(input) ).find(_ isDefined).getOrElse(None)
}
This takes a list and treats it as a Stream (which doesn't actually evaluate anything), then defines a new Stream that is a result of applying the functions (but that doesn't evaluate anything either yet), then searches for the first one which is defined--and here, magically, it looks back and realizes it has to apply the map, and get the right data from the original list--and then unwraps it from Option[Option[String]] to Option[String] using getOrElse.
Here's an example:
scala> transform("This is a really long string",stringOps)
res0: Option[String] = Some(28)
scala> transform("",stringOps)
res1: Option[String] = Some(empty)
scala> transform(" hi ",stringOps)
res2: Option[String] = Some(hi)
scala> transform("no-match",stringOps)
res3: Option[String] = None
But does it work? If we put a println into our functions so we can tell if they're called, we get
val stringOps = List(
(s:String) => {println("1"); if (s.length>10) Some(s.length.toString) else None },
(s:String) => {println("2"); if (s.length==0) Some("empty") else None },
(s:String) => {println("3"); if (s.indexOf(" ")>=0) Some(s.trim) else None }
);
// (transform is the same)
scala> transform("This is a really long string",stringOps)
1
res0: Option[String] = Some(28)
scala> transform("no-match",stringOps)
1
2
3
res1: Option[String] = None
(This is with Scala 2.8; 2.7's implementation will sometimes overshoot by one, unfortunately. And note that you do accumulate a long list of None as your failures accrue, but presumably this is inexpensive compared to your true computation here.)

I could imagine, that if you poll some device in real time, a Stream is more convenient.
Think of an GPS tracker, which returns the actual position if you ask it. You can't precompute the location where you will be in 5 minutes. You might use it for a few minutes only to actualize a path in OpenStreetMap or you might use it for an expedition over six months in a desert or the rain forest.
Or a digital thermometer or other kinds of sensors which repeatedly return new data, as long as the hardware is alive and turned on - a log file filter could be another example.

Stream is to Iterator as immutable.List is to mutable.List. Favouring immutability prevents a class of bugs, occasionally at the cost of performance.
scalac itself isn't immune to these problems: http://article.gmane.org/gmane.comp.lang.scala.internals/2831
As Daniel points out, favouring laziness over strictness can simplify algorithms and make it easier to compose them.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse