'tee' operation on Scala's option type? - scala

Is there some sort of 'tee' operation on Option in Scala's standard library available? The best I could find is foreach, however its return type is Unit, therefore it cannot be chained.
This is what I am looking for: given an Option instance, perform some operation with side effects on its value if the option is not empty (Some[A]), otherwise do nothing; return the option in any case.
I have a custom implementation using an implicit class, but I am wondering whether there is a more common way to do this without implicit conversion:
object OptionExtensions {
implicit class TeeableOption[A](value: Option[A]) {
def tee(action: A => Unit): Option[A] = {
value foreach action
value
}
}
}
Example code:
import OptionExtensions._
val option: Option[Int] = Some(42)
option.tee(println).foreach(println) // will print 42 twice
val another: Option[Int] = None
another.tee(println).foreach(println) // does nothing
Any suggestions?

In order to avoid implicit conversion, instead of using method chaining you can use function composition with k-combinator.
k-combinator gives you an idiomatic way to communicate the fact that you are going to perform a side effect.
Here is a short example:
object KCombinator {
def tap[A](a: A)(action: A => Any): A = {
action(a)
a
}
}
import KCombinator._
val func = ((_: Option[Int]).getOrElse(0))
.andThen(tap(_)(println))
.andThen(_ + 3)
.andThen(tap(_)(println))
If we call our func with an argument of Option(3) the result will be an Int with the value of 6
and this is how the console will look like:
3
6

There is not an existing way to accomplish this in the standard library, because side effects are minimized and isolated in functional programming. Depending on what your actual goals are, there are a couple different ways to idiomatically accomplish your task.
In the case of doing a lot of println commands, instead of sprinkling them throughout your algorithm, you would typically gather them in a collection, then do one foreach println at the end. This minimizes the side effects to the smallest possible impact. That goes with any other side effect. Try to find a way to squeeze it into the smallest possible space.
If you are trying to chain a series of "actions," you should look into futures. Futures basically treat an action as a value, and provide a lot of useful functions to work with them.

Simply use map, and make your side effecting functions conform to action: A => A rather than action: A => Unit.
def tprintln[A](a: A): A = {
println(a)
a
}
another.map(tprintln).foreach(println)

Related

How to design abstract classes if methods don't have the exact same signature?

This is a "real life" OO design question. I am working with Scala, and interested in specific Scala solutions, but I'm definitely open to hear generic thoughts.
I am implementing a branch-and-bound combinatorial optimization program. The algorithm itself is pretty easy to implement. For each different problem we just need to implement a class that contains information about what are the allowed neighbor states for the search, how to calculate the cost, and then potentially what is the lower bound, etc...
I also want to be able to experiment with different data structures. For instance, one way to store a logic formula is using a simple list of lists of integers. This represents a set of clauses, each integer a literal. We can have a much better performance though if we do something like a "two-literal watch list", and store some extra information about the formula in general.
That all would mean something like this
object BnBSolver[S<:BnBState]{
def solve(states: Seq[S], best_state:Option[S]): Option[S] = if (states.isEmpty) best_state else
val next_state = states.head
/* compare to best state, etc... */
val new_states = new_branches ++ states.tail
solve(new_states, new_best_state)
}
class BnBState[F<:Formula](clauses:F, assigned_variables) {
def cost: Int
def branches: Seq[BnBState] = {
val ll = clauses.pick_variable
List(
BnBState(clauses.assign(ll), ll :: assigned_variables),
BnBState(clauses.assign(-ll), -ll :: assigned_variables)
)
}
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Int) :F =
Formula(clauses.filterNot(_ contains ll)
.map(_.filterNot(_==-ll))))
}
Hopefully this is not too crazy, wrong or confusing. The whole issue here is that this assign method from a formula would usually take just the current literal that is going to be assigned. In the case of two-literal watch lists, though, you are doing some lazy thing that requires you to know later what literals have been previously assigned.
One way to fix this is you just keep this list of previously assigned literals in the data structure, maybe as a private thing. Make it a self-standing lazy data structure. But this list of the previous assignments is actually something that may be naturally available by whoever is using the Formula class. So it makes sense to allow whoever is using it to just provide the list every time you assign, if necessary.
The problem here is that we cannot now have an abstract Formula class that just declares a assign(ll:Int):Formula. In the normal case this is OK, but if this is a two-literal watch list Formula, it is actually an assign(literal: Int, previous_assignments: Seq[Int]).
From the point of view of the classes using it, it is kind of OK. But then how do we write generic code that can take all these different versions of Formula? Because of the drastic signature change, it cannot simply be an abstract method. We could maybe force the user to always provide the full assigned variables, but then this is a kind of a lie too. What to do?
The idea is the watch list class just becomes a kind of regular assign(Int) class if I write down some kind of adapter method that knows where to take the previous assignments from... I am thinking maybe with implicit we can cook something up.
I'll try to make my answer a bit general, since I'm not convinced I'm completely following what you are trying to do. Anyway...
Generally, the first thought should be to accept a common super-class as a parameter. Obviously that won't work with Int and Seq[Int].
You could just have two methods; have one call the other. For instance just wrap an Int into a Seq[Int] with one element and pass that to the other method.
You can also wrap the parameter in some custom class, e.g.
class Assignment {
...
}
def int2Assignment(n: Int): Assignment = ...
def seq2Assignment(s: Seq[Int]): Assignment = ...
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Assignment) :F = ...
}
And of course you would have the option to make those conversion methods implicit so that callers just have to import them, not call them explicitly.
Lastly, you could do this with a typeclass:
trait Assigner[A] {
...
}
implicit val intAssigner = new Assigner[Int] {
...
}
implicit val seqAssigner = new Assigner[Seq[Int]] {
...
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign[A : Assigner](ll: A) :F = ...
}
You could also make that type parameter at the class level:
case class Formula[A:Assigner,F<:Formula[A,F]](clauses:List[List[Int]]) {
def assign(ll: A) :F = ...
}
Which one of these paths is best is up to preference and how it might fit in with the rest of the code.

Map an instance using function in Scala

Say I have a local method/function
def withExclamation(string: String) = string + "!"
Is there a way in Scala to transform an instance by supplying this method? Say I want to append an exclamation mark to a string. Something like:
val greeting = "Hello"
val loudGreeting = greeting.applyFunction(withExclamation) //result: "Hello!"
I would like to be able to invoke (local) functions when writing a chain transformation on an instance.
EDIT: Multiple answers show how to program this possibility, so it seems that this feature is not present on an arbitraty class. To me this feature seems incredibly powerful. Consider where in Java I want to execute a number of operations on a String:
appendExclamationMark(" Hello! ".trim().toUpperCase()); //"HELLO!"
The order of operations is not the same as how they read. The last operation, appendExclamationMark is the first word that appears. Currently in Java I would sometimes do:
Function.<String>identity()
.andThen(String::trim)
.andThen(String::toUpperCase)
.andThen(this::appendExclamationMark)
.apply(" Hello "); //"HELLO!"
Which reads better in terms of expressing a chain of operations on an instance, but also contains a lot of noise, and it is not intuitive to have the String instance at the last line. I would want to write:
" Hello "
.applyFunction(String::trim)
.applyFunction(String::toUpperCase)
.applyFunction(this::withExclamation); //"HELLO!"
Obviously the name of the applyFunction function can be anything (shorter please). I thought backwards compatibility was the sole reason Java's Object does not have this.
Is there any technical reason why this was not added on, say, the Any or AnyRef classes?
You can do this with an implicit class which provides a way to extend an existing type with your own methods:
object StringOps {
implicit class RichString(val s: String) extends AnyVal {
def withExclamation: String = s"$s!"
}
def main(args: Array[String]): Unit = {
val m = "hello"
println(m.withExclamation)
}
}
Yields:
hello!
If you want to apply any functions (anonymous, converted from methods, etc.) in this way, you can use a variation on Yuval Itzchakov's answer:
object Combinators {
implicit class Combinators[A](val x: A) {
def applyFunction[B](f: A => B) = f(x)
}
}
A while after asking this question, I noticed that Kotlin has this built in:
inline fun <T, R> T.let(block: (T) -> R): R
Calls the specified function block with this value as its argument and returns
its result.
A lot more, quite useful variations of the above function are provided on all types, like with, also, apply, etc.

What is the advantage of using Option.map over Option.isEmpty and Option.get?

I am a new to Scala coming from Java background, currently confused about the best practice considering Option[T].
I feel like using Option.map is just more functional and beautiful, but this is not a good argument to convince other people. Sometimes, isEmpty check feels more straight forward thus more readable. Is there any objective advantages, or is it just personal preference?
Example:
Variation 1:
someOption.map{ value =>
{
//some lines of code
}
} orElse(foo)
Variation 2:
if(someOption.isEmpty){
foo
} else{
val value = someOption.get
//some lines of code
}
I intentionally excluded the options to use fold or pattern matching. I am simply not pleased by the idea of treating Option as a collection right now, and using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO. But no matter why I dislike these options, I want to keep the scope of this question to be the above two variations as named in the title.
Is there any objective advantages, or is it just personal preference?
I think there's a thin line between objective advantages and personal preference. You cannot make one believe there is an absolute truth to either one.
The biggest advantage one gains from using the monadic nature of Scala constructs is composition. The ability to chain operations together without having to "worry" about the internal value is powerful, not only with Option[T], but also working with Future[T], Try[T], Either[A, B] and going back and forth between them (also see Monad Transformers).
Let's try and see how using predefined methods on Option[T] can help with control flow. For example, consider a case where you have an Option[Int] which you want to multiply only if it's greater than a value, otherwise return -1. In the imperative approach, we get:
val option: Option[Int] = generateOptionValue
var res: Int = if (option.isDefined) {
val value = option.get
if (value > 40) value * 2 else -1
} else -1
Using collections style method on Option, an equivalent would look like:
val result: Int = option
.filter(_ > 40)
.map(_ * 2)
.getOrElse(-1)
Let's now consider a case for composition. Let's say we have an operation which might throw an exception. Additionaly, this operation may or may not yield a value. If it returns a value, we want to query a database with that value, otherwise, return an empty string.
A look at the imperative approach with a try-catch block:
var result: String = _
try {
val maybeResult = dangerousMethod()
if (maybeResult.isDefined) {
result = queryDatabase(maybeResult.get)
} else result = ""
}
catch {
case NonFatal(e) => result = ""
}
Now let's consider using scala.util.Try along with an Option[String] and composing both together:
val result: String = Try(dangerousMethod())
.toOption
.flatten
.map(queryDatabase)
.getOrElse("")
I think this eventually boils down to which one can help you create clear control flow of your operations. Getting used to working with Option[T].map rather than Option[T].get will make your code safer.
To wrap up, I don't believe there's a single truth. I do believe that composition can lead to beautiful, readable, side effect deferring safe code and I'm all for it. I think the best way to show other people what you feel is by giving them examples as we just saw, and letting them feel for themselves the power they can leverage with these sets of tools.
using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO
If you do just want an isEmpty check, isEmpty/isDefined is perfectly fine. But in your case you also want to get the value. And using pattern matching for this is not abuse; it's precisely the basic use-case. Using get allows to very easily make errors like forgetting to check isDefined or making the wrong check:
if(someOption.isEmpty){
val value = someOption.get
//some lines of code
} else{
//some other lines
}
Hopefully testing would catch it, but there's no reason to settle for "hopefully".
Combinators (map and friends) are better than get for the same reason pattern matching is: they don't allow you to make this kind of mistake. Choosing between pattern matching and combinators is a different question. Generally combinators are preferred because they are more composable (as Yuval's answer explains). If you want to do something covered by a single combinator, I'd generally choose them; if you need a combination like map ... getOrElse, or a fold with multi-line branches, it depends on the specific case.
It seems similar to you in case of Option but just consider the case of Future. You will not be able to interact with the future's value after going out of Future monad.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Promise
import scala.util.{Success, Try}
// create a promise which we will complete after sometime
val promise = Promise[String]();
// Now lets consider the future contained in this promise
val future = promise.future;
val byGet = if (!future.value.isEmpty) {
val valTry = future.value.get
valTry match {
case Success(v) => v + " :: Added"
case _ => "DEFAULT :: Added"
}
} else "DEFAULT :: Added"
val byMap = future.map(s => s + " :: Added")
// promise was completed now
promise.complete(Try("PROMISE"))
//Now lets print both values
println(byGet)
// DEFAULT :: Added
println(byMap)
// Success(PROMISE :: Added)

Are there any types with side-effecting methods that return the original type?

Often I find myself wanting to chain a side-effecting function to the end of another method call in a more functional-looking way, but I don't want to transform the original type to Unit. Suppose I have a read method that searches a database for a record, returning Option[Record].
def read(id: Long): Option[Record] = ...
If read returns Some(record), then I might want to cache that value and move on. I could do something like this:
read(id).map { record =>
// Cache the record
record
}
But, I would like to avoid the above code and end up with something more like this to make it more clear as to what's happening:
read(id).withSideEffect { record =>
// Cache the record
}
Where withSideEffect returns the same value as read(id). After searching high and low, I can't find any method on any type that does something like this. The closest solution I can come up with is using implicit magic:
implicit class ExtendedOption[A](underlying: Option[A]) {
def withSideEffect(op: A => Unit): Option[A] = {
underlying.foreach(op)
underlying
}
}
Are there any Scala types I may have overlooked with methods like this one? And are there are any potential design flaws from using such a method?
Future.andThen (scaladoc) takes a side-effect and returns a future of the current value to facilitate fluent chaining.
The return type is not this.type.
See also duplicate questions about tap.
You can use scalaz for "explicit annotation" of side-effectful functions. In scalaz 7.0.6 it's IO monad: http://eed3si9n.com/learning-scalaz/IO+Monad.html
It's deprecated in scalaz 7.1. I would do something like that with Task
val readAndCache = Task.delay(read(id)).map(record => cacheRecord(record); record)
readAndCache.run // Run task for it's side effects

costly computation occuring in both isDefined and Apply of a PartialFunction

It is quite possible that to know whether a function is defined at some point, a significant part of computing its value has to be done. In a PartialFunction, when implementing isDefined and apply, both methods will have to do that. What to do is this common job is costly?
There is the possibility of caching its result, hoping that apply will be called after isDefined. Definitely ugly.
I often wish that PartialFunction[A,B] would be Function[A, Option[B]], which is clearly isomorphic. Or maybe, there could be another method in PartialFunction, say applyOption(a: A): Option[B]. With some mixins, implementors would have a choice of implementing either isDefined and apply or applyOption. Or all of them to be on the safe side, performance wise. Clients which test isDefined just before calling apply would be encouraged to use applyOption instead.
However, this is not so. Some major methods in the library, among them collect in collections require a PartialFunction. Is there a clean (or not so clean) way to avoid paying for computations repeated between isDefined and apply?
Also, is the applyOption(a: A): Option[B] method reasonable? Does it sound feasible to add it in a future version? Would it be worth it?
Why is caching such a problem? In most cases, you have a local computation, so as long as you write a wrapper for the caching, you needn't worry about it. I have the following code in my utility library:
class DroppedFunction[-A,+B](f: A => Option[B]) extends PartialFunction[A,B] {
private[this] var tested = false
private[this] var arg: A = _
private[this] var ans: Option[B] = None
private[this] def cache(a: A) {
if (!tested || a != arg) {
tested = true
arg = a
ans = f(a)
}
}
def isDefinedAt(a: A) = {
cache(a)
ans.isDefined
}
def apply(a: A) = {
cache(a)
ans.get
}
}
class DroppableFunction[A,B](f: A => Option[B]) {
def drop = new DroppedFunction(f)
}
implicit def function_is_droppable[A,B](f: A => Option[B]) = new DroppableFunction(f)
and then if I have an expensive computation, I write a function method A => Option[B] and do something like (f _).drop to use it in collect or whatnot. (If you wanted to do it inline, you could create a method that takes A=>Option[B] and returns a partial function.)
(The opposite transformation--from PartialFunction to A => Option[B]--is called lifting, hence the "drop"; "unlift" is, I think, a more widely used term for the opposite operation.)
Have a look at this thread, Rethinking PartialFunction. You're not the only one wondering about this.
This is an interesting question, and I'll give my 2 cents.
First of resist the urge for premature optimization. Make sure the partial function is the problem. I was amazed at how fast they are on some cases.
Now assuming there is a problem, where would it come from?
Could be a large number of case clauses
Complex pattern matching
Some complex computation on the if causes
One option I'd try to find ways to fail fast. Break the pattern matching into layer, then chain partial functions. This way you can fail the match early. Also extract repeated sub matching. For example:
Lets assume OddEvenList is an extractor that break a list into a odd list and an even list:
var pf1: PartialFuntion[List[Int],R] = {
case OddEvenList(1::ors, 2::ers) =>
case OddEvenList(3::ors, 4::ors) =>
}
Break to two part, one that matches the split then one that tries to match re result (to avoid repeated computation. However this may require some re-engineering
var pf2: PartialFunction[(List[Int],List[Int],R) = {
case (1 :: ors, 2 :: ers) => R1
case (3 :: ors, 4 :: ors) => R2
}
var pf1: PartialFuntion[List[Int],R] = {
case OddEvenList(ors, ers) if(pf2.isDefinedAt(ors,ers) => pf2(ors,ers)
}
I have used this when progressively reading XML files that hard a rather inconstant format.
Another option is to compose partial functions using andThen. Although a quick test here seamed to indicate that only the first was is actually tests.
There is absolutely nothing wrong with caching mechanism inside the partial function, if:
the function returns always the same input, when passed the same argument
it has no side effects
it is completely hidden from the rest of the world
Such cached function is not distiguishable from a plain old pure partial function...