scala currying by nested functions or by multiple parameter lists - scala

In Scala, I can define a function with two parameter lists.
def myAdd(x :Int)(y :Int) = x + y
This makes it easy to define a partially applied function.
val plusFive = myAdd(5) _
But, I can accomplish something similar by defining and returning a nested function.
def myOtherAdd(x :Int) = {
def f(y :Int) = x + y
f _
}
Cosmetically, I've moved the underscore, but this still feels like currying.
val otherPlusFive = myOtherAdd(5)
What criteria should I use to prefer one approach over the other?

There are at least four ways to accomplish the same thing:
def myAddA(x: Int, y: Int) = x + y
val plusFiveA: Int => Int = myAddA(5,_)
def myAddB(x: Int)(y : Int) = x + y
val plusFiveB = myAddB(5) _
def myAddC(x: Int) = (y: Int) => x + y
val plusFiveC = myAddC(5)
def myAddD(x: Int) = {
def innerD(y: Int) = x + y
innerD _
}
val plusFiveD = myAddD(5)
You might want to know which is most efficient or which is the best style (for some non-performance based measure of best).
As far as efficiency goes, it turns out that all four are essentially equivalent. The first two cases actually emit exactly the same bytecode; the JVM doesn't know anything about multiple parameter lists, so once the compiler figures it out (you need to help it with a type annotation on the case A), it's all the same under the hood. The third case is also extremely close, but since it promises up front to return a function and specifies it on the spot, it can avoid one internal field. The fourth case is pretty much the same as the first two in terms of work done; it just does the conversion to Function1 inside the method instead of outside.
In terms of style, I suggest that B and C are the best ways to go, depending on what you're doing. If your primary use case is to create a function, not to call in-place with both parameter lists, then use C, because it tells you what it's going to do. (This version is also particularly familiar to people coming from Haskell, for instance.) On the other hand, if you are mostly going to call it in place but will only occasionally curry it, then use B. Again, it says more clearly what it's expected to do.

You could also do this:
def yetAnotherAdd(x: Int) = x + (_: Int)
You should choose the API based on intention. The main reason in Scala to have multiple parameter lists is to help type inference. For instance:
def f[A](x: A)(f: A => A) = ...
f(5)(_ + 5)
One can also use it to have multiple varargs, but I have never seen code like that. And, of course, there's the need for the implicit parameter list, but that's pretty much another matter.
Now, there are many ways you can have functions returning functions, which is pretty much what currying does. You should use them if the API should be thought of as a function which returns a function.
I think it is difficult to get any more precise than this.

Another benefit of having a method return a function directly (instead of using partial application) is that it leads to much cleaner code when using infix notation, allowing you to avoid a bucketload of parentheses and underscores in more complex expressions.
Consider:
val list = List(1,2,3,4)
def add1(a: Int)(b: Int) = a + b
list map { add1(5) _ }
//versus
def add2(a: Int) = a + (_: Int)
list map add2(5)

Related

How to implement memoization in Scala without mutability?

I was recently reading Category Theory for Programmers and in one of the challenges, Bartosz proposed to write a function called memoize which takes a function as an argument and returns the same one with the difference that, the first time this new function is called, it stores the result of the argument and then returns this result each time it is called again.
def memoize[A, B](f: A => B): A => B = ???
The problem is, I can't think of any way to implement this function without resorting to mutability. Moreover, the implementations I have seen uses mutable data structures to accomplish the task.
My question is, is there a purely functional way of accomplishing this? Maybe without mutability or by using some functional trick?
Thanks for reading my question and for any future help. Have a nice day!
is there a purely functional way of accomplishing this?
No. Not in the narrowest sense of pure functions and using the given signature.
TLDR: Use mutable collections, it's okay!
Impurity of g
val g = memoize(f)
// state 1
g(a)
// state 2
What would you expect to happen for the call g(a)?
If g(a) memoizes the result, an (internal) state has to change, so the state is different after the call g(a) than before.
As this could be observed from the outside, the call to g has side effects, which makes your program impure.
From the Book you referenced, 2.5 Pure and Dirty Functions:
[...] functions that
always produce the same result given the same input and
have no side effects
are called pure functions.
Is this really a side effect?
Normally, at least in Scala, internal state changes are not considered side effects.
See the definition in the Scala Book
A pure function is a function that depends only on its declared inputs and its internal algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.
The following examples of lazy computations both change their internal states, but are normally still considered purely functional as they always yield the same result and have no side effects apart from internal state:
lazy val x = 1
// state 1: x is not computed
x
// state 2: x is 1
val ll = LazyList.continually(0)
// state 1: ll = LazyList(<not computed>)
ll(0)
// state 2: ll = LazyList(0, <not computed>)
In your case, the equivalent would be something using a private, mutable Map (as the implementations you may have found) like:
def memoize[A, B](f: A => B): A => B = {
val cache = mutable.Map.empty[A, B]
(a: A) => cache.getOrElseUpdate(a, f(a))
}
Note that the cache is not public.
So, for a pure function f and without looking at memory consumption, timings, reflection or other evil stuff, you won't be able to tell from the outside whether f was called twice or g cached the result of f.
In this sense, side effects are only things like printing output, writing to public variables, files etc.
Thus, this implementation is considered pure (at least in Scala).
Avoiding mutable collections
If you really want to avoid var and mutable collections, you need to change the signature of your memoize method.
This is, because if g cannot change internal state, it won't be able to memoize anything new after it was initialized.
An (inefficient but simple) example would be
def memoizeOneValue[A, B](f: A => B)(a: A): (B, A => B) = {
val b = f(a)
val g = (v: A) => if (v == a) b else f(v)
(b, g)
}
val (b1, g) = memoizeOneValue(f, a1)
val (b2, h) = memoizeOneValue(g, a2)
// ...
The result of f(a1) would be cached in g, but nothing else. Then, you could chain this and always get a new function.
If you are interested in a faster version of that, see #esse's answer, which does the same, but more efficient (using an immutable map, so O(log(n)) instead of the linked list of functions above, O(n)).
Let's try(Note: I have change the return type of memoize to store the cached data):
import scala.language.existentials
type M[A, B] = A => T forSome { type T <: (B, A => T) }
def memoize[A, B](f: A => B): M[A, B] = {
import scala.collection.immutable
def withCache(cache: immutable.Map[A, B]): M[A, B] = a => cache.get(a) match {
case Some(b) => (b, withCache(cache))
case None =>
val b = f(a)
(b, withCache(cache + (a -> b)))
}
withCache(immutable.Map.empty)
}
def f(i: Int): Int = { print(s"Invoke f($i)"); i }
val (i0, m0) = memoize(f)(1) // f only invoked at first time
val (i1, m1) = m0(1)
val (i2, m2) = m1(1)
Yes there is pure functional ways to implement polymorphic function memoization. The topic is surprisingly deep and even summons the Yoneda Lemma, which is likely what Bartosz had in mind with this exercise.
The blog post Memoization in Haskell gives a nice introduction by simplifying the problem a bit: instead of looking at arbitrary functions it restricts the problem to functions from the integers.
The following memoize function takes a function of type Int -> a and
returns a memoized version of the same function. The trick is to turn
a function into a value because, in Haskell, functions are not
memoized but values are. memoize converts a function f :: Int -> a
into an infinite list [a] whose nth element contains the value of f n.
Thus each element of the list is evaluated when it is first accessed
and cached automatically by the Haskell runtime thanks to lazy
evaluation.
memoize :: (Int -> a) -> (Int -> a)
memoize f = (map f [0 ..] !!)
Apparently the approach can be generalised to function of arbitrary domains. The trick is to come up with a way to use the type of the domain as an index into a lazy data structure used for "storing" previous values. And this is where the Yoneda Lemma comes in and my own understanding of the topic becomes flimsy.

Scala: currying vs closure

I can use closure to define function like this:
def product(x: Int) = (y: Int) => x * y
val productBy3 = product(3)
println(productBy3(6)) // 18
OR, using currying:
def curriedProduct(x: Int)(y: Int) = x * y
val productBy3 = curriedProduct(3)_
println(productBy3(6)) // 18
Any advantage/disadvantage one approach has over other?
The first is an example of a method returning a function. The second is an example of a method with multiple parameter lists.
The way you use them, there is no difference.
When called as product(3)(6), the second may be a bit faster, but not to an extent that would normally be a concern.
I would use the first form when the expected way to call it would be with product(3), and use the second form if the normal way to call it would be product(3)(6).
Lastly, I'd like to suggest the possibility of
def product(i: Int, j: Int) = i * j
val productBy3 = product(3, _)
println(productBy3(6)) //18
I don't really see any upsides of using the second form instead of either this alternative or the first alternative in this situation. Using multiple parameter lists may help type inference in scala 2 (see https://docs.scala-lang.org/tour/multiple-parameter-lists.html), but there is no problematic inference here anyway.

Binary operator with Option arguments

In scala, how do I define addition over two Option arguments? Just to be specific, let's say they're wrappers for Int types (I'm actually working with maps of doubles but this example is simpler).
I tried the following but it just gives me an error:
def addOpt(a:Option[Int], b:Option[Int]) = {
a match {
case Some(x) => x.get
case None => 0
} + b match {
case Some(y) => y.get
case None => 0
}
}
Edited to add:
In my actual problem, I'm adding two maps which are standins for sparse vectors. So the None case returns Map[Int, Double] and the + is actually a ++ (with the tweak at stackoverflow.com/a/7080321/614684)
Monoids
You might find life becomes a lot easier when you realize that you can stand on the shoulders of giants and take advantage of common abstractions and the libraries built to use them. To this end, this question is basically about dealing with
monoids (see related questions below for more about this) and the library in question is called scalaz.
Using scalaz FP, this is just:
def add(a: Option[Int], b: Option[Int]) = ~(a |+| b)
What is more this works on any monoid M:
def add[M: Monoid](a: Option[M], b: Option[M]) = ~(a |+| b)
Even more usefully, it works on any number of them placed inside a Foldable container:
def add[M: Monoid, F: Foldable](as: F[Option[M]]) = ~as.asMA.sum
Note that some rather useful monoids, aside from the obvious Int, String, Boolean are:
Map[A, B: Monoid]
A => (B: Monoid)
Option[A: Monoid]
In fact, it's barely worth the bother of extracting your own method:
scala> some(some(some(1))) #:: some(some(some(2))) #:: Stream.empty
res0: scala.collection.immutable.Stream[Option[Option[Option[Int]]]] = Stream(Some(Some(Some(1))), ?)
scala> ~res0.asMA.sum
res1: Option[Option[Int]] = Some(Some(3))
Some related questions
Q. What is a monoid?
A monoid is a type M for which there exists an associative binary operation (M, M) => M and an identity I under this operation, such that mplus(m, I) == m == mplus(I, m) for all m of type M
Q. What is |+|?
This is just scalaz shorthand (or ASCII madness, ymmv) for the mplus binary operation
Q. What is ~?
It is a unary operator meaning "or identity" which is retrofitted (using scala's implicit conversions) by the scalaz library onto Option[M] if M is a monoid. Obviously a non-empty option returns its contents; an empty option is replaced by the monoid's identity.
Q. What is asMA.sum?
A Foldable is basically a datastructure which can be folded over (like foldLeft, for example). Recall that foldLeft takes a seed value and an operation to compose successive computations. In the case of summing a monoid, the seed value is the identity I and the operation is mplus. You can hence call asMA.sum on a Foldable[M : Monoid]. You might need to use asMA because of the name clash with the standard library's sum method.
Some References
Slides and Video of a talk I gave which gives practical examples of using monoids in the wild
def addOpts(xs: Option[Int]*) = xs.flatten.sum
This will work for any number of inputs.
If they both default to 0 you don't need pattern matching:
def addOpt(a:Option[Int], b:Option[Int]) = {
a.getOrElse(0) + b.getOrElse(0)
}
(Repeating comment above in an answer as requested)
You don't extract the content of the option the proper way. When you match with case Some(x), x is the value inside the option(type Int) and you don't call get on that. Just do
case Some(x) => x
Anyway, if you want content or default, a.getOrElse(0) is more convenient
def addOpt(ao: Option[Int], bo: Option[Int]) =
for {
a <- ao
b <- bo
} yield a + b

How can I write f(g(h(x))) in Scala with fewer parentheses?

Expressions like
ls map (_ + 1) sum
are lovely because they are left-to-right and not nested. But if the functions in question are defined outside the class, it is less pretty.
Following an example I tried
final class DoublePlus(val self: Double) {
def hypot(x: Double) = sqrt(self*self + x*x)
}
implicit def doubleToDoublePlus(x: Double) =
new DoublePlus(x)
which works fine as far as I can tell, other than
A lot of typing for one method
You need to know in advance that you want to use it this way
Is there a trick that will solve those two problems?
You can call andThen on a function object:
(h andThen g andThen f)(x)
You can't call it on methods directly though, so maybe your h needs to become (h _) to transform the method into a partially applied function. The compiler will translate subsequent method names to functions automatically because the andThen method accepts a Function parameter.
You could also use the pipe operator |> to write something like this:
x |> h |> g |> f
Enriching an existing class/interface with an implicit conversion (which is what you did with doubleToDoublePlus) is all about API design when some classes aren't under your control. I don't recommend to do that lightly just to save a few keystrokes or having a few less parenthesis. So if it's important to be able to type val h = d hypot x, then the extra keystrokes should not be a concern. (there may be object allocations concerns but that's different).
The title and your example also don't match:
f(g(h(x))) can be rewritten asf _ compose g _ compose h _ apply x if your concern is about parenthesis or f compose g compose h apply x if f, g, h are function objects rather than def.
But ls map (_ + 1) sum aren't nested calls as you say, so I'm not sure how that relates to the title. And although it's lovely to use, the library/language designers went through a lot of efforts to make it easy to use and under the hood is not simple (much more complex than your hypot example).
def fgh (n: N) = f(g(h(n)))
val m = fgh (n)
Maybe this, observe how a is provided:
def compose[A, B, C](f: B => C, g: A => B): A => C = (a: A) => f(g(a))
basically like the answer above combine the desired functions to a intermediate one which you then can use easily with map.
Starting Scala 2.13, the standard library provides the chaining operation pipe which can be used to convert/pipe a value with a function of interest.
Using multiple pipes we can thus build a pipeline which as mentioned in the title of your question, minimizes the number of parentheses:
import scala.util.chaining._
x pipe h pipe g pipe f

Most concise way to combine sequence elements

Say we have two sequences and we and we want to combine them using some method
val a = Vector(1,2,3)
val b = Vector(4,5,6)
for example addition could be
val c = a zip b map { i => i._1 + i._2 }
or
val c = a zip b map { case (i, j) => i + j }
The repetition in the second part makes me think this should be possible in a single operation. I can't see any built-in method for this. I suppose what I really want is a zip method that skips the creation and extraction of tuples.
Is there a prettier / more concise way in plain Scala, or maybe with Scalaz? If not, how would you write such a method and pimp it onto sequences so I could write something like
val c = a zipmap b (_+_)
There is
(a,b).zipped.map(_ + _)
which is probably close enough to what you want to not bother with an extension. (You can't use it point-free, unfortunately, since the implicits on zipped don't like that.)
Rex's answer is certainly the easier way out for most cases. However, zipped is more limited than zip, so you might stumble upon cases where it won't work.
For those cases, you might try this:
val c = a zip b map (Function tupled (_+_))
Or, alternatively, if you do have a function or method that does what you want, you have this option as well:
def sumFunction = (a: Int, b: Int) => a + b
def sumMethod(a: Int, b: Int) = a + b
val c1 = a zip b map sumFunction.tupled
val c2 = a zip b map (sumMethod _).tupled
Using .tupled won't work in the first case because Scala won't be able to infer the type of the function.