Efficient implementation of Catamorphisms in Scala - scala

For a datatype representing the natural numbers:
sealed trait Nat
case object Z extends Nat
case class S(pred: Nat) extends Nat
In Scala, here is an elementary way of implementing the corresponding catamorphism:
def cata[A](z: A)(l: Nat)(f: A => A): A = l match {
case Z => z
case S(xs) => f( cata(z)(xs)(f) )
}
However, since the recursive call to cata isn't in tail position, this can easily trigger a stack overflow.
What are alternative implementation options that will avoid this? I'd rather not go down the route of F-algebras unless the interface ultimately presented by the code can look pretty much as simple as the above.
EDIT: Looks like this might be directly relevant: Is it possible to use continuations to make foldRight tail recursive?

If you were implementing a catamorphism on lists, that would be what in Haskell we call a foldr. We know that foldr does not have a tail-recursive definition, but foldl does. So if you insist on a tail-recusive program, the right thing to do is reverse the list argument (tail-recursively, in linear time), then use a foldl in place of the foldr.
Your example uses the simpler data type of naturals (and a truly "efficient" implementation would use machine integers, but we'll agree to leave that aside). What is the reverse of one of your natural numbers? Just the number itself, because we can think of it as a list with no data in each node, so we can't tell the difference when it is reversed! And what's the equivalent of the foldl? It's the program (forgive the pseudocode)
def cata(z, a, f) = {
var x = a, y = z;
while (x != Z) {
y = f(y);
x = pred(x)
}
return y
}
Or as a Scala tail-recursion,
def cata[A](z: A)(a: Nat)(f: A => A): A = a match {
case Z => z
case S(b) => cata( f(z) )(b)(f)
}
Will that do?

Yes, this is exactly the motivating example in the paper Clowns to the left of me, jokers to the right
(Dissecting Data Structures) (updated, better, but non-free version here http://dl.acm.org/citation.cfm?id=1328474).
The basic idea is that you want to turn your recursive function into a loop, so you need to figure out a data structure that keeps track of the state of the procedure, which is
What you've calculuated so far
What you have left to do.
The type of this state depends on the structure of the type you're doing the fold over, at any point in the fold you are at some node of the tree and you need to remember the tree structure of "the rest of the tree".
The paper shows how you can calculate that state type mechanically. If you do this for Lists, you get that the state you need to keep track of is
The operation run on all the previous values.
The list of elements left to process.
Which is exactly what foldl keeps track of, so it's kind of a coincidence that foldl and foldr can be given the same type.

Related

How to implement memoization in Scala without mutability?

I was recently reading Category Theory for Programmers and in one of the challenges, Bartosz proposed to write a function called memoize which takes a function as an argument and returns the same one with the difference that, the first time this new function is called, it stores the result of the argument and then returns this result each time it is called again.
def memoize[A, B](f: A => B): A => B = ???
The problem is, I can't think of any way to implement this function without resorting to mutability. Moreover, the implementations I have seen uses mutable data structures to accomplish the task.
My question is, is there a purely functional way of accomplishing this? Maybe without mutability or by using some functional trick?
Thanks for reading my question and for any future help. Have a nice day!
is there a purely functional way of accomplishing this?
No. Not in the narrowest sense of pure functions and using the given signature.
TLDR: Use mutable collections, it's okay!
Impurity of g
val g = memoize(f)
// state 1
g(a)
// state 2
What would you expect to happen for the call g(a)?
If g(a) memoizes the result, an (internal) state has to change, so the state is different after the call g(a) than before.
As this could be observed from the outside, the call to g has side effects, which makes your program impure.
From the Book you referenced, 2.5 Pure and Dirty Functions:
[...] functions that
always produce the same result given the same input and
have no side effects
are called pure functions.
Is this really a side effect?
Normally, at least in Scala, internal state changes are not considered side effects.
See the definition in the Scala Book
A pure function is a function that depends only on its declared inputs and its internal algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.
The following examples of lazy computations both change their internal states, but are normally still considered purely functional as they always yield the same result and have no side effects apart from internal state:
lazy val x = 1
// state 1: x is not computed
x
// state 2: x is 1
val ll = LazyList.continually(0)
// state 1: ll = LazyList(<not computed>)
ll(0)
// state 2: ll = LazyList(0, <not computed>)
In your case, the equivalent would be something using a private, mutable Map (as the implementations you may have found) like:
def memoize[A, B](f: A => B): A => B = {
val cache = mutable.Map.empty[A, B]
(a: A) => cache.getOrElseUpdate(a, f(a))
}
Note that the cache is not public.
So, for a pure function f and without looking at memory consumption, timings, reflection or other evil stuff, you won't be able to tell from the outside whether f was called twice or g cached the result of f.
In this sense, side effects are only things like printing output, writing to public variables, files etc.
Thus, this implementation is considered pure (at least in Scala).
Avoiding mutable collections
If you really want to avoid var and mutable collections, you need to change the signature of your memoize method.
This is, because if g cannot change internal state, it won't be able to memoize anything new after it was initialized.
An (inefficient but simple) example would be
def memoizeOneValue[A, B](f: A => B)(a: A): (B, A => B) = {
val b = f(a)
val g = (v: A) => if (v == a) b else f(v)
(b, g)
}
val (b1, g) = memoizeOneValue(f, a1)
val (b2, h) = memoizeOneValue(g, a2)
// ...
The result of f(a1) would be cached in g, but nothing else. Then, you could chain this and always get a new function.
If you are interested in a faster version of that, see #esse's answer, which does the same, but more efficient (using an immutable map, so O(log(n)) instead of the linked list of functions above, O(n)).
Let's try(Note: I have change the return type of memoize to store the cached data):
import scala.language.existentials
type M[A, B] = A => T forSome { type T <: (B, A => T) }
def memoize[A, B](f: A => B): M[A, B] = {
import scala.collection.immutable
def withCache(cache: immutable.Map[A, B]): M[A, B] = a => cache.get(a) match {
case Some(b) => (b, withCache(cache))
case None =>
val b = f(a)
(b, withCache(cache + (a -> b)))
}
withCache(immutable.Map.empty)
}
def f(i: Int): Int = { print(s"Invoke f($i)"); i }
val (i0, m0) = memoize(f)(1) // f only invoked at first time
val (i1, m1) = m0(1)
val (i2, m2) = m1(1)
Yes there is pure functional ways to implement polymorphic function memoization. The topic is surprisingly deep and even summons the Yoneda Lemma, which is likely what Bartosz had in mind with this exercise.
The blog post Memoization in Haskell gives a nice introduction by simplifying the problem a bit: instead of looking at arbitrary functions it restricts the problem to functions from the integers.
The following memoize function takes a function of type Int -> a and
returns a memoized version of the same function. The trick is to turn
a function into a value because, in Haskell, functions are not
memoized but values are. memoize converts a function f :: Int -> a
into an infinite list [a] whose nth element contains the value of f n.
Thus each element of the list is evaluated when it is first accessed
and cached automatically by the Haskell runtime thanks to lazy
evaluation.
memoize :: (Int -> a) -> (Int -> a)
memoize f = (map f [0 ..] !!)
Apparently the approach can be generalised to function of arbitrary domains. The trick is to come up with a way to use the type of the domain as an index into a lazy data structure used for "storing" previous values. And this is where the Yoneda Lemma comes in and my own understanding of the topic becomes flimsy.

How to model if-expressions with actor systems?

I'm trying to emulate a simple functional language using an actor based execution model where issues arose modelling if-expression.
Actor systems nowadays are used basically for speeding up all kind of stuff by avoiding OS locks and stalled threads or to make microservices less painful, but initially it was supposed to be an alternative model of computation in general [1][2], a contemporary take on it may be propagation networks. So this should be capable to cover any programming language construct and certainly an if, right?
While I'm aware that this is occasionally met with irritation, I saw one timid attempt to move towards recursive algorithms represented using akka actors (I've refurbished it and added further examples including the one given below). That attempt halted at function calls, but why not go further and also model operators and if conditions with actors? In fact the smalltalk language applies this model and yields a precursor of the actor concept, as has been pointed out in the accepted answers below.
Surprisingly recursive function calls aren't much of an issue, but if1 is, due to it's potentially stateful nature.
Given the clause C: if a then x else y, here's the problem:
My initial idea was, that C is an actor behaving like a function with 3 parameters (a,x,y) that returns either x or y depending on a. Being maximally parallel [2] a,x and y would be evaluated simultaneously and passed as messages to C. Now, this isn't really good, if C is the exit condition of a recursive function f, sending one branch of f in an infinite recursion. Also if x or y have side effects one can't just evaluate both of them. Let's take this recursive sum (which is not the usual factorial, stupid as such and could be made tail recursive, but that's not the point)
f(n) = if n <= 0
0
else
n + f(n -1)
Note, that I'd like to create an if-expression resembling the one of Scala, see the (spec, p. 88), or Haskell, for that matter, rather than an if-statement that relies on side-effects.
f(0) would cause 3 concurrent evaluations
n <= 0 (ok)
0 (ok)
n + f(n -1) (bad, introducing the weird behavior that the call to f(n) actually will return (yielding 0) but the evaluation of its branches continues infinitely)
I can see these options from here:
The whole computation becomes stateful and the evaluation of either x
or y only happens after a has been calculated (mandatory if x or y have side effects).
Some guarding mechanism gets introduced that renders x or y not
applicable for arguments outside a certain range upon the call of f. They might evaluate to some "not applicable" marker instead of a value, which will not be used in C anyway, since it comes from a branch which isn't relevant.
I'm not sure at this point, If I didn't miss out on the question at a fundamental level and there are obvious other approaches that I just don't see. Input appreciated :)
Btw. see this for an exhaustive list of conditional branching in different languages, without giving their semantics, and, not exhaustive, the wiki page on conditionals, with semantics, and this for a discussion how the question at hand is an issue till down to the level of hardware.
1 I'm aware that an if could be seen as a special case of pattern matching, but then the question is, how to model the different cases of a match expression using actors. But maybe that wasn't even intended in the first place, matching is just something that every actor can do without referring to other specialized "match-actors". On the other hand it has been stated that "everything is an actor", rather confusing[2]. Btw. does anybody have a clear notion what the [#message whatever] notation is meant to be in that paper? # is irritatingly undefined. Maybe smalltak gives a hint, there it indicates a symbol.
There is a little bit of a misconception in your question. In functional languages, if is not necessarily a function of three parameters. Rather, it is sometimes two functions of two parameters.
In particular, that is how the Church Encoding of Booleans works in λ-calculus: there are two functions, let's call them True and False. Both functions have two parameters. True simply returns the first argument, False simply returns the second argument.
First, let's define two functions called true and false. We could define them any way we want, they are completely arbitrary, but we will define them in a very special way which has some advantages as we will see later (I will use ECMAScript as a somewhat reasonable approximation of λ-calculus that is probably readable by a bigger portion of visitors to this site than λ-calculus itself):
const tru = (thn, _ ) => thn,
fls = (_ , els) => els;
tru is a function with two parameters which simply ignores its second argument and returns the first. fls is also a function with two parameters which simply ignores its first argument and returns the second.
Why did we encode tru and fls this way? Well, this way, the two functions not only represent the two concepts of true and false, no, at the same time, they also represent the concept of "choice", in other words, they are also an if/then/else expression! We evaluate the if condition and pass it the then block and the else block as arguments. If the condition evaluates to tru, it will return the then block, if it evaluates to fls, it will return the else block. Here's an example:
tru(23, 42);
// => 23
This returns 23, and this:
fls(23, 42);
// => 42
returns 42, just as you would expect.
There is a wrinkle, however:
tru(console.log("then branch"), console.log("else branch"));
// then branch
// else branch
This prints both then branch and else branch! Why?
Well, it returns the return value of the first argument, but it evaluates both arguments, since ECMAScript is strict and always evaluates all arguments to a function before calling the function. IOW: it evaluates the first argument which is console.log("then branch"), which simply returns undefined and has the side-effect of printing then branch to the console, and it evaluates the second argument, which also returns undefined and prints to the console as a side-effect. Then, it returns the first undefined.
In λ-calculus, where this encoding was invented, that's not a problem: λ-calculus is pure, which means it doesn't have any side-effects; therefore you would never notice that the second argument also gets evaluated. Plus, λ-calculus is lazy (or at least, it is often evaluated under normal order), meaning, it doesn't actually evaluate arguments which aren't needed. So, IOW: in λ-calculus the second argument would never be evaluated, and if it were, we wouldn't notice.
ECMAScript, however, is strict, i.e. it always evaluates all arguments. Well, actually, not always: the if/then/else, for example, only evaluates the then branch if the condition is true and only evaluates the else branch if the condition is false. And we want to replicate this behavior with our iff. Thankfully, even though ECMAScript isn't lazy, it has a way to delay the evaluation of a piece of code, the same way almost every other language does: wrap it in a function, and if you never call that function, the code will never get executed.
So, we wrap both blocks in a function, and at the end call the function that is returned:
tru(() => console.log("then branch"), () => console.log("else branch"))();
// then branch
prints then branch and
fls(() => console.log("then branch"), () => console.log("else branch"))();
// else branch
prints else branch.
We could implement the traditional if/then/else this way:
const iff = (cnd, thn, els) => cnd(thn, els);
iff(tru, 23, 42);
// => 23
iff(fls, 23, 42);
// => 42
Again, we need some extra function wrapping when calling the iff function and the extra function call parentheses in the definition of iff, for the same reason as above:
const iff = (cnd, thn, els) => cnd(thn, els)();
iff(tru, () => console.log("then branch"), () => console.log("else branch"));
// then branch
iff(fls, () => console.log("then branch"), () => console.log("else branch"));
// else branch
Now that we have those two definitions, we can implement or. First, we look at the truth table for or: if the first operand is truthy, then the result of the expression is the same as the first operand. Otherwise, the result of the expression is the result of the second operand. In short: if the first operand is true, we return the first operand, otherwise we return the second operand:
const orr = (a, b) => iff(a, () => a, () => b);
Let's check out that it works:
orr(tru,tru);
// => tru(thn, _) {}
orr(tru,fls);
// => tru(thn, _) {}
orr(fls,tru);
// => tru(thn, _) {}
orr(fls,fls);
// => fls(_, els) {}
Great! However, that definition looks a little ugly. Remember, tru and fls already act like a conditional all by themselves, so really there is no need for iff, and thus all of that function wrapping at all:
const orr = (a, b) => a(a, b);
There you have it: or (plus other boolean operators) defined with nothing but function definitions and function calls in just a handful of lines:
const tru = (thn, _ ) => thn,
fls = (_ , els) => els,
orr = (a , b ) => a(a, b),
nnd = (a , b ) => a(b, a),
ntt = a => a(fls, tru),
xor = (a , b ) => a(ntt(b), b),
iff = (cnd, thn, els) => cnd(thn, els)();
Unfortunately, this implementation is rather useless: there are no functions or operators in ECMAScript which return tru or fls, they all return true or false, so we can't use them with our functions. But there's still a lot we can do. For example, this is an implementation of a singly-linked list:
const cons = (hd, tl) => which => which(hd, tl),
car = l => l(tru),
cdr = l => l(fls);
You may have noticed something peculiar: tru and fls play a double role, they act both as the data values true and false, but at the same time, they also act as a conditional expression. They are data and behavior, bundled up into one … uhm … "thing" … or (dare I say) object! Does this idea of identifying data and behavior remind us of anything?
Indeed, tru and fls are objects. And, if you have ever used Smalltalk, Self, Newspeak or other pure object-oriented languages, you will have noticed that they implement booleans in exactly the same way: two objects true and false which have method named if that takes two blocks (functions, lambdas, whatever) as arguments and evaluates one of them.
Here's an example of what it might look like in Scala:
sealed abstract trait Buul {
def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): T
def &&&(other: ⇒ Buul): Buul
def |||(other: ⇒ Buul): Buul
def ntt: Buul
}
case object Tru extends Buul {
override def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): U = thn
override def &&&(other: ⇒ Buul) = other
override def |||(other: ⇒ Buul): this.type = this
override def ntt = Fls
}
case object Fls extends Buul {
override def apply[T, U <: T, V <: T](thn: ⇒ U)(els: ⇒ V): V = els
override def &&&(other: ⇒ Buul): this.type = this
override def |||(other: ⇒ Buul) = other
override def ntt = Tru
}
object BuulExtension {
import scala.language.implicitConversions
implicit def boolean2Buul(b: ⇒ Boolean) = if (b) Tru else Fls
}
import BuulExtension._
(2 < 3) { println("2 is less than 3") } { println("2 is greater than 3") }
// 2 is less than 3
Given the very close relationship between OO and actors (they are pretty much the same thing, actually), which is not historically surprising (Alan Kay based Smalltalk on Carl Hewitt's PLANNER; Carl Hewitt based Actors on Alan Kay's Smalltalk), I wouldn't be surprised if this turned out to be a step in the right direction to solve your problem.
Q : didn't ( I ) miss out on the question at a fundamental level?
Yes, you happened to have missed a cardinal point: even the functional-languages, that may otherwise enjoy the forms of AND- and/or OR-based fine-grain sorts of parallelism, do not grow as wild as not to respect the strictly [SERIAL] nature of the if ( expression1 ) expression2 [ else expression3 ]
You have spend many efforts on argumentation about recursion-case(s) whereas the principal property was left out of your view. State-fullness is the Mother Nature of the computing ( these toys are nothing but finite state automata, no matter how large the state-space might be, it is and always will principally remain finite ).
Even the cited Scala p.88 confirms this: "The conditional expression is evaluated by evaluating first e1. If this evaluates to true, the result of evaluating e2 is returned, otherwise the result of evaluating e3 is returned." - which is a pure-[SERIAL] process-recipe ( one step after another ).
One may remember, that even the evaluation of expression1 may have (and does have ) state-change effects ( not only "side-effects" ), but indeed state-change effects ( PRNG-steps into a new state whenever a random-number was asked to get generated and many similar situations )
Thus the if e1 then e2 else e3 has to obey a pure-[SERIAL] implementation, no matter what benefits may be brought into action from fine-grain {AND|OR}-based-parallelism ( may see many working examples thereof in languages that can use 'em right since late 70-ies early 80-ies )

What kind of morphism is `filter` in category theory?

In category theory, is the filter operation considered a morphism? If yes, what kind of morphism is it? Example (in Scala)
val myNums: Seq[Int] = Seq(-1, 3, -4, 2)
myNums.filter(_ > 0)
// Seq[Int] = List(3, 2) // result = subset, same type
myNums.filter(_ > -99)
// Seq[Int] = List(-1, 3, -4, 2) // result = identical than original
myNums.filter(_ > 99)
// Seq[Int] = List() // result = empty, same type
One interesting way of looking at this matter involves not picking filter as a primitive notion. There is a Haskell type class called Filterable which is aptly described as:
Like Functor, but it [includes] Maybe effects.
Formally, the class Filterable represents a functor from Kleisli Maybe to Hask.
The morphism mapping of the "functor from Kleisli Maybe to Hask" is captured by the mapMaybe method of the class, which is indeed a generalisation of the homonymous Data.Maybe function:
mapMaybe :: Filterable f => (a -> Maybe b) -> f a -> f b
The class laws are simply the appropriate functor laws (note that Just and (<=<) are, respectively, identity and composition in Kleisli Maybe):
mapMaybe Just = id
mapMaybe (g <=< f) = mapMaybe g . mapMaybe f
The class can also be expressed in terms of catMaybes...
catMaybes :: Filterable f => f (Maybe a) -> f a
... which is interdefinable with mapMaybe (cf. the analogous relationship between sequenceA and traverse)...
catMaybes = mapMaybe id
mapMaybe g = catMaybes . fmap g
... and amounts to a natural transformation between the Hask endofunctors Compose f Maybe and f.
What does all of that have to do with your question? Firstly, a functor is a morphism between categories, and a natural transformation is a morphism between functors. That being so, it is possible to talk of morphisms here in a sense that is less boring than the "morphisms in Hask" one. You won't necessarily want to do so, but in any case it is an existing vantage point.
Secondly, filter is, unsurprisingly, also a method of Filterable, its default definition being:
filter :: Filterable f => (a -> Bool) -> f a -> f a
filter p = mapMaybe $ \a -> if p a then Just a else Nothing
Or, to spell it using another cute combinator:
filter p = mapMaybe (ensure p)
That indirectly gives filter a place in this particular constellation of categorical notions.
To answer are question like this, I'd like to first understand what is the essence of filtering.
For instance, does it matter that the input is a list? Could you filter a tree? I don't see why not! You'd apply a predicate to each node of the tree and discard the ones that fail the test.
But what would be the shape of the result? Node deletion is not always defined or it's ambiguous. You could return a list. But why a list? Any data structure that supports appending would work. You also need an empty member of your data structure to start the appending process. So any unital magma would do. If you insist on associativity, you get a monoid. Looking back at the definition of filter, the result is a list, which is indeed a monoid. So we are on the right track.
So filter is just a special case of what's called Foldable: a data structure over which you can fold while accumulating the results in a monoid. In particular, you could use the predicate to either output a singleton list, if it's true; or an empty list (identity element), if it's false.
If you want a categorical answer, then a fold is an example of a catamorphism, an example of a morphism in the category of algebras. The (recursive) data structure you're folding over (a list, in the case of filter) is an initial algebra for some functor (the list functor, in this case), and your predicate is used to define an algebra for this functor.
In this answer, I will assume that you are talking about filter on Set (the situation seems messier for other datatypes).
Let's first fix what we are talking about. I will talk specifically about the following function (in Scala):
def filter[A](p: A => Boolean): Set[A] => Set[A] =
s => s filter p
When we write it down this way, we see clearly that it's a polymorphic function with type parameter A that maps predicates A => Boolean to functions that map Set[A] to other Set[A]. To make it a "morphism", we would have to find some categories first, in which this thing could be a "morphism". One might hope that it's natural transformation, and therefore a morphism in the category of endofunctors on the "default ambient category-esque structure" usually referred to as "Hask" (or "Scal"? "Scala"?). To show that it's natural, we would have to check that the following diagram commutes for every f: B => A:
- o f
Hom[A, Boolean] ---------------------> Hom[B, Boolean]
| |
| |
| |
| filter[A] | filter[B]
| |
V ??? V
Hom[Set[A], Set[A]] ---------------> Hom[Set[B], Set[B]]
however, here we fail immediately, because it's not clear what to even put on the horizontal arrow at the bottom, since the assignment A -> Hom[Set[A], Set[A]] doesn't even seem functorial (for the same reasons why A -> End[A] is not functorial, see here and also here).
The only "categorical" structure that I see here for a fixed type A is the following:
Predicates on A can be considered to be a partially ordered set with implication, that is p LEQ q if p implies q (i.e. either p(x) must be false, or q(x) must be true for all x: A).
Analogously, on functions Set[A] => Set[A], we can define a partial order with f LEQ g whenever for each set s: Set[A] it holds that f(s) is subset of g(s).
Then filter[A] would be monotonic, and therefore a functor between poset-categories. But that's somewhat boring.
Of course, for each fixed A, it (or rather its eta-expansion) is also just a function from A => Boolean to Set[A] => Set[A], so it's automatically a "morphism" in the "Hask-category". But that's even more boring.
filter can be written in terms of foldRight as:
filter p ys = foldRight(nil)( (x, xs) => if (p(x)) x::xs else xs ) ys
foldRight on lists is a map of T-algebras (where here T is the List datatype functor), so filter is a map of T-algebras.
The two algebras in question here are the initial list algebra
[nil, cons]: 1 + A x List(A) ----> List(A)
and, let's say the "filter" algebra,
[nil, f]: 1 + A x List(A) ----> List(A)
where f(x, xs) = if p(x) x::xs else xs.
Let's call filter(p, _) the unique map from the initial algebra to the filter algebra in this case (it is called fold in the general case). The fact that it is a map of algebras means that the following equations are satisfied:
filter(p, nil) = nil
filter(p, x::xs) = f(x, filter(p, xs))

What is meant by parallelism in this case?

In this blog post about parallel collections in Scala http://beust.com/weblog/2011/08/15/scalas-parallel-collections/
This is mentioned on a comment by Daniel Spiewak:
Other people have already commented about the mkString example, so I’m
going to leave it alone. It does reflect a much larger point about
collection semantics in general. Basically, this is it: in the absence
of side-effects (in user code), parallel collections have the
exact same semantics as the sequential collections. Put another way:
forAll { (xs: Vector[A], f: A => B) =>
(xs.par map f) == (xs map f)
}
Does this mean that if there are no side effects parallelism is not achieved? If this is true can this point be expanded to explain why this the case?
Does this mean that if there are no side effects parallelism is not
achieved?
No, that's not what it means. When Daniel Spiewak says that
Basically, this is it: in the absence of side-effects (in user code),
parallel collections have the exact same semantics as the sequential
collections.
It means that if your function has no side-effects then using it to map over a simple collection or a parallel collection will yield the same outcome. Which is why:
Put another way: forAll { (xs: Vector[A], f: A => B) => (xs.par map f)
== (xs map f) }
if f is side-effect free.
So, it's actually the opposite: if there are side effects parallelism is not a good idea since the outcome will be inconsistent.
It will always be run in parallel, but the result may differ when side-effects are there.
Let's say A = Int and B = Int with following code:
var tmp = false
def f(i: Int) = if(!tmp) {tmp = true; 0} else i + 1
So here we have a function f with a side-effect. I assume that tmp is false before running the code.
Running Vector(1,2,3).map(f) will always result in Vector(0,3,4)
But Vector(1,2,3).par.map(f) can have different results. It can be Vector(0,3,4) but since its parallel, the second element maybe is mapped first etc. So something like this might happen Vector(2,0,4).
You can be sure that the result will be the same, when in this case f would not have side-effects.

How can I write f(g(h(x))) in Scala with fewer parentheses?

Expressions like
ls map (_ + 1) sum
are lovely because they are left-to-right and not nested. But if the functions in question are defined outside the class, it is less pretty.
Following an example I tried
final class DoublePlus(val self: Double) {
def hypot(x: Double) = sqrt(self*self + x*x)
}
implicit def doubleToDoublePlus(x: Double) =
new DoublePlus(x)
which works fine as far as I can tell, other than
A lot of typing for one method
You need to know in advance that you want to use it this way
Is there a trick that will solve those two problems?
You can call andThen on a function object:
(h andThen g andThen f)(x)
You can't call it on methods directly though, so maybe your h needs to become (h _) to transform the method into a partially applied function. The compiler will translate subsequent method names to functions automatically because the andThen method accepts a Function parameter.
You could also use the pipe operator |> to write something like this:
x |> h |> g |> f
Enriching an existing class/interface with an implicit conversion (which is what you did with doubleToDoublePlus) is all about API design when some classes aren't under your control. I don't recommend to do that lightly just to save a few keystrokes or having a few less parenthesis. So if it's important to be able to type val h = d hypot x, then the extra keystrokes should not be a concern. (there may be object allocations concerns but that's different).
The title and your example also don't match:
f(g(h(x))) can be rewritten asf _ compose g _ compose h _ apply x if your concern is about parenthesis or f compose g compose h apply x if f, g, h are function objects rather than def.
But ls map (_ + 1) sum aren't nested calls as you say, so I'm not sure how that relates to the title. And although it's lovely to use, the library/language designers went through a lot of efforts to make it easy to use and under the hood is not simple (much more complex than your hypot example).
def fgh (n: N) = f(g(h(n)))
val m = fgh (n)
Maybe this, observe how a is provided:
def compose[A, B, C](f: B => C, g: A => B): A => C = (a: A) => f(g(a))
basically like the answer above combine the desired functions to a intermediate one which you then can use easily with map.
Starting Scala 2.13, the standard library provides the chaining operation pipe which can be used to convert/pipe a value with a function of interest.
Using multiple pipes we can thus build a pipeline which as mentioned in the title of your question, minimizes the number of parentheses:
import scala.util.chaining._
x pipe h pipe g pipe f