Functional Programming Principles

Functional Programming Principles - scala

So, I am new to functional programming and I am still trying to digest the fundamental principles. So far, I can appreciate that one should ideally code without mutable variables, assignments, loops and other imperative control structures. So I have a question. Between the following two code snipets:
def enrich(xRDD: RDD[xObject], yRDD: RDD[yObject], zRDD: RDD[zObject]): RDD[Result] = {
val temp = functionA(xRDD, yRDD)
functionB(temp, zRDD)
}
and
def enrich(xRDD: RDD[xObject], yRDD: RDD[yObject], zRDD: RDD[zObject]): RDD[Result] = {
functionB(functionA(xRDD, yRDD), zRDD)
}
which one should I opt for and why? My guess is the second one since it avoids assigning data locally to a temporary val. Is this all there is to it? Did I get it right? Am I missing something?

Both ways are good but depends on usecase which one to go for.
There is nothing wrong about both the ways. You can use any one of the above ways. But as you are not using the value returned by functionA any where except in functionB. Second way looks good (there is no extra variable). Extra variable is less of a concern (memory consumed by reference is insignificant for practical purposes.)
Misconception
Assignments are OK in functional programming. Reassignments are not OK.
Capturing the result using a variable is OK in functional programming. But using var and reassigning the var is not functional programming.

If you are ok with a bit of reading.
https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-20.html
You will need to dig a bit around compilers theory to see benefits and disadvantages which cause variables.

Related

Public read-only access to a private var in Scala

Preamble: I'm teaching a course in object-functional programming using Scala and one of the things we do is to take sample problems and compare how they might be implemented using object-functional programming and using state-based, object-oriented programming, which is the background most of the students have.
So I want to implement a simple class in Scala that has a private var with a public accessor method (a very common idiom in state-based, object-oriented programming). Looking at Alvin Alexander's "Scala Cookbook" the recommended code to do this is pretty ghastly:
class Person(private var _age: Int):
def incrAge() = _age += 1
def age = _age
I say "ghastly" because I'm having to invent two names that essentially represent the age field, one used in the constructor and another used in the class interface. I'm curious if people more familiar with Scala would know of a simpler syntax that would avoid this?
EDIT: It seems clear to me now that Scala combines the val/var declaration with the given visibility (public/private), so for a var either both accessor&mutator are public or both are private. Depending on perspective, you might find this inflexible, or feel it rightly punishes you for using var 🙂.

Yes, a better way of doing it is not using var
class Person(val age: Int) {
def incrAge = new Person(age+1)
}
If you are going to write idiomatic scala code, you should start with pretending that certain parts of it simply do not exist: mostly vars, nulls and returns, but also mutable structures or collections, arrays, and certain methods like .get on a Try or an Option, or the Await object. Oh, and also isInstanceOf and asInstance.
You may ask "why do these things exist if they are not supposed to be used?". Well, because sometimes, in a very few very specific cases they are actually useful for achieving a very limited very specific purpose. But that would be probably fewer than 0.1% of the cases you will come across in your career, unless you are involved in some hard core low level library development (in which case, you would not be posting questions like this here).
So, until you acquire enough command of the language to be able to definitively distinguish those 0.1% of the cases from the other 99.9%, you are much better off simply ignoring those language features, and pretending they do not exist (if you can't figure out how to achieve a certain task without using one of those, post a question here, and people will gladly help you).
You said "Having to create two names to manage a single field is ugly." Indeed. But you know what's uglier? Using vars.
(Btw, the way you typically do this in java is getAge and setAge – still two names. The ugliness is rooted in allowing the value labeled with the given name to be different at different points of program execution, not in how specifically the semantics of mutation looks like).

Is this bad style to replace map or flatMap with return and get on Try in Scala?

I'm wondering if this is a bad code style to replace map with return and get on Try for readability? Say I have some variable with Try inside it and then I need to do anything on it.
val myData: Try[String]
I can do:
myData.flatMap{
some long code
}
Or I can do:
if (myData.isFailure) return myData
val myString = myData.get
some long code that use myString

Yes I would regard this as bad style. One of the biggest gains I have made as a programmer have come from replacing statements with expressions. So I would rewrite your example using pattern matching.
myData match {
case Success(myString) =>
some long code that use myString
case Failure(_) => tFileDf
}
I understand that if-guards are very common in languages like java and C# and they even make the code easier to read in these cases. Since moving to Scala I find less cases where if-guards would improve my code. However the move to expressions has greatly improved the quality of all of my code.
Once you make this change you will gradually use less mutable state and side effects. Gradually your methods will become pure functions. They will probably become shorter. Then you will discover the benefits of total functions. These things improve all of your code where if-guards are local optimisations that will start to clash with modern scala code, and even make it error prone to change.
In the case above I would probably consider exposing the Try, this exposes the failure case more explicitly than returning a default or error value of the same return type.
Another reason that return is discouraged is that it does not alway play nicely in functions.
My advice is to embrace the more functional aspects of Scala and see where it takes you. It will take you more time to write than the equivalent in Java it pay dividends very quickly.

Scala uses mutable variables to implement its apis

I am in the process of learning Scala through Coursera course (progfun).
We are being learned to think functionally and use tail recursions when possible to implement functions/methods.
And as an example for foreach on a list function, we have taught to implement it like:
def foreach[T](list: List[T], f: [T] => Unit) {
if (!list.isEmpty) foreach(list.tail) else f(list.head)
}
Then I was surprised when I found the following implementation in some Scala apis:
override /*IterableLike*/
def foreach[B](f: A => B) {
var these = this
while (!these.isEmpty) {
f(these.head)
these = these.tail
}
}
So how come we are being learned to use recursion and avoid using mutable variables and the api is being implemented by opposite techniques?
Have a look at scala.collection.LinearSeqOptimized where scala.collection.immutable.List extend. (similar implementation found in the List class itself)

Don't forget that Scala is intended to be a multiparadigm language. For educational purposes, it's good to know how to read and write tail-call recursive functions. But when using the language day-to-day, it's important to remember that it's not pure FP.
It's possible that part of the library predated TCO and the #tailrec annotation. You'd have to look at commit history to find out.
That implementation of foreach might use a mutable var, but from the outside, it appears to be pure. Ultimately, this is exactly what TCO would do behind the scenes.

There are two parts to your question:
So how come we are being learned to use recursion and avoid using mutable variables
Because the teachers assume that you either already know about imperative programming with mutable state and loops or will be exposed to it sometime during your career anyway, so they would rather focus on teaching you the things you are less likely to pick up on your own.
Also, imperative programming with mutable state is much harder to reason about, much harder to understand and thus much harder to teach.
and the api is being implemented by opposite techniques?
Because the Scala standard library is intended to be a high-performance industrial-strength library, not a teaching example. Maybe the person who wrote that code profiled it and measured it to be 0.001% percent faster than the tail-recursive version. Maybe, when that code was written, the compiler couldn't yet reliably optimize the tail-recursive version.
Don't forget that Iterable and friends are the cornerstone of Scala's collections library, those methods you are looking at are probably among the most often executed methods in the entire Scala universe. Even the tiniest performance optimization pays out in a method that is executed billions of times.

Is there any advantage to avoiding while loops in Scala?

Reading Scala docs written by the experts one can get the impression that tail recursion is better than a while loop, even when the latter is more concise and clearer. This is one example
object Helpers {
implicit class IntWithTimes(val pip:Int) {
// Recursive
def times(f: => Unit):Unit = {
#tailrec
def loop(counter:Int):Unit = {
if (counter >0) { f; loop(counter-1) }
}
loop(pip)
}
// Explicit loop
def :#(f: => Unit) = {
var lc = pip
while (lc > 0) { f; lc -= 1 }
}
}
}
(To be clear, the expert was not addressing looping at all, but in the example they chose to write a loop in this fashion as if by instinct, which is what the raised the question for me: should I develop a similar instinct..)
The only aspect of the while loop that could be better is the iteration variable should be local to the body of the loop, and the mutation of the variable should be in a fixed place, but Scala chooses not to provide that syntax.
Clarity is subjective, but the question is does the (tail) recursive style offer improved performance?

I'm pretty sure that, due to the limitations of the JVM, not every potentially tail-recursive function will be optimised away by the Scala compiler as so, so the short (and sometimes wrong) answer to your question on performance is no.
The long answer to your more general question (having an advantage) is a little more contrived. Note that, by using while, you are in fact:
creating a new variable that holds a counter.
mutating that variable.
Off-by-one errors and the perils of mutability will ensure that, on the long run, you'll introduce bugs with a while pattern. In fact, your times function could easily be implemented as:
def times(f: => Unit) = (1 to pip) foreach f
Which not only is simpler and smaller, but also avoids any creation of transient variables and mutability. In fact, if the type of the function you are calling would be something to which the results matter, then the while construction would start to be even more difficult to read. Please attempt to implement the following using nothing but whiles:
def replicate(l: List[Int])(times: Int) = l.flatMap(x => List.fill(times)(x))
Then proceed to define a tail-recursive function that does the same.
UPDATE:
I hear you saying: "hey! that's cheating! foreach is neither a while nor a tail-rec call". Oh really? Take a look into Scala's definition of foreach for Lists:
def foreach[B](f: A => B) {
var these = this
while (!these.isEmpty) {
f(these.head)
these = these.tail
}
}
If you want to learn more about recursion in Scala, take a look at this blog post. Once you are into functional programming, go crazy and read Rúnar's blog post. Even more info here and here.

In general, a directly tail recursive function (i.e., one that always calls itself directly and cannot be overridden) will always be optimized into a while loop by the compiler. You can use the #tailrec annotation to verify that the compiler is able to do this for a particular function.
As a general rule, any tail recursive function can be rewritten (usually automatically by the compiler) as a while loop and vice versa.
The purpose of writing functions in a (tail) recursive style is not to maximize performance or even conciseness, but to make the intent of the code as clear as possible, while simultaneously minimizing the chance of introducing bugs (by eliminating mutable variables, which generally make it harder to keep track of what the "inputs" and "outputs" of the function are). A properly written recursive function consists of a series of checks for terminating conditions (using either cascading if-else or a pattern match) with the recursive call(s) (plural only if not tail recursive) made if none of the terminating conditions are met.
The benefit of using recursion is most dramatic when there are several different possible terminating conditions. A series of if conditionals or patterns is generally much easier to comprehend than a single while condition with a whole bunch of (potentially complex and inter-related) boolean expressions &&'d together, especially if the return value needs to be different depending on which terminating condition is met.

Did these experts say that performance was the reason? I'm betting their reasons are more to do with expressive code and functional programming. Could you cite examples of their arguments?
One interesting reason why recursive solutions can be more efficient than more imperative alternatives is that they very often operate on lists and in a way that uses only head and tail operations. These operations are actually faster than random-access operations on more complex collections.
Anther reason that while-based solutions may be less efficient is that they can become very ugly as the complexity of the problem increases...
(I have to say, at this point, that your example is not a good one, since neither of your loops do anything useful. Your recursive loop is particularly atypical since it returns nothing, which implies that you are missing a major point about recursive functions. The functional bit. A recursive function is much more than another way of repeating the same operation n times.)
While loops do not return a value and require side effects to achieve anything. It is a control structure which only works at all for very simple tasks. This is because each iteration of the loop has to examine all of the state to decide what to next. The loops boolean expression may also have to be come very complex if there are multiple potential exit paths (or that complexity has to be distributed throughout the code in the loop, which can be ugly and obfuscatory).
Recursive functions offer the possibility of a much cleaner implementation. A good recursive solution breaks a complex problem down in to simpler parts, then delegates each part on to another function which can deal with it - the trick being that that other function is itself (or possibly a mutually recursive function, though that is rarely seen in Scala - unlike the various Lisp dialects, where it is common - because of the poor tail recursion support). The recursively called function receives in its parameters only the simpler subset of data and only the relevant state; it returns only the solution to the simpler problem. So, in contrast to the while loop,
Each iteration of the function only has to deal with a simple subset of the problem
Each iteration only cares about its inputs, not the overall state
Sucess in each subtask is clearly defined by the return value of the call that handled it.
State from different subtasks cannot become entangled (since it is hidden within each recursive function call).
Multiple exit points, if they exist, are much easier to represent clearly.
Given these advantages, recursion can make it easier to achieve an efficient solution. Especially if you count maintainability as an important factor in long-term efficiency.
I'm going to go find some good examples of code to add. Meanwhile, at this point I always recommend The Little Schemer. I would go on about why but this is the second Scala recursion question on this site in two days, so look at my previous answer instead.

Why do immutable objects enable functional programming?

I'm trying to learn scala and I'm unable to grasp this concept. Why does making an object immutable help prevent side-effects in functions. Can anyone explain like I'm five?

Interesting question, a bit difficult to answer.
Functional programming is very much about using mathematics to reason about programs. To do so, one needs a formalism that describe the programs and how one can make proofs about properties they might have.
There are many models of computation that provide such formalisms, such as lambda calculus and turing machines. And there's a certain degree of equivalency between them (see this question, for a discussion).
In a very real sense, programs with mutability and some other side effects have a direct mapping to functional program. Consider this example:
a = 0
b = 1
a = a + b
Here are two ways of mapping it to functional program. First one, a and b are part of a "state", and each line is a function from a state into a new state:
state1 = (a = 0, b = ?)
state2 = (a = state1.a, b = 1)
state3 = (a = state2.a + state2.b, b = state2.b)
Here's another, where each variable is associated with a time:
(a, t0) = 0
(b, t1) = 1
(a, t2) = (a, t0) + (b, t1)
So, given the above, why not use mutability?
Well, here's the interesting thing about math: the less powerful the formalism is, the easier it is to make proofs with it. Or, to put it in other words, it's too hard to reason about programs that have mutability.
As a consequence, there's very little advance regarding concepts in programming with mutability. The famous Design Patterns were not arrived at through study, nor do they have any mathematical backing. Instead, they are the result of years and years of trial and error, and some of them have since proved to be misguided. Who knows about the other dozens "design patterns" seen everywhere?
Meanwhile, Haskell programmers came up with Functors, Monads, Co-monads, Zippers, Applicatives, Lenses... dozens of concepts with mathematical backing and, most importantly, actual patterns of how code is composed to make up programs. Things you can use to reason about your program, increase reusability and improve correctness. Take a look at the Typeclassopedia for examples.
It's no wonder people not familiar with functional programming get a bit scared with this stuff... by comparison, the rest of the programming world is still working with a few decades-old concepts. The very idea of new concepts is alien.
Unfortunately, all these patterns, all these concepts, only apply with the code they are working with does not contain mutability (or other side effects). If it does, then their properties cease to be valid, and you can't rely on them. You are back to guessing, testing and debugging.

In short, if a function mutates an object then it has side effects. Mutation is a side effect. This is just true by definition.
In truth, in a purely functional language it should not matter if an object is technically mutable or immutable, because the language will never "try" to mutate an object anyway. A pure functional language doesn't give you any way to perform side effects.
Scala is not a pure functional language, though, and it runs in the Java environment in which side effects are very popular. In this environment, using objects that are incapable of mutation encourages you to use a pure functional style because it makes a side-effect oriented style impossible. You are using data types to enforce purity because the language does not do it for you.
Now I will say a bunch of other stuff in the hope that it helps this make sense to you.
Fundamental to the concept of a variable in functional languages is referential transparency.
Referential transparency means that there is no difference between a value, and a reference to that value. In a language where this is true, it makes it much simpler to think about a program works, since you never have to stop and ask, is this a value, or a reference to a value? Anyone who's ever programmed in C recognizes that a great part of the challenge of learning that paradigm is knowing which is which at all times.
In order to have referential transparency, the value that a reference refers to can never change.
(Warning, I'm about to make an analogy.)
Think of it this way: in your cell phone, you have saved some phone numbers of other people's cell phones. You assume that whenever you call that phone number, you will reach the person you intend to talk to. If someone else wants to talk to your friend, you give them the phone number and they reach that same person.
If someone changes their cell phone number, this system breaks down. Suddenly, you need to get their new phone number if you want to reach them. Maybe you call the same number six months later and reach a different person. Calling the same number and reaching a different person is what happens when functions perform side effects: you have what seems to be the same thing, but you try to use it, it turns out it's different now. Even if you expected this, what about all the people you gave that number to, are you going to call them all up and tell them that the old number doesn't reach the same person anymore?
You counted on the phone number corresponding to that person, but it didn't really. The phone number system lacks referential transparency: the number isn't really ALWAYS the same as the person.
Functional languages avoid this problem. You can give out your phone number and people will always be able to reach you, for the rest of your life, and will never reach anybody else at that number.
However, in the Java platform, things can change. What you thought was one thing, might turn into another thing a minute later. If this is the case, how can you stop it?
Scala uses the power of types to prevent this, by making classes that have referential transparency. So, even though the language as a whole isn't referentially transparent, your code will be referentially transparent as long as you use immutable types.
Practically speaking, the advantages of coding with immutable types are:
Your code is simpler to read when the reader doesn't have to look out for surprising side effects.
If you use multiple threads, you don't have to worry about locking because shared objects can never change. When you have side effects, you have to really think through the code and figure out all the places where two threads might try to change the same object at the same time, and protect against the problems that this might cause.
Theoretically, at least, the compiler can optimize some code better if it uses only immutable types. I don't know if Java can do this effectively, though, since it allows side effects. This is a toss-up at best, anyway, because there are some problems that can be solved much more efficiently by using side effects.

I'm running with this 5 year old explanation:
class Account(var myMoney:List[Int] = List(10, 10, 1, 1, 1, 5)) {
def getBalance = println(myMoney.sum + " dollars available")
def myMoneyWithInterest = {
myMoney = myMoney.map(_ * 2)
println(myMoney.sum + " dollars will accru in 1 year")
}
}
Assume we are at an ATM and it is using this code to give us account information.
You do the following:
scala> val myAccount = new Account()
myAccount: Account = Account#7f4a6c40
scala> myAccount.getBalance
28 dollars available
scala> myAccount.myMoneyWithInterest
56 dollars will accru in 1 year
scala> myAccount.getBalance
56 dollars available
We mutated the account balance when we wanted to check our current balance plus a years worth of interest. Now we have an incorrect account balance. Bad news for the bank!
If we were using val instead of var to keep track of myMoney in the class definition, we would not have been able to mutate the dollars and raise our balance.
When defining the class (in the REPL) with val:
error: reassignment to val
myMoney = myMoney.map(_ * 2
Scala is telling us that we wanted an immutable value but are trying to change it!
Thanks to Scala, we can switch to val, re-write our myMoneyWithInterest method and rest assured that our Account class will never alter the balance.

One important property of functional programming is: If I call the same function twice with the same arguments I'll get the same result. This makes reasoning about code much easier in many cases.
Now imagine a function returning the attribute content of some object. If that content can change the function might return different results on different calls with the same argument. => no more functional programming.

First a few definitions:
A side effect is a change in state -- also called a mutation.
An immutable object is an object which does not support mutation, (side effects).
A function which is passed mutable objects (either as parameters or in the global environment) may or may not produce side effects. This is up to the implementation.
However, it is impossible for a function which is passed only immutable objects (either as parameters or in the global environment) to produce side effects. Therefore, exclusive use of immutable objects will preclude the possibility of side effects.

Nate's answer is great, and here is some example.
In functional programming, there is an important feature that when you call a function with same argument, you always get same return value.
This is always true for immutable objects, because you can't modify them after create it:
class MyValue(val value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
val z = plus(x) // z is still 20, plus(x) will always yield 20
But if you have mutable objects, you can't guarantee that plus(x) will always return same value for same instance of MyValue.
class MyValue(var value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
x.value = 30
val z = plus(x) // z is 40, you can't for sure what value will plus(x) return because MyValue.value may be changed at any point.

Why do immutable objects enable functional programming?
They don't.
Take one definition of "function," or "prodecure," "routine" or "method," which I believe applies to many programming languages: "A section of code, typically named, accepting arguments and/or returning a value."
Take one definition of "functional programming:" "Programming using functions." The ability to program with functions is indepedent of whether state is modified.
For instance, Scheme is considered a functional programming language. It features tail calls, higher-order functions and aggregate operations using functions. It also has mutable objects. While mutability destroys some nice mathematical qualities, it does not necessarily prevent "functional programming."

I've read all the answers and they don't satisfy me, because they mostly talk about "immutability", and not about its relation to FP.
The main question is:
Why do immutable objects enable functional programming?
So I've searched a bit more and I have another answer, I believe the easy answer to this question is: "Because Functional Programming is basically defined on the basis of functions that are easy to reason about". Here's the definition of Functional Programming:
The process of building software by composing pure functions.
If a function is not pure -- which means receiving the same input, it's not guaranteed to always produce the same output (e.g., if the function relies on a global object, or date and time, or a random number to compute the output) -- then that function is unpredictable, that's it! Now exactly the same story goes about "immutability" as well, if objects are not immutable, a function with the same object as its input may have different results (aka side effects) each time used, and this will make it hard to reason about the program.
I first tried to put this in a comment, but it got longer than the limit, I'm by no means a pro so please take this answer with a grain of salt.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse