May a while loop be used with yield in scala - scala

Here is the standard format for a for/yield in scala: notice it expects a collection - whose elements drive the iteration.
for (blah <- blahs) yield someThingDependentOnBlah
I have a situation where an indeterminate number of iterations will occur in a loop. The inner loop logic determines how many will be executed.
while (condition) { some logic that affects the triggering condition } yield blah
Each iteration will generate one element of a sequence - just like a yield is programmed to do. What is a recommended way to do this?

You can
Iterator.continually{ some logic; blah }.takeWhile(condition)
to get pretty much the same thing. You'll need to use something mutable (e.g. a var) for the logic to impact the condition. Otherwise you can
Iterator.iterate((blah, whatever)){ case (_,w) => (blah, some logic on w) }.
takeWhile(condition on _._2).
map(_._1)

Using for comprehensions is the wrong thing for that. What you describe is generally done by unfold, though that method is not present in Scala's standard library. You can find it in Scalaz, though.

Another way similar to suggestion by #rexkerr:
blahs.toIterator.map{ do something }.takeWhile(condition)
This feels a bit more natural than the Iterator.continually

Related

Is there any advantage to avoiding while loops in Scala?

Reading Scala docs written by the experts one can get the impression that tail recursion is better than a while loop, even when the latter is more concise and clearer. This is one example
object Helpers {
implicit class IntWithTimes(val pip:Int) {
// Recursive
def times(f: => Unit):Unit = {
#tailrec
def loop(counter:Int):Unit = {
if (counter >0) { f; loop(counter-1) }
}
loop(pip)
}
// Explicit loop
def :#(f: => Unit) = {
var lc = pip
while (lc > 0) { f; lc -= 1 }
}
}
}
(To be clear, the expert was not addressing looping at all, but in the example they chose to write a loop in this fashion as if by instinct, which is what the raised the question for me: should I develop a similar instinct..)
The only aspect of the while loop that could be better is the iteration variable should be local to the body of the loop, and the mutation of the variable should be in a fixed place, but Scala chooses not to provide that syntax.
Clarity is subjective, but the question is does the (tail) recursive style offer improved performance?
I'm pretty sure that, due to the limitations of the JVM, not every potentially tail-recursive function will be optimised away by the Scala compiler as so, so the short (and sometimes wrong) answer to your question on performance is no.
The long answer to your more general question (having an advantage) is a little more contrived. Note that, by using while, you are in fact:
creating a new variable that holds a counter.
mutating that variable.
Off-by-one errors and the perils of mutability will ensure that, on the long run, you'll introduce bugs with a while pattern. In fact, your times function could easily be implemented as:
def times(f: => Unit) = (1 to pip) foreach f
Which not only is simpler and smaller, but also avoids any creation of transient variables and mutability. In fact, if the type of the function you are calling would be something to which the results matter, then the while construction would start to be even more difficult to read. Please attempt to implement the following using nothing but whiles:
def replicate(l: List[Int])(times: Int) = l.flatMap(x => List.fill(times)(x))
Then proceed to define a tail-recursive function that does the same.
UPDATE:
I hear you saying: "hey! that's cheating! foreach is neither a while nor a tail-rec call". Oh really? Take a look into Scala's definition of foreach for Lists:
def foreach[B](f: A => B) {
var these = this
while (!these.isEmpty) {
f(these.head)
these = these.tail
}
}
If you want to learn more about recursion in Scala, take a look at this blog post. Once you are into functional programming, go crazy and read Rúnar's blog post. Even more info here and here.
In general, a directly tail recursive function (i.e., one that always calls itself directly and cannot be overridden) will always be optimized into a while loop by the compiler. You can use the #tailrec annotation to verify that the compiler is able to do this for a particular function.
As a general rule, any tail recursive function can be rewritten (usually automatically by the compiler) as a while loop and vice versa.
The purpose of writing functions in a (tail) recursive style is not to maximize performance or even conciseness, but to make the intent of the code as clear as possible, while simultaneously minimizing the chance of introducing bugs (by eliminating mutable variables, which generally make it harder to keep track of what the "inputs" and "outputs" of the function are). A properly written recursive function consists of a series of checks for terminating conditions (using either cascading if-else or a pattern match) with the recursive call(s) (plural only if not tail recursive) made if none of the terminating conditions are met.
The benefit of using recursion is most dramatic when there are several different possible terminating conditions. A series of if conditionals or patterns is generally much easier to comprehend than a single while condition with a whole bunch of (potentially complex and inter-related) boolean expressions &&'d together, especially if the return value needs to be different depending on which terminating condition is met.
Did these experts say that performance was the reason? I'm betting their reasons are more to do with expressive code and functional programming. Could you cite examples of their arguments?
One interesting reason why recursive solutions can be more efficient than more imperative alternatives is that they very often operate on lists and in a way that uses only head and tail operations. These operations are actually faster than random-access operations on more complex collections.
Anther reason that while-based solutions may be less efficient is that they can become very ugly as the complexity of the problem increases...
(I have to say, at this point, that your example is not a good one, since neither of your loops do anything useful. Your recursive loop is particularly atypical since it returns nothing, which implies that you are missing a major point about recursive functions. The functional bit. A recursive function is much more than another way of repeating the same operation n times.)
While loops do not return a value and require side effects to achieve anything. It is a control structure which only works at all for very simple tasks. This is because each iteration of the loop has to examine all of the state to decide what to next. The loops boolean expression may also have to be come very complex if there are multiple potential exit paths (or that complexity has to be distributed throughout the code in the loop, which can be ugly and obfuscatory).
Recursive functions offer the possibility of a much cleaner implementation. A good recursive solution breaks a complex problem down in to simpler parts, then delegates each part on to another function which can deal with it - the trick being that that other function is itself (or possibly a mutually recursive function, though that is rarely seen in Scala - unlike the various Lisp dialects, where it is common - because of the poor tail recursion support). The recursively called function receives in its parameters only the simpler subset of data and only the relevant state; it returns only the solution to the simpler problem. So, in contrast to the while loop,
Each iteration of the function only has to deal with a simple subset of the problem
Each iteration only cares about its inputs, not the overall state
Sucess in each subtask is clearly defined by the return value of the call that handled it.
State from different subtasks cannot become entangled (since it is hidden within each recursive function call).
Multiple exit points, if they exist, are much easier to represent clearly.
Given these advantages, recursion can make it easier to achieve an efficient solution. Especially if you count maintainability as an important factor in long-term efficiency.
I'm going to go find some good examples of code to add. Meanwhile, at this point I always recommend The Little Schemer. I would go on about why but this is the second Scala recursion question on this site in two days, so look at my previous answer instead.

Is recursion in scala very necessary?

In the coursera scala tutorial, most examples are using top-down iterations. Partially, as I can see, iterations are used to avoid for/while loops. I'm from C++ and feel a little confused about this.
Is iteration chosen over for/while loops? Is it practical in production? Any risk of stackoverflow? How about efficiency? How about bottom up Dynamic Programming (especially when they are not tail-recusions)?
Also, should I use less "if" conditions, instead use more "case" and subclasses?
Truly high-quality Scala will use very little iteration and only slightly more recursion. What would be done with looping in lower-level imperative languages is usually best done with higher-order combinators, map and flatmap most especially, but also filter, zip, fold, foreach, reduce, collect, partition, scan, groupBy, and a good few others. Iteration is best done only in performance critical sections, and recursion done only in a some deep edge cases where the higher-order combinators don't quite fit (which usually aren't tail recursive, fwiw). In three years of coding Scala in production systems, I used iteration once, recursion twice, and map about five times per day.
Hmm, several questions in one.
Necessity of Recursion
Recursion is not necessary, but it can sometimes provide a very elegant solution.
If the solution is tail recursive and the compiler supports tail call optimisation, then the solution can even be efficient.
As has been well said already, Scala has many combinator functions which can be used to perform the same tasks more expressively and efficiently.
One classic example is writing a function to return the nth Fibonacci number. Here's a naive recursive implementation:
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
Now, this is inefficient (definitely not tail recursive) but it is very obvious how its structure relates to the Fibonacci sequence. We can make it properly tail recursive, though:
def fib (n: Long): Long = {
def fibloop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
fibloop(next, current + next, iteration + 1)
}
fibloop(0, 1, 0)
}
That could have been written more tersely, but it is an efficient recursive implementation. That said, it is not as pretty as the first and it's structure is less clearly related to the original problem.
Finally, stolen shamelessly from elsewhere on this site is Luigi Plinge's streams-based implementation:
val fibs: Stream[Int] = 0 #:: fibs.scanLeft(1)(_ + _)
Very terse, efficient, elegant and (if you understand streams and lazy evaluation) very expressive. It is also, in fact, recursive; #:: is a recursive function, but one that operates in a lazily-evaluated context. You certainly have to be able to think recursively to come up with this kind of solution.
Iteration compared to For/While loops
I'm assuming you mean the traditional C-Style for, here.
Recursive solutions can often be preferable to while loops because C/C++/Java-style while loops do not return a value and require side effects to achieve anything (this is also true for C-Style for and Java-style foreach). Frankly, I often wish Scala had never implemented while (or had implemented it as syntactic sugar for something like Scheme's named let), because it allows classically-trained Java developers to keep doing things the way they always did. There are situations where a loop with side effects, which is what while gives you, is a more expressive way of achieving something but I had rather Java-fixated devs were forced to reach a little harder for it (e.g. by abusing a for comprehension).
Simply, traditional while and for make clunky imperative coding much too easy. If you don't care about that, why are you using Scala?
Efficiency and risk of Stackoverflow
Tail optimisation eliminates the risk of stackoverflow. Rewriting recursive solutions to be properly tail recursive can make them very ugly (particularly in any language running on the JVM).
Recursive solutions can be more efficient than more imperative solutions, sometimes suprisingly so. One reason is that they often operate on lists, in a way that only involves head and tail access. Head and tail operations on lists are actually faster than random access operations on more structured collections.
Dynamic Programming
A good recursive algorithm typically reduces a complex problem to a small set of simpler problems, picks one to solve and delegates the rest to another function (usually a recursive call to itself). Now, to me this sounds like a great fit for dynamic programming. Certainly, if I am trying a recursive approach to a problem, I often start with a naive solution which I know can't solve every case, see where it fails, add that pattern to the solution and iterate towards success.
The Little Schemer has many examples of this iterative approach to recursive programming, particularly because it re-uses earlier solutions as sub-components for later, more complex ones. I would say it is the epitome of the Dynamic Programming approach. (It is also one of the best-written educational books about software ever produced). I can recommend it, not least because it teaches you Scheme at the same time. If you really don't want to learn Scheme (why? why would you not?), it has been adapted for a few other languages
If versus Match
if expressions, in Scala, return values (which is very useful and why Scala has no need for a ternary operator). There is no reason to avoid simple
if (something)
// do something
else
// do something else
expressions. The principle reason to match instead of a simple if...else is to use the power of case statements to extract information from complex objects. Here is one example.
On the other hand, if...else if...else if...else is a terrible pattern
There's no easy way to see if you covered all the possibilities properly, even with a final else in place.
Unintentionally nested if expressions are hard to spot
It is too easy to link unrelated conditions together (accidentally or through bone-headed design)
Wherever you find you have written else if, look for an alternative. match is a good place to start.
I'm assuming that, since you say "recursion" in your title, you also mean "recursion" in your question, and not "iteration" (which cannot be chosen "over for/while loops", because those are iterative :D).
You might be interested in reading Effective Scala, especially the section on control structures, which should mostly answer your question. In short:
Recursion isn't "better" than iteration. Often it is easier to write a recursive algorithm for a given problem, then it is to write an iterative algorithm (of course there are cases where the opposite applies). When "tail call optimization" can be applied to a problem, the compiler actually converts it to an iterative algorithm, thus making it impossible for a StackOverflow to happen, and without performance impact. You can read about tail call optimization in Effective Scala, too.
The main problem with your question is that it is very broad. There are many many resources available on functional programming, idiomatic scala, dynamic programming and so on, and no answer here on Stack Overflow would be able to cover all those topics. It'd be probably a good idea to just roam the interwebs for a while, and then come back with more concrete questions :)
One of the main benefits of recursion is that it lets you create solutions without mutation. for following example, you have to calculate the sum of all the elements of a List.
One of the many ways to solve this problem is as below. The imperative solution to this problem uses for loop as shown:
scala> var total = 0
scala> for(f <- List(1,2,3)) { total += f }
And recursion solution would look like following:
def total(xs: List[Int]): Int = xs match {
case Nil => 0
case x :: ys => x + total(ys)
}
The difference is that a recursive solution doesn’t use any mutable temporary variables by letting you break the problem into smaller pieces. Because Functional programming is all about writing side effect free programs it's always encourage to use recursion vs loops (that use mutating variables).
Head recursion is a traditional way of doing recursion, where you perform the recursive call first and then take the return value from the recursive function and calculate the result.
Generally when you call a function an entry is added to the call stack of a currently running thread. The downside is that the call stack has a defined size so quickly you may get StackOverflowError exception. This is why Java prefers to iterate rather than recurse. Because Scala runs on the JVM, Scala also suffers from this problem. But starting with Scala 2.8.1, Scala gets away this limitation by doing tail call optimization. you can do tail recursion in Scala.
To recap recursion is preferred way in functional programming to avoid using mutation and secondly tail recursion is supported in Scala so you don't get into StackOverFlow exceptions which you get in Java.
Hope this helps.
As for stack overflow, a lot of the time you can get away with it because of tail call elimination.
The reason scala and other function paradigms avoid for/while loops they are highly dependent on state and time. That makes it much harder to reason about complex "loops" in a formal and precise manor.

if (Option.nonEmpty) vs Option.foreach

I want to perform some logic if the value of an option is set.
Coming from a java background, I used:
if (opt.nonEmpty) {
//something
}
Going a little further into scala, I can write that as:
opt.foreach(o => {
//something
})
Which one is better? The "foreach" one sounds more "idiomatic" and less Java, but it is less readable - "foreach" applied to a single value sounds weird.
Your example is not complete and you don't use minimal syntax. Just compare these two versions:
if (opt.nonEmpty) {
val o = opt.get
// ...
}
// vs
opt foreach {
o => // ...
}
and
if (opt.nonEmpty)
doSomething(opt.get)
// vs
opt foreach doSomething
In both versions there is more syntactic overhead in the if solution, but I agree that foreach on an Option can be confusing if you think of it only as an optional value.
Instead foreach describes that you want to do some sort of side effects, which makes a lot of sense if you think of Option being a monad and foreach just a method to transform it. Using foreach has furthermore the great advantage that it makes refactorings easier - you can just change its type to a List or any other monad and you will not get any compiler errors (because of Scalas great collections library you are not constrained to use only operations that work on monads, there are a lot of methods defined on a lot of types).
foreach does make sense, if you think of Option as being like a List, but with a maximum of one element.
A neater style, IMO, is to use a for-comprehension:
for (o <- opt) {
something(o)
}
foreach makes sense if you consider Option to be a list that can contain at most a single value. This also leads to a correct intuition about many other methods that are available to Option.
I can think of at least one important reason you might want to prefer foreach in this case: it removes possible run-time errors. With the nonEmpty approach, you'll at one point have to do a get*, which can crash your program spectacularly if you by accident forget to check for emptiness one time.
If you completely erase get from your mind to avoid bugs of that kind, a side effect is that you also have less use for nonEmpty! And you'll start to enjoy foreach and let the compiler take care of what should happen if the Option happens to be empty.
You'll see this concept cropping up in other places. You would never do
if (age.nonEmpty)
Some(age.get >= 18)
else
None
What you'll rather see is
age.map(_ >= 18)
The principle is that you want to avoid having to write code that handles the failure case – you want to offload that burden to the compiler. Many will tell you that you should never use get, however careful you think you are about pre-checking. So that makes things easier.
* Unless you actually just want to know whether or not the Option contains a value and don't really care for the value, in which case nonEmpty is just fine. In that case it serves as a sort of toBoolean.
Things I didn't find in the other answers:
When using if, I prefer if (opt isDefined) to if (opt nonEmpty) as the former is less collection-like and makes it more clear we're dealing with an option, but that may be a matter of taste.
if and foreach are different in the sense that if is an expression that will return a value while foreach returns Unit. So in a certain way using foreach is even more java-like than if, as java has a foreach loop and java's if is not an expression. Using a for comprehension is more scala-like.
You can also use pattern matching (but this is also also less idiomatic)
You can use the fold method, which takes two functions so you can evaluate one expression in the Some case and another in the None case. You may need to explicitly specify the expected type because of how type inference works (as shown here). So in my opinion it may sometimes still be clearer to use either pattern matching or val result = if (opt isDefined) expression1 else expression2.
If you don't need a return value and really have no need to handle the not-defined case, you can use foreach.

Either monadic operations

I just start to be used to deal with monadic operations.
For the Option type, this Cheat Sheet of Tony Morris helped:
http://blog.tmorris.net/posts/scalaoption-cheat-sheet/
So in the end it seems easy to understand that:
map transforms the value of inside an option
flatten permits to transform Option[Option[X]] in Option[X]
flatMap is somehow a map operation producing an Option[Option[X]] and then flattened to Option[X]
At least it is what I understand until now.
For Either, it seems a bit more difficult to understand since Either itself is not right biaised, does not have map / flatMap operations... and we use projection.
I can read the Scaladoc but it is not as clear as the Cheat Sheet on Options.
Can someone provide an Either Sheet Cheat to describe the basic monadic operations?
It seems to me that Either.joinRight is a bit like RightProjection.flatMap and seems to be the equivalent of Option.flatten for Either.
It seems to me that if Either was Right biaised, then Either.flatten would be Either.joinRight no?
In this question: Either, Options and for comprehensions I ask about for comprehension with Eiher, and one of the answers says that we can't mix monads because of the way it is desugared into map/flatMap/filter.
When using this kind of code:
def updateUserStats(user: User): Either[Error,User] = for {
stampleCount <- stampleRepository.getStampleCount(user).right
userUpdated <- Right(copyUserWithStats(user,stampleCount)).right
userSaved <- userService.update(userUpdated).right
} yield userSaved
Does this mean that all my 3 method calls must always return Either[Error,Something]?
I mean if I have a method call Either[Throwable,Something] it won't work right?
Edit:
Is Try[Something] exactly the same as a right-biaised Either[Throwable,Something]?
Either was never really meant to be an exception handling based structure. It was meant to represent a situation where a function really could possible return one of two distinct types, but people started the convention where the left type is a supposed to be a failed case and the right is success. If you want to return a biased type for some pass/fail type business checks logic, then Validation from scalaz works well. If you have a function that could return a value or a Throwable, then Try would be a good choice. Either should be used for situations where you really might get one of two possible types, and now that I am using Try and Validation (each for different types of situations), I never use Either any more.

why use foldLeft instead of procedural version?

So in reading this question it was pointed out that instead of the procedural code:
def expand(exp: String, replacements: Traversable[(String, String)]): String = {
var result = exp
for ((oldS, newS) <- replacements)
result = result.replace(oldS, newS)
result
}
You could write the following functional code:
def expand(exp: String, replacements: Traversable[(String, String)]): String = {
replacements.foldLeft(exp){
case (result, (oldS, newS)) => result.replace(oldS, newS)
}
}
I would almost certainly write the first version because coders familiar with either procedural or functional styles can easily read and understand it, while only coders familiar with functional style can easily read and understand the second version.
But setting readability aside for the moment, is there something that makes foldLeft a better choice than the procedural version? I might have thought it would be more efficient, but it turns out that the implementation of foldLeft is actually just the procedural code above. So is it just a style choice, or is there a good reason to use one version or the other?
Edit: Just to be clear, I'm not asking about other functions, just foldLeft. I'm perfectly happy with the use of foreach, map, filter, etc. which all map nicely onto for-comprehensions.
Answer: There are really two good answers here (provided by delnan and Dave Griffith) even though I could only accept one:
Use foldLeft because there are additional optimizations, e.g. using a while loop which will be faster than a for loop.
Use fold if it ever gets added to regular collections, because that will make the transition to parallel collections trivial.
It's shorter and clearer - yes, you need to know what a fold is to understand it, but when you're programming in a language that's 50% functional, you should know these basic building blocks anyway. A fold is exactly what the procedural code does (repeatedly applying an operation), but it's given a name and generalized. And while it's only a small wheel you're reinventing, but it's still a wheel reinvention.
And in case the implementation of foldLeft should ever get some special perk - say, extra optimizations - you get that for free, without updating countless methods.
Other than a distaste for mutable variable (even mutable locals), the basic reason to use fold in this case is clarity, with occasional brevity. Most of the wordiness of the fold version is because you have to use an explicit function definition with a destructuring bind. If each element in the list is used precisely once in the fold operation (a common case), this can be simplified to use the short form. Thus the classic definition of the sum of a collection of numbers
collection.foldLeft(0)(_+_)
is much simpler and shorter than any equivalent imperative construct.
One additional meta-reason to use functional collection operations, although not directly applicable in this case, is to enable a move to using parallel collection operations if needed for performance. Fold can't be parallelized, but often fold operations can be turned into commutative-associative reduce operations, and those can be parallelized. With Scala 2.9, changing something from non-parallel functional to parallel functional utilizing multiple processing cores can sometimes be as easy as dropping a .par onto the collection you want to execute parallel operations on.
One word I haven't seen mentioned here yet is declarative:
Declarative programming is often defined as any style of programming that is not imperative. A number of other common definitions exist that attempt to give the term a definition other than simply contrasting it with imperative programming. For example:
A program that describes what computation should be performed and not how to compute it
Any programming language that lacks side effects (or more specifically, is referentially transparent)
A language with a clear correspondence to mathematical logic.
These definitions overlap substantially.
Higher-order functions (HOFs) are a key enabler of declarativity, since we only specify the what (e.g. "using this collection of values, multiply each value by 2, sum the result") and not the how (e.g. initialize an accumulator, iterate with a for loop, extract values from the collection, add to the accumulator...).
Compare the following:
// Sugar-free Scala (Still better than Java<5)
def sumDoubled1(xs: List[Int]) = {
var sum = 0 // Initialized correctly?
for (i <- 0 until xs.size) { // Fenceposts?
sum = sum + (xs(i) * 2) // Correct value being extracted?
// Value extraction and +/* smashed together
}
sum // Correct value returned?
}
// Iteration sugar (similar to Java 5)
def sumDoubled2(xs: List[Int]) = {
var sum = 0
for (x <- xs) // We don't need to worry about fenceposts or
sum = sum + (x * 2) // value extraction anymore; that's progress
sum
}
// Verbose Scala
def sumDoubled3(xs: List[Int]) = xs.map((x: Int) => x*2). // the doubling
reduceLeft((x: Int, y: Int) => x+y) // the addition
// Idiomatic Scala
def sumDoubled4(xs: List[Int]) = xs.map(_*2).reduceLeft(_+_)
// ^ the doubling ^
// \ the addition
Note that our first example, sumDoubled1, is already more declarative than (most would say superior to) C/C++/Java<5 for loops, because we haven't had to micromanage the iteration state and termination logic, but we're still vulnerable to off-by-one errors.
Next, in sumDoubled2, we're basically at the level of Java>=5. There are still a couple things that can go wrong, but we're getting pretty good at reading this code-shape, so errors are quite unlikely. However, don't forget that a pattern that's trivial in a toy example isn't always so readable when scaled up to production code!
With sumDoubled3, desugared for didactic purposes, and sumDoubled4, the idiomatic Scala version, the iteration, initialization, value extraction and choice of return value are all gone.
Sure, it takes time to learn to read the functional versions, but we've drastically foreclosed our options for making mistakes. The "business logic" is clearly marked, and the plumbing is chosen from the same menu that everyone else is reading from.
It is worth pointing out that there is another way of calling foldLeft which takes advantages of:
The ability to use (almost) any Unicode symbol in an identifier
The feature that if a method name ends with a colon :, and is called infix, then the target and parameter are switched
For me this version is much clearer, because I can see that I am folding the expr value into the replacements collection
def expand(expr: String, replacements: Traversable[(String, String)]): String = {
(expr /: replacements) { case (r, (o, n)) => r.replace(o, n) }
}