This is what I'm doing now:
private var accounts = Vector.empty[Account]
def removeAccount(account: Account)
{
accounts = accounts.filterNot(_ == account)
}
Is there a more readable solution? Ideally, I'd like to write accounts = accounts.remove(account).
I'd use this:
accounts filterNot account.==
Which reads pretty well to me, but ymmv. I'd also like a count that doesn't take a predicate, but the collection library is really lacking in specialized methods where one with a predicate can generalize the operation.
Until 2.8.x, there was a - method, which got deprecated, iirc, because of semantic issues. It could actually have come back on 2.10 if my memory is serving me right, but it didn't. Edit: I checked it out, and saw that - is now reserved for a mutable method that modifies the collection it is applied on. I'd be all in favor of -:/:- though on sequences, where it makes sense to drop the first or last element equal to something. Anyone willing to front a ticket for that? I'd upvote it. :-)
There unfortunately is not, and worse still (perhaps), if the same account is present twice, filterNot will remove both of them. The only thing I can offer for readability is to use
accounts.filter(_ != account)
Another possibility is to use a collection type that does have a remove operation, such as a TreeSet (where it is called -). If you don't have duplicate entries anyway, a Set is perfectly okay. (It is slower for some operations, of course, but it probably is a better fit to the application--it's more efficient at removing individual entries; with a filterNot you basically have to build the entire Vector again.)
You could do something like this:
def removeFirst[T](xs: Vector[T], x: T) = {
val i = xs.indexOf(x)
if (i == -1) xs else xs.patch(i, Nil, 1)
}
then
accounts = removeFirst(accounts, account)
I think the nub of the problem, though, is that a Vector probably isn't the right collection type for a set of items where you want to pull things out (hint: try Set). If you want to index on an ID or an insertion index then Map could be what you're after (which does have a - method). If you want to index on multiple things efficiently, you need a database!
You can use the diff method which is defined for all sequences. It computes the multiset difference between two sequences - meaning it will remove as many occurrences of an element as you need.
Vector(1, 2, 1, 3, 2).diff(Seq(1)) =>
Vector(2, 1, 3, 2)
Vector(1, 2, 1, 3, 2).diff(Seq(1, 1)) =>
Vector(2, 3, 2)
Vector(1, 2, 1, 3, 2).diff(Seq(1, 1, 2))
Vector(3, 2)
If you prefer not to use the filterNot closure, you could use the more verbose yet more explicit for-comprehension style instead.
private var accounts = Vector.empty[Account]
def removeAccount(account: Account)
{
accounts = for { a <- accounts
if a != account } yield { a }
}
It's a matter of personal preference whether this is felt to be better in this case.
Certainly for more complex expressions involving nested flatMaps etc, I agree with Martin Odersky's advice that for-comprehensions can be quite a bit easier to read, especially for novices.
Related
Coming from a Java background I am learning Scala and the following has me very confused. Why is the type returned different in these two (very similar but different) constructs, which vary only in how the source collection was build -
val seq1: IndexedSeq[Int] = for (i <- 1 to 10) yield i
vs.
val seq2: Array[Int] = for (i <- Array(1, 2, 3)) yield i
Please do point to me the right literature so that I can understand the core fundamentals at play here.
There are, in general, two different styles of collection operation libraries:
type-preserving: that is what you are confused about in your question
generic (not in the "parametric polymorphism sense" but the standard English sense of the word) or maybe "homogeneous"
Type-preserving collection operations try to preserve the type exactly for operations like filter, take, drop, etc. that only take existing elements unmodified. For operations like map, it tries to find the closest super type that can still hold the result. E.g. mapping over an IntSet with a function from Int to String can obviously not result in an IntSet, but only in a Set. Mapping an IntSet to Boolean could be represented in a BitSet, but I know of no collections framework that is clever enough to actually do that.
Generic / homogeneous collection operations always return the same type. Usually, this type is chosen to be very general, to accommodate the widest range of use cases. For example, In .NET, collection operations return IEnumerable, in Java, they return Streams, in C++, they return iterators, in Ruby, they return arrays.
Until recently, it was only possible to implement type-preserving collection operations by duplicating all operations for all types. For example, the Smalltalk collections framework is type-preserving, and it does this by having every single collections class re-implement every single collections operation. This results in a lot of duplicated code and is a maintenance nightmare. (It is no coincidence that many new object-oriented abstractions that get invented have their first paper written about how it can be applied to the Smalltalk collections framework. See Traits: Composable Units of Behaviour for an example.)
To my knowledge, the Scala 2.8 re-design of the collections framework (see also this answer on SO) was the first time someone managed to create type-preserving collections operations while minimizing (though not eliminating) duplication. However, the Scala 2.8 collections framework was widely criticized as being overly complex, and it has required constant work over the last decade. In fact, it actually lead to a complete re-design of the Scala documentation system as well, just to be able to hide the very complex type signatures that the type-preserving operations require. But, this still wasn't enough, so the collections framework was completely thrown out and re-designed yet again in Scala 2.13. (And this re-design took several years.)
So, the simple answer to your question is this: Scala tries as much as possible to preserve the type of the collection.
In your second case, the type of the collection is Array, and when you map over an Array, you get back an Array.
In your first case, the type of the collection is Range. Now, a Range doesn't actually have elements, though. It only has a beginning and an end and a step, and it produces the elements on demand while you are iterating over it. So, it is not that easy to produce a new Range with the new elements. The map function would basically need to be able to "reverse engineer" your mapping function to figure out what the new beginning and end and step should be. (Which is equivalent to solving the Halting Problem, or in other words impossible.) And what if you do something like this:
val seq1: IndexedSeq[Int] = for (i <- 1 to 10) yield scala.util.Random.nextInt(i)
Here, there isn't even a well-defined step, so it is actually impossible to build a Range that does this.
So, clearly, mapping over a Range cannot return a Range. So, it does the next best thing: It returns the most precise super type of Range that can contain the mapped values. In this case, that happens to be IndexedSeq.
There is a wrinkle, in that type-preserving collections operations challenge what we consider to be part of the contract of certain operations. For example, most people would argue that the cardinality of a collection should be invariant under map, in other words, map should map each element to exactly one new element and thus map should never change the size of the collection. But, what about this code:
Set(1, 2, 3).map { _ % 2 == 0 }
//=> Set(true, false)
Here, you get back a collection with fewer elements from a map, which is only supposed to transform elements, not remove them. But, since we decided we want type-preserving collections, and a Set cannot have duplicate values, the two false values are actually the same value, so there is only one of them in the set.
[It could be argued that this actually only demonstrates that Sets aren't collections and shouldn't be treated as collections. Sets are predicates ("Is this element a member?") rather than collections ("Give me all your elements!")]
this happens because construction:
for (x <-someSeq) yield x
is the same as:
someSeq.map(x => x)
for () yield is just syntactic sugar for flatMap/map function.
As we know map function doesn't change type of object-container, it changes elements inside container.
So, in your example 1 to 10 has a type Range.Inclusive which extends Range, and Range extends IndexedSeq. Mapped IndexedSeq has the same type IndexedSeq.
for (i <- 1 to 10) yield i the same as (1 to 10).map(x => x)
In second case: for (i <- Array(1, 2, 3)) yield i you have an Array, and mapped Array type also Array.
for (i <- Array(1, 2, 3)) yield i the same as Array(1, 2, 3).map(x => x)
I think the right literature would be Scala for the Impatient by Cay Horstmann. The first edition is a little outdated but it's held up pretty well. The book is a fairly easy read almost to the end (I admit I don't quite understand lexical parsers or actors, but that's probably on me not Horstmann).
One of the things that Horstmann's book explains early on is that although you can use for like in Java, it can actually do much more sophisticated things. As a toy example, consider this Java procedure:
public static void main(String[] args) {
HashSet<Integer> squaresMod10 = new HashSet<>();
for (int i = 1; i < 11; i++) {
squaresMod10.add(i * i % 10);
}
for (Integer j : squaresMod10) {
System.out.print(j + " ");
}
}
If you're using Java 8 or later, your IDE's probably advising you that you "can use functional operators." NetBeans rewrote it thus for me:
squaresMod10.forEach((j) -> {
System.out.print(j + " ");
});
In Scala, you can use "functional operations" for both the i and j loops in this example. Rather than fire up IntelliJ just for this, I'll use the local Scala REPL on my system.
scala> (1 to 10).map(i => i * i % 10)
res2: IndexedSeq[Int] = Vector(1, 4, 9, 6, 5, 6, 9, 4, 1, 0)
scala> (1 to 10).map(i => i * i % 10).toSet
res3: scala.collection.immutable.Set[Int] = HashSet(0, 5, 1, 6, 9, 4)
scala> for (j <- res3) System.out.print(j + " ")
^
warning: method + in class Int is deprecated (since 2.13.0):
Adding a number and a String is deprecated. Use the string interpolation `s"$num$str"`
0 5 1 6 9 4
In Java, what would you be going for with seq1 and seq2? With a standard for loop in Java, you're essentially guiding the computer through the process of looking at each element of an ad hoc collection one by one and performing a certain operation on it.
Lambdas like the one NetBeans wrote for me still become If and Goto constructs at the JVM level, just like regular for loops in Java, as do a lot of what Horstmann calls "for comprehensions" in Scala. Either way, you're delegating the nuts and bolts of how exactly that happens to the compiler. In other words, you're not micro-managing.
However, as your seq2 example shows, it's still possible to wrap collections unnecessarily.
scala> 1 to 10
res5: scala.collection.immutable.Range.Inclusive = Range 1 to 10
scala> res5 == seq1
res6: Boolean = true
scala> Array(1, 2, 3)
res7: Array[Int] = Array(1, 2, 3)
scala> res7 == seq2
res8: Boolean = false
Okay, that didn't quite go the way I thought it would, but my point stands: in this vacuum, it's unnecessary to wrap 1 to 10 into another IndexedSeq[Int] and it's unnecessary to wrap an Array[Int] into another Array[Int]. Your for "statements" are merely wrapping unnamed collections into named collections.
In some cases, applying 'view' to collection before doing map/filter/... can decrease performance. However, those situations are (afaik) quite seldom, for example, when there is a single operation.
But most of time appending '.view' can give a performance boost,by preventing from creating intermediate collections.
So, why 'view' is not applied to collections by default? Am I missing some hidden costs of it?
Short answer: Views are not always beneficial, and strict behavior by default can prevent surprises.
Consider this snippet from the early collections documentation:
... you might wonder why have strict collections at all? One reason is that performance comparisons do not always favor lazy over strict collections. For smaller collection sizes the added overhead of forming and applying closures in views is often greater than the gain from avoiding the intermediary data structures.
Consider:
List(1, 2).map(_ + 1)
// vs
List(1, 2).view.map(_ + 1)
The view uses just a little more overhead, and with some crude measurements is about 4% slower (a small, but noticeable difference).
The documentation continues to say:
A probably more important reason is that evaluation in views can be very confusing if the delayed operations have side effects.
Consider:
for(i <- (1 to 10)) println(i)
If (1 to 10) were actually a view (it is not), nothing would happen until it was forced to do so. Code that looks straight-forward might not be.
Worse, you may end up with surprising situations where you end up re-evaluating code you did not need to. Take this naive example:
// Evaluates to List(2, 3, 4, 5), prints "eval" during each evaluation
val list = List(1, 2, 3, 4).view.map { i => println("eval"); i + 1 }
// For each element in the list, prints the elements that *not* that element (in a view)
def printList[A](list: Traversable[A]): Unit =
list foreach { i => list.filter(_ != i) }
Because list is a view, printList is evaluates list n - 1 more times than it should!
scala> printList(list)
eval
SeqViewMF(...)
eval
SeqViewMF(...)
eval
SeqViewMF(...)
eval
SeqViewMF(...)
Now imagine list not clearly labelled as a view (i.e. a view by default). I can picture many strange bugs or bottlenecks occurring from unintended re-evaluation.
This is not to say that strict evaluation doesn't come with it's own surprises. It was just a design choice that strict evaluation is less surprising than lazy evaluation (Paul Phillips may disagree).
Given that Seq.view returns a SeqView, I would have expected Set.view to return a SetView, but no such view exists; Set.view instead returns an IterableView.
Unfortunately, IterableView is missing some methods, such as contains. Compare these, for example:
Seq(1, 2, 3).view.map(_ * 2).contains(4) // returns true
Set(1, 2, 3).view.map(_ * 2).contains(4) // error
Is there any particular reason why no SetView class exists?
Also, is there any reason why Iterable doesn't have a contains method (given this is basically a special case of find)?
Given the situation above, is there a better alternative to this when working with sets (in other words, what is the best practice in Scala):
Set(1, 2, 3).view.map(_ * 2).find(_ == 4).isDefined
There is no SetView because views are a pain in the neck to implement and test, and that effort is less worth it since the nice properties of sets generally require you to have already eagerly created the entire set (e.g. O(1) or O(log n) lookup).
contains isn't in Iterable precisely because Set extends Iterable and a Set contains should not type check unless you ask for something that could be in the set. Since Iterable is covariant, its contains would have to admit you asking for anything (as the contains for Seq does).
As a workaround, you can note that contains(x) does the same thing as exists(_ == x), and exists is on Iterable.
What is the motivation for Scala assignment evaluating to Unit rather than the value assigned?
A common pattern in I/O programming is to do things like this:
while ((bytesRead = in.read(buffer)) != -1) { ...
But this is not possible in Scala because...
bytesRead = in.read(buffer)
.. returns Unit, not the new value of bytesRead.
Seems like an interesting thing to leave out of a functional language.
I am wondering why it was done so?
I advocated for having assignments return the value assigned rather than unit. Martin and I went back and forth on it, but his argument was that putting a value on the stack just to pop it off 95% of the time was a waste of byte-codes and have a negative impact on performance.
I'm not privy to inside information on the actual reasons, but my suspicion is very simple. Scala makes side-effectful loops awkward to use so that programmers will naturally prefer for-comprehensions.
It does this in many ways. For instance, you don't have a for loop where you declare and mutate a variable. You can't (easily) mutate state on a while loop at the same time you test the condition, which means you often have to repeat the mutation just before it, and at the end of it. Variables declared inside a while block are not visible from the while test condition, which makes do { ... } while (...) much less useful. And so on.
Workaround:
while ({bytesRead = in.read(buffer); bytesRead != -1}) { ...
For whatever it is worth.
As an alternate explanation, perhaps Martin Odersky had to face a few very ugly bugs deriving from such usage, and decided to outlaw it from his language.
EDIT
David Pollack has answered with some actual facts, which are clearly endorsed by the fact that Martin Odersky himself commented his answer, giving credence to the performance-related issues argument put forth by Pollack.
This happened as part of Scala having a more "formally correct" type system. Formally-speaking, assignment is a purely side-effecting statement and therefore should return Unit. This does have some nice consequences; for example:
class MyBean {
private var internalState: String = _
def state = internalState
def state_=(state: String) = internalState = state
}
The state_= method returns Unit (as would be expected for a setter) precisely because assignment returns Unit.
I agree that for C-style patterns like copying a stream or similar, this particular design decision can be a bit troublesome. However, it's actually relatively unproblematic in general and really contributes to the overall consistency of the type system.
Perhaps this is due to the command-query separation principle?
CQS tends to be popular at the intersection of OO and functional programming styles, as it creates an obvious distinction between object methods that do or do not have side-effects (i.e., that alter the object). Applying CQS to variable assignments is taking it further than usual, but the same idea applies.
A short illustration of why CQS is useful: Consider a hypothetical hybrid F/OO language with a List class that has methods Sort, Append, First, and Length. In imperative OO style, one might want to write a function like this:
func foo(x):
var list = new List(4, -2, 3, 1)
list.Append(x)
list.Sort()
# list now holds a sorted, five-element list
var smallest = list.First()
return smallest + list.Length()
Whereas in more functional style, one would more likely write something like this:
func bar(x):
var list = new List(4, -2, 3, 1)
var smallest = list.Append(x).Sort().First()
# list still holds an unsorted, four-element list
return smallest + list.Length()
These seem to be trying to do the same thing, but obviously one of the two is incorrect, and without knowing more about the behavior of the methods, we can't tell which one.
Using CQS, however, we would insist that if Append and Sort alter the list, they must return the unit type, thus preventing us from creating bugs by using the second form when we shouldn't. The presence of side effects therefore also becomes implicit in the method signature.
I'd guess this is in order to keep the program / the language free of side effects.
What you describe is the intentional use of a side effect which in the general case is considered a bad thing.
It is not the best style to use an assignment as a boolean expression. You perform two things at the same time which leads often to errors. And the accidential use of "=" instead of "==" is avoided with Scalas restriction.
By the way: I find the initial while-trick stupid, even in Java. Why not somethign like this?
for(int bytesRead = in.read(buffer); bytesRead != -1; bytesRead = in.read(buffer)) {
//do something
}
Granted, the assignment appears twice, but at least bytesRead is in the scope it belongs to, and I'm not playing with funny assignment tricks...
You can have a workaround for this as long as you have a reference type for indirection. In a naïve implementation, you can use the following for arbitrary types.
case class Ref[T](var value: T) {
def := (newval: => T)(pred: T => Boolean): Boolean = {
this.value = newval
pred(this.value)
}
}
Then, under the constraint that you’ll have to use ref.value to access the reference afterwards, you can write your while predicate as
val bytesRead = Ref(0) // maybe there is a way to get rid of this line
while ((bytesRead := in.read(buffer)) (_ != -1)) { // ...
println(bytesRead.value)
}
and you can do the checking against bytesRead in a more implicit manner without having to type it.
I'm pretty new to Scala and most of the time before I've used Java. Right now I have warnings all over my code saying that i should "Avoid mutable local variables" and I have a simple question - why?
Suppose I have small problem - determine max int out of four. My first approach was:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
var subMax1 = a
if (b > a) subMax1 = b
var subMax2 = c
if (d > c) subMax2 = d
if (subMax1 > subMax2) subMax1
else subMax2
}
After taking into account this warning message I found another solution:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
max(max(a, b), max(c, d))
}
def max(a: Int, b: Int): Int = {
if (a > b) a
else b
}
It looks more pretty, but what is ideology behind this?
Whenever I approach a problem I'm thinking about it like: "Ok, we start from this and then we incrementally change things and get the answer". I understand that the problem is that I try to change some initial state to get an answer and do not understand why changing things at least locally is bad? How to iterate over collection then in functional languages like Scala?
Like an example: Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6? Can't think of solution without local mutable variable.
In your particular case there is another solution:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
val submax1 = if (a > b) a else b
val submax2 = if (c > d) c else d
if (submax1 > submax2) submax1 else submax2
}
Isn't it easier to follow? Of course I am a bit biased but I tend to think it is, BUT don't follow that rule blindly. If you see that some code might be written more readably and concisely in mutable style, do it this way -- the great strength of scala is that you don't need to commit to neither immutable nor mutable approaches, you can swing between them (btw same applies to return keyword usage).
Like an example: Suppose we have a list of ints, how to write a
function that returns the sublist of ints which are divisible by 6?
Can't think of solution without local mutable variable.
It is certainly possible to write such function using recursion, but, again, if mutable solution looks and works good, why not?
It's not so related with Scala as with the functional programming methodology in general. The idea is the following: if you have constant variables (final in Java), you can use them without any fear that they are going to change. In the same way, you can parallelize your code without worrying about race conditions or thread-unsafe code.
In your example is not so important, however imagine the following example:
val variable = ...
new Future { function1(variable) }
new Future { function2(variable) }
Using final variables you can be sure that there will not be any problem. Otherwise, you would have to check the main thread and both function1 and function2.
Of course, it's possible to obtain the same result with mutable variables if you do not ever change them. But using inmutable ones you can be sure that this will be the case.
Edit to answer your edit:
Local mutables are not bad, that's the reason you can use them. However, if you try to think approaches without them, you can arrive to solutions as the one you posted, which is cleaner and can be parallelized very easily.
How to iterate over collection then in functional languages like Scala?
You can always iterate over a inmutable collection, while you do not change anything. For example:
val list = Seq(1,2,3)
for (n <- list)
println n
With respect to the second thing that you said: you have to stop thinking in a traditional way. In functional programming the usage of Map, Filter, Reduce, etc. is normal; as well as pattern matching and other concepts that are not typical in OOP. For the example you give:
Like an example: Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6?
val list = Seq(1,6,10,12,18,20)
val result = list.filter(_ % 6 == 0)
Firstly you could rewrite your example like this:
def max(first: Int, others: Int*): Int = {
val curMax = Math.max(first, others(0))
if (others.size == 1) curMax else max(curMax, others.tail : _*)
}
This uses varargs and tail recursion to find the largest number. Of course there are many other ways of doing the same thing.
To answer your queston - It's a good question and one that I thought about myself when I first started to use scala. Personally I think the whole immutable/functional programming approach is somewhat over hyped. But for what it's worth here are the main arguments in favour of it:
Immutable code is easier to read (subjective)
Immutable code is more robust - it's certainly true that changing mutable state can lead to bugs. Take this for example:
for (int i=0; i<100; i++) {
for (int j=0; j<100; i++) {
System.out.println("i is " + i = " and j is " + j);
}
}
This is an over simplified example but it's still easy to miss the bug and the compiler won't help you
Mutable code is generally not thread safe. Even trivial and seemingly atomic operations are not safe. Take for example i++ this looks like an atomic operation but it's actually equivalent to:
int i = 0;
int tempI = i + 0;
i = tempI;
Immutable data structures won't allow you to do something like this so you would need to explicitly think about how to handle it. Of course as you point out local variables are generally threadsafe, but there is no guarantee. It's possible to pass a ListBuffer instance variable as a parameter to a method for example
However there are downsides to immutable and functional programming styles:
Performance. It is generally slower in both compilation and runtime. The compiler must enforce the immutability and the JVM must allocate more objects than would be required with mutable data structures. This is especially true of collections.
Most scala examples show something like val numbers = List(1,2,3) but in the real world hard coded values are rare. We generally build collections dynamically (from a database query etc). Whilst scala can reassign the values in a colection it must still create a new collection object every time you modify it. If you want to add 1000 elements to a scala List (immutable) the JVM will need to allocate (and then GC) 1000 objects
Hard to maintain. Functional code can be very hard to read, it's not uncommon to see code like this:
val data = numbers.foreach(_.map(a => doStuff(a).flatMap(somethingElse)).foldleft("", (a : Int,b: Int) => a + b))
I don't know about you but I find this sort of code really hard to follow!
Hard to debug. Functional code can also be hard to debug. Try putting a breakpoint halfway into my (terrible) example above
My advice would be to use a functional/immutable style where it genuinely makes sense and you and your colleagues feel comfortable doing it. Don't use immutable structures because they're cool or it's "clever". Complex and challenging solutions will get you bonus points at Uni but in the commercial world we want simple solutions to complex problems! :)
Your two main questions:
Why warn against local state changes?
How can you iterate over collections without mutable state?
I'll answer both.
Warnings
The compiler warns against the use of mutable local variables because they are often a cause of error. That doesn't mean this is always the case. However, your sample code is pretty much a classic example of where mutable local state is used entirely unnecessarily, in a way that not only makes it more error prone and less clear but also less efficient.
Your first code example is more inefficient than your second, functional solution. Why potentially make two assignments to submax1 when you only ever need to assign one? You ask which of the two inputs is larger anyway, so why not ask that first and then make one assignment? Why was your first approach to temporarily store partial state only halfway through the process of asking such a simple question?
Your first code example is also inefficient because of unnecessary code duplication. You're repeatedly asking "which is the biggest of two values?" Why write out the code for that 3 times independently? Needlessly repeating code is a known bad habit in OOP every bit as much as FP and for precisely the same reasons. Each time you needlessly repeat code, you open a potential source of error. Adding mutable local state (especially when so unnecessary) only adds to the fragility and to the potential for hard to spot errors, even in short code. You just have to type submax1 instead of submax2 in one place and you may not notice the error for a while.
Your second, FP solution removes the code duplication, dramatically reducing the chance of error, and shows that there was simply no need for mutable local state. It's also, as you yourself say, cleaner and clearer - and better than the alternative solution in om-nom-nom's answer.
(By the way, the idiomatic Scala way to write such a simple function is
def max(a: Int, b: Int) = if (a > b) a else b
which terser style emphasises its simplicity and makes the code less verbose)
Your first solution was inefficient and fragile, but it was your first instinct. The warning caused you to find a better solution. The warning proved its value. Scala was designed to be accessible to Java developers and is taken up by many with a long experience of imperative style and little or no knowledge of FP. Their first instinct is almost always the same as yours. You have demonstrated how that warning can help improve code.
There are cases where using mutable local state can be faster but the advice of Scala experts in general (not just the pure FP true believers) is to prefer immutability and to reach for mutability only where there is a clear case for its use. This is so against the instincts of many developers that the warning is useful even if annoying to experienced Scala devs.
It's funny how often some kind of max function comes up in "new to FP/Scala" questions. The questioner is very often tripping up on errors caused by their use of local state... which link both demonstrates the often obtuse addiction to mutable state among some devs while also leading me on to your other question.
Functional Iteration over Collections
There are three functional ways to iterate over collections in Scala
For Comprehensions
Explicit Recursion
Folds and other Higher Order Functions
For Comprehensions
Your question:
Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6? Can't think of solution without local mutable variable
Answer: assuming xs is a list (or some other sequence) of integers, then
for (x <- xs; if x % 6 == 0) yield x
will give you a sequence (of the same type as xs) containing only those items which are divisible by 6, if any. No mutable state required. Scala just iterates over the sequence for you and returns anything matching your criteria.
If you haven't yet learned the power of for comprehensions (also known as sequence comprehensions) you really should. Its a very expressive and powerful part of Scala syntax. You can even use them with side effects and mutable state if you want (look at the final example on the tutorial I just linked to). That said, there can be unexpected performance penalties and they are overused by some developers.
Explicit Recursion
In the question I linked to at the end of the first section, I give in my answer a very simple, explicitly recursive solution to returning the largest Int from a list.
def max(xs: List[Int]): Option[Int] = xs match {
case Nil => None
case List(x: Int) => Some(x)
case x :: y :: rest => max( (if (x > y) x else y) :: rest )
}
I'm not going to explain how the pattern matching and explicit recursion work (read my other answer or this one). I'm just showing you the technique. Most Scala collections can be iterated over recursively, without any need for mutable state. If you need to keep track of what you've been up to along the way, you pass along an accumulator. (In my example code, I stick the accumulator at the front of the list to keep the code smaller but look at the other answers to those questions for more conventional use of accumulators).
But here is a (naive) explicitly recursive way of finding those integers divisible by 6
def divisibleByN(n: Int, xs: List[Int]): List[Int] = xs match {
case Nil => Nil
case x :: rest if x % n == 0 => x :: divisibleByN(n, rest)
case _ :: rest => divisibleByN(n, rest)
}
I call it naive because it isn't tail recursive and so could blow your stack. A safer version can be written using an accumulator list and an inner helper function but I leave that exercise to you. The result will be less pretty code than the naive version, no matter how you try, but the effort is educational.
Recursion is a very important technique to learn. That said, once you have learned to do it, the next important thing to learn is that you can usually avoid using it explicitly yourself...
Folds and other Higher Order Functions
Did you notice how similar my two explicit recursion examples are? That's because most recursions over a list have the same basic structure. If you write a lot of such functions, you'll repeat that structure many times. Which makes it boilerplate; a waste of your time and a potential source of error.
Now, there are any number of sophisticated ways to explain folds but one simple concept is that they take the boilerplate out of recursion. They take care of the recursion and the management of accumulator values for you. All they ask is that you provide a seed value for the accumulator and the function to apply at each iteration.
For example, here is one way to use fold to extract the highest Int from the list xs
xs.tail.foldRight(xs.head) {(a, b) => if (a > b) a else b}
I know you aren't familiar with folds, so this may seem gibberish to you but surely you recognise the lambda (anonymous function) I'm passing in on the right. What I'm doing there is taking the first item in the list (xs.head) and using it as the seed value for the accumulator. Then I'm telling the rest of the list (xs.tail) to iterate over itself, comparing each item in turn to the accumulator value.
This kind of thing is a common case, so the Collections api designers have provided a shorthand version:
xs.reduce {(a, b) => if (a > b) a else b}
(If you look at the source code, you'll see they have implemented it using a fold).
Anything you might want to do iteratively to a Scala collection can be done using a fold. Often, the api designers will have provided a simpler higher-order function which is implemented, under the hood, using a fold. Want to find those divisible-by-six Ints again?
xs.foldRight(Nil: List[Int]) {(x, acc) => if (x % 6 == 0) x :: acc else acc}
That starts with an empty list as the accumulator, iterates over every item, only adding those divisible by 6 to the accumulator. Again, a simpler fold-based HoF has been provided for you:
xs filter { _ % 6 == 0 }
Folds and related higher-order functions are harder to understand than for comprehensions or explicit recursion, but very powerful and expressive (to anybody else who understands them). They eliminate boilerplate, removing a potential source of error. Because they are implemented by the core language developers, they can be more efficient (and that implementation can change, as the language progresses, without breaking your code). Experienced Scala developers use them in preference to for comprehensions or explicit recursion.
tl;dr
Learn For comprehensions
Learn explicit recursion
Don't use them if a higher-order function will do the job.
It is always nicer to use immutable variables since they make your code easier to read. Writing a recursive code can help solve your problem.
def max(x: List[Int]): Int = {
if (x.isEmpty == true) {
0
}
else {
Math.max(x.head, max(x.tail))
}
}
val a_list = List(a,b,c,d)
max_value = max(a_list)