I recently picked up interests in functional programming, and I'm working with some toy script.
One example is, taking a list of integers, adding them up in sequence, and get the index when the rolling sum reaches a certain value, say -1.
This is what I have now:
#tailrec
def someFunc(currentSum: Int, list: List[Int], index: Int): Int = {
if (currentSum == -1) return index
// Return -1 if the value is never reached
if (index >= list.length) return -1
val value = list(index)
someFunc(currentSum + value, list, index + 1)
}
This works, and if the rolling sum never reaches -1, function will return -1.
I'm not exactly happy with returning -1 in this case, after some readings, I was introduced to the concept of Option, basically I could have Some value or None. I figured this might be the proper solution in this scenario, so I changed my code to this:
#tailrec
def someFunc(currentSum: Int, list: List[Int], index: Int): Option[Int] = {
if (currentSum == -1) return Some(index)
if (index >= list.length) return None
val value = list(index)
someFunc(currentSum + value, list, index + 1)
}
Now my question is, is there anything I can do to avoid adding Some(index) in the 3rd line? In this case it seems trivial, but for more complex situation, it seems a bit unnecessary to add Some everywhere down in the chain.
Also I'm also wondering if this is the proper functional way to handle this type of situation?
What you are asking is this: ""I should just be able type return index instead of typing return Some(index), and scala-compiler should understand that since this function returns a Option, so I actually mean the latter"".
it seems a bit unnecessary to add Some everywhere
It is not unnecessary. If what you ask is fulfilled, this will result in ambiguity. Consider this scenario:
def hypotheticalFunction(....): Option[Option[Int]]: = {
if (some-condition ) return None --> ambigious
// more code
}
Does the return None mean return Some(None) or return None? We have no way of knowing.
==========================================
Also I'm also wondering if this is the proper functional way to handle this type of situation?
I can think of at least 2 improvements:
Generally return statements are discouraged, code-readability can take a setback by arbitrary breaking of function flow by a return. You can use this:
if (condition) Some(index)
else if (condition) None
else {
}
Since earlier parts of list is not needed in recursive calls, you can only send the tail of the the list to the recursive function. In this way, you won't have to iterate the list by calling list(index). Some snippet:
#tailrec
def someFunc(currentSum: Int, list: List[Int], index: Int): Option[Int] = {
if (currentSum == -1) Some(index)
else if (list.isEmpty) None
else {
val value = list.head
someFunc(currentSum + value, list.tail, index + 1)
}
}
Not sure what you mean by "everywhere down the chain". There are pretty much just two cases - either Some or None. And yeah, you have to specify which is which.
A pattern match can make this case separation a bit clearer (it also fixes a problem with your implementation making it quadratic: List is a link list, and list(index) is linear, don't do that):
def someFunc(
list: List[Int],
currentSum: Int = 0,
index: Int=0
): Option[Int] = list match {
case _ if currentSum == -1 => Some(index)
case Nil => None
case head :: tail => someFunc(tail, currentSum + head, index+1)
}
Related
I'm new to Scala, there is a better way to express this with the most basic knowledge possible?
def findMax(xs: List[Int]): Int = {
xs match {
case x :: tail => (if (tail.length==0) x else (if(x>findMax(tail)) x else (findMax(tail))))
}
}
Thee are two problems here. First, you call tail.length which is an operation of order O(N), so in the worst case this will cost you N*N steps where N is the length of the sequence. The second is that your function is not tail-recursive - you nest the findMax calls "from outside to inside".
The usual strategy to write the correct recursive function is
to think about each possible pattern case: here you have either the empty list Nil or the non-empty list head :: tail. This solves your first problem.
to carry along the temporary result (here the current guess of the maximum value) as another argument of the function. This solves your second problem.
This gives:
import scala.annotation.tailrec
#tailrec
def findMax(xs: List[Int], max: Int): Int = xs match {
case head :: tail => findMax(tail, if (head > max) head else max)
case Nil => max
}
val z = util.Random.shuffle(1 to 100 toList)
assert(findMax(z, Int.MinValue) == 100)
If you don't want to expose this additional argument, you can write an auxiliary inner function.
def findMax(xs: List[Int]): Int = {
#tailrec
def loop(ys: List[Int], max: Int): Int = ys match {
case head :: tail => loop(tail, if (head > max) head else max)
case Nil => max
}
loop(xs, Int.MinValue)
}
val z = util.Random.shuffle(1 to 100 toList)
assert(findMax(z) == 100)
For simplicity we return Int.MinValue if the list is empty. A better solution might be to throw an exception for this case.
The #tailrec annotation here is optional, it simply assures that we indeed defined a tail recursive function. This has the advantage that we cannot produce a stack overflow if the list is extremely long.
Any time you're reducing a collection to a single value, consider using one of the fold functions instead of explicit recursion.
List(3,7,1).fold(Int.MinValue)(Math.max)
// 7
Even I too am new to Scala (am into Haskell though!).
My attempt at this would be as below.
Note that I assume a non-empty list, since the max of an empty list does not make sense.
I first define an helper method which simply returns the max of 2 numbers.
def maxOf2 (x:Int, y:Int): Int = {
if (x >= y) x
else y
}
Armed with this simple function, we can build a recursive function to find the 'max' as below:
def findMax(xs: List[Int]): Int = {
if (xs.tail.isEmpty)
xs.head
else
maxOf2(xs.head, findMax(xs.tail))
}
I feel this is a pretty 'clear'(though not 'efficient') way to do it.
I wanted to make the concept of recursion obvious.
Hope this helps!
Elaborating on #fritz's answer. If you pass in an empty list, it will throw you a java.lang.UnsupportedOperationException: tail of empty list
So, keeping the algorithm intact, I made this adjustment:
def max(xs: List[Int]): Int = {
def maxOfTwo(x: Int, y: Int): Int = {
if (x >= y) x else y
}
if (xs.isEmpty) throw new UnsupportedOperationException("What man?")
else if (xs.size == 1) xs.head
else maxOfTwo(xs.head, max(xs.tail))
}
#fritz Thanks for the answer
Using pattern matching an recursion,
def top(xs: List[Int]): Int = xs match {
case Nil => sys.error("no max in empty list")
case x :: Nil => x
case x :: xs => math.max(x, top(xs))
}
Pattern matching is used to decompose the list into head and rest. A single element list is denoted with x :: Nil. We recurse on the rest of the list and compare for maximum on the head item of the list at each recursive stage. To make the cases exhaustive (to make a well-defined function) we consider also empty lists (Nil).
def maxl(xl: List[Int]): Int = {
if ( (xl.head > xl.tail.head) && (xl.tail.length >= 1) )
return xl.head
else
if(xl.tail.length == 1)
xl.tail.head
else
maxl(xl.tail)
}
I'm fairly new to scala from java and also pretty new to pattern matching. One of the things I'm trying to get my head around is when to use it and what it's costs/benefits are. For example this
def myThing(a: Int): Int = a match {
case a: Int if a > 0 => a
case _ => myThing(a + 1)
}
Does the same thing as this (unless I've really misunderstood something)
def myThing(a: Int): Int = {
if (a > 0) a
else myThing(a + 1)
}
So my actual question:
But do they run the same way? Is my pattern matched example tail recursive? And if not, then why not when it is in the second example?
Are there any other things I should worry about, like resources? Or should I pretty much always try to use pattern matching?
I've searched around for these answers but haven't found any "best practices" for this!
Edit: I'm aware that the example used is a bit contrived - I've just added it to be clear about the question below it - thanks!
Yes they do run the same. Best practice for every syntactic sugar is the same: use it whenever it provides more readable or flexible code. In your examples in case of if statement you may omit braces and write just
def myThing(a: Int): Int = if (a > 0) a else myThing(a + 1)
Which is definitely more handy than pattern matching. Pattern matching is handy in situations where:
You have 3 or more alternatives
You should unpack\check values through extractors (check this question)
You should check the types
Also to ensure you function is tail-recursive you could use the #tailrec annotation
Another 'Scala' way to do it would be to define an extractor for a positive number
def myThing(a: Int): Int = a match {
case PositiveNum(positive) => positive
case negative => myThing(negative + 1)
}
object PositiveNum {
def unapply(n: Int): Option[Int] = if (n > 0) Some(n) else None
}
Yet another way to pattern-match against the evaluated predicate (condition),
def myThing(a: Int): Int = a > 0 match {
case true => a
case _ => myThing(a + 1)
}
where matches include no (additional) guards or type declarations.
I am doing scala through the functional programming course on coursera. I noticed that the automatic style checker tells me that the use of 'return' is a bad habit. Why is that? To me it seems like the use of return would make the code more readable because any other programmers can instantly see that and what the function is returning.
Example, why is this;
def sum(xs: List[Int]): Int = {
if( xs.length == 0){
return 0
}else{
return xs.head + sum(xs.tail)
}
}
Considered to be worse than this;
def sum(xs: List[Int]): Int = {
if( xs.length == 0){
0
}else{
xs.head + sum(xs.tail)
}
}
I am used to javascript, so that might be a reason why I feel uneasy about it. Still, can anybody make it obvious why the addition of the return statement makes my code worse? If so, why is there a return statement in the language?
In Java, Javascript and other imperative languages if...else is a flow-control statement.
This means you can do this
public int doStuff(final boolean flag) {
if(flag)
return 1;
else
return 5;
}
But you cannot do this
public int doStuff(final boolean flag) {
return if(flag)
1;
else
5;
}
Because if...else is a statement and not an expression. To accomplish this you need to use the ternary operator (strictly speaking the "conditional operator"), something like:
public int doStuff(final boolean flag) {
return flag ? 1 : 5;
}
In Scala, this is different. The if...else construct is an expression, so more akin to the conditional operator in the languages you are used to. So, in fact your code is better written as:
def sum(xs: List[Int]): Int = {
return if(xs.length == 0) {
0
} else {
xs.head + sum(xs.tail)
}
}
Further, the last expression in a function is automatically returned, so the return is redundant. In fact, as the code only has single expressions, the curly brackets are redundant too:
def sum(xs: List[Int]): Int =
if(xs.length == 0) 0
else xs.head + sum(xs.tail)
So, to answer your question: this is discouraged because it is a misinterpretation of the nature if the if...else construct in Scala.
But this is all a little besides the point, you should really be using pattern matching
def sum(xs: List[Int]): Int = xs match {
case Nil => 0
case head::tail => head + sum(tail)
}
This is much more idiomatic Scala. Learn how to use (and abuse) pattern matching and you will save yourself a huge number of lines of code.
I think another answer for the question why
why is the use of return a bad habit in scala
is that return when used in a closure will return from the method not from the closure itself.
For example, consider this code:
def sumElements(xs: List[Int]): Int = {
val ys: List[Int] = xs.map { x =>
return x + 1
}
return ys.sum
}
It can be easily missed that when this code is invoked with sumElements(List(1, 2, 3, 4)) the result will be 2 and not 10. This is because return within map will return from sumElements and not from the map call.
In scala, every line is an expression, not a statement. Statements generally don't have a return value, but expressions do.
The last result of a block will be the returned value, and so the style guide operates on this assumption. A return would be an exceptional exit from a block.
def sum(xs: List[Int]): Int = {
if(xs.isEmpty) return 0
xs.head + sum(xs.tail)
}
The return statement there would cause the function to bail at that return, and generally leads to less understandable code than if you wrote it with the if/else logic, as you did earlier. I believe the rational behind the style decision is to discourage this type of programming as it makes the programs more difficult to understand.
I have the following recursive function in Scala that should return the maximum size integer in the List. Is anyone able to tell me why the largest value is not returned?
def max(xs: List[Int]): Int = {
var largest = xs.head
println("largest: " + largest)
if (!xs.tail.isEmpty) {
var next = xs.tail.head
println("next: " + next)
largest = if (largest > next) largest else next
var remaining = List[Int]()
remaining = largest :: xs.tail.tail
println("remaining: " + remaining)
max(remaining)
}
return largest
}
Print out statements show me that I've successfully managed to bring back the largest value in the List as the head (which was what I wanted) but the function still returns back the original head in the list. I'm guessing this is because the reference for xs is still referring to the original xs list, problem is I can't override that because it's a val.
Any ideas what I'm doing wrong?
You should use the return value of the inner call to max and compare that to the local largest value.
Something like the following (removed println just for readability):
def max(xs: List[Int]): Int = {
var largest = xs.head
if (!xs.tail.isEmpty) {
var remaining = List[Int]()
remaining = largest :: xs.tail
var next = max(remaining)
largest = if (largest > next) largest else next
}
return largest
}
Bye.
I have an answer to your question but first...
This is the most minimal recursive implementation of max I've ever been able to think up:
def max(xs: List[Int]): Option[Int] = xs match {
case Nil => None
case List(x: Int) => Some(x)
case x :: y :: rest => max( (if (x > y) x else y) :: rest )
}
(OK, my original version was ever so slightly more minimal but I wrote that in Scheme which doesn't have Option or type safety etc.) It doesn't need an accumulator or a local helper function because it compares the first two items on the list and discards the smaller, a process which - performed recursively - inevitably leaves you with a list of just one element which must be bigger than all the rest.
OK, why your original solution doesn't work... It's quite simple: you do nothing with the return value from the recursive call to max. All you had to do was change the line
max(remaining)
to
largest = max(remaining)
and your function would work. It wouldn't be the prettiest solution, but it would work. As it is, your code looks as if it assumes that changing the value of largest inside the recursive call will change it in the outside context from which it was called. But each new call to max creates a completely new version of largest which only exists inside that new iteration of the function. Your code then throws away the return value from max(remaining) and returns the original value of largest, which hasn't changed.
Another way to solve this would have been to use a local (inner) function after declaring var largest. That would have looked like this:
def max(xs: List[Int]): Int = {
var largest = xs.head
def loop(ys: List[Int]) {
if (!ys.isEmpty) {
var next = ys.head
largest = if (largest > next) largest else next
loop(ys.tail)
}
}
loop(xs.tail)
return largest
}
Generally, though, it is better to have recursive functions be entirely self-contained (that is, not to look at or change external variables but only at their input) and to return a meaningful value.
When writing a recursive solution of this kind, it often helps to think in reverse. Think first about what things are going to look like when you get to the end of the list. What is the exit condition? What will things look like and where will I find the value to return?
If you do this, then the case which you use to exit the recursive function (by returning a simple value rather than making another recursive call) is usually very simple. The other case matches just need to deal with a) invalid input and b) what to do if you are not yet at the end. a) is usually simple and b) can usually be broken down into just a few different situations, each with a simple thing to do before making another recursive call.
If you look at my solution, you'll see that the first case deals with invalid input, the second is my exit condition and the third is "what to do if we're not at the end".
In many other recursive solutions, Nil is the natural end of the recursion.
This is the point at which I (as always) recommend reading The Little Schemer. It teaches you recursion (and basic Scheme) at the same time (both of which are very good things to learn).
It has been pointed out that Scala has some powerful functions which can help you avoid recursion (or hide the messy details of it), but to use them well you really do need to understand how recursion works.
The following is a typical way to solve this sort of problem. It uses an inner tail-recursive function that includes an extra "accumulator" value, which in this case will hold the largest value found so far:
def max(xs: List[Int]): Int = {
def go(xs: List[Int], acc: Int): Int = xs match {
case Nil => acc // We've emptied the list, so just return the final result
case x :: rest => if (acc > x) go(rest, acc) else go(rest, x) // Keep going, with remaining list and updated largest-value-so-far
}
go(xs, Int.MinValue)
}
Nevermind I've resolved the issue...
I finally came up with:
def max(xs: List[Int]): Int = {
var largest = 0
var remaining = List[Int]()
if (!xs.isEmpty) {
largest = xs.head
if (!xs.tail.isEmpty) {
var next = xs.tail.head
largest = if (largest > next) largest else next
remaining = largest :: xs.tail.tail
}
}
if (!remaining.tail.isEmpty) max(remaining) else xs.head
}
Kinda glad we have loops - this is an excessively complicated solution and hard to get your head around in my opinion. I resolved the problem by making sure the recursive call was the last statement in the function either that or xs.head is returned as the result if there isn't a second member in the array.
The most concise but also clear version I have ever seen is this:
def max(xs: List[Int]): Int = {
def maxIter(a: Int, xs: List[Int]): Int = {
if (xs.isEmpty) a
else a max maxIter(xs.head, xs.tail)
}
maxIter(xs.head, xs.tail)
}
This has been adapted from the solutions to a homework on the Scala official Corusera course: https://github.com/purlin/Coursera-Scala/blob/master/src/example/Lists.scala
but here I use the rich operator max to return the largest of its two operands. This saves having to redefine this function within the def max block.
What about this?
def max(xs: List[Int]): Int = {
maxRecursive(xs, 0)
}
def maxRecursive(xs: List[Int], max: Int): Int = {
if(xs.head > max && ! xs.isEmpty)
maxRecursive(xs.tail, xs.head)
else
max
}
What about this one ?
def max(xs: List[Int]): Int = {
var largest = xs.head
if( !xs.tail.isEmpty ) {
if(xs.head < max(xs.tail)) largest = max(xs.tail)
}
largest
}
My answer is using recursion is,
def max(xs: List[Int]): Int =
xs match {
case Nil => throw new NoSuchElementException("empty list is not allowed")
case head :: Nil => head
case head :: tail =>
if (head >= tail.head)
if (tail.length > 1)
max(head :: tail.tail)
else
head
else
max(tail)
}
}
One way is this
list.distinct.size != list.size
Is there any better way? It would have been nice to have a containsDuplicates method
Assuming "better" means "faster", see the alternative approaches benchmarked in this question, which seems to show some quicker methods (although note that distinct uses a HashSet and is already O(n)). YMMV of course, depending on specific test case, scala version etc. Probably any significant improvement over the "distinct.size" approach would come from an early-out as soon as a duplicate is found, but how much of a speed-up is actually obtained would depend strongly on how common duplicates actually are in your use-case.
If you mean "better" in that you want to write list.containsDuplicates instead of containsDuplicates(list), use an implicit:
implicit def enhanceWithContainsDuplicates[T](s:List[T]) = new {
def containsDuplicates = (s.distinct.size != s.size)
}
assert(List(1,2,2,3).containsDuplicates)
assert(!List("a","b","c").containsDuplicates)
You can also write:
list.toSet.size != list.size
But the result will be the same because distinct is already implemented with a Set. In both case the time complexity should be O(n): you must traverse the list and Set insertion is O(1).
I think this would stop as soon as a duplicate was found and is probably more efficient than doing distinct.size - since I assume distinct keeps a set as well:
#annotation.tailrec
def containsDups[A](list: List[A], seen: Set[A] = Set[A]()): Boolean =
list match {
case x :: xs => if (seen.contains(x)) true else containsDups(xs, seen + x)
case _ => false
}
containsDups(List(1,1,2,3))
// Boolean = true
containsDups(List(1,2,3))
// Boolean = false
I realize you asked for easy and I don't now that this version is, but finding a duplicate is also finding if there is an element that has been seen before:
def containsDups[A](list: List[A]): Boolean = {
list.iterator.scanLeft(Set[A]())((set, a) => set + a) // incremental sets
.zip(list.iterator)
.exists{ case (set, a) => set contains a }
}
#annotation.tailrec
def containsDuplicates [T] (s: Seq[T]) : Boolean =
if (s.size < 2) false else
s.tail.contains (s.head) || containsDuplicates (s.tail)
I didn't measure this, and think it is similar to huynhjl's solution, but a bit more simple to understand.
It returns early, if a duplicate is found, so I looked into the source of Seq.contains, whether this returns early - it does.
In SeqLike, 'contains (e)' is defined as 'exists (_ == e)', and exists is defined in TraversableLike:
def exists (p: A => Boolean): Boolean = {
var result = false
breakable {
for (x <- this)
if (p (x)) { result = true; break }
}
result
}
I'm curious how to speed things up with parallel collections on multi cores, but I guess it is a general problem with early-returning, while another thread will keep running, because it doesn't know, that the solution is already found.
Summary:
I've written a very efficient function which returns both List.distinct and a List consisting of each element which appeared more than once and the index at which the element duplicate appeared.
Note: This answer is a straight copy of the answer on a related question.
Details:
If you need a bit more information about the duplicates themselves, like I did, I have written a more general function which iterates across a List (as ordering was significant) exactly once and returns a Tuple2 consisting of the original List deduped (all duplicates after the first are removed; i.e. the same as invoking distinct) and a second List showing each duplicate and an Int index at which it occurred within the original List.
Here's the function:
def filterDupes[A](items: List[A]): (List[A], List[(A, Int)]) = {
def recursive(remaining: List[A], index: Int, accumulator: (List[A], List[(A, Int)])): (List[A], List[(A, Int)]) =
if (remaining.isEmpty)
accumulator
else
recursive(
remaining.tail
, index + 1
, if (accumulator._1.contains(remaining.head))
(accumulator._1, (remaining.head, index) :: accumulator._2)
else
(remaining.head :: accumulator._1, accumulator._2)
)
val (distinct, dupes) = recursive(items, 0, (Nil, Nil))
(distinct.reverse, dupes.reverse)
}
An below is an example which might make it a bit more intuitive. Given this List of String values:
val withDupes =
List("a.b", "a.c", "b.a", "b.b", "a.c", "c.a", "a.c", "d.b", "a.b")
...and then performing the following:
val (deduped, dupeAndIndexs) =
filterDupes(withDupes)
...the results are:
deduped: List[String] = List(a.b, a.c, b.a, b.b, c.a, d.b)
dupeAndIndexs: List[(String, Int)] = List((a.c,4), (a.c,6), (a.b,8))
And if you just want the duplicates, you simply map across dupeAndIndexes and invoke distinct:
val dupesOnly =
dupeAndIndexs.map(_._1).distinct
...or all in a single call:
val dupesOnly =
filterDupes(withDupes)._2.map(_._1).distinct
...or if a Set is preferred, skip distinct and invoke toSet...
val dupesOnly2 =
dupeAndIndexs.map(_._1).toSet
...or all in a single call:
val dupesOnly2 =
filterDupes(withDupes)._2.map(_._1).toSet
This is a straight copy of the filterDupes function out of my open source Scala library, ScalaOlio. It's located at org.scalaolio.collection.immutable.List_._.
If you're trying to check for duplicates in a test then ScalaTest can be helpful.
import org.scalatest.Inspectors._
import org.scalatest.Matchers._
forEvery(list.distinct) { item =>
withClue(s"value $item, the number of occurences") {
list.count(_ == item) shouldBe 1
}
}
// example:
scala> val list = List(1,2,3,4,3,2)
list: List[Int] = List(1, 2, 3, 4, 3, 2)
scala> forEvery(list) { item => withClue(s"value $item, the number of occurences") { list.count(_ == item) shouldBe 1 } }
org.scalatest.exceptions.TestFailedException: forEvery failed, because:
at index 1, value 2, the number of occurences 2 was not equal to 1 (<console>:19),
at index 2, value 3, the number of occurences 2 was not equal to 1 (<console>:19)
in List(1, 2, 3, 4)