I'm new to Scala and I want to write a higher-order function (say "partition2") that takes a list of integers and a function that returns either true or false. The output would be a list of values for which the function is true and a list of values for which the function is false. I'd like to implement this using a fold. I know something like this would be a really straightforward way to do this:
val (passed, failed) = List(49, 58, 76, 82, 88, 90) partition ( _ > 60 )
I'm wondering how this same logic could be applied using a fold.
You can start by thinking about what you want your accumulator to look like. In many cases it'll have the same type as the thing you want to end up with, and that works here—you can use two lists to keep track of the elements that passed and failed. Then you just need to write the cases and add the element to the appropriate list:
List(49, 58, 76, 82, 88, 90).foldRight((List.empty[Int], List.empty[Int])) {
case (i, (passed, failed)) if i > 60 => (i :: passed, failed)
case (i, (passed, failed)) => (passed, i :: failed)
}
I'm using a right fold here because prepending to a list is nicer than the alternative, but you could easily rewrite it to use a left fold.
You can do this:
List(49, 58, 76, 82, 88, 90).foldLeft((Vector.empty[Int], Vector.empty[Int])){
case ((passed, failed), x) =>
if (x > 60) (passed :+ x, failed)
else (passed, failed :+ x)
}
Basically you have two accumulators, and as you visit each element, you add it to the appropriate accumulator.
Related
For example if you have millions of elements, but typically only need to examine the first million(e.g. if you are accumulating a sum and you saturate at some max value or you have some other complex data structure you are building, but you finish after examining the first M elements). FoldLeft always forces you to iterate over the entire sequence. Ideally, you could supply a predicate that lets foldLeft know you are done.
If scanLeft is evaluated lazily(?), perhaps scanLeft along with a find (find first valid element) can accomplish this. I believe something like this would work in Haskell, but not sure about Scala.
numbers.scanLeft(0)((a, b) => a + b).find(_ >= 100)
So if numbers = List(100,0,9,10), then scanLeft will only look at the first element.
scanLeft produces a lazy collection for already lazy collections, like Iterator or LazyList (Strem prior 2.13). Thus, you can use that to abort early.
For example:
LazyList.continually(100)
.scanLeft(0) { case (acc, n) => acc + n }
.takeWhile(_ < 1000)
.toList
// res: List[Int] = List(0, 100, 200, 300, 400, 500, 600, 700, 800, 900)
List(0, 100, 5, 300)
.iterator
.map(i => { println(i); i })
.scanLeft(0) { case (acc, n) => acc + n }
.find(_ >= 100)
// 0
// 100
// res: Option[Int] = Some(100)
I expect the following code output with Seq(0), instead it returns a function ?
# Seq(0).orElse(Seq(1))
res2: PartialFunction[Int, Int] = <function1>
I suspected at first that via syntax sugar it orElse on apply function, but it didn't since by trying:
# Seq(0).apply.orElse(Seq(1))
cmd3.sc:1: missing argument list for method apply in trait SeqLike
....(omit)
I checked in IntellJ that there's no implicit conversion.
What happens?
EDIT:
what I wish is:
Seq.empty.orElse(Seq(1)) == Seq(1)
Seq(0).orElse(Seq(1)) == Seq(0)
thanks #AndreyTyukin answer.
In one line, orElse has different semantic in different type , now Seq inherits PartialFunction not Option, so does the orElse behavior.
The Seq(0) is treated as a PartialFunction that is defined only at index 0, and produces as result the constant value 0 if it is given the only valid input 0.
When you invoke orElse with Seq(1), a new partial function is constructed, that first tries to apply Seq(0), and if it finds nothing in the domain of definition of Seq(0), it falls back to Seq(1). Since the domain of Seq(1) is the same as the domain of Seq(0) (namely just the {0}), the orElse does essentially nothing in this case, and returns a partial function equivalent to Seq(0).
So, the result is again a partial function defined at 0 that gives 0 if it is passed the only valid input 0.
Here is a non-degenerate example with sequences of different length, which hopefully makes it easier to understand what the orElse method is for:
val f = Seq(1,2,3).orElse(Seq(10, 20, 30, 40, 50))
is a partial function:
f: PartialFunction[Int,Int] = <function1>
Here is how it maps values 0 to 4:
0 to 4 map f
// Output: Vector(1, 2, 3, 40, 50)
That is, it uses first three values from the first sequence, and falls back to the second sequence passed to orElse for inputs 3 and 4.
This also works with arbitrary partial functions, not only sequences:
scala> val g = Seq(42,43,44).orElse[Int, Int]{ case n => n * n }
g: PartialFunction[Int,Int] = <function1>
scala> 0 to 10 map g
res7 = Vector(42, 43, 44, 9, 16, 25, 36, 49, 64, 81, 100)
If you wanted to select between two sequences without treating them as partial functions, you might consider using
Option(Seq(0)).getOrElse(Seq(1))
This will return Seq(0), if this is what you wanted.
When working with large collections, we usually hear the term "lazy evaluation". I want to better demonstrate the difference between strict and lazy evaluation, so I tried the following example - getting the first two even numbers from a list:
scala> var l = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)
l: List[Int] = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)
scala> l.filter(_ % 2 == 0).take(2)
res0: List[Int] = List(38, 46)
scala> l.toStream.filter(_ % 2 == 0).take(2)
res1: scala.collection.immutable.Stream[Int] = Stream(38, ?)
I noticed that when I'm using toStream, I'm getting Stream(38, ?). What does the "?" mean here? Does this have something to do with lazy evaluation?
Also, what are some good example of lazy evaluation, when should I use it and why?
One benefit using lazy collections is to "save" memory, e.g. when mapping to large data structures. Consider this:
val r =(1 to 10000)
.map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
.map(_.sum)
.sum
And using lazy evaluation:
val r =(1 to 10000).toStream
.map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
.map(_.sum)
.sum
The first statement will genrate 10000 Seqs of size 10000 and keeps them in memory, while in the second case only one Seq at a time needs to exist in memory, therefore its much faster...
Another use-case is when only a part of the data is actually needed. I often use lazy collections together with take, takeWhile etc
Let's take a real life scenario - Instead of having a list, you have a big log file that you want to extract first 10 lines that contains "Success".
The straight forward solution would be reading the file line-by-line, and once you have a line that contains "Success", print it and continue to the next line.
But since we love functional programming, we don't want to use the traditional loops. Instead, we want to achieve our goal by composing functions.
First attempt:
Source.fromFile("log_file").getLines.toList.filter(_.contains("Success")).take(10)
Let's try to understand what actually happened here:
we read the whole file
filter relevant lines
took the first 10 elements
If we try to print Source.fromFile("log_file").getLines.toList, we will get the whole file, which is obviously a waste, since not all lines are relevant for us.
Why we got all lines and only then we performed the filtering? That's because the List is a strict data structure, so when we call toList, it evaluates immediately, and only after having the whole data, the filtering is applied.
Luckily, Scala provides lazy data structures, and stream is one of them:
Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success")).take(10)
In order to demonstrate the difference, let's try:
Source.fromFile("log_file").getLines.toStream
Now we get something like:
Scala.collection.immutable.Stream[Int] = Stream(That's the first line, ?)
toStream evaluates to only one element - the first line in the file. The next element is represented by a "?", which indicates that the stream has not evaluated the next element, and that's because toStream is lazy function, and the next item is evaluated only when used.
Now after we apply the filter function, it will start reading the next line until we get the first line that contains "Success":
> var res = Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success"))
Scala.collection.immutable.Stream[Int] = Stream(First line contains Success!, ?)
Now we apply the take function. There is still no action is performed, but it knows that is should pick 10 lines, so it doesn't evaluate until we use the result:
res foreach println
Finally, i we now print res, we'll get a Stream containing the first 10 lines, as we expected.
I would like to write succinct code to map over a list, accumulating a value as I go and using that value in the output list.
Using a recursive function and pattern matching this is straightforward (see below). But I was wondering if there is a way to do this using the function programming family of combinators like map and fold etc. Obviously map and fold are no good unless you use a mutable variable defined outside the call and modify that in the body.
Perhaps I could do this with a State Monad but was wondering if there is a way to do it that I'm missing, and that utilizes the Scala standard library.
// accumulate(List(10, 20, 20, 30, 20))
// => List(10, 30, 50, 80, 100,)
def accumulate(weights : List[Int], sum : Int = 0, acc: List[Int] = List.empty) : List[Int] = {
weights match {
case hd :: tl =>
val total = hd + sum
accumulate(tl, total, total :: acc)
case Nil =>
acc.reverse
}
}
You may also use foldLeft:
def accumulate(seq: Seq[Int]) =
seq.foldLeft(Vector.empty[Int]) { (result, e) =>
result :+ result.lastOption.getOrElse(0) + e
}
accumulate(List(10, 20, 20, 30, 20))
// => List(10, 30, 50, 80, 100,)
This could be done with scan:
val result = list.scanLeft(0){case (acc, item) => acc+item}
Scan will include the initial value 0 into output so you have to drop it:
result.drop(1)
As pointed out in #Nyavro's answer, the operation you are looking for (the sum of the prefixes of the list) is called prefix-sum and its generalization to any binary operation is called scan and is included in the Scala standard library:
val l = List(10, 20, 20, 30, 20)
l.scan(0) { _ + _ }
//=> List(0, 10, 30, 50, 80, 100)
l.scan(0)(_ + _).drop(1)
//=> List(10, 30, 50, 80, 100)
This has already been answered, but I wanted to address a misconception in your question:
Obviously map and fold are no good unless you use a mutable variable defined outside the call and modify that in the body.
That is not true. fold is a general method of iteration. Everything you can do by iterating over a collection, you can do with fold. If fold were the only method in your List class, you could still do everything you can do now. Here's how to solve your problem with fold:
l.foldLeft(List(0)) { (list, el) ⇒ list.head + el :: list }.reverse.drop(1)
And a general implementation of scan:
def scan[A](l: List[A])(z: A)(op: (A, A) ⇒ A) =
l.
drop(1).
foldLeft(List(l.head)) { (list, el) ⇒ op(list.head, el) :: list }.
reverse
Think of it this way: a collection can be either empty or not. fold has two arguments, one which tells it what to do when the list is empty, and one which tells it what to do when the list is not empty. Those are the only two cases, so every possible case is handled. Therefore, fold can do anything! (More precisely in Scala, foldLeft and foldRight can do anything, while fold is restricted to associative operations.)
Or a different viewpoint: a collection is a stream of instructions, either the EMPTY instruction or the ELEMENT(value) instruction. foldLeft / foldRight are skeleton interpreters for that instruction set, and you as a programmer can supply the implementation for the interpretation of both those instructions, namely the two arguments to foldLeft / foldRight are the interpretation of those instructions.
Remember: while foldLeft / foldRight reduces a collection to a single value, that value can be arbitrarily complex, including being a collection itself!
1) Is it possible to iterate through an Array using while loop in Scala?
2) How to find the numbers that are greater than 50 using reduce loop?
val reduce_left_list=List(12,34,54,50,82,34,78,90,3,45,43,1,2343,234)
val greatest_num=reduce_left_list.reduceLeft((x:Int)=> { for(line <- reduce_left_list) line > 50)
1) Is it possible to iterate through an Array using while loop in Scala?
That depends on your definition of "iterate through an array". You can certainly do the same thing that you would do in C, for example, that is taking an integer, increasing it by 1 in every iteration of the loop, stopping when it is equal to the size of the array, and use this integer as an index into the array:
val anArray = Array('A, 'B, 'C, 'D)
var i = 0
val s = anArray.size
while (i < s) {
println(anArray(i))
i += 1
}
// 'A
// 'B
// 'C
// 'D
But I wouldn't call this "iterating through an array". You are iterating through integers, not the array.
And besides, why would you want to do that, if you can just tell the array to iterate itself?
anArray foreach println
// 'A
// 'B
// 'C
// 'D
If you absolutely insist on juggling indices yourself (but again, why would you want to), there are much better ways available than using a while loop. You could, for example, iterate over a Range:
(0 until s) foreach (i ⇒ println(anArray(i)))
Or written using a for comprehension:
for (i ← 0 until s) println(anArray(i))
Loops are never idiomatic in Scala. While Scala does allow side-effects, it is generally idiomatic to avoid them and strive for referential transparency. Albert Einstein is quoted as saying "Insanity is doing the same thing and expecting a different result", but that's exactly what we expect a loop to do: the loop executes the same code over and over, but we expect it to do a different thing every time (or at least once, namely, stop the loop). According to Einstein, loops are insane, and who are we to defy Einstein?
Seriously, though: loops cannot work without side-effects, but the Scala community tries to avoid side-effects, so the Scala community tries to avoid loops.
2) How to find the numbers that are greater than 50 using reduce loop?
There is no such thing as a "reduce loop" in Scala. I assume, you mean the reduce method.
The answer is: No. The types don't line up. reduce returns a value of the same type as the element type of the collection, but you want to return a collection of elements.
You can, however, use a fold, more precisely, a right fold:
(reduce_left_list :\ List.empty[Int])((el, acc) => if (el > 50) el :: acc else acc)
//=> List(54, 82, 78, 90, 2343, 234)
You can also use a left fold if you reverse the result afterwards:
(List.empty[Int] /: reduce_left_list)((acc, el) => if (el > 50) el :: acc else acc) reverse
//=> List(54, 82, 78, 90, 2343, 234)
If you try appending the element to the result instead, your runtime will be quadratic instead of linear:
(List.empty[Int] /: reduce_left_list)((acc, el) => if (el > 50) acc :+ el else acc)
//=> List(54, 82, 78, 90, 2343, 234)
However, saying that "you can do this using a left/right fold" is tautological: left/right fold is universal, which means that anything you can do with a collection, can be done with a left/right fold. Which means that using a left/right fold is not very intention-revealing: since a left/right fold can do anything, seeing a left/right fold in the code doesn't tell the reader anything about what's going on.
So, whenever possible, you should use a more specialized operation with a more intention-revealing name. In this particular case, you want to filter out some particular elements that satisfy a predicate. And the Scala collections API actually has a method that filters, and it is called (surprise!) filter:
reduce_left_list filter (_ > 50)
//=> List(54, 82, 78, 90, 2343, 234)
Alternatively, you can use withFilter instead:
reduce_left_list withFilter (_ > 50)
The difference is that filter returns a new list, whereas withFilter returns an instance of FilterMonadic, which is a view of the existing list that only includes the elements that satisfy the predicate.
Maybe you want to try filter:
List(12,34,54,50,82,34,78,90,3,45,43,1,2343,234).filter(_ > 50)