In the book "Scala for the impatient", it says on page 16
In Scala, a { } block contains a sequence of expressions, and the
result is also an expression. The value of the block is the value of
the last expression.
OK, then let's create a block and let the last value of the block be assigned:
scala> val evens = for (elem <- 1 to 10 if elem%2==0) {
| elem
| }
val evens: Unit = ()
I would have expected that evens is at least the last value of the sequence (i.e. 10). But why not?
You need to yield the value, then it's a for expression:
val evens = for (elem <- 1 to 10 if elem % 2 == 0) yield elem
Without that it's just a statement (does not return anything) and is translated to foreach.
P.S.: Of course this will return a collection of all the elements that fulfill the predicate and not the last one.
When in doubt just run it through the typechecker to peek under the hood
scala -Xprint:typer -e 'val evens = for (elem <- 1 to 10 if elem%2==0) { elem }'
reveals
val evens: Unit =
scala.Predef
.intWrapper(1)
.to(10)
.withFilter(((elem: Int) => elem.%(2).==(0)))
.foreach[Int](((elem: Int) => elem))
where we see foreach to be the last step in the chain, and its signature is
def foreach[U](f: A => U): Unit
where we see it returns Unit. You can even do this straight from within the REPL by executing the following command
scala> :settings -Xprint:typer
and now you will get real-time desugaring of Scala expressions at the same time they are interpreted. You can even take it a step further and get at the JVM bytecode itself
scala> :javap -
For-comprehensions are some of the most prevalent syntactic sugar in Scala so I would suggest to drill them as much as possible by perhaps trying to write them at the same time in both their suggared and desugared from until it clicks: https://docs.scala-lang.org/tutorials/FAQ/yield.html
Unit is the exception to the rule stated in your book. Unit basically says "ignore whatever type the block would have returned because I only intended to execute the block for the side effects." Otherwise, in order to get it to typecheck, you'd have to add a unit value to the end of any block that was supposed to return Unit:
val evens = for (elem <- 1 to 10 if elem%2==0) {
elem
()
}
This throwing away of type information is one reason people tend to avoid imperative for loops and similar in Scala.
I have a:
val a : Stream[Boolean] = ...
When I foldLeft it as follows
val b = a.foldLeft(false)(_||_)
Will it terminate when it finds the first true value in the stream? If not, how do I make it to?
It would not terminate on the first true. You can use exists instead:
val b = a.exists(identity)
No it won't terminate early. This is easy to illustrate:
val a : Stream[Boolean] = Stream.continually(true)
// won't terminate because the strea
val b = a.foldLeft(false)(_||_)
stew showed that a simple solution to terminate early, in your specific case, is
val b = a.exists(identity).
Even simpler, this is equivalent to:
val b = a.contains(true)
A more general solution which unlike the above is also applicable if you actually need a fold, is to use recursion (note that here I am assuming the stream is non-empty, for simplicity):
def myReduce( s: Stream[Boolean] ): Boolean = s.head || myReduce( s.tail )
val b = myReduce(a)
Now the interesting thing of the recursive solution is how it can be used in a more general use case where you actually need to accumulate the values in some way (which is what fold is for) and still terminate early. Say that you want to add the values of a stream of ints using an add method that will "terminate" early in a way similar to || (in this case, it does not evaluate its right hand side if the left hand side is > 100):
def add(x: Int, y: => Int) = if ( x >= 100 ) x else x + y
val a : Stream[Int] = Stream.range(0, Int.MaxValue)
val b = a.foldLeft(0)(add(_, _))
The last line won't terminate, much like in your example. But you can fix it like this:
def myReduce( s: Stream[Int] ): Int = add( s.head, myReduce( s.tail ) )
val b = myReduce(a)
WARNING: there is a significant downside to this approach though: myReduce here is not tail recursive, meaning that it will blow your stack if iterating over too many elements of the stream.
Yet another solution, which does nto blow the stack, is this:
val b = a.takeWhile(_ <= 100).foldLeft(0)(_ + _)
But I fear I have gone really too far on the off topic side, so I'd better stop now.
You could use takeWhile to extract the prefix of the Stream on which you want to operate and then apply foldLeft to that.
Can someone please explain the following output from the REPL?
I'm defining 2 (infinite) Streams that are otherwise identical in their definition except that map is preceded by . (period) in one definition and a _ _ (space) in the other.
I can see that this would cause map to bind differently, but what happens to the 1 in the output from the second definition?
Thanks.
scala> lazy val infinite: Stream[Int] = 1 #:: infinite.map(_+1)
infinite: Stream[Int] = <lazy>
scala> val l = infinite.take(10).toList.mkString(",")
l: String = 1,2,3,4,5,6,7,8,9,10
scala> lazy val infinite2: Stream[Int] = 1 #:: infinite2 map(_+1)
infinite2: Stream[Int] = <lazy>
scala> val l2 = infinite2.take(10).toList.mkString(",")
l2: String = 2,3,4,5,6,7,8,9,10,11
It's about method associativity. This:
1 #:: infinite.map(_+1)
is quite straightforward while this:
1 #:: infinite2 map(_+1)
is interpreted by the compiler as:
(1 #:: infinite2) map(_+1)
1 #:: infinite2 is your desired stream, but before you return it, you apply lazy transformation adding one to every item. This explains why 1 never appears as a result - after transformation it becomes 2.
For more details see: Operator precedence in Scala. Since # is not a special character, it is treated equally with map, thus methods are evaluated from left to right.
In the infinite2 case, what you've expressed is equivalent to the following:
lazy val infinite2: Stream[Int] = (1 #:: infinite2) map(_ + 1)
Since the stream starts with 1, the map will add 1 to the first element.
I found this issue of scala: https://issues.scala-lang.org/browse/SI-4939
Seems we can define a method whose name is a number:
scala> object Foo { val 1 = 2 }
defined module Foo
But we can't invoke it:
scala> Foo.1
<console>:1: error: ';' expected but double literal found.
Foo.1
And we can invoke it inside the object:
scala> object O { val 1 = 1; def x = 1 }
defined module O
scala> O.x
res1: Int = 1
And follow will throw error:
scala> object O { val 1 = 2; def x = 1 }
defined module O
scala> O.x
scala.MatchError: 2
at O$.<init>(<console>:5)
at O$.<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at RequestResult$.<init>(<console>:9)
I use scalac -Xprint:typer to see the code, the val 1 = 2 part is:
<synthetic> private[this] val x$1: Unit = (2: Int(2) #unchecked) match {
case 1 => ()
}
From it, we can see the method name changed to x$1, and only can be invoked inside that object.
And the resolution of that issue is: Won't Fix
I want to know is there any reason to allow a number to be the name of a method? Is there any case we need to use a "number" method?
There is no name "1" being bound here. val 1 = 2 is a pattern-matching expression, in much the same way val (x,2) = (1,2) binds x to 1 (and would throw a MatchError if the second element were not thet same). It's allowed because there's no real reason to add a special case to forbid it; this way val pattern matching works (almost) exactly the same way as match pattern-matching.
There are usually two factors in this kind of decision:
There are many bugs in Scalac that are much higher priority, and bug fixing resources are limited. This behavior is benign and therefore low priority.
There's a long term cost to any increases in the complexity of the language specification, and the current behavior is consistent with the spec. Once things start getting special cased, there can be an avalanche effect.
It's some combination of these two.
Update. Here's what seems strange to me:
val pair = (1, 2)
object Foo
object Bar
val (1, 2) = pair // Pattern matching on constants 1 and 2
val (Foo, Bar) = pair // Pattern matching on stable ids Foo and Bar
val (foo, bar) = pair // Binds foo and bar because they are lowercase
val 1 = 1 // Pattern matching on constant 1
val Foo = 1 // *Not* pattern matching; binds Foo
If val 1 = 1 is pattern matching, then why should val Foo = 1 bind Foo rather than pattern match?
Update 2. Daniel Sobral pointed out that this is a special exception, and Martin Odersky recently wrote the same.
Here's a few examples to show how the LHS of an assignment is more than just a name:
val pair = (1, 2)
val (a1, b1) = pair // LHS of the = is a pattern
val (1, b2) = pair // okay, b2 is bound the the value 2
val (0, b3) = pair // MatchError, as 0 != 1
val a4 = 1 // okay, a4 is bound to the value 1
val 1 = 1 // okay, but useless, no names are bound
val a # 1 = 1 // well, we can bind a name to a pattern with #
val 1 = 0 // MatchError
As always, you can use backticks to escape the name. I see no problem in supporting such names – either you use them and they work for you or they do not work for you, and you don’t use them.
I understand Ruby and Python's yield. What does Scala's yield do?
I think the accepted answer is great, but it seems many people have failed to grasp some fundamental points.
First, Scala's for comprehensions are equivalent to Haskell's do notation, and it is nothing more than a syntactic sugar for composition of multiple monadic operations. As this statement will most likely not help anyone who needs help, let's try again… :-)
Scala's for comprehensions is syntactic sugar for composition of multiple operations with map, flatMap and filter. Or foreach. Scala actually translates a for-expression into calls to those methods, so any class providing them, or a subset of them, can be used with for comprehensions.
First, let's talk about the translations. There are very simple rules:
This
for(x <- c1; y <- c2; z <-c3) {...}
is translated into
c1.foreach(x => c2.foreach(y => c3.foreach(z => {...})))
This
for(x <- c1; y <- c2; z <- c3) yield {...}
is translated into
c1.flatMap(x => c2.flatMap(y => c3.map(z => {...})))
This
for(x <- c; if cond) yield {...}
is translated on Scala 2.7 into
c.filter(x => cond).map(x => {...})
or, on Scala 2.8, into
c.withFilter(x => cond).map(x => {...})
with a fallback into the former if method withFilter is not available but filter is. Please see the section below for more information on this.
This
for(x <- c; y = ...) yield {...}
is translated into
c.map(x => (x, ...)).map((x,y) => {...})
When you look at very simple for comprehensions, the map/foreach alternatives look, indeed, better. Once you start composing them, though, you can easily get lost in parenthesis and nesting levels. When that happens, for comprehensions are usually much clearer.
I'll show one simple example, and intentionally omit any explanation. You can decide which syntax was easier to understand.
l.flatMap(sl => sl.filter(el => el > 0).map(el => el.toString.length))
or
for {
sl <- l
el <- sl
if el > 0
} yield el.toString.length
withFilter
Scala 2.8 introduced a method called withFilter, whose main difference is that, instead of returning a new, filtered, collection, it filters on-demand. The filter method has its behavior defined based on the strictness of the collection. To understand this better, let's take a look at some Scala 2.7 with List (strict) and Stream (non-strict):
scala> var found = false
found: Boolean = false
scala> List.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
7
9
scala> found = false
found: Boolean = false
scala> Stream.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
The difference happens because filter is immediately applied with List, returning a list of odds -- since found is false. Only then foreach is executed, but, by this time, changing found is meaningless, as filter has already executed.
In the case of Stream, the condition is not immediatelly applied. Instead, as each element is requested by foreach, filter tests the condition, which enables foreach to influence it through found. Just to make it clear, here is the equivalent for-comprehension code:
for (x <- List.range(1, 10); if x % 2 == 1 && !found)
if (x == 5) found = true else println(x)
for (x <- Stream.range(1, 10); if x % 2 == 1 && !found)
if (x == 5) found = true else println(x)
This caused many problems, because people expected the if to be considered on-demand, instead of being applied to the whole collection beforehand.
Scala 2.8 introduced withFilter, which is always non-strict, no matter the strictness of the collection. The following example shows List with both methods on Scala 2.8:
scala> var found = false
found: Boolean = false
scala> List.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
7
9
scala> found = false
found: Boolean = false
scala> List.range(1,10).withFilter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
This produces the result most people expect, without changing how filter behaves. As a side note, Range was changed from non-strict to strict between Scala 2.7 and Scala 2.8.
It is used in sequence comprehensions (like Python's list-comprehensions and generators, where you may use yield too).
It is applied in combination with for and writes a new element into the resulting sequence.
Simple example (from scala-lang)
/** Turn command line arguments to uppercase */
object Main {
def main(args: Array[String]) {
val res = for (a <- args) yield a.toUpperCase
println("Arguments: " + res.toString)
}
}
The corresponding expression in F# would be
[ for a in args -> a.toUpperCase ]
or
from a in args select a.toUpperCase
in Linq.
Ruby's yield has a different effect.
Yes, as Earwicker said, it's pretty much the equivalent to LINQ's select and has very little to do with Ruby's and Python's yield. Basically, where in C# you would write
from ... select ???
in Scala you have instead
for ... yield ???
It's also important to understand that for-comprehensions don't just work with sequences, but with any type which defines certain methods, just like LINQ:
If your type defines just map, it allows for-expressions consisting of a
single generator.
If it defines flatMap as well as map, it allows for-expressions consisting
of several generators.
If it defines foreach, it allows for-loops without yield (both with single and multiple generators).
If it defines filter, it allows for-filter expressions starting with an if
in the for expression.
Unless you get a better answer from a Scala user (which I'm not), here's my understanding.
It only appears as part of an expression beginning with for, which states how to generate a new list from an existing list.
Something like:
var doubled = for (n <- original) yield n * 2
So there's one output item for each input (although I believe there's a way of dropping duplicates).
This is quite different from the "imperative continuations" enabled by yield in other languages, where it provides a way to generate a list of any length, from some imperative code with almost any structure.
(If you're familiar with C#, it's closer to LINQ's select operator than it is to yield return).
Consider the following for-comprehension
val A = for (i <- Int.MinValue to Int.MaxValue; if i > 3) yield i
It may be helpful to read it out loud as follows
"For each integer i, if it is greater than 3, then yield (produce) i and add it to the list A."
In terms of mathematical set-builder notation, the above for-comprehension is analogous to
which may be read as
"For each integer , if it is greater than , then it is a member of the set ."
or alternatively as
" is the set of all integers , such that each is greater than ."
The keyword yield in Scala is simply syntactic sugar which can be easily replaced by a map, as Daniel Sobral already explained in detail.
On the other hand, yield is absolutely misleading if you are looking for generators (or continuations) similar to those in Python. See this SO thread for more information: What is the preferred way to implement 'yield' in Scala?
Yield is similar to for loop which has a buffer that we cannot see and for each increment, it keeps adding next item to the buffer. When the for loop finishes running, it would return the collection of all the yielded values. Yield can be used as simple arithmetic operators or even in combination with arrays.
Here are two simple examples for your better understanding
scala>for (i <- 1 to 5) yield i * 3
res: scala.collection.immutable.IndexedSeq[Int] = Vector(3, 6, 9, 12, 15)
scala> val nums = Seq(1,2,3)
nums: Seq[Int] = List(1, 2, 3)
scala> val letters = Seq('a', 'b', 'c')
letters: Seq[Char] = List(a, b, c)
scala> val res = for {
| n <- nums
| c <- letters
| } yield (n, c)
res: Seq[(Int, Char)] = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), (3,a), (3,b), (3,c))
Hope this helps!!
val aList = List( 1,2,3,4,5 )
val res3 = for ( al <- aList if al > 3 ) yield al + 1
val res4 = aList.filter(_ > 3).map(_ + 1)
println( res3 )
println( res4 )
These two pieces of code are equivalent.
val res3 = for (al <- aList) yield al + 1 > 3
val res4 = aList.map( _+ 1 > 3 )
println( res3 )
println( res4 )
These two pieces of code are also equivalent.
Map is as flexible as yield and vice-versa.
val doubledNums = for (n <- nums) yield n * 2
val ucNames = for (name <- names) yield name.capitalize
Notice that both of those for-expressions use the yield keyword:
Using yield after for is the “secret sauce” that says, “I want to yield a new collection from the existing collection that I’m iterating over in the for-expression, using the algorithm shown.”
taken from here
According to the Scala documentation, it clearly says "yield a new collection from the existing collection".
Another Scala documentation says, "Scala offers a lightweight notation for expressing sequence comprehensions. Comprehensions have the form for (enums) yield e, where enums refers to a semicolon-separated list of enumerators. An enumerator is either a generator which introduces new variables, or it is a filter. "
yield is more flexible than map(), see example below
val aList = List( 1,2,3,4,5 )
val res3 = for ( al <- aList if al > 3 ) yield al + 1
val res4 = aList.map( _+ 1 > 3 )
println( res3 )
println( res4 )
yield will print result like: List(5, 6), which is good
while map() will return result like: List(false, false, true, true, true), which probably is not what you intend.