What is Scala's yield? - scala

I understand Ruby and Python's yield. What does Scala's yield do?

I think the accepted answer is great, but it seems many people have failed to grasp some fundamental points.
First, Scala's for comprehensions are equivalent to Haskell's do notation, and it is nothing more than a syntactic sugar for composition of multiple monadic operations. As this statement will most likely not help anyone who needs help, let's try again… :-)
Scala's for comprehensions is syntactic sugar for composition of multiple operations with map, flatMap and filter. Or foreach. Scala actually translates a for-expression into calls to those methods, so any class providing them, or a subset of them, can be used with for comprehensions.
First, let's talk about the translations. There are very simple rules:
This
for(x <- c1; y <- c2; z <-c3) {...}
is translated into
c1.foreach(x => c2.foreach(y => c3.foreach(z => {...})))
This
for(x <- c1; y <- c2; z <- c3) yield {...}
is translated into
c1.flatMap(x => c2.flatMap(y => c3.map(z => {...})))
This
for(x <- c; if cond) yield {...}
is translated on Scala 2.7 into
c.filter(x => cond).map(x => {...})
or, on Scala 2.8, into
c.withFilter(x => cond).map(x => {...})
with a fallback into the former if method withFilter is not available but filter is. Please see the section below for more information on this.
This
for(x <- c; y = ...) yield {...}
is translated into
c.map(x => (x, ...)).map((x,y) => {...})
When you look at very simple for comprehensions, the map/foreach alternatives look, indeed, better. Once you start composing them, though, you can easily get lost in parenthesis and nesting levels. When that happens, for comprehensions are usually much clearer.
I'll show one simple example, and intentionally omit any explanation. You can decide which syntax was easier to understand.
l.flatMap(sl => sl.filter(el => el > 0).map(el => el.toString.length))
or
for {
sl <- l
el <- sl
if el > 0
} yield el.toString.length
withFilter
Scala 2.8 introduced a method called withFilter, whose main difference is that, instead of returning a new, filtered, collection, it filters on-demand. The filter method has its behavior defined based on the strictness of the collection. To understand this better, let's take a look at some Scala 2.7 with List (strict) and Stream (non-strict):
scala> var found = false
found: Boolean = false
scala> List.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
7
9
scala> found = false
found: Boolean = false
scala> Stream.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
The difference happens because filter is immediately applied with List, returning a list of odds -- since found is false. Only then foreach is executed, but, by this time, changing found is meaningless, as filter has already executed.
In the case of Stream, the condition is not immediatelly applied. Instead, as each element is requested by foreach, filter tests the condition, which enables foreach to influence it through found. Just to make it clear, here is the equivalent for-comprehension code:
for (x <- List.range(1, 10); if x % 2 == 1 && !found)
if (x == 5) found = true else println(x)
for (x <- Stream.range(1, 10); if x % 2 == 1 && !found)
if (x == 5) found = true else println(x)
This caused many problems, because people expected the if to be considered on-demand, instead of being applied to the whole collection beforehand.
Scala 2.8 introduced withFilter, which is always non-strict, no matter the strictness of the collection. The following example shows List with both methods on Scala 2.8:
scala> var found = false
found: Boolean = false
scala> List.range(1,10).filter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
7
9
scala> found = false
found: Boolean = false
scala> List.range(1,10).withFilter(_ % 2 == 1 && !found).foreach(x => if (x == 5) found = true else println(x))
1
3
This produces the result most people expect, without changing how filter behaves. As a side note, Range was changed from non-strict to strict between Scala 2.7 and Scala 2.8.

It is used in sequence comprehensions (like Python's list-comprehensions and generators, where you may use yield too).
It is applied in combination with for and writes a new element into the resulting sequence.
Simple example (from scala-lang)
/** Turn command line arguments to uppercase */
object Main {
def main(args: Array[String]) {
val res = for (a <- args) yield a.toUpperCase
println("Arguments: " + res.toString)
}
}
The corresponding expression in F# would be
[ for a in args -> a.toUpperCase ]
or
from a in args select a.toUpperCase
in Linq.
Ruby's yield has a different effect.

Yes, as Earwicker said, it's pretty much the equivalent to LINQ's select and has very little to do with Ruby's and Python's yield. Basically, where in C# you would write
from ... select ???
in Scala you have instead
for ... yield ???
It's also important to understand that for-comprehensions don't just work with sequences, but with any type which defines certain methods, just like LINQ:
If your type defines just map, it allows for-expressions consisting of a
single generator.
If it defines flatMap as well as map, it allows for-expressions consisting
of several generators.
If it defines foreach, it allows for-loops without yield (both with single and multiple generators).
If it defines filter, it allows for-filter expressions starting with an if
in the for expression.

Unless you get a better answer from a Scala user (which I'm not), here's my understanding.
It only appears as part of an expression beginning with for, which states how to generate a new list from an existing list.
Something like:
var doubled = for (n <- original) yield n * 2
So there's one output item for each input (although I believe there's a way of dropping duplicates).
This is quite different from the "imperative continuations" enabled by yield in other languages, where it provides a way to generate a list of any length, from some imperative code with almost any structure.
(If you're familiar with C#, it's closer to LINQ's select operator than it is to yield return).

Consider the following for-comprehension
val A = for (i <- Int.MinValue to Int.MaxValue; if i > 3) yield i
It may be helpful to read it out loud as follows
"For each integer i, if it is greater than 3, then yield (produce) i and add it to the list A."
In terms of mathematical set-builder notation, the above for-comprehension is analogous to
which may be read as
"For each integer , if it is greater than , then it is a member of the set ."
or alternatively as
" is the set of all integers , such that each is greater than ."

The keyword yield in Scala is simply syntactic sugar which can be easily replaced by a map, as Daniel Sobral already explained in detail.
On the other hand, yield is absolutely misleading if you are looking for generators (or continuations) similar to those in Python. See this SO thread for more information: What is the preferred way to implement 'yield' in Scala?

Yield is similar to for loop which has a buffer that we cannot see and for each increment, it keeps adding next item to the buffer. When the for loop finishes running, it would return the collection of all the yielded values. Yield can be used as simple arithmetic operators or even in combination with arrays.
Here are two simple examples for your better understanding
scala>for (i <- 1 to 5) yield i * 3
res: scala.collection.immutable.IndexedSeq[Int] = Vector(3, 6, 9, 12, 15)
scala> val nums = Seq(1,2,3)
nums: Seq[Int] = List(1, 2, 3)
scala> val letters = Seq('a', 'b', 'c')
letters: Seq[Char] = List(a, b, c)
scala> val res = for {
| n <- nums
| c <- letters
| } yield (n, c)
res: Seq[(Int, Char)] = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), (3,a), (3,b), (3,c))
Hope this helps!!

val aList = List( 1,2,3,4,5 )
val res3 = for ( al <- aList if al > 3 ) yield al + 1
val res4 = aList.filter(_ > 3).map(_ + 1)
println( res3 )
println( res4 )
These two pieces of code are equivalent.
val res3 = for (al <- aList) yield al + 1 > 3
val res4 = aList.map( _+ 1 > 3 )
println( res3 )
println( res4 )
These two pieces of code are also equivalent.
Map is as flexible as yield and vice-versa.

val doubledNums = for (n <- nums) yield n * 2
val ucNames = for (name <- names) yield name.capitalize
Notice that both of those for-expressions use the yield keyword:
Using yield after for is the “secret sauce” that says, “I want to yield a new collection from the existing collection that I’m iterating over in the for-expression, using the algorithm shown.”
taken from here

According to the Scala documentation, it clearly says "yield a new collection from the existing collection".
Another Scala documentation says, "Scala offers a lightweight notation for expressing sequence comprehensions. Comprehensions have the form for (enums) yield e, where enums refers to a semicolon-separated list of enumerators. An enumerator is either a generator which introduces new variables, or it is a filter. "

yield is more flexible than map(), see example below
val aList = List( 1,2,3,4,5 )
val res3 = for ( al <- aList if al > 3 ) yield al + 1
val res4 = aList.map( _+ 1 > 3 )
println( res3 )
println( res4 )
yield will print result like: List(5, 6), which is good
while map() will return result like: List(false, false, true, true, true), which probably is not what you intend.

Related

In Scala returning a value from a for loop block

In the book "Scala for the impatient", it says on page 16
In Scala, a { } block contains a sequence of expressions, and the
result is also an expression. The value of the block is the value of
the last expression.
OK, then let's create a block and let the last value of the block be assigned:
scala> val evens = for (elem <- 1 to 10 if elem%2==0) {
| elem
| }
val evens: Unit = ()
I would have expected that evens is at least the last value of the sequence (i.e. 10). But why not?
You need to yield the value, then it's a for expression:
val evens = for (elem <- 1 to 10 if elem % 2 == 0) yield elem
Without that it's just a statement (does not return anything) and is translated to foreach.
P.S.: Of course this will return a collection of all the elements that fulfill the predicate and not the last one.
When in doubt just run it through the typechecker to peek under the hood
scala -Xprint:typer -e 'val evens = for (elem <- 1 to 10 if elem%2==0) { elem }'
reveals
val evens: Unit =
scala.Predef
.intWrapper(1)
.to(10)
.withFilter(((elem: Int) => elem.%(2).==(0)))
.foreach[Int](((elem: Int) => elem))
where we see foreach to be the last step in the chain, and its signature is
def foreach[U](f: A => U): Unit
where we see it returns Unit. You can even do this straight from within the REPL by executing the following command
scala> :settings -Xprint:typer
and now you will get real-time desugaring of Scala expressions at the same time they are interpreted. You can even take it a step further and get at the JVM bytecode itself
scala> :javap -
For-comprehensions are some of the most prevalent syntactic sugar in Scala so I would suggest to drill them as much as possible by perhaps trying to write them at the same time in both their suggared and desugared from until it clicks: https://docs.scala-lang.org/tutorials/FAQ/yield.html
Unit is the exception to the rule stated in your book. Unit basically says "ignore whatever type the block would have returned because I only intended to execute the block for the side effects." Otherwise, in order to get it to typecheck, you'd have to add a unit value to the end of any block that was supposed to return Unit:
val evens = for (elem <- 1 to 10 if elem%2==0) {
elem
()
}
This throwing away of type information is one reason people tend to avoid imperative for loops and similar in Scala.

Difference between applicative and monadic computation in scala

Given this simple computation i can not clearly see the difference between using applicative style over monadic style. Are there some better examples out there ( in scala ) when to use the one over the other.
println( (3.some |#| none[Int] |#| 4.some )( (a:Int,b:Int,c:Int) => { a + b + c } ) ) // prints None
println( for(
a <- Some(3);
b <- none[Int];
c <- Some(4)
) yield( a + b + c ) ) // prints None
Both computations ending up in a None so the end result is the same. The only difference i can see ist that there is no temporaray access to those vars in the for comprehension when using the applicative syntax.
Furthermore having one None value stops the whole computation. I thought applicative means "not dependent on the result of the computation before"
The applicative builder syntax will evaluate each term and can not use the result of a prior computation. However, even if the first result is None, all the other expressions will still be evaluated.
Whereas, with the for comprehension, it will 'fail fast' (it will not evaluate any further expressions after a None, in your case), plus you can access the results of previous computations.
Don't think of these things as simply different styles, they are calling different functions with different behaviours: i.e. flatMap vs apply
Monads represent sequential computations where each next computation depends on previous ones (if previous computation is empty you can't proceed, so you "fail fast"), more generic example of monadic computation:
println( for(
a <- Some(1);
b <- Some(a);
c <- Some(a + b)
) yield( a + b + c ) ) //=> 4
Applicative is just fmap on steroids where not only an argument, but a mapping function itself can be empty. In your case it can be rewritten as:
4.some <*>
{ none[Int] <*>
{ 3.some <*>
{ (_: Int) + (_: Int) + (_: Int) }.curried.some } }
On some step your function becomes Option[Int => Int] = None, but it doesn't stop from applying it to 4.some, only the result is None as expected. You still need to know the value of 4.some.

Extract elements from one list that aren't in another

Simply, I have two lists and I need to extract the new elements added to one of them.
I have the following
val x = List(1,2,3)
val y = List(1,2,4)
val existing :List[Int]= x.map(xInstance => {
if (!y.exists(yInstance =>
yInstance == xInstance))
xInstance
})
Result :existing: List[AnyVal] = List((), (), 3)
I need to remove all other elements except the numbers with the minimum cost.
Pick a suitable data structure, and life becomes a lot easier.
scala> x.toSet -- y
res1: scala.collection.immutable.Set[Int] = Set(3)
Also beware that:
if (condition) expr1
Is shorthand for:
if (condition) expr1 else ()
Using the result of this, which will usually have the static type Any or AnyVal is almost always an error. It's only appropriate for side-effects:
if (condition) buffer += 1
if (condition) sys.error("boom!")
retronym's solution is okay IF you don't have repeated elements that and you don't care about the order. However you don't indicate that this is so.
Hence it's probably going to be most efficient to convert y to a set (not x). We'll only need to traverse the list once and will have fast O(log(n)) access to the set.
All you need is
x filterNot y.toSet
// res1: List[Int] = List(3)
edit:
also, there's a built-in method that is even easier:
x diff y
(I had a look at the implementation; it looks pretty efficient, using a HashMap to count ocurrences.)
The easy way is to use filter instead so there's nothing to remove;
val existing :List[Int] =
x.filter(xInstance => !y.exists(yInstance => yInstance == xInstance))
val existing = x.filter(d => !y.exists(_ == d))
Returns
existing: List[Int] = List(3)

General comprehensions in Scala

As far as I understand, the Scala for-comprehension notation relies on the first generator to define how elements are to be combined. Namely, for (i <- list) yield i returns a list and for (i <- set) yield i returns a set.
I was wondering if there was a way to specify how elements are combined independently of the properties of the first generator. For instance, I would like to get "the set of all elements from a given list", or "the sum of all elements from a given set". The only way I have found is to first build a list or a set as prescribed by the for-comprehension notation, then apply a transformation function to it - building a useless data structure in the process.
What I have in mind is a general "algebraic" comprehension notation as it exists for instance in Ateji PX:
`+ { i | int i : set } // the sum of all elements from a given set
set() { i | int i : list } // the set of all elements from a given list
concat(",") { s | String s : list } // string concatenation with a separator symbol
Here the first element (`+, set(), concat(",")) is a so-called "monoid" that defines how elements are combined, independently of the structure of the first generator (there can be multiple generators and filters, I just tried to keep the examples concise).
Any idea about how to achieve a similar result in Scala while keeping a nice and concise notation ? As far as I understand, the for-comprehension notation is hard-wired in the compiler and cannot be upgraded.
Thanks for your feedback.
About the for comprehension
The for comprehension in scala is syntactic sugar for calls to flatMap, filter, map and foreach. In exactly the same way as calls to those methods, the type of the target collection leads to the type of the returned collection. That is:
list map f //is a List
vector map f // is a Vector
This property is one of the underlying design goals of the scala collections library and would be seen as desirable in most situations.
Answering the question
You do not need to construct any intermediate collection of course:
(list.view map (_.prop)).toSet //uses list.view
(list.iterator map (_.prop)).toSet //uses iterator
(for { l <- list.view} yield l.prop).toSet //uses view
(Set.empty[Prop] /: coll) { _ + _.prop } //uses foldLeft
Will all yield Sets without generating unnecessary collections. My personal preference is for the first. In terms of idiomatic scala collection manipulation, each "collection" comes with these methods:
//Conversions
toSeq
toSet
toArray
toList
toIndexedSeq
iterator
toStream
//Strings
mkString
//accumulation
sum
The last is used where the element type of a collection has an implicit Numeric instance in scope; such as:
Set(1, 2, 3, 4).sum //10
Set('a, 'b).sum //does not compile
Note that the String concatenation example in scala looks like:
list.mkString(",")
And in the scalaz FP library might look something like (which uses Monoid to sum Strings):
list.intercalate(",").asMA.sum
Your suggestions do not look anything like Scala; I'm not sure whether they are inspired by another language.
foldLeft? That's what you're describing.
The sum of all elements from a given set:
(0 /: Set(1,2,3))(_ + _)
the set of all elements from a given list
(Set[Int]() /: List(1,2,3,2,1))((acc,x) => acc + x)
String concatenation with a separator symbol:
("" /: List("a", "b"))(_ + _) // (edit - ok concat a bit more verbose:
("" /: List("a", "b"))((acc,x) => acc + (if (acc == "") "" else ",") + x)
You can also force the result type of the for comprehension by explicitly supplying the implicit CanBuildFrom parameter as scala.collection.breakout and specifying the result type.
Consider this REPL session:
scala> val list = List(1, 1, 2, 2, 3, 3)
list: List[Int] = List(1, 1, 2, 2, 3, 3)
scala> val res = for(i <- list) yield i
res: List[Int] = List(1, 1, 2, 2, 3, 3)
scala> val res: Set[Int] = (for(i <- list) yield i)(collection.breakOut)
res: Set[Int] = Set(1, 2, 3)
It results in a type error when not specifying the CanBuildFrom explicitly:
scala> val res: Set[Int] = for(i <- list) yield i
<console>:8: error: type mismatch;
found : List[Int]
required: Set[Int]
val res: Set[Int] = for(i <- list) yield i
^
For a deeper understanding of this I suggest the following read:
http://www.scala-lang.org/docu/files/collections-api/collections-impl.html
If you want to use for comprehensions and still be able to combine your values in some result value you could do the following.
case class WithCollector[B, A](init: B)(p: (B, A) => B) {
var x: B = init
val collect = { (y: A) => { x = p(x, y) } }
def apply(pr: (A => Unit) => Unit) = {
pr(collect)
x
}
}
// Some examples
object Test {
def main(args: Array[String]): Unit = {
// It's still functional
val r1 = WithCollector[Int, Int](0)(_ + _) { collect =>
for (i <- 1 to 10; if i % 2 == 0; j <- 1 to 3) collect(i + j)
}
println(r1) // 120
import collection.mutable.Set
val r2 = WithCollector[Set[Int], Int](Set[Int]())(_ += _) { collect =>
for (i <- 1 to 10; if i % 2 == 0; j <- 1 to 3) collect(i + j)
}
println(r2) // Set(9, 10, 11, 6, 13, 4, 12, 3, 7, 8, 5)
}
}

scala loop through a linkedlist

In scala, what is a good way to loop through a linked list(scala.collection.mutable.LinkedList) of objects? For example, I want to have 'for' loop traverse through each object on the linked list and process it.
With foreach:
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) Client VM, Java 1.6.0_21).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val ll = scala.collection.mutable.LinkedList[Int](1,2,3)
ll: scala.collection.mutable.LinkedList[Int] = LinkedList(1, 2, 3)
scala> ll.foreach(i => println(i * 2))
2
4
6
or, if your processing of each object returns a new value, use map:
scala> ll.map(_ * 2)
res3: scala.collection.mutable.LinkedList[Int] = LinkedList(2, 4, 6)
Some people prefer for comprehensions instead of foreach and map. They look like this:
scala> for (i <- ll) println(i)
1
2
3
scala> for (i <- ll) yield i * 2
res5: scala.collection.mutable.LinkedList[Int] = LinkedList(2, 4, 6)
To expand on the previous answer...
for, foreach and map are all higher-order functions - they can all take a function as an argument, so starting here:
val list = List(1,2,3)
list.foreach(i => println(i * 2))
You have a number of ways that you can make the code more declarative in nature, and cleaner at the same time.
First, you don't really need to use the name - i - for each member of the collection, you can use _ as a placeholder instead:
list.foreach(println(_ * 2))
You can also separate the logic out into a distinct method, and continue to use placeholder syntax:
def printTimesTwo(i:Int) = println(i * 2)
list.foreach(printTimesTwo(_))
Even cleaner, just pass the raw function without specifying parameters (look ma, no placeholders!)
list.foreach(printTimesTwo)
And to take it to a logical conclusion, this can be made cleaner still by using infix syntax. Which I show here working with a standard library method. Note: you could even use a method imported from a java library, if you wanted:
list foreach println
This thinking extends to anonymous functions and partially-applied functions and also to the map operation:
// "2 *" creates an anonymous function that will double its one-and-only argument
list map { 2 * }
For-comprehensions aren't really very useful when working at this level, they just add boilerplate. But they do come into their own when working with deeper nested structures:
//a list of lists, print out all the numbers
val grid = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 9))
grid foreach { _ foreach println } //hmm, could get confusing
for(line <- grid; cell <- line) println(cell) //that's clearer
I didn't need the yield keyword there, as nothing is being returned. But if I wanted to get back a list of Strings (un-nested):
for(line <- grid; cell <- line) yield { cell.toString }
With lots of generators, you'll want to split them over multiple lines:
for {
listOfGrids <- someMasterCollection
grid <- listOfGrids
line <- grid
cell <- line
} yield {
cell.toString
}