Can you define a value (in a if) in a for comprehension in Scala for use in yield - scala

Is it possible to define a value (in a if) in a for comprehension in Scala for use in yield.
I want to do this to avoid a potential expensive evaluation two times.
An example to illustrate.
for {
bar <- bars if expensive(bar) > 5
} yield (bar, expensive(bar))

How about this:
for {
bar <- bars
exp = expensive(bar)
if exp > 5
} yield (bar, exp)

Yes, you can:
scala> List(1,2,3,4,5)
res0: List[Int] = List(1, 2, 3, 4, 5)
scala> for(n <- res0; val b = n % 2; if b==1) yield b
res2: List[Int] = List(1, 1, 1)

Related

Scala: Indexed Seq instead of List in for loop

The following code shows a type mismatch error :
def f(arr:List[Int]): List[Int] =
for(num <- 0 to arr.length-1; if num % 2 == 1) yield arr(num)
It is says that it found an IndexedSeq instead of a List. The following works :
def f(arr:List[Int]): List[Int] =
for(num <- (0 to arr.length-1).toList; if num % 2 == 1) yield arr(num)
I have used i <- a to b in a for loop before but haven't seen this error before. Can someone please explain why the format i <- a to b cannot be used here ?
because 0 to arr.length-1 return type is: IndexedSeq[Int], so when execute for yield it also will yield result with IndexedSeq[Int] type.
The correct function define:
def f(arr:List[Int]):IndexedSeq[Int] = for( num <- 0 to arr.length-1 if num%2==1) yield arr(num)
And
for( num <- 0 to arr.length-1 if num%2==1) yield arr(num)
will translate to:
scala> def f(arr:List[Int]) = (0 to arr.length-1).filter(i => i%2==1).map(i => arr(i))
f: (arr: List[Int])scala.collection.immutable.IndexedSeq[Int]
So we can see the return type is decided by 0 to arr.length-1 type.
and (0 to arr.length-1).toList is changing the return IndexedSeq[int] type to List[Int] type, so for yield will generate result with type of List[Int].
In Scala, for each iteration of your for loop, yield generates a value which will be remembered. The type of the collection that is returned is the same type that you were iterating over, so a List yields a List, a IndexedSeq yields a IndexedSeq, and so on.
The type of (0 to arr.length-1) is scala.collection.immutable.Range, it's Inherited from scala.collection.immutable.IndexedSeq[Int]. So, in the first case, the result is IndexedSeq[Int], but the return type of function f is List[Int], obviously it doesn't work. In the second case, a List yields a List, and the return type of f is List[Int].
You can also write function f as follow:
def f(arr: List[Int]): IndexedSeq[Int] = for( a <- 1 to arr.length-1; if a % 2 == 1) yield arr(a)
Another example:
scala> for (i <- 1 to 5) yield i
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 4, 5)
scala> for (e <- Array(1, 2, 3, 4, 5)) yield e
res1: Array[Int] = Array(1, 2, 3, 4, 5)
In scala for is a syntax sugar, where:
for (i <- a to b) yield func(i)
translate to:
RichInt(a).to(b).map({ i => func(i) })
RichInt.to returns a Range
Range.map returns a IndexedSeq

Scala List Operation

Given a List of Int and variable X of Int type . What is the best in Scala functional way to retain only those values in the List (starting from beginning of list) such that sum of list values is less than equal to variable.
This is pretty close to a one-liner:
def takeWhileLessThan(x: Int)(l: List[Int]): List[Int] =
l.scan(0)(_ + _).tail.zip(l).takeWhile(_._1 <= x).map(_._2)
Let's break that into smaller pieces.
First you use scan to create a list of cumulative sums. Here's how it works on a small example:
scala> List(1, 2, 3, 4).scan(0)(_ + _)
res0: List[Int] = List(0, 1, 3, 6, 10)
Note that the result includes the initial value, which is why we take the tail in our implementation.
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail
res1: List[Int] = List(1, 3, 6, 10)
Now we zip the entire thing against the original list. Taking our example again, this looks like the following:
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail.zip(List(1, 2, 3, 4))
res2: List[(Int, Int)] = List((1,1), (3,2), (6,3), (10,4))
Now we can use takeWhile to take as many values as we can from this list before the cumulative sum is greater than our target. Let's say our target is 5 in our example:
scala> res2.takeWhile(_._1 <= 5)
res3: List[(Int, Int)] = List((1,1), (3,2))
This is almost what we want—we just need to get rid of the cumulative sums:
scala> res2.takeWhile(_._1 <= 5).map(_._2)
res4: List[Int] = List(1, 2)
And we're done. It's worth noting that this isn't very efficient, since it computes the cumulative sums for the entire list, etc. The implementation could be optimized in various ways, but as it stands it's probably the simplest purely functional way to do this in Scala (and in most cases the performance won't be a problem, anyway).
In addition to Travis' answer (and for the sake of completeness), you can always implement these type of operations as a foldLeft:
def takeWhileLessThanOrEqualTo(maxSum: Int)(list: Seq[Int]): Seq[Int] = {
// Tuple3: the sum of elements so far; the accumulated list; have we went over x, or in other words are we finished yet
val startingState = (0, Seq.empty[Int], false)
val (_, accumulatedNumbers, _) = list.foldLeft(startingState) {
case ((sum, accumulator, finished), nextNumber) =>
if(!finished) {
if (sum + nextNumber > maxSum) (sum, accumulator, true) // We are over the sum limit, finish
else (sum + nextNumber, accumulator :+ nextNumber, false) // We are still under the limit, add it to the list and sum
} else (sum, accumulator, finished) // We are in a finished state, just keep iterating over the list
}
accumulatedNumbers
}
This only iterates over the list once, so it should be more efficient, but is more complicated and requires a bit of reading code to understand.
I will go with something like this, which is more functional and should be efficient.
def takeSumLessThan(x:Int,l:List[Int]): List[Int] = (x,l) match {
case (_ , List()) => List()
case (x, _) if x<= 0 => List()
case (x, lh :: lt) => lh :: takeSumLessThan(x-lh,lt)
}
Edit 1 : Adding tail recursion and implicit for shorter call notation
import scala.annotation.tailrec
implicit class MyList(l:List[Int]) {
def takeSumLessThan(x:Int) = {
#tailrec
def f(x:Int,l:List[Int],acc:List[Int]) : List[Int] = (x,l) match {
case (_,List()) => acc
case (x, _ ) if x <= 0 => acc
case (x, lh :: lt ) => f(x-lh,lt,acc ++ List(lh))
}
f(x,l,Nil)
}
}
Now you can use this like
List(1,2,3,4,5,6,7,8).takeSumLessThan(10)

How to sum adjacent elements in scala

I want to sum adjacent elements in scala and I'm not sure how to deal with the last element.
So I have a list:
val x = List(1,2,3,4)
And I want to sum adjacent elements using indices and map:
val size = x.indices.size
val y = x.indices.map(i =>
if (i < size - 1)
x(i) + x(i+1))
The problem is that this approach creates an AnyVal elemnt at the end:
res1: scala.collection.immutable.IndexedSeq[AnyVal] = Vector(3, 5, 7, ())
and if I try to sum the elements or another numeric method of the collection, it doesn't work:
error: could not find implicit value for parameter num: Numeric[AnyVal]
I tried to filter out the element using:
y diff List(Unit) or y diff List(AnyVal)
but it doesn't work.
Is there a better approach in scala to do this type of adjacent sum without using a foor loop?
For a more functional solution, you can use sliding to group the elements together in twos (or any number of them), then map to their sum.
scala> List(1, 2, 3, 4).sliding(2).map(_.sum).toList
res80: List[Int] = List(3, 5, 7)
What sliding(2) will do is create an intermediate iterator of lists like this:
Iterator(
List(1, 2),
List(2, 3),
List(3, 4)
)
So when we chain map(_.sum), we will map each inner List to it's own sum. toList will convert the Iterator back into a List.
You can try pattern matching and tail recursion also.
import scala.annotation.tailrec
#tailrec
def f(l:List[Int],r :List[Int]=Nil):List[Int] = {
l match {
case x :: xs :: xss =>
f(l.tail, r :+ (x + xs))
case _ => r
}
}
scala> f(List(1,2,3,4))
res4: List[Int] = List(3, 5, 7)
With a for comprehension by zipping two lists, the second with the first item dropped,
for ( (a,b) <- x zip x.drop(1) ) yield a+b
which results in
List(3, 5, 7)

verifying a probability distribution with variable arguments sums to 1

I was wondering how you would write a method in Scala that takes a function f and a list of arguments args where each arg is a range. Suppose I have three arguments (Range(0,2), Range(0,10), and Range(1, 5)). Then I want to iterate over f with all the possibilities of those three arguments.
var sum = 0.0
for (a <- arg(0)) {
for (b <- arg(1)) {
for (c <- arg(2)) {
sum += f(a, b, c)
}
}
}
However, I want this method to work for functions with a variable number of arguments. Is this possible?
Edit: is there any way to do this when the function does not take a list, but rather takes a standard parameter list or is curried?
That's a really good question!
You want to run flatMap in sequence over a list of elements of arbitrary size. When you don't know how long your list is, you can process it with recursion, or equivalently, with a fold.
scala> def sequence[A](lss: List[List[A]]) = lss.foldRight(List(List[A]())) {
| (m, n) => for (x <- m; xs <- n) yield x :: xs
| }
scala> sequence(List(List(1, 2), List(4, 5), List(7)))
res2: List[List[Int]] = List(List(1, 4, 7), List(1, 5, 7), List(2, 4, 7), List(2
, 5, 7))
(If you can't figure out the code, don't worry, learn how to use Hoogle and steal it from Haskell)
You can do this with Scalaz (in general it starts with a F[G[X]] and returns a G[F[X]], given that the type constructors G and F have the Traverse and Applicative capabilities respectively.
scala> import scalaz._
import scalaz._
scala> import Scalaz._
import Scalaz._
scala> List(List(1, 2), List(4, 5), List(7)).sequence
res3: List[List[Int]] = List(List(1, 4, 7), List(1, 5, 7), List(2, 4, 7), List(2
, 5, 7))
scala> Seq(some(1), some(2)).sequence
res4: Option[Seq[Int]] = Some(List(1, 2))
scala> Seq(some(1), none[Int]).sequence
res5: Option[Seq[Int]] = None
That would more or less do the job (without applying f, which you can do separately)
def crossProduct[A](xxs: Seq[A]*) : Seq[Seq[A]]
= xxs.foldLeft(Vector(Vector[A]())){(res, xs) =>
for(r <- res; x <- xs) yield r :+ x
}
You can then just map your function on that. I'm not sure it's a very efficient implementation though.
That's the answer from recursive perspective. Unfortunately, not so short as others.
def foo(f: List[Int] => Int, args: Range*) = {
var sum = 0.0
def rec(ranges: List[Range], ints: List[Int]): Unit = {
if (ranges.length > 0)
for (i <- ranges.head)
rec(ranges.tail, i :: ints)
else
sum += f(ints)
}
rec(args.toList, List[Int]())
sum
}
Have a look at this answer. I use this code for exactly this purpose. It's slightly optimized. I think I could produce a faster version if you need one.

Multiple yields in sequence comprehension?

I'm trying to learn Scala and tried to write a sequence comprehension that extracts unigrams, bigrams and trigrams from a sequence. E.g., [1,2,3,4] should be transformed to (not Scala syntax)
[1; _,1; _,_,1; 2; 1,2; _,1,2; 3; 2,3; 1,2,3; 4; 3,4; 2,3,4]
In Scala 2.8, I tried the following:
def trigrams(tokens : Seq[T]) = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens) {
yield t3
yield (t2,t3)
yield (t1,t2,Some(t3))
t1 = t2
t2 = t3
}
}
But this doesn't compile as, apparently, only one yield is allowed in a for-comprehension (no block statements either). Is there any other elegant way to get the same behavior, with only one pass over the data?
You can't have multiple yields in a for loop because for loops are syntactic sugar for the map (or flatMap) operations:
for (i <- collection) yield( func(i) )
translates into
collection map {i => func(i)}
Without a yield at all
for (i <- collection) func(i)
translates into
collection foreach {i => func(i)}
So the entire body of the for loop is turned into a single closure, and the presence of the yield keyword determines whether the function called on the collection is map or foreach (or flatMap). Because of this translation, the following are forbidden:
Using imperative statements next to a yield to determine what will be yielded.
Using multiple yields
(Not to mention that your proposed verison will return a List[Any] because the tuples and the 1-gram are all of different types. You probably want to get a List[List[Int]] instead)
Try the following instead (which put the n-grams in the order they appear):
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
for {onegram <- basis
ngram <- slidingIterators if ngram.hasNext}
yield (ngram.next)
or
val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
val first=slidingIterators head
val buf=new ListBuffer[List[Int]]
while (first.hasNext)
for (i <- slidingIterators)
if (i.hasNext)
buf += i.next
If you prefer the n-grams to be in length order, try:
val basis = List(1,2,3,4)
1 to 4 flatMap { basis sliding _ toList }
scala> val basis = List(1, 2, 3, 4)
basis: List[Int] = List(1, 2, 3, 4)
scala> val nGrams = (basis sliding 1).toList ::: (basis sliding 2).toList ::: (basis sliding 3).toList
nGrams: List[List[Int]] = ...
scala> nGrams foreach (println _)
List(1)
List(2)
List(3)
List(4)
List(1, 2)
List(2, 3)
List(3, 4)
List(1, 2, 3)
List(2, 3, 4)
I guess I should have given this more thought.
def trigrams(tokens : Seq[T]) : Seq[(Option[T],Option[T],T)] = {
var t1 : Option[T] = None
var t2 : Option[T] = None
for (t3 <- tokens)
yield {
val tri = (t1,t2,t3)
t1 = t2
t2 = Some(t3)
tri
}
}
Then extract the unigrams and bigrams from the trigrams. But can anyone explain to me why 'multi-yields' are not permitted, and if there's any other way to achieve their effect?
val basis = List(1, 2, 3, 4)
val nGrams = basis.map(x => (x)) ::: (for (a <- basis; b <- basis) yield (a, b)) ::: (for (a <- basis; b <- basis; c <- basis) yield (a, b, c))
nGrams: List[Any] = ...
nGrams foreach (println(_))
1
2
3
4
(1,1)
(1,2)
(1,3)
(1,4)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(3,3)
(3,4)
(4,1)
(4,2)
(4,3)
(4,4)
(1,1,1)
(1,1,2)
(1,1,3)
(1,1,4)
(1,2,1)
(1,2,2)
(1,2,3)
(1,2,4)
(1,3,1)
(1,3,2)
(1,3,3)
(1,3,4)
(1,4,1)
(1,4,2)
(1,4,3)
(1,4,4)
(2,1,1)
(2,1,2)
(2,1,3)
(2,1,4)
(2,2,1)
(2,2,2)
(2,2,3)
(2,2,4)
(2,3,1)
(2,3,2)
(2,3,3)
(2,3,4)
(2,4,1)
(2,4,2)
(2,4,3)
(2,4,4)
(3,1,1)
(3,1,2)
(3,1,3)
(3,1,4)
(3,2,1)
(3,2,2)
(3,2,3)
(3,2,4)
(3,3,1)
(3,3,2)
(3,3,3)
(3,3,4)
(3,4,1)
(3,4,2)
(3,4,3)
(3,4,4)
(4,1,1)
(4,1,2)
(4,1,3)
(4,1,4)
(4,2,1)
(4,2,2)
(4,2,3)
(4,2,4)
(4,3,1)
(4,3,2)
(4,3,3)
(4,3,4)
(4,4,1)
(4,4,2)
(4,4,3)
(4,4,4)
You could try a functional version without assignments:
def trigrams[T](tokens : Seq[T]) = {
val s1 = tokens.map { Some(_) }
val s2 = None +: s1
val s3 = None +: s2
s1 zip s2 zip s3 map {
case ((t1, t2), t3) => (List(t1), List(t1, t2), List(t1, t2, t3))
}
}