Scala iterate over a set of objects - scala

I have a sealed trait which is implemented by 3 objects
sealed trait MyTrait {
...
}
object A extends MyTrait { ... }
object B extends MyTrait { ... }
object C extends MyTrait { ... }
I'm using Scalaz's validation mechanism wherein the apply methods of the objects A, B and C return a validated type. the Objects A, B and C does contain some logic and I want to apply this logic sequentially, i.e., I want to first apply A and check what the result of A is and based on it, decide if I want to call B or just return the validated result. I want to repeat this until I hit C after which I just return whatever I get as a result of calling C.
Currently I have a static approach where I first call A, pass the result of A to a utility method and check for the result and then call B.
def apply(request: Request): Validated[Result] = {
val vResultA = run(request, A)
val vResultB = if (isResultOk(vResultA)) run(request, B) else vResultA
if (isResultOk(vResultB)) run(request, C) else vResultB
}
Is there a better way to do this? Any suggestions or any patterns that I can apply?

We will define succeeded results = results that are OK, and failed results = results that are not OK.
First, A, B, and C are all objects extending MyTrait. Therefore, they can be grouped into an Array or a List of MyTrait.
val objects = Array(A, B, C) /* You can use List instead if you want. */
Then the type of objects is Array[MyTrait].
Next, we have to iterate on this Array.
However, just calling map on this Array continues mapping even if the previous isResultOk() is false.
Therefore, we will use Stream instead of Array.
Let's see how using Stream can stop calling map if some condition is satisfied.
Array(1, 2, 3, 4, 5).map(i => {
println(i)
i + 100
}).takeWhile(_ <= 103).foreach(println(_))
The output of the above code will be:
1
2
3
4
5
101
102
103
So, map() ends, and then takeWhile() ends -- takeWhile() does not affect calling map().
However, if we do the same operations on the Stream,
Array(1, 2, 3, 4, 5).toStream.map(i => {
println(i)
i + 100
}).takeWhile(_ <= 103).foreach(println(_))
The output will be:
1
101
2
102
3
103
4
So the calling will be map() -> takeWhile() -> foreach() -> map() -> takeWhile() -> ...
At the end, 4 is printed, and 4 + 100 = 104 > 103 will be cut in takeWhile().
The following elements will be not accessed further.
So, do we have to use takeWhile?
objects.toStream.map(run(request, _)).takeWhile(isResultOk(_))
This will get rid of failed results, even though we need the first failed result if failure occured.
(i.e. This will make a problem if there is any result that is not OK.)
How about the opposite function dropWhile()?
objects.toStream.map(run(request, _)).dropWhile(isResultOk(_))
This will get rid of all succeeded results, even though all results are succeeded.
(i.e. This will make a problem if all results are OK.)
So, we will use span().
c.span(p) = (c.takeWhile(p), c.dropWhile(p))
We will test if there are results that are not OK.
If there is a result that is not OK, then we will return the first such result.
Otherwise, we will return the last result that is OK.
val (succ, fail) = objects.toStream.map(run(request, _)).span(isResultOk(_))
fail.headOption.getOrElse(succ.last)
fail.headOption will return Some(fail's first element) if fail is not empty, otherwise None.
In summary,
val objects = Array(A, B, C)
def apply(request: Request): Validated[Result] = {
val (succ, fail) = objects.toStream.map(run(request, _)).span(isResultOk(_))
fail.headOption.getOrElse(succ.last)
}

Related

How to count the number of iterations in a for comprehension in Scala?

I am using a for comprehension on a stream and I would like to know how many iterations took to get o the final results.
In code:
var count = 0
for {
xs <- xs_generator
x <- xs
count = count + 1 //doesn't work!!
if (x prop)
yield x
}
Is there a way to achieve this?
Edit: If you don't want to return only the first item, but the entire stream of solutions, take a look at the second part.
Edit-2: Shorter version with zipWithIndex appended.
It's not entirely clear what you are attempting to do. To me it seems as if you are trying to find something in a stream of lists, and additionaly save the number of checked elements.
If this is what you want, consider doing something like this:
/** Returns `x` that satisfies predicate `prop`
* as well the the total number of tested `x`s
*/
def findTheX(): (Int, Int) = {
val xs_generator = Stream.from(1).map(a => (1 to a).toList).take(1000)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
for (xs <- xs_generator; x <- xs) {
count += 1
if (prop(x)) {
return (x, count)
}
}
throw new Exception("No solution exists")
}
println(findTheX())
// prints:
// (317,50403)
Several important points:
Scala's for-comprehension have nothing to do with Python's "yield". Just in case you thought they did: re-read the documentation on for-comprehensions.
There is no built-in syntax for breaking out of for-comprehensions. It's better to wrap it into a function, and then call return. There is also breakable though, but it works with Exceptions.
The function returns the found item and the total count of checked items, therefore the return type is (Int, Int).
The error in the end after the for-comprehension is to ensure that the return type is Nothing <: (Int, Int) instead of Unit, which is not a subtype of (Int, Int).
Think twice when you want to use Stream for such purposes in this way: after generating the first few elements, the Stream holds them in memory. This might lead to "GC-overhead limit exceeded"-errors if the Stream isn't used properly.
Just to emphasize it again: the yield in Scala for-comprehensions is unrelated to Python's yield. Scala has no built-in support for coroutines and generators. You don't need them as often as you might think, but it requires some readjustment.
EDIT
I've re-read your question again. In case that you want an entire stream of solutions together with a counter of how many different xs have been checked, you might use something like that instead:
val xs_generator = Stream.from(1).map(a => (1 to a).toList)
var count = 0
def prop(x: Int): Boolean = x % 317 == 0
val xsWithCounter = for {
xs <- xs_generator;
x <- xs
_ = { count = count + 1 }
if (prop(x))
} yield (x, count)
println(xsWithCounter.take(10).toList)
// prints:
// List(
// (317,50403), (317,50721), (317,51040), (317,51360), (317,51681),
// (317,52003), (317,52326), (317,52650), (317,52975), (317,53301)
// )
Note the _ = { ... } part. There is a limited number of things that can occur in a for-comprehension:
generators (the x <- things)
filters/guards (if-s)
value definitions
Here, we sort-of abuse the value-definition syntax to update the counter. We use the block { counter += 1 } as the right hand side of the assignment. It returns Unit. Since we don't need the result of the block, we use _ as the left hand side of the assignment. In this way, this block is executed once for every x.
EDIT-2
If mutating the counter is not your main goal, you can of course use the zipWithIndex directly:
val xsWithCounter =
xs_generator.flatten.zipWithIndex.filter{x => prop(x._1)}
It gives almost the same result as the previous version, but the indices are shifted by -1 (it's the indices, not the number of tried x-s).

Why does iterating over multiple streams only iterate over the first element?

I've recently run into a bug in my code, in which iterating over multiple streams causes them to only iterate only through the first item. I converted my streams to buffers (I wasn't even aware that the function's implementation that I was calling returns a stream) and the problem was fixed. I found this hard to believe, so I created a minimum verifiable example:
def f(as: Seq[String], bs: Seq[String]): Unit =
for {
a <- as
b <- bs
} yield println((a, b))
val seq = Seq(1, 2, 3).map(_.toString)
f(seq, seq)
println()
val stream = Stream.iterate(1)(_ + 1).map(_.toString).take(3)
f(stream, stream)
A function that prints every combination of its inputs, and is invoked with the Seq [1, 2, 3] and the Stream [1, 2, 3].
The result with the seq is:
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(3,1)
(3,2)
(3,3)
And the result with the stream is:
(1,1)
I've only been able to replicate this when iterating through multiple generators, iterating through a single stream seems to work fine.
So my questions are: why does this happen, and how can I avoid this kind of glitch? That is, short of using .toBuffer or .to[Vector] before every multi-generator iteration?
Thanks.
The manner in which you're using the for-comprehension (with the println in the yield) is a bit strange and probably not what you want to do. If you really just want to print out the entries, then just use foreach. This will force lazy sequences like Stream, i.e.
def f_strict(as: Seq[String], bs: Seq[String]): Unit = {
for {
a <- as
b <- bs
} println((a, b))
}
The reason you're getting the strange behavior with your f is that Streams are lazy, and elements are only computed (and then memoized) as needed. Since you never use the Stream created by f (necessarily because your f returns a Unit), only the head ever gets computed (which is why you're seeing the single (1, 1).) If you were instead to have it return the sequence it generated (which will have type Seq[Unit]), i.e.
def f_new(as: Seq[String], bs: Seq[String]): Seq[Unit] = {
for {
a <- as
b <- bs
} yield println((a, b))
}
Then you'll get the following behavior which should hopefully help to elucidate what's going on:
val xs = Stream(1, 2, 3)
val result = f_new(xs.map(_.toString), xs.map(_.toString))
//prints out (1, 1) as a result of evaluating the head of the resulting Stream
result.foreach(aUnit => {})
//prints out the other elements as the rest of the entries of Stream are computed, i.e.
//(1,2)
//(1,3)
//(2,1)
//...
result.foreach(aUnit => {})
//probably won't print out anything because elements of Stream have been computed,
//memoized and probably don't need to be computed again at this point.

Getting an error trying to map through a list in Scala

I'm trying to print out all the factors of every number in a list.
Here is my code:
def main(args: Array[String])
{
val list_of_numbers = List(1,4,6)
def get_factors(list_of_numbers:List[Int]) : Int =
{
return list_of_numbers.foreach{(1 to _).filter {divisor => _ % divisor == 0}}
}
println(get_factors(list_of_numbers));
}
I want the end result to contain a single list that will hold all the numbers which are factors of any of the numbers in the list. So the final result should be (1,2,3,4,6). Right now, I get the following error:
error: missing parameter type for expanded function ((x$1) => 1.to(x$1))
return list_of_numbers.foreach{(1 to _).filter {divisor => _ % divisor == 0}}
How can I fix this?
You can only use _ shorthand once in a function (except for some special cases), and even then not always.
Try spelling it out instead:
list_of_numbers.foreach { n =>
(1 to n).filter { divisor => n % divisor == 0 }
}
This will compile.
There are other problems with your code though.
foreach returns a Unit, but you are requiring an Int for example.
Perhaps, you wanted a .map rather than .foreach, but that would still be a List, not an Int.
A few things are wrong here.
First, foreach takes a function A => Unit as an argument, meaning that it's really just for causing side effects.
Second your use of _, you can use _ when the function uses each argument once.
Lastly your expected output seems to be getting rid of duplicates (1 is a factor for all 3 inputs, but it only appears once).
list_of_numbers flatMap { i => (1 to i) filter {i % _ == 0 }} distinct
will do what you are looking for.
flatMap takes a function from A => List[B] and produces a simple List[B] as output, list.distinct gets rid of the duplicates.
Actually, there are several problems with your code.
First, foreach is a method which yields Unit (like void in Java). You want to yield something so you should use a for comprehension.
Second, in your divisor-test function, you've specified both the unnamed parameter ("_") and the named parameter (divisor).
The third problem is that you expect the result to be Int (in the code) but List[Int] in your description.
The following code will do what you want (although it will repeat factors, so you might want to pass it through distinct before using the result):
def main(args: Array[String]) {
val list_of_numbers = List(1, 4, 6)
def get_factors(list_of_numbers: List[Int]) = for (n <- list_of_numbers; r = 1 to n; f <- r.filter(n%_ == 0)) yield f
println(get_factors(list_of_numbers))
}
Note that you need two generators ("<-") in the for comprehension in order that you end up with simply a List. If you instead implemented the filter part in the yield expression, you would get a List[List[Int]].

Executing for comprehension in parallel

I have written this code
def getParallelList[T](list : List[T]) : ParSeq[T] = {
val parList = list.par
parList.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(10))
parList
}
for {
a <- getList1
b <- getList2
c = b.calculateSomething
d <- getParallelList(getList3)
} { ... }
I want to know if this is a good (or best) way to make the for loop execute in parallel? Or should I explicitly code in futures inside of the loop.
I tested this and it seemed to work... but I am not sure if this is the best way ... also I am worried that what happens to the values of a,b,c for different threads of d. If one thread finishes earlier? does it change the value of a, b, c for others?
If getList3 is referentially transparent, i.e. it is going to return the same value every time it's called, it's better idea to calculate it once, since invoking .par on a list has to turn it to a ParVector, which takes O(n) (as List is a linked list and can't be immediately converted to a Vector structure). Here is example:
val list3 = getParallelList(getList3)
for {
a <- getList1
b <- getList2
c = b.calculateSomething
d <- list3
} { ... }
In the for comprehension, the values for (a, b, c) will remain the same during processing of d values.
For best performance, you might consider making getList1 or getList2 parallel, depending on how evenly work is split for a/b/c values.

Scala Stream processing

I wrote the following code in scala using streams
def foo(x: Int) : Stream[Int] = {println("came inside"); (2 * x)} #:: foo(x + 1)
foo(1).takeWhile(_ < 6).toList.foreach(x => println(s"+++ $x"))
This works and produces the following output
came inside
came inside
came inside
+++ 2
+++ 4
but I wanted the processing to happen like
came inside
+++ 2
came inside
+++ 4
came inside
Basically I want to process one by one, till the termination condition of < 6 is met. I guess its my "tolist" method which first creates a giant list and only then processes it.
First, format your code in a more readable fashion. Then, remove the toList: all it does is pull your entire stream into a single variable. Doing this forces all the values to be calculated. Since every value is calculated at this point, we know that the 'inside' printlns will execute before the 'outside' ones do. You want to store the definition of your function (including it's starting value) as a lazily evaluated value. This should work:
def river(x: Int) : Stream[Int] = {
println("Inside function.")
(2 * x) #:: river(x + 1)
}
lazy val creek = river(1)
creek.takeWhile(_ < 6).foreach(x => println(s"+++ $x"))