Scala Futures with DB - postgresql

I'm writing code in scala/play with anorm/postgres for match generation based on users profiles. The following code works, but I've commented out the section that is causing problems, the while loop. I noticed while running it that the first 3 Futures seem to work synchronously but the problem comes when I'm retrieving the count of rows in the table in the fourth step.
The fourth step returns the count before the above insert's actually happened. As far as I can tell, steps 1-3 are being queued up for postgres synchronously, but the call to retrieve the count seems to return BEFORE the first 3 steps complete, which makes no sense to me. If the first 3 steps get queued up in the correct order, why wouldn't the fourth step wait to return the count until after the inserts happen?
When I uncomment the while loop, the match generation and insert functions are called until memory runs out, as the count returned is continually below the desired threshold.
I know the format itself is subpar, but my question is not about how to write the most elegant scala code, but merely how to get it to work for now.
def matchGeneration(email:String,itNum:Int) = {
var currentIterationNumber = itNum
var numberOfMatches = MatchData.numberOfCurrentMatches(email)
while(numberOfMatches < 150){
Thread.sleep(25000)//delay while loop execution time
generateUsers(email) onComplete {
case(s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 1")
Thread.sleep(5000)//Time for initial user generation to take place
genDemoMatches(email, currentIterationNumber) onComplete {
case (s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 2")
genIntMatches(email,currentIterationNumber) onComplete {
case(s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 3")
genSchoolWorkMatches(email,currentIterationNumber) onComplete {
case(s) => {
Thread.sleep(10000)
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 4")
incrementNumberOfMatches(email) onComplete {
case(s) => {
currentIterationNumber+=1
println(s"current number of matches: $numberOfMatches")
println(s"current Iteration: $currentIterationNumber")
}
}
}
}
}
}
}
}
}
}
//}
}
The match functions are defined as futures, such as :
def genSchoolWorkMatches(email:String,currentIterationNumber:Int):Future[Unit]=
Future(genUsersFromSchoolWorkData(email, currentIterationNumber))
genUsersFromSchoolWorkData(email:String) follows the same form as the other two. It is a function that initially gets all the school/work fields that a user has filled out in their profile ( SELECT major FROM school_work where email='$email') and it generates a dummyUser that contains one of those fields in common with this user of email:String. It would take about 30-40 lines of code to print this function so I can explain it further if need be.
I have edited my code, the only way I found so far to get this to work was by hacking it with Thread.sleep(). I think the problem may lie with anorm
as my Future logic constructs did work as I expected, but the problem lies in the inconsistency of when writes occur versus what the read returns. The numberOfCurrentMatches(email:String) function returns the number of matches as it is a simple SELECT count(email) from table where email='$email'. The problem is that sometimes after inserting 23 matches the count returns as 0, then after a second iteration it will return 46. I assumed that the onComplete() would bind to the underlying anorm function defined with DB.withConnection() but apparently it may be too far removed to accomplish this. I am not really sure at this point what to research or look up further to try to get around this problem, rather than writing a separate sort of supervisor function to return at a value closer to 150.
UPDATE
Thanks to the advice of user's here, and trying to understand Scala's documentation at this link: Scala Futures and Promises
I have updated my code to be a bit more readable and scala-esque:
def genMatchOfTypes(email:String,iterationNumber:Int) = {
genDemoMatches(email,iterationNumber)
genIntMatches(email,iterationNumber)
genSchoolWorkMatches(email,iterationNumber)
}
def matchGeneration(email:String) = {
var currentIterationNumber = 0
var numberOfMatches = MatchData.numberOfCurrentMatches(email)
while (numberOfMatches < 150) {
println(s"current number of matches: $numberOfMatches")
Thread.sleep(30000)
generateUsers(email)
.flatMap(users => genMatchOfTypes(email,currentIterationNumber))
.flatMap(matches => incrementNumberOfMatches(email))
.map{
result =>
currentIterationNumber += 1
println(s"current Iteration2: $currentIterationNumber")
numberOfMatches = MatchData.numberOfCurrentMatches(email)
println(s"current number of matches2: $numberOfMatches")
}
}
}
I still am heavily dependent upon the Thread.sleep(30000) to provide enough time to run through the while loop before it tries to loop back again. It's still an unwieldy hack. When I uncomment the Thread.sleep()
my output in bash looks like this:
users for match generation createdcurrent number of matches: 0
[error] c.MatchDataController - here is the list: jnkj
[error] c.MatchDataController - here is the list: hbhjbjjnkjn
current number of matches: 0
current number of matches: 0
current number of matches: 0
current number of matches: 0
current number of matches: 0
This of course is a truncated output. It runs like this over and over until I get errors about too many open files and the JVM/play server crashes entirely.

One solution is to use Future.traverse for known iteration count
Implying
object MatchData {
def numberOfCurrentMatches(email: String) = ???
}
def generateUsers(email: String): Future[Unit] = ???
def incrementNumberOfMatches(email: String): Future[Int] = ???
def genDemoMatches(email: String, it: Int): Future[Unit] = ???
def genIntMatches(email: String, it: Int): Future[Unit] = ???
def genSchoolWorkMatches(email: String, it: Int): Future[Unit] = ???
You can write code like
def matchGeneration(email: String, itNum: Int) = {
val numberOfMatches = MatchData.numberOfCurrentMatches(email)
Future.traverse(Stream.range(itNum, 150 - numberOfMatches + itNum)) { currentIterationNumber => for {
_ <- generateUsers(email)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 1")
_ <- genDemoMatches(email, currentIterationNumber)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 2")
_ <- genIntMatches(email, currentIterationNumber)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 3")
_ <- genSchoolWorkMatches(email, currentIterationNumber)
_ = Thread.sleep(15000)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 4")
numberOfMatches <- incrementNumberOfMatches(email)
_ = println(s"current number of matches: $numberOfMatches")
_ = println(s"current Iteration: $currentIterationNumber")
} yield ()
}
Update
If you urged to check some condition each time, one way is to use cool monadic things from scalaz library. It have definition of monad for scala.Future so we can replace word monadic with asynchronous when we want to
For example StreamT.unfoldM can create conditional monadic(asynchronous) loop, even if we don need elements of resulting collection we still can use it just for iteration.
Lets define your
def generateAll(email: String, iterationNumber: Int): Future[Unit] = for {
_ <- generateUsers(email)
_ <- genDemoMatches(email, iterationNumber)
_ <- genIntMatches(email, iterationNumber)
_ <- genSchoolWorkMatches(email, iterationNumber)
} yield ()
Then iteration step
def generateStep(email: String, limit: Int)(iterationNumber: Int): Future[Option[(Unit, Int)]] =
if (MatchData.numberOfCurrentMatches(email) >= limit) Future(None)
else for {
_ <- generateAll(email, iterationNumber)
_ <- incrementNumberOfMatches(email)
next = iterationNumber + 1
} yield Some((), next)
Now our resulting function simplifies to
import scalaz._
import scalaz.std.scalaFuture._
def matchGeneration(email: String, itNum: Int): Future[Unit] =
StreamT.unfoldM(0)(generateStep(email, 150) _).toStream.map(_.force: Unit)
It looks like synchronous method MatchData.numberOfCurrentMatches is reacting on your asynchronous modification inside the incrementNumberOfMatches. Note that generally it could lead to disastrous results and you probably need to move that state inside some actor or something like that

Related

Functional way of interrupting lazy iteration depedning on timeout and comparisson between previous and next, while, LazyList vs Stream

Background
I have the following scenario. I want to execute the method of a class from an external library, repeatedly, and I want to do so until a certain timeout condition and result condition (compared to the previous result) is met. Furthermore I want to collect the return values, even on the "failed" run (the run with the "failing" result condition that should interrupt further execution).
Thus far I have accomplished this with initializing an empty var result: Result, a var stop: Boolean and using a while loop that runs while the conditions are true and modifying the outer state. I would like to get rid of this and use a functional approach.
Some context. Each run is expected to run from 0 to 60 minutes and the total time of iteration is capped at 60 minutes. Theoretically, there's no bound to how many times it executes in this period but in practice, it's generally 2-60 times.
The problem is, the runs take a long time so I need to stop the execution. My idea is to use some kind of lazy Iterator or Stream coupled with scanLeft and Option.
Code
Boiler plate
This code isn't particularly relevant but used in my approach samples and provide identical but somewhat random pseudo runtime results.
import scala.collection.mutable.ListBuffer
import scala.util.Random
val r = Random
r.setSeed(1)
val sleepingTimes: Seq[Int] = (1 to 601)
.map(x => Math.pow(2, x).toInt * r.nextInt(100))
.toList
.filter(_ > 0)
.sorted
val randomRes = r.shuffle((0 to 600).map(x => r.nextInt(10)).toList)
case class Result(val a: Int, val slept: Int)
class Lib() {
def run(i: Int) = {
println(s"running ${i}")
Thread.sleep(sleepingTimes(i))
Result(randomRes(i), sleepingTimes(i))
}
}
case class Baz(i: Int, result: Result)
val lib = new Lib()
val timeout = 10 * 1000
While approach
val iteratorStart = System.currentTimeMillis()
val iterator = for {
i <- (0 to 600).iterator
if System.currentTimeMillis() < iteratorStart + timeout
f = Baz(i, lib.run(i))
} yield f
val iteratorBuffer = ListBuffer[Baz]()
if (iterator.hasNext) iteratorBuffer.append(iterator.next())
var run = true
while (run && iterator.hasNext) {
val next = iterator.next()
run = iteratorBuffer.last.result.a < next.result.a
iteratorBuffer.append(next)
}
Stream approach (Scala.2.12)
Full example
val streamStart = System.currentTimeMillis()
val stream = for {
i <- (0 to 600).toStream
if System.currentTimeMillis() < streamStart + timeout
} yield Baz(i, lib.run(i))
var last: Option[Baz] = None
val head = stream.headOption
val tail = if (stream.nonEmpty) stream.tail else stream
val streamVersion = (tail
.scanLeft((head, true))((x, y) => {
if (x._1.exists(_.result.a > y.result.a)) (Some(y), false)
else (Some(y), true)
})
.takeWhile {
case (baz, continue) =>
if (!baz.eq(head)) last = baz
continue
}
.map(_._1)
.toList :+ last).flatten
LazyList approach (Scala 2.13)
Full example
val lazyListStart = System.currentTimeMillis()
val lazyList = for {
i <- (0 to 600).to(LazyList)
if System.currentTimeMillis() < lazyListStart + timeout
} yield Baz(i, lib.run(i))
var last: Option[Baz] = None
val head = lazyList.headOption
val tail = if (lazyList.nonEmpty) lazyList.tail else lazyList
val lazyListVersion = (tail
.scanLeft((head, true))((x, y) => {
if (x._1.exists(_.result.a > y.result.a)) (Some(y), false)
else (Some(y), true)
})
.takeWhile {
case (baz, continue) =>
if (!baz.eq(head)) last = baz
continue
}
.map(_._1)
.toList :+ last).flatten
Result
Both approaches appear to yield the correct end result:
List(Baz(0,Result(4,170)), Baz(1,Result(5,208)))
and they interrupt execution as desired.
Edit: The desired outcome is to not execute the next iteration but still return the result of the iteration that caused the interruption. Thus the desired result is
List(Baz(0,Result(4,170)), Baz(1,Result(5,208)), Baz(2,Result(2,256))
and lib.run(i) should only run 3 times.
This is achieved by the while approach, as well as the LazyList approach but not the Stream approach which executes lib.run 4 times (Bad!).
Question
Is there another stateless approach, which is hopefully more elegant?
Edit
I realized my examples were faulty and not returning the "failing" result, which it should, and that they kept executing beyond the stop condition. I rewrote the code and examples but I believe the spirit of the question is the same.
I would use something higher level, like fs2.
(or any other high-level streaming library, like: monix observables, akka streams or zio zstreams)
def runUntilOrTimeout[F[_]: Concurrent: Timer, A](work: F[A], timeout: FiniteDuration)
(stop: (A, A) => Boolean): Stream[F, A] = {
val interrupt =
Stream.sleep_(timeout)
val run =
Stream
.repeatEval(work)
.zipWithPrevious
.takeThrough {
case (Some(p), c) if stop(p, c) => false
case _ => true
} map {
case (_, c) => c
}
run mergeHaltBoth interrupt
}
You can see it working here.

Cats Writer Vector is empty

I wrote this simple program in my attempt to learn how Cats Writer works
import cats.data.Writer
import cats.syntax.applicative._
import cats.syntax.writer._
import cats.instances.vector._
object WriterTest extends App {
type Logged2[A] = Writer[Vector[String], A]
Vector("started the program").tell
val output1 = calculate1(10)
val foo = new Foo()
val output2 = foo.calculate2(20)
val (log, sum) = (output1 + output2).pure[Logged2].run
println(log)
println(sum)
def calculate1(x : Int) : Int = {
Vector("came inside calculate1").tell
val output = 10 + x
Vector(s"Calculated value ${output}").tell
output
}
}
class Foo {
def calculate2(x: Int) : Int = {
Vector("came inside calculate 2").tell
val output = 10 + x
Vector(s"calculated ${output}").tell
output
}
}
The program works and the output is
> run-main WriterTest
[info] Compiling 1 Scala source to /Users/Cats/target/scala-2.11/classes...
[info] Running WriterTest
Vector()
50
[success] Total time: 1 s, completed Jan 21, 2017 8:14:19 AM
But why is the vector empty? Shouldn't it contain all the strings on which I used the "tell" method?
When you call tell on your Vectors, each time you create a Writer[Vector[String], Unit]. However, you never actually do anything with your Writers, you just discard them. Further, you call pure to create your final Writer, which simply creates a Writer with an empty Vector. You have to combine the writers together in a chain that carries your value and message around.
type Logged[A] = Writer[Vector[String], A]
val (log, sum) = (for {
_ <- Vector("started the program").tell
output1 <- calculate1(10)
foo = new Foo()
output2 <- foo.calculate2(20)
} yield output1 + output2).run
def calculate1(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate1").tell
output = 10 + x
_ <- Vector(s"Calculated value ${output}").tell
} yield output
class Foo {
def calculate2(x: Int): Logged[Int] = for {
_ <- Vector("came inside calculate2").tell
output = 10 + x
_ <- Vector(s"calculated ${output}").tell
} yield output
}
Note the use of for notation. The definition of calculate1 is really
def calculate1(x: Int): Logged[Int] = Vector("came inside calculate1").tell.flatMap { _ =>
val output = 10 + x
Vector(s"calculated ${output}").tell.map { _ => output }
}
flatMap is the monadic bind operation, which means it understands how to take two monadic values (in this case Writer) and join them together to get a new one. In this case, it makes a Writer containing the concatenation of the logs and the value of the one on the right.
Note how there are no side effects. There is no global state by which Writer can remember all your calls to tell. You instead make many Writers and join them together with flatMap to get one big one at the end.
The problem with your example code is that you're not using the result of the tell method.
If you take a look at its signature, you'll see this:
final class WriterIdSyntax[A](val a: A) extends AnyVal {
def tell: Writer[A, Unit] = Writer(a, ())
}
it is clear that tell returns a Writer[A, Unit] result which is immediately discarded because you didn't assign it to a value.
The proper way to use a Writer (and any monad in Scala) is through its flatMap method. It would look similar to this:
println(
Vector("started the program").tell.flatMap { _ =>
15.pure[Logged2].flatMap { i =>
Writer(Vector("ended program"), i)
}
}
)
The code above, when executed will give you this:
WriterT((Vector(started the program, ended program),15))
As you can see, both messages and the int are stored in the result.
Now this is a bit ugly, and Scala actually provides a better way to do this: for-comprehensions. For-comprehension are a bit of syntactic sugar that allows us to write the same code in this way:
println(
for {
_ <- Vector("started the program").tell
i <- 15.pure[Logged2]
_ <- Vector("ended program").tell
} yield i
)
Now going back to your example, what I would recommend is for you to change the return type of compute1 and compute2 to be Writer[Vector[String], Int] and then try to make your application compile using what I wrote above.

Efficient way to fold list in scala, while avoiding allocations and vars

I have a bunch of items in a list, and I need to analyze the content to find out how many of them are "complete". I started out with partition, but then realized that I didn't need to two lists back, so I switched to a fold:
val counts = groupRows.foldLeft( (0,0) )( (pair, row) =>
if(row.time == 0) (pair._1+1,pair._2)
else (pair._1, pair._2+1)
)
but I have a lot of rows to go through for a lot of parallel users, and it is causing a lot of GC activity (assumption on my part...the GC could be from other things, but I suspect this since I understand it will allocate a new tuple on every item folded).
for the time being, I've rewritten this as
var complete = 0
var incomplete = 0
list.foreach(row => if(row.time != 0) complete += 1 else incomplete += 1)
which fixes the GC, but introduces vars.
I was wondering if there was a way of doing this without using vars while also not abusing the GC?
EDIT:
Hard call on the answers I've received. A var implementation seems to be considerably faster on large lists (like by 40%) than even a tail-recursive optimized version that is more functional but should be equivalent.
The first answer from dhg seems to be on-par with the performance of the tail-recursive one, implying that the size pass is super-efficient...in fact, when optimized it runs very slightly faster than the tail-recursive one on my hardware.
The cleanest two-pass solution is probably to just use the built-in count method:
val complete = groupRows.count(_.time == 0)
val counts = (complete, groupRows.size - complete)
But you can do it in one pass if you use partition on an iterator:
val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
val counts = (complete.size, incomplete.size)
This works because the new returned iterators are linked behind the scenes and calling next on one will cause it to move the original iterator forward until it finds a matching element, but it remembers the non-matching elements for the other iterator so that they don't need to be recomputed.
Example of the one-pass solution:
scala> val groupRows = List(Row(0), Row(1), Row(1), Row(0), Row(0)).view.map{x => println(x); x}
scala> val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
Row(0)
Row(1)
complete: Iterator[Row] = non-empty iterator
incomplete: Iterator[Row] = non-empty iterator
scala> val counts = (complete.size, incomplete.size)
Row(1)
Row(0)
Row(0)
counts: (Int, Int) = (3,2)
I see you've already accepted an answer, but you rightly mention that that solution will traverse the list twice. The way to do it efficiently is with recursion.
def counts(xs: List[...], complete: Int = 0, incomplete: Int = 0): (Int,Int) =
xs match {
case Nil => (complete, incomplete)
case row :: tail =>
if (row.time == 0) counts(tail, complete + 1, incomplete)
else counts(tail, complete, incomplete + 1)
}
This is effectively just a customized fold, except we use 2 accumulators which are just Ints (primitives) instead of tuples (reference types). It should also be just as efficient a while-loop with vars - in fact, the bytecode should be identical.
Maybe it's just me, but I prefer using the various specialized folds (.size, .exists, .sum, .product) if they are available. I find it clearer and less error-prone than the heavy-duty power of general folds.
val complete = groupRows.view.filter(_.time==0).size
(complete, groupRows.length - complete)
How about this one? No import tax.
import scala.collection.generic.CanBuildFrom
import scala.collection.Traversable
import scala.collection.mutable.Builder
case class Count(n: Int, total: Int) {
def not = total - n
}
object Count {
implicit def cbf[A]: CanBuildFrom[Traversable[A], Boolean, Count] = new CanBuildFrom[Traversable[A], Boolean, Count] {
def apply(): Builder[Boolean, Count] = new Counter
def apply(from: Traversable[A]): Builder[Boolean, Count] = apply()
}
}
class Counter extends Builder[Boolean, Count] {
var n = 0
var ttl = 0
override def +=(b: Boolean) = { if (b) n += 1; ttl += 1; this }
override def clear() { n = 0 ; ttl = 0 }
override def result = Count(n, ttl)
}
object Counting extends App {
val vs = List(4, 17, 12, 21, 9, 24, 11)
val res: Count = vs map (_ % 2 == 0)
Console println s"${vs} have ${res.n} evens out of ${res.total}; ${res.not} were odd."
val res2: Count = vs collect { case i if i % 2 == 0 => i > 10 }
Console println s"${vs} have ${res2.n} evens over 10 out of ${res2.total}; ${res2.not} were smaller."
}
OK, inspired by the answers above, but really wanting to only pass over the list once and avoid GC, I decided that, in the face of a lack of direct API support, I would add this to my central library code:
class RichList[T](private val theList: List[T]) {
def partitionCount(f: T => Boolean): (Int, Int) = {
var matched = 0
var unmatched = 0
theList.foreach(r => { if (f(r)) matched += 1 else unmatched += 1 })
(matched, unmatched)
}
}
object RichList {
implicit def apply[T](list: List[T]): RichList[T] = new RichList(list)
}
Then in my application code (if I've imported the implicit), I can write var-free expressions:
val (complete, incomplete) = groupRows.partitionCount(_.time != 0)
and get what I want: an optimized GC-friendly routine that prevents me from polluting the rest of the program with vars.
However, I then saw Luigi's benchmark, and updated it to:
Use a longer list so that multiple passes on the list were more obvious in the numbers
Use a boolean function in all cases, so that we are comparing things fairly
http://pastebin.com/2XmrnrrB
The var implementation is definitely considerably faster, even though Luigi's routine should be identical (as one would expect with optimized tail recursion). Surprisingly, dhg's dual-pass original is just as fast (slightly faster if compiler optimization is on) as the tail-recursive one. I do not understand why.
It is slightly tidier to use a mutable accumulator pattern, like so, especially if you can re-use your accumulator:
case class Accum(var complete = 0, var incomplete = 0) {
def inc(compl: Boolean): this.type = {
if (compl) complete += 1 else incomplete += 1
this
}
}
val counts = groupRows.foldLeft( Accum() ){ (a, row) => a.inc( row.time == 0 ) }
If you really want to, you can hide your vars as private; if not, you still are a lot more self-contained than the pattern with vars.
You could just calculate it using the difference like so:
def counts(groupRows: List[Row]) = {
val complete = groupRows.foldLeft(0){ (pair, row) =>
if(row.time == 0) pair + 1 else pair
}
(complete, groupRows.length - complete)
}

How to yield a single element from for loop in scala?

Much like this question:
Functional code for looping with early exit
Say the code is
def findFirst[T](objects: List[T]):T = {
for (obj <- objects) {
if (expensiveFunc(obj) != null) return /*???*/ Some(obj)
}
None
}
How to yield a single element from a for loop like this in scala?
I do not want to use find, as proposed in the original question, i am curious about if and how it could be implemented using the for loop.
* UPDATE *
First, thanks for all the comments, but i guess i was not clear in the question. I am shooting for something like this:
val seven = for {
x <- 1 to 10
if x == 7
} return x
And that does not compile. The two errors are:
- return outside method definition
- method main has return statement; needs result type
I know find() would be better in this case, i am just learning and exploring the language. And in a more complex case with several iterators, i think finding with for can actually be usefull.
Thanks commenters, i'll start a bounty to make up for the bad posing of the question :)
If you want to use a for loop, which uses a nicer syntax than chained invocations of .find, .filter, etc., there is a neat trick. Instead of iterating over strict collections like list, iterate over lazy ones like iterators or streams. If you're starting with a strict collection, make it lazy with, e.g. .toIterator.
Let's see an example.
First let's define a "noisy" int, that will show us when it is invoked
def noisyInt(i : Int) = () => { println("Getting %d!".format(i)); i }
Now let's fill a list with some of these:
val l = List(1, 2, 3, 4).map(noisyInt)
We want to look for the first element which is even.
val r1 = for(e <- l; val v = e() ; if v % 2 == 0) yield v
The above line results in:
Getting 1!
Getting 2!
Getting 3!
Getting 4!
r1: List[Int] = List(2, 4)
...meaning that all elements were accessed. That makes sense, given that the resulting list contains all even numbers. Let's iterate over an iterator this time:
val r2 = (for(e <- l.toIterator; val v = e() ; if v % 2 == 0) yield v)
This results in:
Getting 1!
Getting 2!
r2: Iterator[Int] = non-empty iterator
Notice that the loop was executed only up to the point were it could figure out whether the result was an empty or non-empty iterator.
To get the first result, you can now simply call r2.next.
If you want a result of an Option type, use:
if(r2.hasNext) Some(r2.next) else None
Edit Your second example in this encoding is just:
val seven = (for {
x <- (1 to 10).toIterator
if x == 7
} yield x).next
...of course, you should be sure that there is always at least a solution if you're going to use .next. Alternatively, use headOption, defined for all Traversables, to get an Option[Int].
You can turn your list into a stream, so that any filters that the for-loop contains are only evaluated on-demand. However, yielding from the stream will always return a stream, and what you want is I suppose an option, so, as a final step you can check whether the resulting stream has at least one element, and return its head as a option. The headOption function does exactly that.
def findFirst[T](objects: List[T], expensiveFunc: T => Boolean): Option[T] =
(for (obj <- objects.toStream if expensiveFunc(obj)) yield obj).headOption
Why not do exactly what you sketched above, that is, return from the loop early? If you are interested in what Scala actually does under the hood, run your code with -print. Scala desugares the loop into a foreach and then uses an exception to leave the foreach prematurely.
So what you are trying to do is to break out a loop after your condition is satisfied. Answer here might be what you are looking for. How do I break out of a loop in Scala?.
Overall, for comprehension in Scala is translated into map, flatmap and filter operations. So it will not be possible to break out of these functions unless you throw an exception.
If you are wondering, this is how find is implemented in LineerSeqOptimized.scala; which List inherits
override /*IterableLike*/
def find(p: A => Boolean): Option[A] = {
var these = this
while (!these.isEmpty) {
if (p(these.head)) return Some(these.head)
these = these.tail
}
None
}
This is a horrible hack. But it would get you the result you wished for.
Idiomatically you'd use a Stream or View and just compute the parts you need.
def findFirst[T](objects: List[T]): T = {
def expensiveFunc(o : T) = // unclear what should be returned here
case class MissusedException(val data: T) extends Exception
try {
(for (obj <- objects) {
if (expensiveFunc(obj) != null) throw new MissusedException(obj)
})
objects.head // T must be returned from loop, dummy
} catch {
case MissusedException(obj) => obj
}
}
Why not something like
object Main {
def main(args: Array[String]): Unit = {
val seven = (for (
x <- 1 to 10
if x == 7
) yield x).headOption
}
}
Variable seven will be an Option holding Some(value) if value satisfies condition
I hope to help you.
I think ... no 'return' impl.
object TakeWhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else
seq(seq.takeWhile(_ == null).size)
}
object OptionLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T], index: Int = 0): T = if (seq.isEmpty) null.asInstanceOf[T] else
Option(seq(index)) getOrElse func(seq, index + 1)
}
object WhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else {
var i = 0
def obj = seq(i)
while (obj == null)
i += 1
obj
}
}
objects iterator filter { obj => (expensiveFunc(obj) != null } next
The trick is to get some lazy evaluated view on the colelction, either an iterator or a Stream, or objects.view. The filter will only execute as far as needed.

Functional Alternative to Game Loop

I'm just starting out with the Scala and am trying a little toy program - in this case a text based TicTacToe. I wrote a working version based on what I know about scala, but noticed it was mostly imperative and my classes were mutable.
I'm going through and trying to implement some functional idioms and have managed to at least make the classes representing the game state immutable. However, I'm left with a class responsible for performing the game loop relying on mutable state and imperative loop as follows:
var board: TicTacToeBoard = new TicTacToeBoard
def start() {
var gameState: GameState = new XMovesNext
outputState(gameState)
while (!gameState.isGameFinished) {
val position: Int = getSelectionFromUser
board = board.updated(position, gameState.nextTurn)
gameState = getGameState(board)
outputState(gameState)
}
}
What would be a more idiomatic way to program what I'm doing imperatively in this loop?
Full source code is here https://github.com/whaley/TicTacToe-in-Scala/tree/master/src/main/scala/com/jasonwhaley/tictactoe
imho for Scala, the imperative loop is just fine. You can always write a recursive function to behave like a loop. I also threw in some pattern matching.
def start() {
def loop(board: TicTacToeBoard) = board.state match {
case Finished => Unit
case Unfinished(gameState) => {
gameState.output()
val position: Int = getSelectionFromUser()
loop(board.updated(position))
}
}
loop(new TicTacToeBoard)
}
Suppose we had a function whileSome : (a -> Option[a]) a -> (), which runs the input function until its result is None. That would strip away a little boilerplate.
def start() {
def step(board: TicTacToeBoard) = {
board.gameState.output()
val position: Int = getSelectionFromUser()
board.updated(position) // returns either Some(nextBoard) or None
}
whileSome(step, new TicTacToeBoard)
}
whileSome should be trivial to write; it is simply an abstraction of the former pattern. I'm not sure if it's in any common Scala libs, but in Haskell you could grab whileJust_ from monad-loops.
You could implement it as a recursive method. Here's an unrelated example:
object Guesser extends App {
val MIN = 1
val MAX = 100
readLine("Think of a number between 1 and 100. Press enter when ready")
def guess(max: Int, min: Int) {
val cur = (max + min) / 2
readLine("Is the number "+cur+"? (y/n) ") match {
case "y" => println("I thought so")
case "n" => {
def smallerGreater() {
readLine("Is it smaller or greater? (s/g) ") match {
case "s" => guess(cur - 1, min)
case "g" => guess(max, cur + 1)
case _ => smallerGreater()
}
}
smallerGreater()
}
case _ => {
println("Huh?")
guess(max, min)
}
}
}
guess(MAX, MIN)
}
How about something like:
Stream.continually(processMove).takeWhile(!_.isGameFinished)
where processMove is a function that gets selection from user, updates board and returns new state.
I'd go with the recursive version, but here's a proper implementation of the Stream version:
var board: TicTacToeBoard = new TicTacToeBoard
def start() {
def initialBoard: TicTacToeBoard = new TicTacToeBoard
def initialGameState: GameState = new XMovesNext
def gameIterator = Stream.iterate(initialBoard -> initialGameState) _
def game: Stream[GameState] = {
val (moves, end) = gameIterator {
case (board, gameState) =>
val position: Int = getSelectionFromUser
val updatedBoard = board.updated(position, gameState.nextTurn)
(updatedBoard, getGameState(board))
}.span { case (_, gameState) => !gameState.isGameFinished }
(moves ::: end.take(1)) map { case (_, gameState) => gameState }
}
game foreach outputState
}
This looks weirder than it should. Ideally, I'd use takeWhile, and then map it afterwards, but it won't work as the last case would be left out!
If the moves of the game could be discarded, then dropWhile followed by head would work. If I had the side effect (outputState) instead the Stream, I could go that route, but having side-effect inside a Stream is way worse than a var with a while loop.
So, instead, I use span which gives me both takeWhile and dropWhile but forces me to save the intermediate results -- which can be real bad if memory is a concern, as the whole game will be kept in memory because moves points to the head of the Stream. So I had to encapsulate all that inside another method, game. That way, when I foreach through the results of game, there won't be anything pointing to the Stream's head.
Another alternative would be to get rid of the other side effect you have: getSelectionFromUser. You can get rid of that with an Iteratee, and then you can save the last move and reapply it.
OR... you could write yourself a takeTo method and use that.