Substitute while loop with functional code - scala

I am refactoring some scala code and I am having problems with a while loop. The old code was:
for (s <- sentences){
// ...
while (/*Some condition*/){
// ...
function(trees, ...)
}
}
I've have translated that code into this one, using foldLeft to transverse sentences:
sentences./:(initialSeed){
(seed, s) =>
// ...
// Here I've replaced the while with other foldleft
trees./:(seed){
(v, n) =>
// ....
val updatedVariable = function(...., v)
}
}
Now, It may be the case that I need to stop transversing trees (The inner foldLeft before it is transverse entirely, for that I've found this question:
Abort early in a fold
But I also have the following problem:
As I transverse trees, I need to accumulate values to the variable v, function takes v and returns an updated v, called here updatedVariable. The problem is that I have the feeling that this is not a proper way of coding this functionality.
Could you recommended me a functional/immutable way of doing this?
NOTE: I've simplified the code to show the actual problem, the complete code is this:
val trainVocabulart = sentences./:(Vocabulary()){
(vocab, s) =>
var trees = s.tree
var i = 0
var noConstruction = false
trees./:(vocab){
(v, n) =>
if (i == trees.size - 1) {
if (noConstruction) return v
noConstruction = true
i = 0
} else {
// Build vocabulary
val updatedVocab = buildVocabulary(trees, v, i, Config.LeftCtx, Config.RightCtx)
val y = estimateTrainAction(trees, i)
val (newI, newTrees) = takeAction(trees, i, y)
i = newI
trees = newTrees
// Execute the action and modify the trees
if (y != Shift)
noConstruction = false
Vocabulary(v.positionVocab ++ updatedVocab.positionVocab,
v.positionTag ++ updatedVocab.positionTag,
v.chLVocab ++ updatedVocab.chLVocab,
v.chLTag ++ updatedVocab.chLTag,
v.chRVocab ++ updatedVocab.chRVocab,
v.chRTag ++ updatedVocab.chRTag)
}
v
}
}
And the old one:
for (s <- sentences) {
var trees = s.tree
var i = 0
var noConstruction = false
var exit = false
while (trees.nonEmpty && !exit) {
if (i == trees.size - 1) {
if (noConstruction) exit = true
noConstruction = true
i = 0
} else {
// Build vocabulary
buildVocabulary(trees, i, LeftCtx, RightCtx)
val y = estimateTrainAction(trees, i)
val (newI, newTrees) = takeAction(trees, i, y)
i = newI
trees = newTrees
// Execute the action and modify the trees
if (y != Shift)
noConstruction = false
}
}
}

1st - You don't make this easy. Neither your simplified or complete examples are complete enough to compile.
2nd - You include a link to some reasonable solutions to the break-out-early problem. Is there a reason why none of them look workable for your situation?
3rd - Does that complete example actually work? You're folding over a var ...
trees./:(vocab){
... and inside that operation you modify/update that var ...
trees = newTrees
According to my tests that's a meaningless statement. The original iteration is unchanged by updating the collection.
4th - I'm not convinced that fold is what you want here. fold iterates over a collection and reduces it to a single value, but your aim here doesn't appear to be finding that single value. The result of your /: is thrown away. There is no val result = trees./:(vocab){...
One solution you might look at is: trees.forall{ ... At the end of each iteration you just return true if the next iteration should proceed.

Related

Is there any way to replace nested For loop with Higher order methods in scala

I am having a mutableList and want to take sum of all of its rows and replacing its rows with some other values based on some criteria. Code below is working fine for me but i want to ask is there any way to get rid of nested for loops as for loops slows down the performance. I want to use scala higher order methods instead of nested for loop. I tried flodLeft() higher order method to replace single for loop but can not implement to replace nested for loop
def func(nVect : Int , nDim : Int) : Unit = {
var Vector = MutableList.fill(nVect,nDimn)(math.random)
var V1Res =0.0
var V2Res =0.0
var V3Res =0.0
for(i<- 0 to nVect -1) {
for (j <- i +1 to nVect -1) {
var resultant = Vector(i).zip(Vector(j)).map{case (x,y) => x + y}
V1Res = choice(Vector(i))
V2Res = choice(Vector(j))
V3Res = choice(resultant)
if(V3Res > V1Res){
Vector(i) = res
}
if(V3Res > V2Res){
Vector(j) = res
}
}
}
}
There are no "for loops" in this code; the for statements are already converted to foreach calls by the compiler, so it is already using higher-order methods. These foreach calls could be written out explicitly, but it would make no difference to the performance.
Making the code compile and then cleaning it up gives this:
def func(nVect: Int, nDim: Int): Unit = {
val vector = Array.fill(nVect, nDim)(math.random)
for {
i <- 0 until nVect
j <- i + 1 until nVect
} {
val res = vector(i).zip(vector(j)).map { case (x, y) => x + y }
val v1Res = choice(vector(i))
val v2Res = choice(vector(j))
val v3Res = choice(res)
if (v3Res > v1Res) {
vector(i) = res
}
if (v3Res > v2Res) {
vector(j) = res
}
}
}
Note that using a single for does not make any difference to the result, it just looks better!
At this point it gets difficult to make further improvements. The only parallelism possible is with the inner map call, but vectorising this is almost certainly a better option. If choice is expensive then the results could be cached, but this cache needs to be updated when vector is updated.
If the choice could be done in a second pass after all the cross-sums have been calculated then it would be much more parallelisable, but clearly that would also change the results.

Get First nonrecurring element in a list using scala

Getting an compilation error - forward reference extends over definition of value lst:
val lt = List(1,2,3,3,2,4,5,1,5,7,8,7)
var cond = false
do
{
var cond = if (lt.tail contains lt.head) true else false
if (cond == true) {
val lst : List[Int]= lt.filter(_!=lt.head)
val lt = lst
}
else {
println(lt.head)
}
}
while(cond == false)
You can implement "Get first" using find and you can implement "non-recurring" using count == 1 so the code is
lt.find(x => lt.count(_ == x) == 1)
This will return an Option[Int] that can be unpicked in the usual way.
This algorithm is clear but not efficient, so for a very long list you might want to pre-compute the count, or use a recursive function to implement your original algorithm. This would be less clear but more efficient, so avoid it unless you can prove that the inefficiency is causing a problem.
Update
Here is an example of pre-computing the count for each value. This is potentially faster for long lists because Map operations are typically O(log n) so the function is O(n log n) rather than O(n2) for the previous version.
def firstUniq[A](in: Seq[A]): Option[A] = {
val m = mutable.Map.empty[A, Int]
for (elem <- in) {
m.update(elem, m.getOrElseUpdate(elem, 0) + 1)
}
val singles = m.filter(_._2 == 1)
in.find(singles.contains)
}
first non recurring element in whole list
Get First nonrecurring element in a list using scala
You can use filter and count as
val firstNonRecurrringValue = lt.filter(x => lt.count(_ == x) == 1)(0)
so firstNonRecurrringValue is 4
first non recurring element in the list after the element
But looking at your do while code, it seems that you are trying to print the first element that is not recurring after it. For that following code should work
val firstNonRecurringValue = lt.zipWithIndex.filter(x => lt.drop(x._2).count(_ == x._1) == 1)(0)._1
Now firstNonRecurringValue should be 3

Scala - avoid use mutable variables

I have a function that perform a calc but i'm using a var to receive the value of a recursive function and i would like to avoid mutable variables.
def scoreNode(node:Option[Tree], score:Double = 0, depth:Int = 0):Double = {
node.map(n => {
var points = score
n.children.filter(n => n.valid == Some(true)).foreach(h => {
points = scoreNode(Some(h), 10, depth+1)
})
points
}).getOrElse(score)
}
How can i rewrite this piece of code without a mutable variable? I've tried
What you are essentially doing is summing something over all the nodes in a tree. Try to write a more idiomatic code, like this.
def scoreNode(node:Option[Tree], depth:Int = 0):Double =
(for {
n <- node
h <- n.children
if h.valid == Some(true)
res = scoreNode(Some(h), depth + 1) + scala.math.pow(0.8, depth)
} yield res).sum
I do not guarantee this works completely. It is your homework to make it right.
You can use fold:
def scoreNode(node:Option[Tree], score:Double = 0, depth:Int = 0):Double =
node
.map(_.children.filter(n => n.valid == Some(true)).fold(score)((acc, h) => scoreNode(Some(h), acc + scala.math.pow(0.8, depth), depth + 1)))
.getOrElse(score)

workaround for prepending to a LinkedHashMap in Scala?

I have a LinkedHashMap which I've been using in a typical way: adding new key-value
pairs to the end, and accessing them in order of insertion. However, now I have a
special case where I need to add pairs to the "head" of the map. I think there's
some functionality inside the LinkedHashMap source for doing this, but it has private
accessibility.
I have a solution where I create a new map, add the pair, then add all the old mappings.
In Java syntax:
newMap.put(newKey, newValue)
newMap.putAll(this.map)
this.map = newMap
It works. But the problem here is that I then need to make my main data structure
(this.map) a var rather than a val.
Can anyone think of a nicer solution? Note that I definitely need the fast lookup
functionality provided by a Map collection. The performance of a prepending is not
such a big deal.
More generally, as a Scala developer how hard would you fight to avoid a var
in a case like this, assuming there's no foreseeable need for concurrency?
Would you create your own version of LinkedHashMap? Looks like a hassle frankly.
This will work but is not especially nice either:
import scala.collection.mutable.LinkedHashMap
def prepend[K,V](map: LinkedHashMap[K,V], kv: (K, V)) = {
val copy = map.toMap
map.clear
map += kv
map ++= copy
}
val map = LinkedHashMap('b -> 2)
prepend(map, 'a -> 1)
map == LinkedHashMap('a -> 1, 'b -> 2)
Have you taken a look at the code of LinkedHashMap? The class has a field firstEntry, and just by taking a quick peek at updateLinkedEntries, it should be relatively easy to create a subclass of LinkedHashMap which only adds a new method prepend and updateLinkedEntriesPrepend resulting in the behavior you need, e.g. (not tested):
private def updateLinkedEntriesPrepend(e: Entry) {
if (firstEntry == null) { firstEntry = e; lastEntry = e }
else {
val oldFirstEntry = firstEntry
firstEntry = e
firstEntry.later = oldFirstEntry
oldFirstEntry.earlier = e
}
}
Here is a sample implementation I threw together real quick (that is, not thoroughly tested!):
class MyLinkedHashMap[A, B] extends LinkedHashMap[A,B] {
def prepend(key: A, value: B): Option[B] = {
val e = findEntry(key)
if (e == null) {
val e = new Entry(key, value)
addEntry(e)
updateLinkedEntriesPrepend(e)
None
} else {
// The key already exists, so we might as well call LinkedHashMap#put
put(key, value)
}
}
private def updateLinkedEntriesPrepend(e: Entry) {
if (firstEntry == null) { firstEntry = e; lastEntry = e }
else {
val oldFirstEntry = firstEntry
firstEntry = e
firstEntry.later = oldFirstEntry
oldFirstEntry.earlier = firstEntry
}
}
}
Tested like this:
object Main {
def main(args:Array[String]) {
val x = new MyLinkedHashMap[String, Int]();
x.prepend("foo", 5)
x.prepend("bar", 6)
x.prepend("olol", 12)
x.foreach(x => println("x:" + x._1 + " y: " + x._2 ));
}
}
Which, on Scala 2.9.0 (yeah, need to update) results in
x:olol y: 12
x:bar y: 6
x:foo y: 5
A quick benchmark shows order of magnitude in performance difference between the extended built-in class and the "map rewrite" approach (I used the code from Debilski's answer in "ExternalMethod" and mine in "BuiltIn"):
benchmark length us linear runtime
ExternalMethod 10 1218.44 =
ExternalMethod 100 1250.28 =
ExternalMethod 1000 19453.59 =
ExternalMethod 10000 349297.25 ==============================
BuiltIn 10 3.10 =
BuiltIn 100 2.48 =
BuiltIn 1000 2.38 =
BuiltIn 10000 3.28 =
The benchmark code:
def timeExternalMethod(reps: Int) = {
var r = reps
while(r > 0) {
for(i <- 1 to 100) prepend(map, (i, i))
r -= 1
}
}
def timeBuiltIn(reps: Int) = {
var r = reps
while(r > 0) {
for(i <- 1 to 100) map.prepend(i, i)
r -= 1
}
}
Using a scala benchmarking template.

Initializing a val to be used in a different scope

How can I initialize a val that is to be used in another scope? In the example below, I am forced to make myOptimizedList as a var, since it is initialized in the if (iteration == 5){} scope and used in the if (iteration > 5){} scope.
val myList:A = List(...)
var myOptimizedList:A = null
for (iteration <- 1 to 100) {
if (iteration < 5) {
process(myList)
} else if (iteration == 5)
myOptimizedList = optimize(myList)
}
if (iteration > 5) {
process(myOptimizedList)
}
}
This may have been asked before, but I wonder if there is an elegant solution that uses Option[A].
Seems that you have taken this code example out of the context, so this solution can be not very suitable for your real context, but you can use foldLeft in order to simplify it:
val myOptimizedList = (1 to 100).foldLeft (myList) {
case (list, 5) => optimize(list)
case (list, _) => process(list); list
}
You can almost always rewrite some sort of looping construct as a (tail) recursive function:
#annotation.tailrec def processLists(xs: List[A], start: Int, stop: Int) {
val next = start + 1
if (start < 5) { process(xs); processLists(xs, next, stop)
else if (start == 5) { processLists( optimize(xs), next, stop) }
else if (start <= stop) { process(xs); processLists( xs, next, stop ) }
}
processLists(myList, 100, 1)
Here, you pass forward that data which you would otherwise have mutated. If you need to mutate a huge number of things it becomes unwieldy, but for one or two it is often as clear or clearer than doing the mutation.
It's often the case that you can rework your code to avoid the problem. Consider the simple, and common, example here:
var x = 0
if(something)
x = 5
else
x = 6
println(x)
This would be a pretty common pattern in most languages, but Scala has a better way of doing it. Specifically, if-statements can return values, so the better way is:
val x =
if(something)
5
else
6
println(x)
So we can make x a val after all.
Now, clearly your code can be rewritten to use all vals:
val myList:A = List(...)
for (iteration <- 1 to 5)
process(myList)
val myOptimizedList = optimize(myList)
for (iteration <- 5 to 100)
process(myOptimizedList)
But I suspect this is simply an example, not your real case. But if you're unsure how you might rearrange your real code to accomplish something similar, please show us what it looks like.
There's another technique (perhaps trick in this case) to delay initialization of
myOptimizedList which is to use a lazy val. Your example is very specific but the principal is still obvious, delay assignment of a val until it is first referenced.
val myList = List(A(), A(), A())
lazy val myOptimizedList = optimize(myList)
for (iteration <- 1 to 100) {
if (iteration < 5)
process(myList)
else if (iteration > 5)
process(myOptimizedList)
}
Note that the case iteration == 5 is ignored.