Breaking out of a for loop scala? - scala

I need to get out of a for loop in scala, but when I try to change the value of i past its limit my IDE says that i is a val so I can't change it. How do I get around this?
Also, if i is a val can I not use it as an index of lists because it will always be the same value?
I'm trying to go though a list and if the list contains a key (which is a string), I remove it from the list. However if it has multiple instances of this string I only want to remove one, so I want to get out of the for loop after I find the first instance of the key.
for (i <- 0 to d.length-1){
if (key == d(i)){
d=d.patch(i,Nil,1)
i=d.length
}

In scala, a variable declared as val is immutable -- it can never be changed. In each iteration of your for loop, the index variable i is similarly immutable. Idiomatic scala relies heavily on this paradigm of immutability. Collections are usually also declared as val, and then reassigned to a new variable when a map, flatMap, filter, or other operation is performed.
For your example, you might do something like this:
val data = Seq("foo", "bar", "bar", "bar", "baz", "qux")
val newData = data diff Seq("bar")
Or:
val (first, second) = data.splitAt(data.indexOf("bar"))
val newData = first ++ second.tail
Now newData will lose 1 instance of "bar". There are many other ways to do this, many of which are documented in this similar question.

Related

Scala: Update Array inside a Map

I am creating a Map which has an Array inside it. I need to keep adding values to that Array. How do I do that?
var values: Map[String, Array[Float]] = Map()
I tried several ways such as:
myobject.values.getOrElse("key1", Array()).++(Array(float1))
Few other ways to but nothing updates the array inside the Map.
There is a problem with this code:
values.getOrElse("key1", Array()).++(Array(float1))
This does not update the Map in values, it just creates a new Array and then throws it away.
You need to replace the original Map with a new, updated Map, like this:
values = values.updated("key1", values.getOrElse("key1", Array.empty[Float]) :+ float1)
To understand this you need to be clear on the distinction between mutable variables and mutable data.
var is used to create a mutable variable which means that the variable can be assigned a new value, e.g.
var name = "John"
name = "Peter" // Would not work if name was a val
By contrast mutable data is held in objects whose contents can be changed
val a = Array(1,2,3)
a(0) = 12 // Works even though a is a val not a var
In your example values is a mutable variable but the Map is immutable so it can't be changed. You have to create a new, immutable, Map and assign it to the mutable var.
From what I can see (according to ++), you would like to append Array, with one more element. But Array fixed length structure, so instead I'd recommend to use Vector. Because, I suppose, you are using immutable Map you need update it as well.
So the final solution might look like:
var values: Map[String, Vector[Float]] = Map()
val key = "key1"
val value = 1.0
values = values + (key -> (values.getOrElse(key, Vector.empty[Float]) :+ value))
Hope this helps!
You can use Scala 2.13's transform function to transform your map anyway you want.
val values = Map("key" -> Array(1f, 2f, 3f), "key2" -> Array(4f,5f,6f))
values.transform {
case ("key", v) => v ++ Array(6f)
case (_,v) => v
}
Result:
Map(key -> Array(1.0, 2.0, 3.0, 6.0), key2 -> Array(4.0, 5.0, 6.0))
Note that appending to arrays takes linear time so you might want to consider a more efficient data structure such as Vector or Queue or even a List (if you can afford to prepend rather than append).
Update:
However, if it is only one key you want to update, it is probably better to use updatedWith:
values.updatedWith("key")(_.map(_ ++ Array(6f)))
which will give the same result. The nice thing about the above code is that if the key does not exist, it will not change the map at all without throwing any error.
Immutable vs Mutable Collections
You need to choose what type of collection you will use immutable or mutable one. Both are great and works totally differently. I guess you are familiar with mutable one (from other languages), but immutable are default in scala and probably you are using it in your code (because it doesn't need any imports). Immutable Map cannot be changed... you can only create new one with updated values (Tim's and Ivan's answers covers that).
There are few ways to solve your problem and all are good depending on use case.
See implementation below (m1 to m6):
//just for convenience
type T = String
type E = Long
import scala.collection._
//immutable map with immutable seq (default).
var m1 = immutable.Map.empty[T,List[E]]
//mutable map with immutable seq. This is great for most use-cases.
val m2 = mutable.Map.empty[T,List[E]]
//mutable concurrent map with immutable seq.
//should be fast and threadsafe (if you know how to deal with it)
val m3 = collection.concurrent.TrieMap.empty[T,List[E]]
//mutable map with mutable seq.
//should be fast but could be unsafe. This is default in most imperative languages (PHP/JS/JAVA and more).
//Probably this is what You have tried to do
val m4 = mutable.Map.empty[T,mutable.ArrayBuffer[E]]
//immutable map with mutable seq.
//still could be unsafe
val m5 = immutable.Map.empty[T,mutable.ArrayBuffer[E]]
//immutable map with mutable seq v2 (used in next snipped)
var m6 = immutable.Map.empty[T,mutable.ArrayBuffer[E]]
//Oh... and NEVER DO THAT, this is wrong
//I mean... don't keep mutable Map in `var`
//var mX = mutable.Map.empty[T,...]
Other answers show immutable.Map with immutable.Seq and this is preferred way (or default at least). It costs something but for most apps it is perfectly ok. Here You have nice source of info about immutable data structures: https://stanch.github.io/reftree/talks/Immutability.html.
Each variant has it's own Pros and Cons. Each deals with updates differently, and it makes this question much harder than it looks at the first glance.
Solutions
val k = "The Ultimate Answer"
val v = 42f
//immutable map with immutable seq (default).
m1 = m1.updated(k, v :: m1.getOrElse(k, Nil))
//mutable map with immutable seq.
m2.update(k, v :: m2.getOrElse(k, Nil))
//mutable concurrent map with immutable seq.
//m3 is bit harder to do in scala 2.12... sorry :)
//mutable map with mutable seq.
m4.getOrElseUpdate(k, mutable.ArrayBuffer.empty[Float]) += v
//immutable map with mutable seq.
m5 = m5.updated(k, {
val col = m5.getOrElse(k, c.mutable.ArrayBuffer.empty[E])
col += v
col
})
//or another implementation of immutable map with mutable seq.
m6.get(k) match {
case None => m6 = m6.updated(k, c.mutable.ArrayBuffer(v))
case Some(col) => col += v
}
check scalafiddle with this implementations. https://scalafiddle.io/sf/WFBB24j/3.
This is great tool (ps: you can always save CTRL+S your changes and share link to write question about your snippet).
Oh... and if You care about concurrency (m3 case) then write another question. Such topic deserve to be in separate thread :)
(im)mutable api VS (im)mutable Collections
You can have mutable collection and still use immutable api that will copy orginal seq. For example Array is mutable:
val example = Array(1,2,3)
example(0) = 33 //edit in place
println(example.mkString(", ")) //33, 2, 3
But some functions on it (e.g. ++) will create new sequence... not change existing one:
val example2 = example ++ Array(42, 41) //++ is immutable operator
println(example.mkString(", ")) //33, 2, 3 //example stays unchanged
println(example2.mkString(", ")) //33, 2, 3, 42, 41 //but new sequence is created
There is method updateWith that is mutable and will exist only in mutable sequences. There is also updatedWith and it exists in both immutable AND mutable collections and if you are not careful enough you will use wrong one (yea ... 1 letter more).
This means you need to be careful which functions you are using, immutable or mutable one. Most of the time you can distinct them by result type. If something returns collection then it will be probably some kind of copy of original seq. It result is unit then it is mutable for sure.

Scala type mismatch when adding an element to an array

I have the following array:
var as = Array.empty[Tuple2[Int, Int]]
I am adding an element to it like this:
var nElem = Tuple2(current, current)
as += nElem
current is a var of type Int
However, I am getting this error:
Solution.scala:51: error: type mismatch;
found : (Int, Int)
required: String
as += nElem
I don't understand why this is appearing. I haven't declared a String anywhere.
+= is the string concatenation operator.
You are looking for :+ to append to an array. Note, that Array length is immutable, so :+=, will return a new array, with the nElem appended, and assign it to the as variable, the original array will stay unchanged (take this as a hint, that you are likely doing something in a suboptimal way).
Note, that if you find yourself using var, that is almost always a sign of a bad design in your code. Mutable objects and variables are considered really bad taste in functional programming. Sometimes, you can't get away without using them, but those are rare corner cases. Most of the time, you should not need mutability.
Also, do not use Tuple2. Just do Array.empty[(Int, Int)], nElem = (current, current) etc.
Use a :+= to modify the variable in place. However, remember this: Using both var and a mutable data structure at the same time (like Array) is a sign of really bad programming. Either is sometimes fine, though.
However, note that this operation is O(n), therefore pushing n elements like that is going to be slow, O(n²). Arrays are not meant to have elements pushed to back like that. You can alternatively use a var Vector instead and call .toArray() on it at the end or use a mutable val ArrayBuffer. However, prefer functional style of programming, unless it produces less readable code.
Also, avoid typing Tuple2 explicitly. Use Array.empty[(Int, Int)] and var nElem = (current, current).
The semantics of + are weird because of the automatic conversion to String in certain cases. To append to an array, use the :+ method:
as :+= nElem

How to remove duplicates from collection (without creating new ones in-between)?

So first up, I'm fully aware mutation is a bad idea, but I need to keep object creation down to a minimum as I have an incredibly huge amount of data to process (keeps GC hang time down and speeds up my code).
What I want is a scala collection that has a method like distinct or similar, or possibly a library or code snippet (but native scala preferred) such that the method is side effecting / mutating the collection rather than creating a new collection.
I've explored the usual suspects like ArrayBuffer, mutable.List, Array, MutableList, Vector and they all "create a new sequence" from the original rather than mutate the original in place. Am I trying to find something that does not exist? Will I just have to write my own??
I think this exists in C++ http://www.cplusplus.com/reference/algorithm/unique/
Also, mega mega bonus points if there is some kind of awesome tail recursive way of doing this so that any bookkeeping structures created are kept in a single stack frame that is thus deallocated from memory once the method exits. The reason this would be uber cool is then even if the method creates some instances of things in order to perform the removal of duplicates, those instance will not need to be garbage collected and therefore not contribute to massive GC hangs. It doesn't have to be recursion, as long as it's likely to cause the objects to go on the stack (see escape analysis here http://www.ibm.com/developerworks/java/library/j-jtp09275/index.html)
(Also if I can specify and fix the capacity (size in memory) of the collection that would also be great)
The algorithm (for C++), you mentioned is for consecutive duplicates. So if you need it for consecutive, you could use some LinkedList, but mutable lists was deprecated. On the other hand if you want something memory-efficient and agree with linear access - you could wrap your collection (mutable or immutable) with distinct iterator (O(N)):
def toConsDist[T](c: Traversable[T]) = new Iterator[T] {
val i = c.toIterator
var prev: Option[T] = None
var _nxt: Option[T] = None
def nxt = {
if (_nxt.isEmpty) _nxt = i.find(x => !prev.toList.contains(x))
prev = _nxt
_nxt
}
def hasNext = nxt.nonEmpty
def next = {
val next = nxt.get
_nxt = None
next
}
}
scala> toConsDist(List(1,1,1,2,2,3,3,3,2,2)).toList
res44: List[Int] = List(1, 2, 3, 2)
If you need to remove all duplicates, it will be О(N*N), but you can't use scala collections for that, because of https://github.com/scala/scala/commit/3cc99d7b4aa43b1b06cc837a55665896993235fc (see LinkedList part), https://stackoverflow.com/a/27645224/1809978.
But you may use Java's LinkedList:
import scala.collection.JavaConverters._
scala> val mlist = new java.util.LinkedList[Integer]
mlist: java.util.LinkedList[Integer] = []
scala> mlist.asScala ++= List(1,1,1,2,2,3,3,3,2,2)
res74: scala.collection.mutable.Buffer[Integer] = Buffer(1, 1, 1, 2, 2, 3, 3, 3, 2, 2)
scala> var i = 0
i: Int = 0
scala> for(x <- mlist.asScala){ if (mlist.indexOf(x) != i) mlist.set(i, null); i+=1} //O(N*N)
scala> while(mlist.remove(null)){} //O(N*N)
scala> mlist
res77: java.util.LinkedList[Integer] = [1, 2, 3]
mlist.asScala just creates wrapper without any copying. You can't modify Java's LinkedList during iteration, that's why i used null's. You may try Java ConcurrentLinkedQueue, but it doesn't support indexOf, so you will have to implement it by yourself (scala maps it to the Iterator, so asScala.indexOf won't work).
By definition, immutability forces you to create new objects whenever you want to change your collection.
What Scala provides for some collection are buffers which allow you to build a collection using a mutable interface and finally returning a immutable version but once you got your immutable collection you can't change its references in any way, that includes filtering as distinct. The furthest point you can reach concerning mutability in an immutable collection is changing its elements state when these are mutable objects.
On the other hand, some collections as Vector are implemented as trees (in this case as a trie) and insert or delete operations are implemented not by copying the entire tree but just the required branches.
From Martin Ordesky's Programming in Scala:
Updating an element in the middle of a vector can be done by copying
the node that contains the element, and every node that points to it,
starting from the root of the tree. This means that a functional
update creates between one and five nodes that each contain up to 32
elements or subtrees. This is certainly more expensive than an
in-place update in a mutable array, but still a lot cheaper than
copying the whole vector.

Creating a Map from a Set of keys

I have a set of keys, say Set[MyKey] and for each of the keys I want to compute the value through some value function, lets say computeValueOf(key: MyKey). In the end I want to have a Map which maps key -> value
What is the most efficient way to do this without iterating too much?
A collection of Tuple2s can be converted to a Map, where the tuple's first element will be the key and the second element will be the value.
val setOfKeys = Set[MyKey]()
setOfKeys.map(key => (key, computeValueOf(key)).toMap
This is actually a pretty neat application for collection.breakOut, one of my favorite pieces of bizarre Scala voodoo:
type MyKey = Int
def computeValueOf(key: MyKey) = "value" * key
val mySet: Set[MyKey] = Set(1, 2, 3)
val myMap: Map[MyKey, String] =
mySet.map(k => k -> computeValueOf(k))(collection.breakOut)
See this answer for some discussion of what's going on here. Unlike the version with toMap, this won't construct an intermediate Set, saving you some allocations and a traversal. It's also much less readable, though—I only offer it because you mentioned that you wanted to avoid "iterating too much".

Flattening a Set of pairs of sets to one pair of sets

I have a for-comprehension with a generator from a Set[MyType]
This MyType has a lazy val variable called factsPair which returns a pair of sets:
(Set[MyFact], Set[MyFact]).
I wish to loop through all of them and unify the facts into one flattened pair (Set[MyFact], Set[MyFact]) as follows, however I am getting No implicit view available ... and not enough arguments for flatten: implicit (asTraversable ... errors. (I am a bit new to Scala so still trying to get used to the errors).
lazy val allFacts =
(for {
mytype <- mytypeList
} yield mytype.factsPair).flatten
What do I need to specify to flatten for this to work?
Scala flatten works on same types. You have a Seq[(Set[MyFact], Set[MyFact])], which can't be flattened.
I would recommend learning the foldLeft function, because it's very general and quite easy to use as soon as you get the hang of it:
lazy val allFacts = myTypeList.foldLeft((Set[MyFact](), Set[MyFact]())) {
case (accumulator, next) =>
val pairs1 = accumulator._1 ++ next.factsPair._1
val pairs2 = accumulator._2 ++ next.factsPair._2
(pairs1, pairs2)
}
The first parameter takes the initial element it will append the other elements to. We start with an empty Tuple[Set[MyFact], Set[MyFact]] initialized like this: (Set[MyFact](), Set[MyFact]()).
Next we have to specify the function that takes the accumulator and appends the next element to it and returns with the new accumulator that has the next element in it. Because of all the tuples, it doesn't look nice, but works.
You won't be able to use flatten for this, because flatten on a collection returns a collection, and a tuple is not a collection.
You can, of course, just split, flatten, and join again:
val pairs = for {
mytype <- mytypeList
} yield mytype.factsPair
val (first, second) = pairs.unzip
val allFacts = (first.flatten, second.flatten)
A tuple isn't traverable, so you can't flatten over it. You need to return something that can be iterated over, like a List, for example:
List((1,2), (3,4)).flatten // bad
List(List(1,2), List(3,4)).flatten // good
I'd like to offer a more algebraic view. What you have here can be nicely solved using monoids. For each monoid there is a zero element and an operation to combine two elements into one.
In this case, sets for a monoid: the zero element is an empty set and the operation is a union. And if we have two monoids, their Cartesian product is also a monoid, where the operations are defined pairwise (see examples on Wikipedia).
Scalaz defines monoids for sets as well as tuples, so we don't need to do anything there. We'll just need a helper function that combines multiple monoid elements into one, which is implemented easily using folding:
def msum[A](ps: Iterable[A])(implicit m: Monoid[A]): A =
ps.foldLeft(m.zero)(m.append(_, _))
(perhaps there already is such a function in Scala, I didn't find it). Using msum we can easily define
def pairs(ps: Iterable[MyType]): (Set[MyFact], Set[MyFact]) =
msum(ps.map(_.factsPair))
using Scalaz's implicit monoids for tuples and sets.