Creating a Map from a Set of keys - scala

I have a set of keys, say Set[MyKey] and for each of the keys I want to compute the value through some value function, lets say computeValueOf(key: MyKey). In the end I want to have a Map which maps key -> value
What is the most efficient way to do this without iterating too much?

A collection of Tuple2s can be converted to a Map, where the tuple's first element will be the key and the second element will be the value.
val setOfKeys = Set[MyKey]()
setOfKeys.map(key => (key, computeValueOf(key)).toMap

This is actually a pretty neat application for collection.breakOut, one of my favorite pieces of bizarre Scala voodoo:
type MyKey = Int
def computeValueOf(key: MyKey) = "value" * key
val mySet: Set[MyKey] = Set(1, 2, 3)
val myMap: Map[MyKey, String] =
mySet.map(k => k -> computeValueOf(k))(collection.breakOut)
See this answer for some discussion of what's going on here. Unlike the version with toMap, this won't construct an intermediate Set, saving you some allocations and a traversal. It's also much less readable, though—I only offer it because you mentioned that you wanted to avoid "iterating too much".

Related

Getting the key of a key/value pair in Scala/Spark

I created a key-value pair using the arrow operator in Scala/Spark:
val picardsShip = "Picard" -> "Enterprise-D"
I extracted the value by using println(picardsShip._2) and that returns the value Enterprise-D, as expected. However, I want to extract the key in this case to get Picard. When I type picardsShip. in the IntelliJ IDEA editor that I am using, I don't see anything suggesting an obvious attribute or method to get the key. Any ideas? Thank you.
Careful here, you haven't created a key/value pair. You've created a tuple. When you use the REPL, you can see what's happening:
scala> val picardsShip = "Picard" -> "Enterprise-D"
picardsShip: (String, String) = (Picard,Enterprise-D)
to access the first value, use what #Krzysztof Atłasik suggested and use ._1:
scala> picardsShip._1
res1: String = Picard
Yes, you can create a key/value pair using the -> operator, but if you want it to operate on it like you have a key/value pair - you have to be a bit more specific by wrapping it in a map. And then you no longer have 1 key/value pair, but theoretically none to many key value pairs, which means you have basically a kind of Seq which is what Scala is really built around. The Map gives you access to an API that basically says "okay, here is a series of items that can be treated as key-value pairs" and will allow you to operate on it as you would expect key-value pairs to look at:
scala> val shipMap: Map[String, String] = Map(("Picard" -> "Enterprise-D"))
shipMap: Map[String,String] = Map(Picard -> Enterprise-D)
scala> shipMap("Picard")
res2: String = Enterprise-D
scala> shipMap.keys.head
res3: String = Picard
scala> shipMap.foreach( kv => println(kv._1))
Picard

Scala: Update Array inside a Map

I am creating a Map which has an Array inside it. I need to keep adding values to that Array. How do I do that?
var values: Map[String, Array[Float]] = Map()
I tried several ways such as:
myobject.values.getOrElse("key1", Array()).++(Array(float1))
Few other ways to but nothing updates the array inside the Map.
There is a problem with this code:
values.getOrElse("key1", Array()).++(Array(float1))
This does not update the Map in values, it just creates a new Array and then throws it away.
You need to replace the original Map with a new, updated Map, like this:
values = values.updated("key1", values.getOrElse("key1", Array.empty[Float]) :+ float1)
To understand this you need to be clear on the distinction between mutable variables and mutable data.
var is used to create a mutable variable which means that the variable can be assigned a new value, e.g.
var name = "John"
name = "Peter" // Would not work if name was a val
By contrast mutable data is held in objects whose contents can be changed
val a = Array(1,2,3)
a(0) = 12 // Works even though a is a val not a var
In your example values is a mutable variable but the Map is immutable so it can't be changed. You have to create a new, immutable, Map and assign it to the mutable var.
From what I can see (according to ++), you would like to append Array, with one more element. But Array fixed length structure, so instead I'd recommend to use Vector. Because, I suppose, you are using immutable Map you need update it as well.
So the final solution might look like:
var values: Map[String, Vector[Float]] = Map()
val key = "key1"
val value = 1.0
values = values + (key -> (values.getOrElse(key, Vector.empty[Float]) :+ value))
Hope this helps!
You can use Scala 2.13's transform function to transform your map anyway you want.
val values = Map("key" -> Array(1f, 2f, 3f), "key2" -> Array(4f,5f,6f))
values.transform {
case ("key", v) => v ++ Array(6f)
case (_,v) => v
}
Result:
Map(key -> Array(1.0, 2.0, 3.0, 6.0), key2 -> Array(4.0, 5.0, 6.0))
Note that appending to arrays takes linear time so you might want to consider a more efficient data structure such as Vector or Queue or even a List (if you can afford to prepend rather than append).
Update:
However, if it is only one key you want to update, it is probably better to use updatedWith:
values.updatedWith("key")(_.map(_ ++ Array(6f)))
which will give the same result. The nice thing about the above code is that if the key does not exist, it will not change the map at all without throwing any error.
Immutable vs Mutable Collections
You need to choose what type of collection you will use immutable or mutable one. Both are great and works totally differently. I guess you are familiar with mutable one (from other languages), but immutable are default in scala and probably you are using it in your code (because it doesn't need any imports). Immutable Map cannot be changed... you can only create new one with updated values (Tim's and Ivan's answers covers that).
There are few ways to solve your problem and all are good depending on use case.
See implementation below (m1 to m6):
//just for convenience
type T = String
type E = Long
import scala.collection._
//immutable map with immutable seq (default).
var m1 = immutable.Map.empty[T,List[E]]
//mutable map with immutable seq. This is great for most use-cases.
val m2 = mutable.Map.empty[T,List[E]]
//mutable concurrent map with immutable seq.
//should be fast and threadsafe (if you know how to deal with it)
val m3 = collection.concurrent.TrieMap.empty[T,List[E]]
//mutable map with mutable seq.
//should be fast but could be unsafe. This is default in most imperative languages (PHP/JS/JAVA and more).
//Probably this is what You have tried to do
val m4 = mutable.Map.empty[T,mutable.ArrayBuffer[E]]
//immutable map with mutable seq.
//still could be unsafe
val m5 = immutable.Map.empty[T,mutable.ArrayBuffer[E]]
//immutable map with mutable seq v2 (used in next snipped)
var m6 = immutable.Map.empty[T,mutable.ArrayBuffer[E]]
//Oh... and NEVER DO THAT, this is wrong
//I mean... don't keep mutable Map in `var`
//var mX = mutable.Map.empty[T,...]
Other answers show immutable.Map with immutable.Seq and this is preferred way (or default at least). It costs something but for most apps it is perfectly ok. Here You have nice source of info about immutable data structures: https://stanch.github.io/reftree/talks/Immutability.html.
Each variant has it's own Pros and Cons. Each deals with updates differently, and it makes this question much harder than it looks at the first glance.
Solutions
val k = "The Ultimate Answer"
val v = 42f
//immutable map with immutable seq (default).
m1 = m1.updated(k, v :: m1.getOrElse(k, Nil))
//mutable map with immutable seq.
m2.update(k, v :: m2.getOrElse(k, Nil))
//mutable concurrent map with immutable seq.
//m3 is bit harder to do in scala 2.12... sorry :)
//mutable map with mutable seq.
m4.getOrElseUpdate(k, mutable.ArrayBuffer.empty[Float]) += v
//immutable map with mutable seq.
m5 = m5.updated(k, {
val col = m5.getOrElse(k, c.mutable.ArrayBuffer.empty[E])
col += v
col
})
//or another implementation of immutable map with mutable seq.
m6.get(k) match {
case None => m6 = m6.updated(k, c.mutable.ArrayBuffer(v))
case Some(col) => col += v
}
check scalafiddle with this implementations. https://scalafiddle.io/sf/WFBB24j/3.
This is great tool (ps: you can always save CTRL+S your changes and share link to write question about your snippet).
Oh... and if You care about concurrency (m3 case) then write another question. Such topic deserve to be in separate thread :)
(im)mutable api VS (im)mutable Collections
You can have mutable collection and still use immutable api that will copy orginal seq. For example Array is mutable:
val example = Array(1,2,3)
example(0) = 33 //edit in place
println(example.mkString(", ")) //33, 2, 3
But some functions on it (e.g. ++) will create new sequence... not change existing one:
val example2 = example ++ Array(42, 41) //++ is immutable operator
println(example.mkString(", ")) //33, 2, 3 //example stays unchanged
println(example2.mkString(", ")) //33, 2, 3, 42, 41 //but new sequence is created
There is method updateWith that is mutable and will exist only in mutable sequences. There is also updatedWith and it exists in both immutable AND mutable collections and if you are not careful enough you will use wrong one (yea ... 1 letter more).
This means you need to be careful which functions you are using, immutable or mutable one. Most of the time you can distinct them by result type. If something returns collection then it will be probably some kind of copy of original seq. It result is unit then it is mutable for sure.

What's the difference between a set and a mapping to boolean?

In Scala, I sometimes use a Map[A, Boolean], sometimes a Set[A]. There's really not much difference between these two concepts, and an implementation might use the same data structure to implement them. So why bother to have Sets? As I said, this question occurred to me in connection with Scala, but it would arise in any programming language whose library implements a Set abstraction.
The are some specific convenient methods defined on Set (intersect, diff and more). Not a big deal, but often useful.
My first thoughts are two:
efficiency: if you only want to signal presence, why bothering with a flag that can either be true or false?
meaning: a set is about the existence of something, a map is about a correlation between a key and value (generally speaking); these two ideas are quite different and should be used accordingly to simplify reading and understanding the code
Furthermore, the semantics of application change:
val map: Map[String, Bool] = Map("hello" -> true, "world" -> false)
val set: Set[String] = Set("hello")
map("hello") // true
map("world") // false
map("none!") // EXCEPTION
set("hello") // true
set("world") // false
set("none!") // false
Without having to actually store an extra pair to indicate absence (not to mention the boolean that actually indicates such absence).
Sets are very good to indicate the presence of something, which makes them very good for filtering:
val map = (0 until 10).map(_.toString).zipWithIndex.toMap
val set = (3 to 5).map(_.toString).toSet
map.filterKeys(set) // keeps pairs 3rd through 5th
Maps, in terms of processing collections, are good to indicate relationships, which makes them very good for collecting:
set.collect(map) // more or less equivalent as above, but only values are returned
You can read more about using collections as functions to process other collections here.
There are several reasons:
1) It is easier to think/work with a data structure that only has single elements as opposed to mapping to dummy true,
For example, it is easier to convert a list to Set, then to Map:
scala> val myList = List(1,2,3,2,1)
myList: List[Int] = List(1, 2, 3, 2, 1)
scala> myList.toSet
res9: scala.collection.immutable.Set[Int] = Set(1, 2, 3)
scala> myList.map(x => (x, true)).toMap
res1: scala.collection.immutable.Map[Int,Boolean] = Map(1 -> true, 2 -> true, 3 -> true)
2) As Kombajn zbożowy pointed out, Sets have additional helper methods, union, intersect, diff, subsetOf.
3) Sense Set does not have mapping to dummy variable the size of a set in memory is smaller, this is more noticeable for small sized keys.
Having said the above, not all languages have Set data structure, Go for example does not.

Is there a findall in Scala maps?

In a Scala Map, how do I get all keys in a Map that have the same value?
For example in my Map I have 3 keys which have the value 27
Eg:
large -> 27
indispensable -> 27
most -> 27
I tried
val adj1value = htAdjId.find(_._2 == value1).getOrElse(default)._1
but that gives me only the first key "large" (as is the definition of find). I searched a lot but I am not able to find a "findall" function. Does it exist in Scala? If not, can someone please advice me on how to solve this problem?
You can filter the collection and extract all keys using keys:
val map = Map("h" -> 27, "b" -> 2, "c" -> 27)
map.filter { case (key, value) => value == 27 }.keys
Yields
res0: Iterable[String] = Set(h, c)
Though I'd argue that if you need to iterate the entire Map each time, perhaps it isn't the right data structure to begin with, perhaps a List[(String, Int)] will suffice and save the overhead incurred by using a Map.
You can treat the map as a Iterable[K,V] then groupBy the value like this..
# Map(
"large" -> 27,
"indispensable" -> 27,
"most" -> 27
).groupBy(_._2).mapValues(_.keys)
res4: Map[Int, Iterable[String]] = Map(27 -> Set(
"large",
"indispensable",
"most"))
[answered by a friend of mine offline] Hashmaps are usually built to do only the forward lookup efficiently. Unless you know that the underlying data structure used in the hashmap supports this, you probably don't want to do this "reverse lookup" because it will be very inefficient.
Consider putting your data into a bi-directional map or "bi-map" from the start if you are going to access it in both directions. OR, use two hashmaps: one in regular direction, one inverted (values become keys, keys become values) OR use different data structure altogether.
i.e two maps is a good idea if the map is large or you're going to be doing this kind of check a lot. Otherwise try filter instead of find

Scala: Getting the index of an element in a Map

I have a Map[String, Object], where the key is an ID. That ID is now referenced within another object and I have to know what index (what position in the Map) that ID has. I can't believe there isn't an .indexOf
How can I accomplish that?
Do I really have to build myself a list with all the keys or another Map with ID1 -> 1, ID2 -> 2,... ?
I have to get the ID's indexes multiple times. Would a List or that Map be more efficient?
#Dora, as everyone mentioned, maps are unordered so there is no way to index them in place and store id with them.
It's hard to guess use case of storing (K,V) pairs in map and then getting unique id for every K. So, these are few suggestions based on my understanding -
1. You could use LinkedHashMap instead of Map which will maintain the insertion order so you will get stable iteration. Get KeysIterator on this map and convert it into a list which give you an unique index for every key in you map. Something like this-
import scala.collection.mutable.LinkedHashMap
val myMap = LinkedHashMap("a"->11, "b"->22, "c"->33)
val l = myMap.keysIterator.toList
l.indexOf("a") //get index of key a
myMap.+=("d"->44) //insert new element
val l = myMap.keysIterator.toList
l.indexOf("a") //index of a still remains 0 due to linkedHashMap
l.indexOf("d") //get index of newly inserted element.
Obviously, it is expensive to insert elements in linkedHashMap compared to HashMaps.
Deleting element from Map would automatically shift indexes to left.
myMap.-=("b")
val l = myMap.keysIterator.toList
l.indexOf("c") // Index of "c" moves from 2 to 1.
Change you map (K->V) to (K->(index, v)) and generate index manually while inserting new elements.
class ValueObject(val index: Int, val value: Int)
val myMap = scala.collection.mutable.Map[String, ValueObject]()
myMap.+=("a"-> new ValueObject(myMap.size+1, 11))
myMap("a").index<br/> // get index of key a
myMap.+=("b"-> new ValueObject(myMap.size+1, 22))
myMap.+=("c"-> new ValueObject(myMap.size+1, 33))
myMap("c").index<br/> // get index of key c
myMap("b").index<br/> // get index of key b
deletion would be expensive if we need indexes with no gaps, as we need to update all keys accordingly. However, keys insertion and search will be faster.
This problem can be solved efficiently if we know exactly what you need so please explain if above solutions doesn't work for you !!! (May be you really don't need map for solving your problem)
Maps do not have indexes, by definition.
What you can do is enumerate the keys in a map, but this is not guaranteed to be stable/repeatable. Add a new element, and it may change the numbers of every other element randomly due to rehashing. If you want a stable mapping of key to index (or the opposite), you will have to store this mapping somewhere, e.g. by serializing the map into a list.
Convert the Map[A,B] to an ordered collection namely for instance Seq[(A,B)] equipped with indexOf method. Note for
val m = Map("a" -> "x", "b" -> "y")
that
m.toSeq
Seq[(String, String)] = ArrayBuffer((a,x), (b,y))
From the extensive discussions above note this conversion does not guarantee any specific ordering in the resulting ordered collection. This collection can be sorted as necessary.
Alternatively you can index the set of keys in the map, namely for instance,
val idxm = for ( ((k,v),i) <- m.zipWithIndex ) yield ((k,i),v)
Map[(String, Int),String] = Map((a,0) -> x, (b,1) -> y)
where an equivalent to indexOf would be for example
idxm.find { case ((_,i),v) => i == 1 }
Option[((String, Int), String)] = Some(((b,1),y))