Scala: Getting the index of an element in a Map - scala

I have a Map[String, Object], where the key is an ID. That ID is now referenced within another object and I have to know what index (what position in the Map) that ID has. I can't believe there isn't an .indexOf
How can I accomplish that?
Do I really have to build myself a list with all the keys or another Map with ID1 -> 1, ID2 -> 2,... ?
I have to get the ID's indexes multiple times. Would a List or that Map be more efficient?

#Dora, as everyone mentioned, maps are unordered so there is no way to index them in place and store id with them.
It's hard to guess use case of storing (K,V) pairs in map and then getting unique id for every K. So, these are few suggestions based on my understanding -
1. You could use LinkedHashMap instead of Map which will maintain the insertion order so you will get stable iteration. Get KeysIterator on this map and convert it into a list which give you an unique index for every key in you map. Something like this-
import scala.collection.mutable.LinkedHashMap
val myMap = LinkedHashMap("a"->11, "b"->22, "c"->33)
val l = myMap.keysIterator.toList
l.indexOf("a") //get index of key a
myMap.+=("d"->44) //insert new element
val l = myMap.keysIterator.toList
l.indexOf("a") //index of a still remains 0 due to linkedHashMap
l.indexOf("d") //get index of newly inserted element.
Obviously, it is expensive to insert elements in linkedHashMap compared to HashMaps.
Deleting element from Map would automatically shift indexes to left.
myMap.-=("b")
val l = myMap.keysIterator.toList
l.indexOf("c") // Index of "c" moves from 2 to 1.
Change you map (K->V) to (K->(index, v)) and generate index manually while inserting new elements.
class ValueObject(val index: Int, val value: Int)
val myMap = scala.collection.mutable.Map[String, ValueObject]()
myMap.+=("a"-> new ValueObject(myMap.size+1, 11))
myMap("a").index<br/> // get index of key a
myMap.+=("b"-> new ValueObject(myMap.size+1, 22))
myMap.+=("c"-> new ValueObject(myMap.size+1, 33))
myMap("c").index<br/> // get index of key c
myMap("b").index<br/> // get index of key b
deletion would be expensive if we need indexes with no gaps, as we need to update all keys accordingly. However, keys insertion and search will be faster.
This problem can be solved efficiently if we know exactly what you need so please explain if above solutions doesn't work for you !!! (May be you really don't need map for solving your problem)

Maps do not have indexes, by definition.
What you can do is enumerate the keys in a map, but this is not guaranteed to be stable/repeatable. Add a new element, and it may change the numbers of every other element randomly due to rehashing. If you want a stable mapping of key to index (or the opposite), you will have to store this mapping somewhere, e.g. by serializing the map into a list.

Convert the Map[A,B] to an ordered collection namely for instance Seq[(A,B)] equipped with indexOf method. Note for
val m = Map("a" -> "x", "b" -> "y")
that
m.toSeq
Seq[(String, String)] = ArrayBuffer((a,x), (b,y))
From the extensive discussions above note this conversion does not guarantee any specific ordering in the resulting ordered collection. This collection can be sorted as necessary.
Alternatively you can index the set of keys in the map, namely for instance,
val idxm = for ( ((k,v),i) <- m.zipWithIndex ) yield ((k,i),v)
Map[(String, Int),String] = Map((a,0) -> x, (b,1) -> y)
where an equivalent to indexOf would be for example
idxm.find { case ((_,i),v) => i == 1 }
Option[((String, Int), String)] = Some(((b,1),y))

Related

Merge entries in Map without using loop in scala

Could someone suggest a best way to merge entries in map for below use case in scala (possibly without using loop)?
From
val map1 = Map(((1,"case0")->List(1,2,3)), ((2,"case0")->List(3,4,5)), ((1,"case1")->List(2,4,6)), ((2,"case1")->List(3)))
To
Map(((1,"nocase")->List(2)), ((2,"nocase")->List(3)))
You can do it as follows:
map1.groupBy(_._1._1).map{case (key, elements) => ((key, "nocase"), elements.values.reduce(_ intersect _ ))}
With the group you group the elements by the first element of the key, then with the map, you build the new key, with the "nocase" string as in your example. With elements.value you get all the elements for the given keys and you can reduce them with the intersect you get the expected output
Output:
Map[(Int, String),List[Int]] = Map((2,nocase) -> List(3), (1,nocase) -> List(2))

I have a map that i want to delete entries from by value

I have a map(WeakHashMap) that i am using for some caching. I want to delete some entries from the map depending on the value.
I am unable to figure out a way to do this.
The easy way is to use filter and filter based on the key. This will create a new Map and not modify the original.
val newMap = map.filter((key, value) => <true/false based on value>)
If it is possible to structure your application to work with this I would recommend it.
If you need to delete elements from a mutable map (modifying the original instead of creating a new map you should use foldLeft to loop over your map and create a list of the keys corresponding to the values you want to remove
val keyList = map.foldLeft(List[KeyType]())((keys, pair) => if(pair._2 <should be removed>) pair._1 :: keys else keys)
keyList.foreach(map.remove)
pair is a tuple of the current element _1 is the key, _2 is the value

Map to List[Tuple2] in Scala - removes duplicates?

Am trying to convert a map to list of Tuples, given a map like below
Map("a"->2,"a"->4,"a"->5,"b"->6,"b"->1,"c"->3)
i want output like
List(("a",2),("a",4),("a",5),("b",6),("b",1),("c",3))
I tried following
val seq = inputMap.toList //output (a,5)(b,1)(c,3)
var list:List[(String,Int)] = Nil
for((k,v)<-inputMap){
(k,v) :: list
} //output (a,5)(b,1)(c,3)
Why does it remove duplicates? I dont see other tuples that has "a" as key.
That's because a Map doesn't allow duplicate keys:
val map = Map("a"->2,"a"->4,"a"->5,"b"->6,"b"->1,"c"->3)
println(map) // Map(a -> 5, b -> 1, c -> 3)
Since map has duplicate keys, it will remove duplicate entries while map creation itself.
Map("a"->2,"a"->4,"a"->5,"b"->6,"b"->1,"c"->3)
it will turn into,
Map(a -> 5, b -> 1, c -> 3)
so other operations will be performed on short map
The problem is with Map, whose keys are a Set, so you cannot have twice the same key. This is because Maps are dictionaries, which are made to access a value by its key, so keys MUST be unique. The builder thus keeps only the last value given with the key "a".
By the way, Map already has a method toList, that do exactly what you implemented.

Creating a Map from a Set of keys

I have a set of keys, say Set[MyKey] and for each of the keys I want to compute the value through some value function, lets say computeValueOf(key: MyKey). In the end I want to have a Map which maps key -> value
What is the most efficient way to do this without iterating too much?
A collection of Tuple2s can be converted to a Map, where the tuple's first element will be the key and the second element will be the value.
val setOfKeys = Set[MyKey]()
setOfKeys.map(key => (key, computeValueOf(key)).toMap
This is actually a pretty neat application for collection.breakOut, one of my favorite pieces of bizarre Scala voodoo:
type MyKey = Int
def computeValueOf(key: MyKey) = "value" * key
val mySet: Set[MyKey] = Set(1, 2, 3)
val myMap: Map[MyKey, String] =
mySet.map(k => k -> computeValueOf(k))(collection.breakOut)
See this answer for some discussion of what's going on here. Unlike the version with toMap, this won't construct an intermediate Set, saving you some allocations and a traversal. It's also much less readable, though—I only offer it because you mentioned that you wanted to avoid "iterating too much".

Sort a list by an ordered index

Let us assume that I have the following two sequences:
val index = Seq(2,5,1,4,7,6,3)
val unsorted = Seq(7,6,5,4,3,2,1)
The first is the index by which the second should be sorted. My current solution is to traverse over the index and construct a new sequence with the found elements from the unsorted sequence.
val sorted = index.foldLeft(Seq[Int]()) { (s, num) =>
s ++ Seq(unsorted.find(_ == num).get)
}
But this solution seems very inefficient and error-prone to me. On every iteration it searches the complete unsorted sequence. And if the index and the unsorted list aren't in sync, then either an error will be thrown or an element will be omitted. In both cases, the not in sync elements should be appended to the ordered sequence.
Is there a more efficient and solid solution for this problem? Or is there a sort algorithm which fits into this paradigm?
Note: This is a constructed example. In reality I would like to sort a list of mongodb documents by an ordered list of document Id's.
Update 1
I've selected the answer from Marius Danila because it seems the more fastest and scala-ish solution for my problem. It doesn't come with a not in sync item solution, but this could be easily implemented.
So here is the updated solution:
def sort[T: ClassTag, Key](index: Seq[Key], unsorted: Seq[T], key: T => Key): Seq[T] = {
val positionMapping = HashMap(index.zipWithIndex: _*)
val inSync = new Array[T](unsorted.size)
val notInSync = new ArrayBuffer[T]()
for (item <- unsorted) {
if (positionMapping.contains(key(item))) {
inSync(positionMapping(key(item))) = item
} else {
notInSync.append(item)
}
}
inSync.filterNot(_ == null) ++ notInSync
}
Update 2
The approach suggested by Bask.cc seems the correct answer. It also doesn't consider the not in sync issue, but this can also be easily implemented.
val index: Seq[String]
val entities: Seq[Foo]
val idToEntityMap = entities.map(e => e.id -> e).toMap
val sorted = index.map(idToEntityMap)
val result = sorted ++ entities.filterNot(sorted.toSet)
Why do you want to sort collection, when you already have sorted index collection? You can just use map
Concerning> In reality I would like to sort a list of mongodb documents by an ordered list of document Id's.
val ids: Seq[String]
val entities: Seq[Foo]
val idToEntityMap = entities.map(e => e.id -> e).toMap
ids.map(idToEntityMap _)
This may not exactly map to your use case, but Googlers may find this useful:
scala> val ids = List(3, 1, 0, 2)
ids: List[Int] = List(3, 1, 0, 2)
scala> val unsorted = List("third", "second", "fourth", "first")
unsorted: List[String] = List(third, second, fourth, first)
scala> val sorted = ids map unsorted
sorted: List[String] = List(first, second, third, fourth)
I do not know the language that you are using. But irrespective of the language this is how i would have solved the problem.
From the first list (here 'index') create a hash table taking key as the document id and the value as the position of the document in the sorted order.
Now when traversing through the list of document i would lookup the hash table using the document id and then get the position it should be in the sorted order. Then i would use this obtained order to sort in a pre allocated memory.
Note: if the number of documents is small then instead of using hashtable u could use a pre allocated table and index it directly using the document id.
Flat Mapping the index over the unsorted list seems to be a safer version (if the index isn't found it's just dropped since find returns a None):
index.flatMap(i => unsorted.find(_ == i))
It still has to traverse the unsorted list every time (worst case this is O(n^2)). With you're example I'm not sure that there's a more efficient solution.
In this case you can use zip-sort-unzip:
(unsorted zip index).sortWith(_._2 < _._2).unzip._1
Btw, if you can, better solution would be to sort list on db side using $orderBy.
Ok.
Let's start from the beginning.
Besides the fact you're rescanning the unsorted list each time, the Seq object will create, by default a List collection. So in the foldLeft you're appending an element at the end of the list each time and this is a O(N^2) operation.
An improvement would be
val sorted_rev = index.foldLeft(Seq[Int]()) { (s, num) =>
unsorted.find(_ == num).get +: s
}
val sorted = sorted_rev.reverse
But that is still an O(N^2) algorithm. We can do better.
The following sort function should work:
def sort[T: ClassTag, Key](index: Seq[Key], unsorted: Seq[T], key: T => Key): Seq[T] = {
val positionMapping = HashMap(index.zipWithIndex: _*) //1
val arr = new Array[T](unsorted.size) //2
for (item <- unsorted) { //3
val position = positionMapping(key(item))
arr(position) = item
}
arr //6
}
The function sorts a list of items unsorted by a sequence of indexes index where the key function will be used to extract the id from the objects you're trying to sort.
Line 1 creates a reverse index - mapping each object id to its final position.
Line 2 allocates the array which will hold the sorted sequence. We're using an array since we need constant-time random-position set performance.
The loop that starts at line 3 will traverse the sequence of unsorted items and place each item in it's meant position using the positionMapping reverse index
Line 6 will return the array converted implicitly to a Seq using the WrappedArray wrapper.
Since our reverse-index is an immutable HashMap, lookup should take constant-time for regular cases. Building the actual reverse-index takes O(N_Index) time where N_Index is the size of the index sequence. Traversing the unsorted sequence takes O(N_Unsorted) time where N_Unsorted is the size of the unsorted sequence.
So the complexity is O(max(N_Index, N_Unsorted)), which I guess is the best you can do in the circumstances.
For your particular example, you would call the function like so:
val sorted = sort(index, unsorted, identity[Int])
For the real case, it would probably be like this:
val sorted = sort(idList, unsorted, obj => obj.id)
The best I can do is to create a Map from the unsorted data, and use map lookups (basically the hashtable suggested by a previous poster). The code looks like:
val unsortedAsMap = unsorted.map(x => x -> x).toMap
index.map(unsortedAsMap)
Or, if there's a possibility of hash misses:
val unsortedAsMap = unsorted.map(x => x -> x).toMap
index.flatMap(unsortedAsMap.get)
It's O(n) in time*, but you're swapping time for space, as it uses O(n) space.
For a slightly more sophisticated version, that handles missing values, try:
import scala.collection.JavaConversions._
import scala.collection.mutable.ListBuffer
val unsortedAsMap = new java.util.LinkedHashMap[Int, Int]
for (i <- unsorted) unsortedAsMap.add(i, i)
val newBuffer = ListBuffer.empty[Int]
for (i <- index) {
val r = unsortedAsMap.remove(i)
if (r != null) newBuffer += i
// Not sure what to do for "else"
}
for ((k, v) <- unsortedAsMap) newBuffer += v
newBuffer.result()
If it's a MongoDB database in the first place, you might be better retrieving documents directly from the database by index, so something like:
index.map(lookupInDB)
*technically it's O(n log n), as Scala's standard immutable map is O(log n), but you could always use a mutable map, which is O(1)