Immutable Scala Map implementation that preserves insertion order [duplicate] - scala

This question already has answers here:
Scala Map implementation keeping entries in insertion order?
(6 answers)
Closed 7 years ago.
LinkedHashMap is used to preserve insertion order in the map, but this only works for mutable maps. Which is the immutable Map implementation that preserves insertion order?

ListMap implements an immutable map using a list-based data structure, and thus preserves insertion order.
scala> import collection.immutable.ListMap
import collection.immutable.ListMap
scala> ListMap(1 -> 2) + (3 -> 4)
res31: scala.collection.immutable.ListMap[Int,Int] = Map(1 -> 2, 3 -> 4)
scala> res31 + (6 -> 9)
res32: scala.collection.immutable.ListMap[Int,Int] = Map(1 -> 2, 3 -> 4, 6 -> 9)
The following extension method - Seq#toListMap can be quite useful when working with ListMaps.
scala> import scalaz._, Scalaz._, Liskov._
import scalaz._
import Scalaz._
import Liskov._
scala> :paste
// Entering paste mode (ctrl-D to finish)
implicit def seqW[A](xs: Seq[A]) = new SeqW(xs)
class SeqW[A](xs: Seq[A]) {
def toListMap[B, C](implicit ev: A <~< (B, C)): ListMap[B, C] = {
ListMap(co[Seq, A, (B, C)](ev)(xs) : _*)
}
}
// Exiting paste mode, now interpreting.
seqW: [A](xs: Seq[A])SeqW[A]
defined class SeqW
scala> Seq((2, 4), (11, 89)).toListMap
res33: scala.collection.immutable.ListMap[Int,Int] = Map(2 -> 4, 11 -> 89)

While ListMap will preserve insertion order, it is not very efficient - e.g. lookup time is linear. I suggest you create a new collection class which wraps both the immutable.HashMap and the immutable.TreeMap. The immutable map should be parametrized as immutable.HashMap[Key, (Value, Long)], where the Long in the tuple gives you the pointer to the corresponding entry in the TreeMap[Long, Key]. You then keep an entry counter on the side. This tree map will sort the entries according to the insertion order.
You implement insertion and lookup in the straightforward way - increment the counter, insert into the hash map and insert to the the counter-key pair into the treemap. You use the hash map for the lookup.
You implement iteration by using the tree map.
To implement remove, you have to remove the key-value pair from the hash map and use the index from the tuple to remove the corresponding entry from the tree map.

Related

how to convert List(String,String) to ListMap[String,String]?

I have a list of type List(String,String) and I wanted to convert it to map. When I used toMap method I found that it does not preservers the order of data that is there in the List. However my goal is to convert the list to Map by keeping the order of the data same as of List. I learned that ListMap preserves the insertion order(but it is immutable) so I can use the LinkedHashMap with map function to insert the data sequentially into LinkedHashMap but that means I need to iterate over all the elements which is pain. Can anyone please suggest me a better approach?
Thanks
This should do it :
val listMap = ListMap(list : _*)
In Scala 2.13 or later:
scala> import scala.collection.immutable.ListMap
import scala.collection.immutable.ListMap
scala> val list = List((1,2), (3,4), (5,6), (7,8), (9,0))
list: List[(Int, Int)] = List((1,2), (3,4), (5,6), (7,8), (9,0))
scala> list.to(ListMap)
res3: scala.collection.immutable.ListMap[Int,Int] = ListMap(1 -> 2, 3 -> 4, 5 -> 6, 7 -> 8, 9 -> 0)
Don't use a ListMap. They are extremely imperformant. Since they are structured as lists they have linear lookup performance (https://docs.scala-lang.org/overviews/collections/performance-characteristics.html)
I'd advise instanciating a mutable LinkedHashmap and then assigning it to a val defined as a collections.Map. The collection.Map interface doesn't expose mutable methods so the map is immutable to any entity accessing it.

Efficient way to check if a traversable has more than 1 element in Scala

I need to check if a Traversable (which I already know to be nonEmpty) has a single element or more.
I could use size, but (tell me if I'm wrong) I suspect that this could be O(n), and traverse the collection to compute it.
I could check if tail.nonEmpty, or if .head != .last
Which are the pros and cons of the two approaches? Is there a better way? (for example, will .last do a full iteration as well?)
All approaches that cut elements from beginning of the collection and return tail are inefficient. For example tail for List is O(1), while tail for Array is O(N). Same with drop.
I propose using take:
list.take(2).size == 1 // list is singleton
take is declared to return whole collection if collection length is less that take's argument. Thus there will be no error if collection is empty or has only one element. On the other hand if collection is huge take will run in O(1) time nevertheless. Internally take will start iterating your collection, take two steps and break, putting elements in new collection to return.
UPD: I changed condition to exactly match the question
Not all will be the same, but let's take a worst case scenario where it's a List. last will consume the entire List just to access that element, as will size.
tail.nonEmpty is obtained from a head :: tail pattern match, which doesn't need to consume the entire List. If you already know the list to be non-empty, this should be the obvious choice.
But not all tail operations take constant time like a List: Scala Collections Performance
You can take a view of a traversable. You can slice the TraversableView lazily.
The initial star is because the REPL prints some output.
scala> val t: Traversable[Int] = Stream continually { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
*
res1: Int = 2
scala> val t: Traversable[Int] = Stream.fill(1) { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
res2: Int = 1
The advantage is that there is no intermediate collection.
scala> val t: Traversable[_] = Map((1 to 10) map ((_, "x")): _*)
t: Traversable[_] = Map(5 -> x, 10 -> x, 1 -> x, 6 -> x, 9 -> x, 2 -> x, 7 -> x, 3 -> x, 8 -> x, 4 -> x)
scala> t.take(2)
res3: Traversable[Any] = Map(5 -> x, 10 -> x)
That returns an unoptimized Map, for instance:
scala> res3.getClass
res4: Class[_ <: Traversable[Any]] = class scala.collection.immutable.HashMap$HashTrieMap
scala> Map(1->"x",2->"x").getClass
res5: Class[_ <: scala.collection.immutable.Map[Int,String]] = class scala.collection.immutable.Map$Map2
What about pattern matching?
itrbl match { case _::Nil => "one"; case _=>"more" }

How to find the number of (key , value) pairs in a map in scala?

I need to find the number of (key , value) pairs in a Map in my Scala code. I can iterate through the map and get an answer but I wanted to know if there is any direct function for this purpose or not.
you can use .size
scala> val m=Map("a"->1,"b"->2,"c"->3)
m: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)
scala> m.size
res3: Int = 3
Use Map#size:
The size of this traversable or iterator.
The size method is from TraversableOnce so, barring infinite sequences or sequences that shouldn't be iterated again, it can be used over a wide range - List, Map, Set, etc.

How to use priority queues in Scala?

I am trying to implement A* search in Scala (version 2.10), but I've ran into a brick wall - I can't figure out how to use Scala's Priority Queue.
I have a set of squares, represented by (Int, Int)s, and I need to insert them with priorities represented by Ints. In Python you just have a list of key, value pairs and use the heapq functions to sort it.
So how do you do this?
There is actually pre-defined lexicographical order for tuples -- but you need to import it:
import scala.math.Ordering.Implicits._
Moreover, you can define your own ordering.
Suppose I want to arrange tuples, based on the difference between first and second members of the tuple:
scala> import scala.collection.mutable.PriorityQueue
// import scala.collection.mutable.PriorityQueue
scala> def diff(t2: (Int,Int)) = math.abs(t2._1 - t2._2)
// diff: (t2: (Int, Int))Int
scala> val x = new PriorityQueue[(Int, Int)]()(Ordering.by(diff))
// x: scala.collection.mutable.PriorityQueue[(Int, Int)] = PriorityQueue()
scala> x.enqueue(1 -> 1)
scala> x.enqueue(1 -> 2)
scala> x.enqueue(1 -> 3)
scala> x.enqueue(1 -> 4)
scala> x.enqueue(1 -> 0)
scala> x
// res5: scala.collection.mutable.PriorityQueue[(Int, Int)] = PriorityQueue((1,4), (1,3), (1,2), (1,1), (1,0))
Indeed, there is no implicit ordering on pairs of integers (a, b). What would it be? Perhaps they are both positive and you can use (a - 1.0/b)? Or they are not, and you can use, what, (a + atan(b/pi))? If you have an ordering in mind, you can consider wrapping your pairs in a type that has your ordering.

Nested Default Maps in Scala

I'm trying to construct nested maps in Scala, where both the outer and inner map use the "withDefaultValue" method. For example, the following :
val m = HashMap.empty[Int, collection.mutable.Map[Int,Int]].withDefaultValue( HashMap.empty[Int,Int].withDefaultValue(3))
m(1)(2)
res: Int = 3
m(1)(2) = 5
m(1)(2)
res: Int = 5
m(2)(3) = 6
m
res : scala.collection.mutable.Map[Int,scala.collection.mutable.Map[Int,Int]] = Map()
So the map, when addressed by the appropriate keys, gives me back what I put in. However, the map itself appears empty! Even m.size returns 0 in this example. Can anyone explain what's going on here?
Short answer
It's definitely not a bug.
Long answer
The behavior of withDefaultValue is to store a default value (in your case, a mutable map) inside the Map to be returned in the case that they key does not exist. This is not the same as a value that is inserted into the Map when they key is not found.
Let's look closely at what's happening. It will be easier to understand if we pull the default map out as a separate variable so we can inspect is at will; let's call it default
import collection.mutable.HashMap
val default = HashMap.empty[Int,Int].withDefaultValue(3)
So default is a mutable map (that has its own default value). Now we can create m and give default as the default value.
import collection.mutable.{Map => MMap}
val m = HashMap.empty[Int, MMap[Int,Int]].withDefaultValue(default)
Now whenever m is accessed with a missing key, it will return default. Notice that this is the exact same behavior as you have because withDefaultValue is defined as:
def withDefaultValue (d: B): Map[A, B]
Notice that it's d: B and not d: => B, so it will not create a new map each time the default is accessed; it will return the same exact object, what we've called default.
So let's see what happens:
m(1) // Map()
Since key 1 is not in m, the default, default is returned. default at this time is an empty Map.
m(1)(2) = 5
Since m(1) returns default, this operation stores 5 as the value for key 2 in default. Nothing is written to the Map m because m(1) resolves to default which is a separate Map entirely. We can check this by viewing default:
default // Map(2 -> 5)
But as we said, m is left unchanged
m // Map()
Now, how to achieve what you really wanted? Instead of using withDefaultValue, you want to make use of getOrElseUpdate:
def getOrElseUpdate (key: A, op: ⇒ B): B
Notice how we see op: => B? This means that the argument op will be re-evaluated each time it is needed. This allows us to put a new Map in there and have it be a separate new Map for each invalid key. Let's take a look:
val m2 = HashMap.empty[Int, MMap[Int,Int]]
No default values needed here.
m2.getOrElseUpdate(1, HashMap.empty[Int,Int].withDefaultValue(3)) // Map()
Key 1 doesn't exist, so we insert a new HashMap, and return that new value. We can check that it was inserted as we expected. Notice that 1 maps to the newly added empty map and that they 3 was not added anywhere because of the behavior explained above.
m2 // Map(1 -> Map())
Likewise, we can update the Map as expected:
m2.getOrElseUpdate(1, HashMap.empty[Int,Int].withDefaultValue(1))(2) = 6
and check that it was added:
m2 // Map(1 -> Map(2 -> 6))
withDefaultValue is used to return a value when the key was not found. It does not populate the map. So you map stays empty. Somewhat like using getOrElse(a, b) where b is provided by withDefaultValue.
I just had the exact same problem, and was happy to find dhg's answer. Since typing getOrElseUpdate all the time is not very concise, I came up with this little extension of the idea that I want to share:
You can declare a class that uses getOrElseUpdate as default behavior for the () operator:
class DefaultDict[K, V](defaultFunction: (K) => V) extends HashMap[K, V] {
override def default(key: K): V = return defaultFunction(key)
override def apply(key: K): V =
getOrElseUpdate(key, default(key))
}
Now you can do what you want to do like this:
var map = new DefaultDict[Int, DefaultDict[Int, Int]](
key => new DefaultDict(key => 3))
map(1)(2) = 5
Which does now result in map containing 5 (or rather: containing a DefaultDict containing the value 5 for the key 2).
What you're seeing is the effect that you've created a single Map[Int, Int] this is the default value whenever the key isn't in the outer map.
scala> val m = HashMap.empty[Int, collection.mutable.Map[Int,Int]].withDefaultValue( HashMap.empty[Int,Int].withDefaultValue(3))
m: scala.collection.mutable.Map[Int,scala.collection.mutable.Map[Int,Int]] = Map()
scala> m(2)(2)
res1: Int = 3
scala> m(1)(2) = 5
scala> m(2)(2)
res2: Int = 5
To get the effect that you're looking for, you'll have to wrap the Map with an implementation that actually inserts the default value when a key isn't found in the Map.
Edit:
I'm not sure what your actual use case is, but you may have an easier time using a pair for the key to a single Map.
scala> val m = HashMap.empty[(Int, Int), Int].withDefaultValue(3)
m: scala.collection.mutable.Map[(Int, Int),Int] = Map()
scala> m((1, 2))
res0: Int = 3
scala> m((1, 2)) = 5
scala> m((1, 2))
res3: Int = 5
scala> m
res4: scala.collection.mutable.Map[(Int, Int),Int] = Map((1,2) -> 5)
I know it's a bit late but I've just seen the post while I was trying to solve the same problem.
Probably the API are different from the 2012 version but you may want to use withDefaultinstead that withDefaultValue.
The difference is that withDefault takes a function as parameter, that is executed every time a missed key is requested ;)