will using "+" preserve my order in a map ? No right? - scala

scala> var test2 : Map[String , String] = Map("a"->"b","c"->"d")
test2: Map[String,String] = Map(a -> b, c -> d)
test2 = test2 + ("e"->"f" , "g"->"h")
test2: Map[String,String] = Map(a -> b, c -> d, e -> f, g -> h)
And so on. I want to know that Map is not supposed to preserve order of insertion [For that purpose we have LinkedHashMap]then why are the results showing preservation of order? is this a mere coincidence or there is more than meets the eye ?
Thanks in advance!

It's coincidence that holds only for the first 4 items.
val m = Map('a' -> 1, 'b' -> 2, 'c' -> 3, 'd' -> 4)
// m: immutable.Map[Char,Int] = Map(a -> 1, b -> 2, c -> 3, d -> 4)
m + ('e' -> 5)
// immutable.Map[Char,Int] = Map(e -> 5, a -> 1, b -> 2, c -> 3, d -> 4)
The reason is that there is a special optimized implementations for small maps which indeed preserve insertion order (e.g. append to Map of one pair), but once you cross this border it doesn't work anymore.

Coincidental. The ordering may be preserved, but preservation of ordering is not guaranteed and should not be relied upon. In fact, one should not consider maps to be ordered at all.

Related

find unique elements amongst the values of a map in scala

I have Map[String,Seq[String]].
I want to find the unique elements among all the values in the map. I want to do this in Scala.
Say, I have
Map['a' -> Seq(1,2,3),
'b' -> Seq(2,3),
'c' -> Seq(4)
]
I want the desired result to be
Map['a' -> Seq(3), 'c' -> Seq(4)]
Any idea on how to do this?
Thanks!
If you are looking for unique element in each list, then you can use currentList.diff(rest_of_the_list)
Given
scala> val input = Map('a' -> Seq(1,2,3), 'b' -> Seq(2,3), 'c' -> Seq(4))
input: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1, 2, 3), b -> List(2, 3), c -> List(4))
Find the rest of the elements for each key,
scala> val unions = input.map(elem => elem._1 -> input.filter(!_._1.equals(elem._1)).flatMap(_._2).toSet)
unions: scala.collection.immutable.Map[Char,scala.collection.immutable.Set[Int]] = Map(a -> Set(2, 3, 4), b -> Set(1, 2, 3, 4), c -> Set(1, 2, 3))
Then, iterate over input map and find the unique element in each each list
scala> input.map(x => x._1 -> x._2.diff(unions(x._1).toList))
res18: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1), b -> List(), c -> List(4))
If you don't want empty keys (b in above example)
scala> input.map(x => x._1 -> x._2.diff(unions(x._1).toList)).filter(_._2.nonEmpty)
res21: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1), c -> List(4))
Find the elements that non-unique by flattening all values and filter elements that size more than 1. Then, remove all non-unique element in every key.
val input = Map('a' -> Seq(1,2,3),
'b' -> Seq(2,3),
'c' -> Seq(4))
val nonUnique = input.values.flatten
.groupBy(identity)
.filter(_._2.size > 1)
.keys.toSeq
input.mapValues(x => x.diff(nonUnique)).filter(_._2.size == 1)

How to update a nested immutable map

I'm trying to find a cleaner way to update nested immutable structures in Scala. I think I'm looking for something similar to assoc-in in Clojure. I'm not sure how much types factor into this.
For example, in Clojure, to update the "city" attribute of a nested map I'd do:
> (def person {:name "john", :dob "1990-01-01", :home-address {:city "norfolk", :state "VA"}})
#'user/person
> (assoc-in person [:home-address :city] "richmond")
{:name "john", :dob "1990-01-01", :home-address {:state "VA", :city "richmond"}}
What are my options in Scala?
val person = Map("name" -> "john", "dob" -> "1990-01-01",
"home-address" -> Map("city" -> "norfolk", "state" -> "VA"))
As indicated in the other answer, you can leverage case classes to get cleaner, typed data objects. But in case what you need is simply to update a map:
val m = Map("A" -> 1, "B" -> 2)
val m2 = m + ("A" -> 3)
The result (in a worksheet):
m: scala.collection.immutable.Map[String,Int] = Map(A -> 1, B -> 2)
m2: scala.collection.immutable.Map[String,Int] = Map(A -> 3, B -> 2)
The + operator on a Map will add the new key-value pair, overwriting if it already exists. Notably, though, because the original value is a val, you have to assign the result to a new val, because you cannot change the original.
Because, in your example, you're rewriting a nested value, doing this manually becomes somewhat more onerous:
val m = Map("A" -> 1, "B" -> Map("X" -> 2, "Y" -> 4))
val m2 = m + ("B" -> Map("X" -> 3))
This yields some loss-of-data (the nested Y value disappears):
m: scala.collection.immutable.Map[String,Any] = Map(A -> 1, B -> Map(X -> 2, Y -> 4))
m2: scala.collection.immutable.Map[String,Any] = Map(A -> 1, B -> Map(X -> 3)) // Note that 'Y' has gone away.
Thus, forcing you to copy the original value and then re-assign it back:
val m = Map("A" -> 1, "B" -> Map("X" -> 2, "Y" -> 4))
val b = m.get("B") match {
case Some(b: Map[String, Any]) => b + ("X" -> 3) // Will update `X` while keeping other key-value pairs
case None => Map("X" -> 3)
}
val m2 = m + ("B" -> b)
This yields the 'expected' result, but is obviously a lot of code:
m: scala.collection.immutable.Map[String,Any] = Map(A -> 1, B -> Map(X -> 2, Y -> 4))
b: scala.collection.immutable.Map[String,Any] = Map(X -> 3, Y -> 4)
m2: scala.collection.immutable.Map[String,Any] = Map(A -> 1, B -> Map(X -> 3, Y -> 4))
In short, with any immutable data structure when you 'update' it you're really copying all the pieces you want and then including updated values where appropriate. If the structure is complicated this can get onerous. Hence the recommendation that #0___ gave with, say, Monocle.
Scala is a statically typed language, so you may first want to increase the safety of your code by moving away from any-string-to-any-string.
case class Address(city: String, state: String)
case class Person(name: String, dob: java.util.Date, homeAddress: Address)
(Yes, there are better alternatives for java.util.Date).
Then you create an update like this:
val person = Person(name = "john", dob = new java.util.Date(90, 0, 1),
homeAddress = Address(city = "norfolk", state = "VA"))
person.copy(homeAddress = person.homeAddress.copy(city = "richmond"))
To avoid this nested copy, you would use a lens library, like Monocle or Quicklens (there are many others).
import com.softwaremill.quicklens._
person.modify(_.homeAddress.city).setTo("richmond")
The other two answers nicely sum up the importance of correctly modelling your problem so we don't end up having to deal with Map[String, Object] type of collection.
Just adding my two cents here for a brute force solution utilizing the quiet powerful function pipelining and higher order function features in Scala. The ugly asInstanceOf casting is needed because the Map values are of different types and hence Scala treats the Map signature as Map[String,Any].
val person: Map[String,Any] = Map("name" -> "john", "dob" -> "1990-01-01", "home-address" -> Map("city" -> "norfolk", "state" -> "VA"))
val newperson = person.map({case(k,v) => if(k == "home-address") v.asInstanceOf[Map[String,String]].updated("city","Virginia") else k -> v})

scala map += operator with five pairs

I am having an issue with appending pairs to an existing Map. Once I reach the fifth pair of the Map, the Map reorders itself. The order is correct with 4 pairs, but as soon as the 5th is added it shifts itself. See example below (assuming I built the 4 pair Map one pair at a time.):
scala> val a = Map("a1" -> 1, "a2" -> 1, "a3" -> 1, "a4" -> 1)
a: scala.collection.immutable.Map[String,Int] = Map(a1 -> 1, a2 -> 1, a3 -> 1, a4 -> 1)
scala> a += ("a5" -> 1)
scala> a
res26: scala.collection.immutable.Map[String,Int] = Map(a5 -> 1, a4 -> 1, a3 -> 1, a1 -> 1, a2 -> 1)
The added fifth element jumped to the front of the Map and shifts the others around. Is there a way to keep the elements in order (1, 2, 3, 4, 5) ?
Thanks
By default Scala's immutable.Map uses HashMap.
From http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time
So a map is really not a table that contains "a1" -> 1, but a table that contains hash("a1") -> 1. The map reorders its keys based on the hash of the key rather than the key you put in it.
As was recommended in the comments, use LinkedHashMap or ListMap:
Scala Map implementation keeping entries in insertion order?
PS: You might be interested in reading this article: http://howtodoinjava.com/2012/10/09/how-hashmap-works-in-java/

Scala troubles with sorting

I am still on studying period when it comes to scala and faces some problems that I would like to solve.
What I have at the moment is a Seq of items type X. Now I want to make a function that returns me a map of numbers mapped with set of items that appear on that original seq certain amount of time.
Here is small example what I want to do:
val exampleSeq[X]: Seq = [a, b, d, d, c, b, d]
val exampleSeq2[x]: Seq = [a, a, a, c, c, b, b, c]
myMagicalFunction(exampleSeq) returns Map[1 -> Set[a, c], 2 -> Set[b], 3 -> Set[d]]
myMagicalFunction(exampleSeq2) returns Map[2 -> Set[b], 3 -> Set[a, c]]
So far I have been able to create a function that maps the item with the times it appears:
function[X](seq: Seq[X]) = seq.groupBy(item => item).mapValues(_.size)
Return for my exampleSeq from that one is
Map(a -> 1, b -> 2, c -> 1, d -> 3)
Thank you for answers :)
One approach, for
val a = Seq('a', 'b', 'd', 'd', 'c', 'b', 'd')
this
val b = for ( (k,v) <- a.groupBy(identity).mapValues(_.size).toArray )
yield (v,k)
delivers
Array((2,b), (3,d), (1,a), (1,c))
and so
b.groupBy(_._1).mapValues(_.map(_._2).toSet)
res: Map(2 -> Set(b), 1 -> Set(a, c), 3 -> Set(d))
Note seq.groupBy(item => item) is equivalent to seq.groupBy(identity).
You are almost there! Departing from the collection element -> count, you only need a transformation to get to count -> Col[elem].
Lets say that freqItem = Map(a -> 1, b -> 2, c -> 1, d -> 3) you would do something like:
val freqSet = freqItem.toSeq.map(_.swap).groupBy(_._1).mapValues(_.toSet)
Note that we transform the Map into a Seq before swapping the (k,v) into (v,k) because mapping over a Map preserves the semantics of key uniqueness and you'd lose one of (1 -> a), (1 -> b) otherwise.
You can write your function as :
def f[T](l: Seq[T]): Map[Int, Set[T]] = {
l.map {
x => (x, l.count(_ == x))
}.distinct.groupBy(_._2).mapValues(_.map(_._1).toSet)
}
val l = List("a","a","a","b","b","b","b","c","c","d","e")
f(l)
res0: Map[Int,Set[String]] = Map(2 -> Set(c), 4 -> Set(b), 1 -> Set(d, e), 3 -> Set(a))
scala> case class A(name:String,age:Int)
defined class A
scala> val l = List(new A("a",1),new A("b",2),new A("a",1),new A("c",1) )
l: List[A] = List(A(a,1), A(b,2), A(a,1), A(c,1))
scala> f[A](l)
res1: Map[Int,Set[A]] = Map(2 -> Set(A(a,1)), 1 -> Set(A(b,2), A(c,1)))

Convert map to get size of lists in each element

I have a Map of element where each element has a List as its value
e.g.
Map(a -> List(a, a), b -> List(b, b), l -> List(l, l, l), h -> List(h))
I want to convert this so that each value is the size of List e.g.
Map(a -> 2, b -> 2, l -> 3, h -> 1)
I try:
myMap.map(x => x.size())
which gives...
error: value size is not a member of (Char, List[Char])
Any tips how I do this?
Thanks.
Quick solution: myMap.mapValues(x => x.size). Standard map maps over key-value pairs.