toMap when keys are repeated with different values - scala

I have a list
val data = List(2, 4, 3, 2, 1, 1, 1,7)
with which I want to create a map such that values in above list are keys to new one with indeces as new values I tried
scala> data.zipWithIndex.toMap
res5: scala.collection.immutable.Map[Int,Int] = Map(1 -> 6, 2 -> 3, 7 -> 7, 3 -> 2, 4 -> 1)
but strangely it gives res5(1) as 6 but I want it to be 4.
I could solve it by
data.zipWithIndex groupBy (_._1) mapValues (w=>w.map(tuple=>tuple._2) min)
but is there any way I can pass a function f to toMap so that it creates map in desired way.

toMap is going to add each pair to the map in the order of the zipped list, and when you add a mapping k -> v to a map that already contains a k, the old value is simply replaced.
An easy fix is just to reverse the list after zipping the indices and before converting to a map:
data.zipWithIndex.reverse.toMap
Now the mappings 1 -> 6 and 1 -> 5 will be added before 1 -> 4, which means 1 -> 4 is the one you'll see in the result.

Related

Getting all key value pairs having the maximum value from a Scala map

I have seen a similar post here here which is giving a single key-value pair which has maximum value in the entire Map.
But I would like to get List of pairs which has maximum value(maximum value is same for many pairs).
Ex : Map(1 -> 7, 2 -> 1, 4 -> 7, 3 -> 2)
Expected Output : List(1 -> 7, 4 -> 7)
This (Map(1 -> 7, 2 -> 1, 4 -> 7, 3 -> 2).maxBy(x => x._2)) will give only first occurrence 1 -> 7
Using map.filter(_._2 == map.values.max) will do the trick.
val maxValue = map.values.max
map.filter(_._2 == maxValue).toList

How to remove duplicates from particular column in Scala by reading textfile

I am new to scala, I am reading textfile from local, and I want to find duplicate columns in example.
Input File:
1,2,3
2,3,4
1,3,4
2,4,5
3,4,5
I need output like this:
Select first column
1->2
2->3
3->1
program is:
val file=scala.io.Source.fromFile("D:/Files/test.txt").getLines().mkString("\n")
val d=file.groupBy(identity).mapValues(_.size)
println(d)
But I am getting output Like this
Map(-> 5, 4 -> 1, 9 -> 1, 5 -> 3, , -> 12, 1 -> 3, 0 -> 1, 2 -> 5, 3 -> 4)
Its counting all the data but I want to count duplicates in particualr column only
The issue here is because once the call mkString is made, the multiple lines on the file is 'lost'. Another approach could be to use the toArray call instead.
val file = scala.io.Source.fromFile("D:/Files/test.txt")
val lines = file.getLines().toArray
On the above example, lines would be a array of strings:
Array(1,2,3, 2,3,4, 1,3,4, 2,4,5, 3,4,5)
then to extract the first column before grouping you could use something like the slice method on each string
lines.map(_.slice(0,1)).groupBy(identity).mapValues(_.size)
Also, remember to close the file :)
Full example:
val file = scala.io.Source.fromFile("D:/Files/test.txt")
val lines = file.getLines().toArray
val grouping = lines.map(_.slice(0,1)).groupBy(identity).mapValues(_.size)
file.close
If I understand your question correctly, shouldn't the duplicate counts of the 1st column be (1->2, 2->2, 3->1)?
Here's one approach to get the counts:
// Create a list of split-column arrays
val list = scala.io.Source.
fromFile("/Users/leo/projects/scala/files/testfile.txt").
getLines.
map(_.split(",")).
toList
list: List[Array[String]] = List(Array(1, 2, 3), Array(2, 3, 4), Array(1, 3, 4), Array(2, 4, 5), Array(3, 4, 5))
// Count duplicates of the 1st split-column
val d = list.
groupBy(_(0)).
mapValues(_.size)
d: scala.collection.immutable.Map[String,Int] = Map(2 -> 2, 1 -> 2, 3 -> 1)

scala map += operator with five pairs

I am having an issue with appending pairs to an existing Map. Once I reach the fifth pair of the Map, the Map reorders itself. The order is correct with 4 pairs, but as soon as the 5th is added it shifts itself. See example below (assuming I built the 4 pair Map one pair at a time.):
scala> val a = Map("a1" -> 1, "a2" -> 1, "a3" -> 1, "a4" -> 1)
a: scala.collection.immutable.Map[String,Int] = Map(a1 -> 1, a2 -> 1, a3 -> 1, a4 -> 1)
scala> a += ("a5" -> 1)
scala> a
res26: scala.collection.immutable.Map[String,Int] = Map(a5 -> 1, a4 -> 1, a3 -> 1, a1 -> 1, a2 -> 1)
The added fifth element jumped to the front of the Map and shifts the others around. Is there a way to keep the elements in order (1, 2, 3, 4, 5) ?
Thanks
By default Scala's immutable.Map uses HashMap.
From http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html:
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time
So a map is really not a table that contains "a1" -> 1, but a table that contains hash("a1") -> 1. The map reorders its keys based on the hash of the key rather than the key you put in it.
As was recommended in the comments, use LinkedHashMap or ListMap:
Scala Map implementation keeping entries in insertion order?
PS: You might be interested in reading this article: http://howtodoinjava.com/2012/10/09/how-hashmap-works-in-java/

When applying filter to a scala Map, how can I also check what entries were removed?

I am learning scala, and at one point I want to remove entries from a map based on the value (not the key). I also want to know how many entries were removed - my program expects that exactly one entry should be removed.
Removing entries by their values can be done by applying filterNot, ok -- but how can I verify that exactly one entry was removed?
So far the only way I saw to achieve that is to run the predicate twice -- once for the "count" method (to count how often the predicate matches), and then with filterNot, to actually remove the entries.
What is the Scala way of achieving that in one go?
The only other solution I found is to first use filter(...) to get the values to be removed ,and then use "-" to throw out the elements by their keys - but again, this requires two runs.
As long as you don't mind creating a collection out of the information that is removed you can use partition:
scala> Map(1 -> 1, 2 -> 2, 3 -> 3)
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> res0.partition { case (k, v) => v % 2 == 0 }
res3: (Map(2 -> 2),Map(1 -> 1, 3 -> 3))
If you just want to know if only one entry was removed then you can use size to get the size before and after the filter operation. The difference should be one.
scala> Map(1 -> 1, 2 -> 2, 3 -> 3)
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> res0.filterNot { case (k, v) => v % 2 == 0 }
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 3 -> 3)
scala> res0.size - res1.size
res2: Int = 1
Consider groupBy on the mapped values, as follows; let
val a = ((1 to 3) zip (11 to 33)).toMap
a: Map(1 -> 11, 2 -> 12, 3 -> 13)
then
a.groupBy(_._2 % 2 == 0)
res: Map(false -> Map(1 -> 11, 3 -> 13), true -> Map(2 -> 12))

Make a tuple of three integers in Scala

I have a problem where I need to make a tuplet of three elements. Let's suppose that I have a list, and I managed to write tuplet of two elements:
val list = (1 to 10).toList
val map1 = list.foldLeft(Map.empty[Int,String])( (map, value) => map + (value -> value.toString) )
Map(5 -> 5, 10 -> 10, 1 -> 1, 6 -> 6, 9 -> 9, 2 -> 2, 7 -> 7, 3 -> 3, 8 -> 8, 4 -> 4)
I want to make a tuplet of three elements. How can I do that?
I tried this code:
val map1 = list.foldLeft(Map.empty[Int,String])( (map, value, s) => map + (value -> value.toString -> value.toString) )
Map(5 -> 5 -> 5, 10 -> 10-> 10, 1 -> 1-> 1, 6 -> 6-> 6, 9 -> 9-> 9, 2 -> 2-> 2, 7 -> 7-> 7, 3 -> 3-> 3, 8 -> 8-> 8, 4 -> 4-> 4)
-> is just a sugar notation for a pair (a tuple of two items). The universal notation for tuples of any arity is a comma-delimited list in braces. E.g. (1,2,3) is a tuple of three integers, while as in your example the expression 1 -> 2 -> 3 would desugar to ((1,2),3), which is a tuple of a tuple of two ints and an int.
What you're trying to achieve with your code simply doesn't make any sense. A Map can be constructed from a list of pairs, treating the first element of the tuple as a key and the second as a value. Tuples of any other arities are not supported and wouldn't make sense in that case. You can however construct collections of other types (e.g., a List) containing tuples of any arities.
In general to convert a range into a Tuple3 you could do something like this:
(0 to 10) map (x=>(x,x*2,x+10))
res0: scala.collection.immutable.IndexedSeq[(Int, Int, Int)] = Vector((0,0,10), (1,2,11), (2,4,12), (3,6,13), (4,8,14), (5,10,15), (6,12,16), (7,14,17), (8,16,18), (9,18,19), (10,20,20))
To join 2 Seqs as a Tuple2 you zip them:
(1 to 5) zip (10 to 15)
res3: scala.collection.immutable.IndexedSeq[(Int, Int)] = Vector((1,10), (2,11), (3,12), (4,13), (5,14))
scala has built in support for zipping up to arity 3:
((0 to 3),(4 to 6),(7 to 9)).zipped.toList
res6: List[(Int, Int, Int)] = List((0,4,7), (1,5,8), (2,6,9))
If you need to do something similar to higher arities there's product-collections:
(0 to 3) flatZip (4 to 6) flatZip (7 to 9) flatZip (10 to 12)
res7: org.catch22.collections.immutable.CollSeq4[Int,Int,Int,Int] =
CollSeq((0,4,7,10),
(1,5,8,11),
(2,6,9,12))
And finally there's shapeless which does lots of cool things but has a moderate learning curve.