The scala program is iterating through a list of words and appending the word with value if already found or adding key->word otherwise. It is expected to produce Map[] but producing List[Map[]] instead.
val hashmap:Map[List[(Char, Int)], List[String]]=Map()
for (word <- dictionary) yield {
val word_occ = wordOccurrences(word)
hashmap + (if (hashmap.contains(word_occ)) (word_occ -> (hashmap(word_occ) ++ List(word))) else (word_occ -> List(word)))
}
Note that in this case you probably want to build the Map in a single pass rather than modifying a mutable Map:
val hashmap:Map[List[(Char, Int)], List[String]]=
dictionary
.map(x => (wordOccurrences(x), x))
.groupBy(_._1)
.map { case (k, v) => k -> v.map(_._2) }
In Scala 2.13 you can replace the last two lines with
.groupMap(_._1)(_._2)
You can also use a view on the dictionary to avoid creating the intermediate list if performance is a significant issue.
A for comprehension with a single <- generator de-sugars to a map() call on the original collection. And, as you'll recall, map() can change the elements of a collection, but it won't change the collection type itself.
So if dictionary is a List then what you end up with will be a List. The yield specifies what is to be the next element in the resulting List.
In your case the code is creating a new single-element Map for each element in the dictionary. Probably not what you want. I'd suggest you try using foldLeft().
Related
I have a situation here
I have two strins
val keyMap = "anrodiApp,key1;iosApp,key2;xyz,key3"
val tentMap = "androidApp,tenant1; iosApp,tenant1; xyz,tenant2"
So what I want to add is to create a nested immutable nested map like this
tenant1 -> (andoidiApp -> key1, iosApp -> key2),
tenant2 -> (xyz -> key3)
So basically want to group by tenant and create a map of keyMap
Here is what I tried but is done using mutable map which I do want, is there a way to create this using immmutable map
case class TenantSetting() {
val requesterKeyMapping = new mutable.HashMap[String, String]()
}
val requesterKeyMapping = keyMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map(keyValuePair => (keyValuePair[0],keyValuePair[1]))
.toMap
}.flatten.toMap
val config = new mutable.HashMap[String, TenantSetting]
tentMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map { keyValuePair =>
val requester = keyValuePair[0]
val tenant = keyValuePair[1]
if (!config.contains(tenant)) config.put(tenant, new TenantSetting)
config.get(tenant).get.requesterKeyMapping.put(requester, requesterKeyMapping.get(requester).get)
}
}
The logic to break the strings into a map can be the same for both as it's the same syntax.
What you had for the first string was not quite right as the filter you were applying to each string from the split result and not on the array result itself. Which also showed in that you were using [] on keyValuePair which was of type String and not Array[String] as I think you were expecting. Also you needed a trim in there to cope with the spaces in the second string. You might want to also trim the key and value to avoid other whitespace issues.
Additionally in this case the combination of map and filter can be more succinctly done with collect as shown here:
How to convert an Array to a Tuple?
The use of the pattern with 2 elements ensures you filter out anything with length other than 2 as you wanted.
The iterator is to make the combination of map and collect more efficient by only requiring one iteration of the collection returned from the first split (see comments below).
With both strings turned into a map it just needs the right use of groupByto group the first map by the value of the second based on the same key to get what you wanted. Obviously this only works if the same key is always in the second map.
def toMap(str: String): Map[String, String] =
str
.split(";")
.iterator
.map(_.trim.split(','))
.collect { case Array(key, value) => (key.trim, value.trim) }
.toMap
val keyMap = toMap("androidApp,key1;iosApp,key2;xyz,key3")
val tentMap = toMap("androidApp,tenant1; iosApp,tenant1; xyz,tenant2")
val finalMap = keyMap.groupBy { case (k, _) => tentMap(k) }
Printing out finalMap gives:
Map(tenant2 -> Map(xyz -> key3), tenant1 -> Map(androidApp -> key1, iosApp -> key2))
Which is what you wanted.
I have a case class with a parameter a which is a list of int tuple. I want to iterate over a and define operations on a.
I have tried the following:
case class XType (a: List[(Int, Int)]) {
for (x <- a) {
assert(x._2 >= 0)
}
def op(): XType = {
for ( x <- XType(a))
yield (x._1, x._2)
}
}
However, I am getting the error:
"Value map is not a member of XType."
How can I access the integers of tuples and define operations on them?
You're running into an issue with for comprehensions, which are really another way of expressing things like foreach and map (and flatMap and withFilter/filter). See here and here for more explanation.
Your first for comprehension (the one with asserts) is equivalent to
a.foreach(x => assert(x._2 >= 0))
a is a List, x is an (Int, Int), everything's good.
However, the second on (in op) translates to
XType(a).map(x => x)
which doesn't make sense--XType doesn't know what to do with map, like the error said.
An instance of XType refers to its a as simply a (or this.a), so a.map(x => x) would be just fine in op (and then turn the result into a new XType).
As a general rule, for comprehensions are handy for nested maps (or flatMaps or whatever), rather than as a 1-1 equivalent for for loops in other languages--just use map instead.
You can access to the tuple list by:
def op(): XType = {
XType(a.map(...))
}
I have to following map in scala
mutable.Map[String, mutable.Map[String, App]]()
Assuming App contains a field called Token which is not the key for the inner map.
What is the best practice to extract a map of from this nested map.
I did
val result = mutable.Map[String, AppKey]()
myMap foreach(x=>x._2 foreach(y=>result.put(y._2.token, y._2)))
You could use a combination of flatmap and map:
val result = myMap.flatMap { case (_, mp) => mp.map { case (_, app) => app.token -> app }}
If I'm understanding properly, the keys of the resulting map should be the tokens and the values should be the AppKeys, and you throw away the keys of the original maps.
As noted in the other answer, map (or actually flatMap) would be the more usual choice here:
val result = myMap.flatMap{ case (_, v) => v.map{ case (_, ak) => ak.token -> ak }}
.toMap
The inner map extracts the key-value pairs from one of the inner maps and the flatMap effectively concatenates these lists of pairs.
This solution is the same as the others, but uses for-comprehension which is a syntax-sugar for a chain of nested maps, flatMaps (and withFilters):
for {
(_, innerMap) <- myMap
(_, app) <- innerMap
} yield (app.token, app)
You can read more about it in the Scala Tour.
The result if of type Map[String, App], assuming app.token: String. You had it with AppKey, which is probably a typo (or otherwise you had a different intent).
As noted by #leo-c, you should take into account that between two pairs in a Map that have the same key the latter wins, so if those tokens are not unique, you'll just lose some information. If that's the case, you may want the result to have type Map[String, List[AppKey]] to group all apps with the same token. You can get it with this code:
val apps = myMap.values.flatMap(_.values)
val result = apps.groupBy(_.token)
See documentation for the groupBy method.
Item is a custom type.
I have a Iterable of pairs (Item, Item). The first element in every pair is the same, so I want to reduce the list to a single pair of type (Item, Array[Item])
// list: Iterable[(Item, Item)]
// First attempt
val res = list.foldLeft((null, Array[Item]()))((p1,p2) => {
(p2._1, p1._2 :+ p2._2)
}
// Second attempt
val r = list.unzip
val res = (r._1.head, r._2.toArray))
1. I don't know how to correctly setup the zero value in the first ("foldLeft") solution. Is there any way to do something like this?
2. Other than the second solution, is there a better way to reduce a list of custom object tuples to single tuple ?
If you are sure the first element in every pair is the same, why don't you use that information to simplify?
(list.head._1, list.map(_._2))
should do the work
if there are other cases where the first element is different, you may want to try:
list.groupBy(_._1).map { case (common, lst) => (common, lst.map(_._2)) }
I have a Spark RDD of type (Array[breeze.linalg.DenseVector[Double]], breeze.linalg.DenseVector[Double]). I wish to flatten its key to transform it into a RDD of type breeze.linalg.DenseVector[Double], breeze.linalg.DenseVector[Double]). I am currently doing:
val newRDD = oldRDD.flatMap(ob => anonymousOrdering(ob))
The signature of anonymousOrdering() is String => (Array[DenseVector[Double]], DenseVector[Double]).
It returns type mismatch: required: TraversableOnce[?]. The Python code doing the same thing is:
newRDD = oldRDD.flatMap(lambda point: [(tile, point) for tile in anonymousOrdering(point)])
How to do the same thing in Scala ? I generally use flatMapValuesbut here I need to flatten the key.
If I understand your question correctly, you can do:
val newRDD = oldRDD.flatMap(ob => anonymousOrdering(ob))
// newRDD is RDD[(Array[DenseVector], DenseVector)]
In that case, you can "flatten" the Array portion of the tuple using pattern matching and a for/yield statement:
newRDD = newRDD.flatMap{case (a: Array[DenseVector[Double]], b: DenseVector[Double]) => for (v <- a) yield (v, b)}
// newRDD is RDD[(DenseVector, DenseVector)]
Although it's still not clear to me where/how you want to use groupByKey
Change the code to use Map instead of FlatMap:
val newRDD = oldRDD.map(ob => anonymousOrdering(ob)).groupByKey()
You would only want to use flatmap here if anonymousOrdering returned a list of tuples and you wanted it flattened down.
As anonymousOrdering() is a function that you have in your code, update it in order to return a Seq[(breeze.linalg.DenseVector[Double], breeze.linalg.DenseVector[Double])]. It is like doing (tile, point) for tile in anonymousOrdering(point)] but directly at the end of the anonymous function. The flatMap will then take care to create one partition for each element of the sequences.
As a general rule, avoid having a collection as a key in a RDD.