Question 1: Can I use tuple as a key of a map in Scala?
Question 2: If yes , how can I create a map with a tuple as key?
Question 3: I want to convert my scala map to RDD, how would I do in the following case? I am trying to do in this way
var mapRDD = sc.parallelize(map.toList)
Is this the right way to do ?
Question 4: For this particular code snippet, when I do a println on map, it has no values.
I have not included the whole code, basically mapAgainstValue contains userId as key and list of friends as values. I want to recreate a map RDD with the following transformation in the key.
What would be the reason for empty map?
var mapAgainstValue = logData.map(x=>x.split("\t")).filter(x => x.length == 2).map(x => (x(0),x(1).split(",")))
var map:Map[String,List[String]] = Map()
var changedMap = mapAgainstValue.map{
line =>
var key ="";
for(userIds <- line._2){
if(line._1.toInt < userIds.toInt){
key =line._1.concat("-"+userIds);
}
else {
key = userIds.concat("-" + line._1);
}
map += (key -> line._2.toList)
}
}
changedMap.collect()
map.foreach(println)
Yes, you can use Tuple as a key in Map.
For example:
val userMap = Map(
(1, 25) -> "shankar",
(2, 35) -> "ramesh")
Then you can try print the output using foreach
val userMapRDD = sparkContext.parallelize(userMap.toSeq, 2)
mapRDD.foreach(element => {
println(element)
})
If you want to transform the mapRDD to something else. following code returns only age and name as tuple.
val mappedRDD = userMapRDD.map {
case ((empId: Int, age: Int), name: String) => {
(age, name)
}
}
Related
I have a situation here
I have two strins
val keyMap = "anrodiApp,key1;iosApp,key2;xyz,key3"
val tentMap = "androidApp,tenant1; iosApp,tenant1; xyz,tenant2"
So what I want to add is to create a nested immutable nested map like this
tenant1 -> (andoidiApp -> key1, iosApp -> key2),
tenant2 -> (xyz -> key3)
So basically want to group by tenant and create a map of keyMap
Here is what I tried but is done using mutable map which I do want, is there a way to create this using immmutable map
case class TenantSetting() {
val requesterKeyMapping = new mutable.HashMap[String, String]()
}
val requesterKeyMapping = keyMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map(keyValuePair => (keyValuePair[0],keyValuePair[1]))
.toMap
}.flatten.toMap
val config = new mutable.HashMap[String, TenantSetting]
tentMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map { keyValuePair =>
val requester = keyValuePair[0]
val tenant = keyValuePair[1]
if (!config.contains(tenant)) config.put(tenant, new TenantSetting)
config.get(tenant).get.requesterKeyMapping.put(requester, requesterKeyMapping.get(requester).get)
}
}
The logic to break the strings into a map can be the same for both as it's the same syntax.
What you had for the first string was not quite right as the filter you were applying to each string from the split result and not on the array result itself. Which also showed in that you were using [] on keyValuePair which was of type String and not Array[String] as I think you were expecting. Also you needed a trim in there to cope with the spaces in the second string. You might want to also trim the key and value to avoid other whitespace issues.
Additionally in this case the combination of map and filter can be more succinctly done with collect as shown here:
How to convert an Array to a Tuple?
The use of the pattern with 2 elements ensures you filter out anything with length other than 2 as you wanted.
The iterator is to make the combination of map and collect more efficient by only requiring one iteration of the collection returned from the first split (see comments below).
With both strings turned into a map it just needs the right use of groupByto group the first map by the value of the second based on the same key to get what you wanted. Obviously this only works if the same key is always in the second map.
def toMap(str: String): Map[String, String] =
str
.split(";")
.iterator
.map(_.trim.split(','))
.collect { case Array(key, value) => (key.trim, value.trim) }
.toMap
val keyMap = toMap("androidApp,key1;iosApp,key2;xyz,key3")
val tentMap = toMap("androidApp,tenant1; iosApp,tenant1; xyz,tenant2")
val finalMap = keyMap.groupBy { case (k, _) => tentMap(k) }
Printing out finalMap gives:
Map(tenant2 -> Map(xyz -> key3), tenant1 -> Map(androidApp -> key1, iosApp -> key2))
Which is what you wanted.
I have some records in a List .
Now I want to create a new Map(Mutable Map) from that List with unique key for each record. I want to achieve this my reading a List and calling the higher order method called map in scala.
records.txt is my input file
100,Surender,2015-01-27
100,Surender,2015-01-30
101,Raja,2015-02-19
Expected Output :
Map(0-> 100,Surender,2015-01-27, 1 -> 100,Surender,2015-01-30,2 ->101,Raja,2015-02-19)
Scala Code :
object SampleObject{
def main(args:Array[String]) ={
val mutableMap = scala.collection.mutable.Map[Int,String]()
var i:Int =0
val myList=Source.fromFile("D:\\Scala_inputfiles\\records.txt").getLines().toList;
println(myList)
val resultList= myList.map { x =>
{
mutableMap(i) =x.toString()
i=i+1
}
}
println(mutableMap)
}
}
But I am getting output like below
Map(1 -> 101,Raja,2015-02-19)
I want to understand why it is keeping the last record alone .
Could some one help me?
val mm: Map[Int, String] = Source.fromFile(filename).getLines
.zipWithIndex
.map({ case (line, i) => i -> line })(collection.breakOut)
Here the (collection.breakOut) is to avoid the extra parse caused by toMap.
Consider
(for {
(line, i) <- Source.fromFile(filename).getLines.zipWithIndex
} yield i -> line).toMap
where we read each line, associate an index value starting from zero and create a map out of each association.
I need to form this:
{
“item” : “hardcoded_value”,
“item2” : “hardcoded_value”,
“item3” : “hardcoded_value”,
}
In the exec block I am trying:
// list with items [“item1”, “ item2”, “ item3”]
val theList = session("itemNames").as[List[String]]
val theMap = Map.empty[String,String] // empty map
// add items from list in map
theList.foreach{ key =>
| theMap += key -> "hardcoded_value"
}
But getting error at += position.
Also tried:
theList.foreach(key => theMap += key -> "hardcoded_value" )
How to insert key and value into a map by iterating over a list? I am new to gatling and scala.
After looking at your question in more detail, I realize that you're not just asking about turning the Collection into a Map. Combining the answers from How can I use map and receive an index as well in Scala? and Scala best way of turning a Collection into a Map-by-key?, you can do the following:
List("first", "second", "third").zipWithIndex.map {
case (item, index) => ("item" + index) -> item.toString
}
>res0: List[(String, String)] = List((item0,first), (item1,second), (item2,third))
I have an application which has to gather some external data first, then turn them into objects. Afterwards, it will do some analysis on the data.
I managed to gather the data and put it into a map. The map contains a unique key for each of the future objects, and a ListBuffer of the data needed to build the object.
Now I want to create a list of objects from this map, and don't know how to get my data out of the map. I haven't worked with maps before (yes, I am that new to the language), but found a question which says that, when I want to access an element of the map with head, I get a tuple of the key and the value. I hoped that I get the same when I iterate over the map with map (the method), but this doesn't appear to work. And I looked in Programming with Scala, but couldn't find a place saying what I get when I iterate over a map.
Here is an MWE for what I want to do:
//This code will gather number names from different languages and then create objects of type Number containing each name.
import scala.collection.mutable
import scala.collection.mutable.ListBuffer
class Number (val theNumber: Int, val names: List[String]) {
override def toString = theNumber + " is known as " + names.mkString(", ") + "."
}
// Construct a map holding example data
val numbersAsMap = mutable.Map.empty[Int, ListBuffer[String]]
numbersAsMap += (1 -> new ListBuffer[String])
numbersAsMap += (2 -> new ListBuffer[String])
numbersAsMap += (3 -> new ListBuffer[String])
numbersAsMap(1) += "one"
numbersAsMap(1) += "eins"
numbersAsMap(1) += "uno"
numbersAsMap(2) += "two"
numbersAsMap(2) += "zwei"
numbersAsMap(2) += "due"
numbersAsMap(3) += "three"
numbersAsMap(3) += "drei"
numbersAsMap(3) += "tre"
// Create a list of numbers
numbersAsMap map ((key, value) => new Number(key, value.toList)).toList
// error: missing parameter type
// obviously I'm not getting tuples, let's try it another way
numbersAsMap.keys map (key => new Number(key, numbersAsMap(key).toList)).toList
// it throws the same error as above :(
The map method of Map complies with the map method of other collections, so it's body only gets one parameter. In case of a Map, this is a tuple consisting of the key and the value.
So you can write:
numbersAsMap.map(kv => new Number(kv._1, kv._2.toList)).toList
If you want to name the tuple values:
numbersAsMap.map {
kv =>
val (key, value) = kv
new Number(key, value.toList)
}.toList
But there is another option to write it nicely in a single line: Use a partial function:
numbersAsMap.map { case (key, value) => new Number(key, value.toList) }.toList
A { case ... } defines a partial function; and this way you can extract the values of the tuple.
Here are two possible ways to do the map operation on your Map:
val result = numbersAsMap.map{
case (key, value) =>
new Number(key, value.toList)
}.toList
val result2 = numbersAsMap.map(kv => new Number(kv._1, kv._2.toList)).toList
I have following code:
val rows: Iterator[Map[String,String]] = CSVDictReader(file.getInputStream)
val parsedProducts = rows.map(x => Product(name = x.get("NAME"), id = x.get("ID")))
And I would like to replace map entries with empty string. With a map alone I could use:
filter(_._2.trim.nonEmpty)
I cannot get my head around how to do this in a nice way without introducing some helper function to return None in case value is empty string.
Edit: In my example I have only name and id but in the real code there are easily over ten columns of data. Also, I would need to have None instead of empty string value. So name=Option("") should be replaced with name=None
You can filter Options as well.
Let's say your x.get("NAME") returns a Some("") or even Some(" ").
Then you may do something like this: x.get("NAME").filter(_.trim.nonEmpty)
Hope I understood your question correctly
something like this?
val rows: Iterator[Map[String,String]] = CSVDictReader(file.getInputStream)
val parsedProducts = for {
row <- rows
name <- row.get("NAME")
id <- row.get("ID")
} yield Product(name, id)
Here, if row.get("NAME") or row.get("ID") return None, the corresponding entry will not be yielded.
I'm not sure if this is what you're looking for, but the following code snippet:
val rows: Iterator[Map[String,String]] = Iterator(Map("NAME" -> " ", "ID" -> "foo"), Map("NAME" -> " ", "ID" -> ""))
val fieldNames = List("NAME","ID","ANOTHER COLUMN")
val cleanedRows = rows map { row =>
fieldNames map { fieldName =>
Map ( fieldName -> row.get(fieldName).filter (_.trim.nonEmpty) )
}
}
while(cleanedRows.hasNext) {
println(cleanedRows.next)
}
Would print out:
List(Map(NAME -> None), Map(ID -> Some(foo)), Map(ANOTHER COLUMN -> None))
List(Map(NAME -> None), Map(ID -> None), Map(ANOTHER COLUMN -> None))
So at this point cleanedRows would have the entries you need to create your Product instances.