Map not applied correctly - scala

In Scala, I have a method of type:
HashMap[String, String]
And this variable:
var bestMatch = new HashMap[String, (String, Int)]
At the end of the method, I am trying to return this value:
bestMatch.map((x, (y, count)) => (x, y))
However, I am getting the error:
Cannot resolve reference map with such signature
Why am I applying it incorrectly?

It should be something like this:
bestMatch.map(tuple => ( tuple._1, tuple._2._1))
You can't just put both arguments of the (String,Int) Tuple as your lambda function parameters. You need to use the tuple as one. If you write out your parameter types it becomes more clear maybe.
bestMatch.map((tuple: (String,(String,Int))) => ( tuple._1, tuple._2._1))
Also in your case it might be better to use mapValues since you're not doing anything with your key. Then you can use this:
bestMatch.mapValues(tuple => tuple._1)
Which is much more readable if you ask me. You could even go further and say:
bestMatch.mapValues(_._1)

You can write
bestMatch map {case (x, (y, count)) => (x, y)}

Related

How to extract values from Some() in Scala

I have Some() type Map[String, String], such as
Array[Option[Any]] = Array(Some(Map(String, String)
I want to return it as
Array(Map(String, String))
I've tried few different ways of extracting it-
Let's say if
val x = Array(Some(Map(String, String)
val x1 = for (i <- 0 until x.length) yield { x.apply(i) }
but this returns IndexedSeq(Some(Map)), which is not what I want.
I tried pattern matching,
x.foreach { i =>
i match {
case Some(value) => value
case _ => println("nothing") }}
another thing I tried that was somewhat successful was that
x.apply(0).get.asInstanceOf[Map[String, String]]
will do something what I want, but it only gets 0th index of the entire array and I'd want all the maps in the array.
How can I extract Map type out of Some?
If you want an Array[Any] from your Array[Option[Any]], you can use this for expression:
for {
opt <- x
value <- opt
} yield value
This will put the values of all the non-empty Options inside a new array.
It is equivalent to this:
x.flatMap(_.toArray[Any])
Here, all options will be converted to an array of either 0 or 1 element. All these arrays will then be flattened back to one single array containing all the values.
Generally, the pattern is either to use transformations on the Option[T], like map, flatMap, filter, etc.
The problem is, we'll need to add a type cast to retrieve the underlying Map[String, String] from Any. So we'll use flatten to remove any potentially None types and unwrap the Option, and asInstanceOf to retreive the type:
scala> val y = Array(Some(Map("1" -> "1")), Some(Map("2" -> "2")), None)
y: Array[Option[scala.collection.immutable.Map[String,String]]] = Array(Some(Map(1 -> 1)), Some(Map(2 -> 2)), None)
scala> y.flatten.map(_.asInstanceOf[Map[String, String]])
res7: Array[Map[String,String]] = Array(Map(1 -> 1), Map(2 -> 2))
Also when you talk just about single value you can try Some("test").head and for null simply Some(null).flatten

acces tuple inside a tuple for anonymous map job in Spark

This post is essentially about how to build joint and marginal histograms from a (String, String) RDD. I posted the code that I eventually used below as the answer.
I have an RDD that contains a set of tuples of type (String,String) and since they aren't unique I want to get a look at how many times each String, String combination occurs so I use countByValue like so
val PairCount = Pairs.countByValue().toSeq
which gives me a tuple as output like this ((String,String),Long) where long is the number of times that the (String, String) tuple appeared
These Strings can be repeated in different combinations and I essentially want to run word count on this PairCount variable so I tried something like this to start:
PairCount.map(x => (x._1._1, x._2))
But the output the this spits out is String1->1, String2->1, String3->1, etc.
How do I output a key value pair from a map job in this case where the key is going to be one of the String values from the inner tuple, and the value is going to be the Long value from the outter tuple?
Update:
#vitalii gets me almost there. the answer gets me to a Seq[(String,Long)], but what I really need is to turn that into a map so that I can run reduceByKey it afterwards. when I run
PairCount.flatMap{case((x,y),n) => Seq[x->n]}.toMap
for each unique x I get x->1
for example the above line of code generates mom->1 dad->1 even if the tuples out of the flatMap included (mom,30) (dad,59) (mom,2) (dad,14) in which case I would expect toMap to provide mom->30, dad->59 mom->2 dad->14. However, I'm new to scala so I might be misinterpreting the functionality.
how can I get the Tuple2 sequence converted to a map so that I can reduce on the map keys?
If I correctly understand question, you need flatMap:
val pairCountRDD = pairs.countByValue() // RDD[((String, String), Int)]
val res : RDD[(String, Int)] = pairCountRDD.flatMap { case ((s1, s2), n) =>
Seq(s1 -> n, s2 -> n)
}
Update: I didn't quiet understand what your final goal is, but here's a few more examples that may help you, btw code above is incorrect, I have missed the fact that countByValue returns map, and not RDD:
val pairs = sc.parallelize(
List(
"mom"-> "dad", "dad" -> "granny", "foo" -> "bar", "foo" -> "baz", "foo" -> "foo"
)
)
// don't use countByValue, if pairs is large you will run out of memmory
val pairCountRDD = pairs.map(x => (x, 1)).reduceByKey(_ + _)
val wordCount = pairs.flatMap { case (a,b) => Seq(a -> 1, b ->1)}.reduceByKey(_ + _)
wordCount.take(10)
// count in how many pairs each word occur, keys and values:
val wordPairCount = pairs.flatMap { case (a,b) =>
if (a == b) {
Seq(a->1)
} else {
Seq(a -> 1, b ->1)
}
}.reduceByKey(_ + _)
wordPairCount.take(10)
to get the histograms for the (String,String) RDD I used this code.
val Hist_X = histogram.map(x => (x._1-> 1.0)).reduceByKey(_+_).collect().toMap
val Hist_Y = histogram.map(x => (x._2-> 1.0)).reduceByKey(_+_).collect().toMap
val Hist_XY = histogram.map(x => (x-> 1.0)).reduceByKey(_+_)
where histogram was the (String,String) RDD

Apply a sequence of functions to value and get the final result

I wish to apply a sequence of functions to an object (each of the functions may return the same or modified object) and get the ultimate result returned by the last function.
Is there an idiomatic Scala way to turn this (pseudocode):
val pipeline = ListMap(("a" -> obj1), ("b" -> obj2), ("c" -> obj3))
into this?
val initial_value = Something("foo", "bar")
val result = obj3.func(obj2.func(obj1.func(initial_value)))
The pipeline is initialized at runtime and contains an undetermined number of "manglers".
I tried with foreach but it requires an intermediate var to store the result, and foldLeft only works on types of ListMap, while the initial value and the result are of type Something.
Thanks
This should do it:
pipeline.foldLeft(initial_value){case (acc, (k,obj)) => obj.func(acc)}
No idea why pipeline contains pairs, though.
Assuming input and output types are the same, I'd go with a reduceLeft and composition by andThen:
def pipe[A](a: A, funcs: List[A => A]): A = funcs.reduceLeft(_ andThen _)(a)
I think foldLeft is the right choice:
val pipeline = List("a"-> func1, "b"-> func2, "c"-> func3)
...
val result = pipeline.foldLeft(initial_value) {case (acc,(key,func)) => func(acc)}
Get rid of your keys, first:
pipeline.values.foldLeft(initial_value)((a, f) => f.func(a))

flatMap on a map gives error: wrong number of parameters; expected = 1

I have a map m
val m = Map(1->2, 3->4, 5->6, 7->8, 4->4, 9->9, 10->12, 11->11)
Now i want a map whose keys are equal to the values. So i do this
def eq(k: Int, v: Int) = if (k == v) Some(k->v) else None
m.flatMap((k,v) => eq(k,v))
This gives me the error
error: wrong number of parameters; expected = 1
m.flatMap((k,v) => eq(k,v))
Whats wrong with the above code? flatMap expects a one argument function and here i am passing one argument which is a Pair of integers.
Also this works
m.flatMap {
case (k,v) => eq(k,v)
}
but this does not
m.flatMap {
(k,v) => eq(k,v)
}
Looks like i am missing something. Help?
There is no such syntax:
m.flatMap((k,v) => eq(k,v))
Well, in fact there is such syntax, but actually it is used in functions that accept two arguments (like reduce):
List(1,2,3,4).reduce((acc, x) => acc + x)
The
m.flatMap {
case (k,v) => eq(k,v)
}
syntax works because in fact it is something like this:
val temp: PartialFunction[Tuple2[X,Y], Tuple2[Y,X]] = {
case (k,v) => eq(k,v) // using literal expression to construct function
}
m.flatMap(temp) // with braces ommited
They key thing here is the usage of case word (actually, there is a discussion to enable your very syntax) which turns usual braces expression, like { ... } into full blown anonymous partial function
(If you want to simply fix the error you're getting, see the 2nd solution (with flatMap); if you want a generally nicer solution, read from the beginning.)
What you need instead is filter not flatMap:
def eq(k: Int, v: Int) = k == v
val m = Map(1->2, 3->4, 5->6, 7->8, 4->4, 9->9, 10->12, 11->11)
m.filter((eq _).tupled)
...which of course reduces to just the following, without the need for eq:
m.filter { case (k, v) => k == v }
result:
Map(9 -> 9, 11 -> 11, 4 -> 4)
OR... If you want to stick with flatMap
First you must know that flatMap will pass to your function TUPLES not keys and values as separate arguments.
Additionally, you must change the Option returned by eq to something that can be fed back to flatMap on sequences such as List or Map (actually any GenTraversableOnce to be precise):
def eq(k: Int, v: Int) = if (k == v) List(k -> v) else Nil
m.flatMap { case (k,v) => eq(k,v) } // use pattern matching to unpack the tuple
or the uglier but equivalent:
m.flatMap { x => eq(x._1, x._2) }
alternatively, you can convert eq to take a tuple instead:
m.flatMap((eq _).tupled)
I think that what you want is a single argument that will be a couple, not two arguments. Something like this may work
m.flatMap(k => eq(k._1, k._2))
The code snippet that works uses pattern matching. You give names to both elements of your couple. It's a partial function and can be use here in your flatMap.
You have to do:
m.flatMap { case (k,v) => eq(k,v) }
Note that here I switch to curly braces, which indicates a function block rather than parameters, and the function here is a case statement. This means that the function block I'm passing to flatMap is a partialFunction that is only invoked for items that match the case statement.
Your eq function takes two parameters, that is why you are getting the type error. Try:
def f(p: (Int, Int)) = if (p._1 == p._2) Some(p) else None
m flatMap f

JSON to XML in Scala and dealing with Option() result

Consider the following from the Scala interpreter:
scala> JSON.parseFull("""{"name":"jack","greeting":"hello world"}""")
res6: Option[Any] = Some(Map(name -> jack, greeting -> hello world))
Why is the Map returned in Some() thing? And how do I work with it?
I want to put the values in an xml template:
<test>
<name>name goes here</name>
<greeting>greeting goes here</greeting>
</test>
What is the Scala way of getting my map out of Some(thing) and getting those values in the xml?
You should probably use something like this:
res6 collect { case x: Map[String, String] => renderXml(x) }
Where:
def renderXml(m: Map[String, String]) =
<test><name>{m.get("name") getOrElse ""}</name></test>
The collect method on Option[A] takes a PartialFunction[A, B] and is a combination of filter (by a predicate) and map (by a function). That is:
opt collect pf
opt filter (a => pf isDefinedAt a) map (a => pf(a))
Are both equivalent. When you have an optional value, you should use map, flatMap, filter, collect etc to transform the option in your program, avoiding extracting the option's contents either via a pattern-match or via the get method. You should never, ever use Option.get - it is the canonical sign that you are doing it wrong. Pattern-matching should be avoided because it represents a fork in your program and hence adds to cyclomatic complexity - the only time you might wish to do this might be for performance
Actually you have the issue that the result of the parseJSON method is an Option[Any] (the reason is that it is an Option, presumably, is that the parsing may not succeed and Option is a more graceful way of handling null than, well, null).
But the issue with my code above is that the case x: Map[String, String] cannot be checked at runtime due to type erasure (i.e. scala can check that the option contains a Map but not that the Map's type parameters are both String. The code will get you an unchecked warning.
An Option is returned because parseFull has different possible return values depending on the input, or it may fail to parse the input at all (giving None). So, aside from an optional Map which associates keys with values, an optional List can be returned as well if the JSON string denoted an array.
Example:
scala> import scala.util.parsing.json.JSON._
import scala.util.parsing.json.JSON._
scala> parseFull("""{"name":"jack"}""")
res4: Option[Any] = Some(Map(name -> jack))
scala> parseFull("""[ 100, 200, 300 ]""")
res6: Option[Any] = Some(List(100.0, 200.0, 300.0))
You might need pattern matching in order to achieve what you want, like so:
scala> parseFull("""{"name":"jack","greeting":"hello world"}""") match {
| case Some(m) => Console println ("Got a map: " + m)
| case _ =>
| }
Got a map: Map(name -> jack, greeting -> hello world)
Now, if you want to generate XML output, you can use the above to iterate over the key/value pairs:
import scala.xml.XML
parseFull("""{"name":"jack","greeting":"hello world"}""") match {
case Some(m: Map[_,_]) =>
<test>
{
m map { case (k,v) =>
XML.loadString("<%s>%s</%s>".format(k,v,k))
}
}
</test>
case _ =>
}
parseFull returns an Option because the string may not be valid JSON (in which case it will return None instead of Some).
The usual way to get the value out of a Some is to pattern match against it like this:
result match {
case Some(map) =>
doSomethingWith(map)
case None =>
handleTheError()
}
If you're certain the input will always be valid and so you don't need to handle the case of invalid input, you can use the get method on the Option, which will throw an exception when called on None.
You have two separate problems.
It's typed as Any.
Your data is inside an Option and a Map.
Let's suppose we have the data:
val x: Option[Any] = Some(Map("name" -> "jack", "greeting" -> "hi"))
and suppose that we want to return the appropriate XML if there is something to return, but not otherwise. Then we can use collect to gather those parts that we know how to deal with:
val y = x collect {
case m: Map[_,_] => m collect {
case (key: String, value: String) => key -> value
}
}
(note how we've taken each entry in the map apart to make sure it maps a string to a string--we wouldn't know how to proceed otherwise. We get:
y: Option[scala.collection.immutable.Map[String,String]] =
Some(Map(name -> jack, greeting -> hi))
Okay, that's better! Now if you know which fields you want in your XML, you can ask for them:
val z = for (m <- y; name <- m.get("name"); greet <- m.get("greeting")) yield {
<test><name>{name}</name><greeting>{greet}</greeting></test>
}
which in this (successful) case produces
z: Option[scala.xml.Elem] =
Some(<test><name>jack</name><greeting>hi</greeting></test>)
and in an unsuccessful case would produce None.
If you instead want to wrap whatever you happen to find in your map in the form <key>value</key>, it's a bit more work because Scala doesn't have a good abstraction for tags:
val z = for (m <- y) yield <test>{ m.map { case (tag, text) => xml.Elem(null, tag, xml.Null, xml.TopScope, xml.Text(text)) }}</test>
which again produces
z: Option[scala.xml.Elem] =
Some(<test><name>jack</name><greeting>hi</greeting></test>)
(You can use get to get the contents of an Option, but it will throw an exception if the Option is empty (i.e. None).)