I'm running a ScalaTest asserting the right data type is returned by my actor.
The actor named "testActor" converts data from SortedMap[Long, SortedMap[String, Double]] to SortedMap[String, Array[Double]]
The current code is:
val data: SortedMap[Long, SortedMap[String, Double]] = SortedMap (1000L -> SortedMap("c1" -> 1., "c2" -> 2.1), 2000L -> SortedMap("c1" -> 1.1), 3000L -> SortedMap("c1" -> 0.95))
val expectedResult = SortedMap("t1" -> Array(1., 1.1, 0.95), "t2" -> Array(2.1))
actor ! testActor(data)
expectMsg(replyTestActor(expectedResult)
For some reason the assert is done on the map physical address, i.e.
assertion failed: expected replyTestActor(Map(c1 -> [D#60b8a8f6, c2 -> [D#7b5ce015),2,2000), found replyTestActor(Map(c1 -> [D#e7bc1f9, c2 -> [D#5efbc9dc),2,2000)
I must comment that in debug mode, when i enter "Expression Evaluation" on a break point the actor message and the "expectedValue" are identical×¥
The problem is the values in your SortedMap.
> Array(42) == Array(42)
res0: Boolean = false
Array does not provide a friendly equal implementation.
Edit: plus, Array is a mutable structure, usually not recommend to use them while passing messages between actors.
Related
I have one map like
val strMap = Map[String, String]("a" -> "a1", "b" -> "b1") // Map(a -> a1, b -> b1)
and I want to create another map with same key but different value, based on value in strMap. For example
case class Data(data: String) {}
var dataMap = scala.collection.mutable.Map[String, Data]()
strMap.foreach (keyVal => {dataMap(keyVal._1) = Data(keyVal._2)})
val dataMapToUse = dataMap.toMap // Map(a -> Data(a1), b -> Data(b1))
but writing this in imperative style is causing issue like creation of "var dataMap", though I want to get immutable map. Because of this, I have to call toMap to get same.
How can I achieve same in functional programming?
Scala Version: 2.11
Why not simply use,
val dataMapToUse = strMap.map{case(k,v) =>(k -> Data(v))}
After reading two JSON files, I have the following two maps:
val m1 = Map("events" -> List(Map("id" -> "Beatles", "when" -> "Today"), Map("id"->"Elvis", "when"->"Tomorrow")))
val m2 = Map("events" -> List(Map("id" -> "Beatles", "desc"-> "The greatest band"), Map("id"->"BeachBoys","desc"-> "The second best band")))
I want to merge them in a generic way (without referencing the specific structure of these two particular maps) such that the result would be:
val m3 = Map("events" -> List(Map("id" -> "Beatles", "when" -> "Today", "desc"->"The greatest band")))
That is, first intersect by id and then join (both on the same depth level). It would be fine if it only works for a max depth of one as in this example (but of course, a fully recursive solution that could handle arbitrarily nested lists of maps / maps would be even better). This needs to be done in a complete generic way (otherwise it would be trivial), as the keys (like "events", "id", "when", ...) in both source JSON file will change.
I tried the (standard) Monoid/Semigroup addition in Scalaz/ Cats, however, this of course only concatenates the list elements and does not intersect/join.
val m3 = m1.combine(m2) // Cats
// Map(events -> List(Map(id -> Beatles, when -> Today), Map(id -> Elvis, when -> Tomorrow), Map(id -> Beatles, desc -> The greatest band), Map(id -> BeachBoys, desc -> The second best band)))
EDIT: The only assumption of the map structure is that there might be an "id" field. If it is present, then intersect and finally join.
Some background: I have two kind of JSON files. One with static information (e.g. a description of a band) and one with dynamic information (e.g. the date of the next concert). After reading the files, I get the two maps as presented above. I want to avoid to exploit the specific structure of the JSON files (e.g. by creating a domain model via case classes) as there are different scenarios with completely different source file structures which are likely subject to change and hence I don't want to create a dependency to this file structures in source code. Therefore, I need a generic way to merge these two maps.
So you have these two maps.
val m1 = Map("events" -> List(Map("id" -> "Beatles", "when" -> "Today"), Map("id"->"Elvis", "when"->"Tomorrow")))
val m2 = Map("events" -> List(Map("id" -> "Beatles", "desc"-> "The greatest band"), Map("id"->"BeachBoys","desc"-> "The second best band")))
And, it looks like you are trying to group events and form event groups with id.
Your domain model can represented with following case classes.
case class EventDetails(title: String, desc: String)
case class Event(subjectId: String, eventDetails: EventDetails)
case class EventGroup(subjectId: String, eventDetailsList: List[EventDetails])
Lets convert out Maps into more meaning full domain objects,
def eventMapToEvent(eventMap: Map[String, String]): Option[Event] = {
val subjectIdOpt = eventMap.get("id")
val (titleOpt, descOpt) = (eventMap - "id").toList.headOption match {
case Some((title, desc)) => (Some(title), Some(desc))
case _ => (None, None)
}
(subjectIdOpt, titleOpt, descOpt) match {
case (Some(subjectId), Some(title), Some(desc)) => Some(Event(subjectId, EventDetails(title, desc)))
case _ => None
}
}
val m1Events = m1.getOrElse("events", List()).flatMap(eventMapToEvent)
val m2Events = m2.getOrElse("events", List()).flatMap(eventMapToEvent)
val events = m1Events ++ m2Events
Now, the world will make more sense compared to dealing with maps. And we can proceed with the groupings.
val eventGroups = events.groupBy(event => event.subjectId).map({
case (subjectId, eventList) => EventGroup(subjectId, eventList.map(event => event.eventDetails)).toList
})
// eventGroups: scala.collection.immutable.Iterable[EventGroup] = List(EventGroup(BeachBoys,List(EventDetails(desc,The second best band))), EventGroup(Elvis,List(EventDetails(when,Tomorrow))), EventGroup(Beatles,List(EventDetails(when,Today), EventDetails(desc,The greatest band))))
I have just started a project in work where we are migrating some C# tooling across to a new Scala project. This is my first exposure to the language (and functional programming in general) so in the interest of not just writing Java style code in Scala, I am wondering what the correct approach to handling the following scenario is.
We have two map objects which represent tabular data with the following structure:
map1 key|date|mapping val
map2 key|number
The mapping value in the first object is not always populated. Currently these are represented by Map[String, Array[String]] and Map[String, Double] types.
In the C# tool we have the following approach:
Loop through key set in first map
For every key, check to see if the mapping val is blank
If no mapping then fetch the number from map 2 and return
If mapping exists then recursively call method to get full range of mapping values and their numbers, summing as you go. E.g. key 1 might have a mapping to key 4, key 4 might have a mapping to key 5 etc and we want to sum all of the values for these keys in map2.
Is there a clever way to do this in Scala which would avoid updating a list from within a for loop and recursively walking the map?
Is this what you are after?
#annotation.tailrec
def recurse(key: String, count: Double, map1: Map[String, String], map2: Map[String, Double]): Double = {
map1.get(key) match {
case Some(mappingVal) if mappingVal == "" =>
count + map2.getOrElse(mappingVal, 0.0)
case Some(mappingVal) =>
recurse(mappingVal, count + map2.getOrElse(mappingVal, 0.0), map1, map2)
case None => count
}
}
example use:
val m1: Map[String, String] = Map("1" -> "4", "4" -> "5", "5" -> "6", "8" -> "")
val m2: Map[String, Double] = Map("1" -> 1.0, "4" -> 4.0, "6" -> 10.0)
m1.map {
case (k, _) => k -> recurse(k, 0.0, m1, m2)
}.foreach(println)
Output:
(1,14.0)
(4,10.0)
(5,10.0)
(8,0.0)
Note that there is no cycle detection - this will never terminate if map1 has a cycle.
I'm new to spark and scala and I've come up with a compile error with scala:
Let's say we have a rdd, which is a map like this:
val rawData = someRDD.map{
//some ops
Map(
"A" -> someInt_var1 //Int
"B" -> someInt_var2 //Int
"C" -> somelong_var //Long
)
}
Then, I want to get histogram info of these vars. So, here is my code:
rawData.map{row => row.get("A")}.histogram(10)
And the compile error says:
value histogram is not a member of org.apache.spark.rdd.RDD[Option[Any]]
I'm wondering why rawData.map{row => row.get("A")} is org.apache.spark.rdd.RDD[Option[Any]] and how to transform it to rdd[Int]?
I have tried like this:
rawData.map{row => row.get("A")}.map{_.toInt}.histogram(10)
But it compiles fail:
value toInt is not a member of Option[Any]
I'm totally confused and seeking for help here.
You get Option because Map.get returns an option; Map.get returns None if the key doesn't exist in the Map; And Option[Any] is also related to the miscellaneous data types of the Map's Value, you have both Int and Long, in my case it returns AnyVal instead of Any;
A possible solution is use getOrElse to get rid of Option by providing a default value when the key doesn't exist, and if you are sure A's value is always a int, you can convert it from AnyVal to Int using asInstanceOf[Int];
A simplified example as follows:
val rawData = sc.parallelize(Seq(Map("A" -> 1, "B" -> 2, "C" -> 4L)))
rawData.map(_.get("A"))
// res6: org.apache.spark.rdd.RDD[Option[AnyVal]] = MapPartitionsRDD[9] at map at <console>:27
rawData.map(_.getOrElse("A", 0).asInstanceOf[Int]).histogram(10)
// res7: (Array[Double], Array[Long]) = (Array(1.0, 1.0),Array(1))
I'm trying to send an object to a remote actor and I got this exception:
ERROR akka.remote.EndpointWriter - Transient association error (association remains live)
java.io.NotSerializableException: scala.collection.immutable.MapLike$$anon$2
The object being serialized is a case class:
case class LocationReport(idn: String, report: String, timestamp: Option[String], location: Attr, status: Attr, alarms: Attr, network: Attr, sensors: Attr) extends Message(idn) {
val ts = timestamp getOrElse location("fix_timestamp")
def json =
(report ->
("TIME" -> ts) ~
("location" -> location) ~
("alarms" -> alarms) ~
("network" -> network) ~
("sensors" -> ((status ++ sensors) + ("CUSTOMCLOCK" -> Report.decodeTimestamp(ts)))))
}
And Attr is a type re-definition:
type Attr = Map[String, String]
The Message class is pretty simple:
abstract class Message(idn: String) {
def topic = idn
def json(): JValue
}
I'm wondering if the type alias/redefinition is confusing the serializer. I think I'm using ProtoBuf serialization, but I do see JavaSerializer in the stacktrace.
More Debugging Info
I newed up a JavaSerializer and individually serialized each of the Maps. Only one (alarms) fails to serialize. Here's the toString of each of them:
This one failed:
alarms = Map(LOWBATTERY -> 1373623446000)
These succeeded:
location = Map(a_value -> 6, latitude -> 37.63473, p_value -> 4, longitude -> -97.41459, fix_timestamp -> 3F0AE7FF, status -> OK, fix_type -> MSBL, CUSTOMCLOCK -> 1373644159000)
network = Map(SID -> 1271, RSSI -> 85)
sensors = Map(HUMIDITY -> -999, PRESSURE -> -999, LIGHT -> -999 9:52 AM)
status = Map(TEMPERATURE_F -> 923, CYCLE -> 4, TEMPERATURE1_C -> 335, CAP_REMAINING -> 560, VOLTAGE -> 3691, CAP_FULL -> 3897)
The problem is that Map.mapValues produces an object that's not serializable. When alarms was created, it's run through something like alarms.mapValues(hex2Int). The problem and workaround is described here:
https://issues.scala-lang.org/browse/SI-7005
In short, the solution is to do alarms.mapValues(hex2Int).map(identity)
Not sure whether this works in all cases but my workaround was simply to convert the map into a sequence (just .toSeqbefore the sequence) before serialization. toMap should give you the same map after deserialization.