Using Scala groupBy(), from method fetchUniqueCodesForARootCode(). I want to get a map from rootCodes to lists of uniqueCodes - scala

I want to to return Future[Map[String, List[String]]] from fetchUniqueCodesForARootCode method
import scala.concurrent._
import ExecutionContext.Implicits.global
case class DiagnosisCode(rootCode: String, uniqueCode: String, description: Option[String] = None)
object Database {
private val data: List[DiagnosisCode] = List(
DiagnosisCode("A00", "A001", Some("Cholera due to Vibrio cholerae")),
DiagnosisCode("A00", "A009", Some("Cholera, unspecified")),
DiagnosisCode("A08", "A080", Some("Rotaviral enteritis")),
DiagnosisCode("A08", "A083", Some("Other viral enteritis"))
)
def getAllUniqueCodes: Future[List[String]] = Future {
Database.data.map(_.uniqueCode)
}
def fetchDiagnosisForUniqueCode(uniqueCode: String): Future[Option[DiagnosisCode]] = Future {
Database.data.find(_.uniqueCode.equalsIgnoreCase(uniqueCode))
}
}
getAllUniqueCodes returns all unique codes from data List.
fetchDiagnosisForUniqueCode returns DiagnosisCode when uniqueCode matches.
From fetchDiagnosisForUniqueCodes, I am returningFuture[List[DiagnosisCode]] using getAllUniqueCodes() and fetchDiagnosisForUniqueCode(uniqueCode).*
def fetchDiagnosisForUniqueCodes: Future[List[DiagnosisCode]] = {
val xa: Future[List[Future[DiagnosisCode]]] = Database.getAllUniqueCodes.map { (xs:
List[String]) =>
xs.map { (uq: String) =>
Database.fetchDiagnosisForUniqueCode(uq)
}
}.map(n =>
n.map(y=>
y.map(_.get)))
}
xa.flatMap {
listOfFuture =>
Future.sequence(listOfFuture)
}}
Now, def fetchUniqueCodesForARootCode should return Future[Map[String, List[DiagnosisCode]]] using fetchDiagnosisForUniqueCodes and groupBy
Here is the method
def fetchUniqueCodesForARootCode: Future[Map[String, List[String]]] = {
fetchDiagnosisForUniqueCodes.map { x =>
x.groupBy(x => (x.rootCode, x.uniqueCode))
}
}
Need to get the below result from fetchUniqueCodesForARootCode:-
A00 -> List(A001, A009), H26 -> List(H26001, H26002), B15 -> List(B150, B159), H26 -> List(H26001, H26002)

It's hard to decode from the question description, what the problem is. But if I understood correctly, you want to get a map from rootCodes to lists of uniqueCodes.
The groupBy method takes a function that for every element returns its key. So first you have to group by the rootCodes and then you have to use map to get the correct values.
groupBy definition: https://dotty.epfl.ch/api/scala/collection/IterableOps.html#groupBy-f68
scastie: https://scastie.scala-lang.org/KacperFKorban/PL1X3joNT3qNOTm6OQ3VUQ

Related

Scala Action Map Implementation Issue (follow up)

This is a fairly long winded question and a follow up to my last one.
I have the following code for an application being built - I am looking to call the function in handleOne but it is not working in the action map. I think this is due to the unit assigned to statesVotes in the handler. The goal is to create a menu driven application that performs a set of desired functions. The function in question here is: Get all the state values and display suitably formatted.
Potentially have to make the states into a map but looking for the same functionality of the case class.
import scala.io.StdIn.readInt
object myApp3 extends App{
val dataRE = "([^(]+) \\((\\d+)\\),(.+)".r
val pVotes = "([^:]+):(\\d+)".r
case class State(name : String
,code : Int
,parties : Array[(String,Int)])
val states: List[State] =
util.Using(io.Source.fromFile("filename.txt"))(_.getLines().toList)
.get //will throw if read file fails
.collect{case dataRE(name,code,votes) =>
State(name.trim
,code.toInt
,votes.split(",")
.collect{case pVotes(p,v) => (p,v.toInt)}
)
}
val actionMap = Map[Int, () => Boolean](1 -> handleOne)
var opt = 0
do{
opt = readOption
} while (menu(opt))
def readOption: Int = {
println(
"""|Please select one of the following:
| 1 - Show All States and Votes
| 2 - CW Option 2
| 3 - quit""".stripMargin)
readInt()
}
def menu(option: Int): Boolean = {
actionMap.get(option) match {
case Some(f) => f()
case None =>
println("Command not recognized!")
true
}
}
// handle one calls function mnuShowStatesVotes, which invokes function statesVotes
def handleOne(): Boolean = {
mnuShowStatesVotes(statesVotes : List[State])
true
}
def mnuShowStatesVotes(f:() => List[State]) = {
f() foreach(println())
}
def statesVotes = states.sortBy(_.name) //alphabetical order of states
.foreach{ st =>
println(st.name) //show line by split by state name
st.parties
.sortBy(-_._2) //sorts parties by votes in descending order
.map{case (p,v) => f"\t$p%-12s:$v%9d"}
.foreach(println)
}
}
Essentially want the menu option handleOne to correctly invoke the function in statesVotes.
The text file being used can be found below:
Alabama (9),Democratic:849624,Republican:1441170,Libertarian:25176,Others:7312
Alaska (3),Democratic:153778,Republican:189951,Libertarian:8897,Others:6904
Arizona (11),Democratic:1672143,Republican:1661686,Libertarian:51465,Green:1557,Others:475
It seems to me that your code would benefit by adopting a clear and distinct separation/segregation of roles and responsibilities.
Let's get the preliminaries taken care of.
import scala.util.{Try, Success, Failure, Using}
case class State(name : String
,code : Int
,parties : Array[(String,Int)])
Now let's parse the input data.
This code has one job to do: load the data from the input file. It takes one parameter, the input filename, and returns either Success() with the accumulated data, or Failure() with the error exception.
def readFile(filename: String): Try[List[State]] = {
val dataRE = "([^(]+) \\((\\d+)\\),(.+)".r
val pVotes = "([^:]+):(\\d+)".r
Using(io.Source.fromFile(filename)) {
_.getLines()
.toList
.collect{ case dataRE(name, code, votes) =>
State(name.trim
,code.toInt
,votes.split(",")
.collect{case pVotes(p,v) => (p,v.toInt)})
}
}
}
Note that collect() will simply ignore file data the doesn't fit the expected format. If you were to use map() instead then bad input data would cause a Failure().
Now let's put all the output methods, and their descriptions, under one roof. This is most of what the user will see.
class Menu(states: List[State]) {
def apply(key: String): Boolean = {
val (_, op, continue) = lookup(key)
op()
continue
}
private val lookup: Map[String,(String,()=>Unit,Boolean)] =
Map("?" -> ("show this menu", menu _, true)
,"menu" -> ("show this menu", menu _, true)
,"all" -> ("display all voting data", all _, true)
,"st" -> ("vote totals by state", stVotes _, true)
,"x" -> ("exit", done _, false)
,"quit" -> ("exit", done _, false)
).withDefaultValue(("",unknown _, true))
private def done(): Unit = println("bye")
private def unknown(): Unit =
println("unknown selection ('?' for main menu)")
private def menu(): Unit =
lookup.keys.toVector.sorted
.map(k => s"$k\t: ${lookup(k)._1}")
.foreach(println)
private def all(): Unit =
states.sortBy(_.name) //alphabetical
.foreach{ st =>
println(st.name) //state name
st.parties
.sortBy(-_._2) //votes in decreasing order
.map{case (p,v) => f"\t$p%-12s:$v%9d"}
.foreach(println)
}
private def stVotes(): Unit =
states.map(st => (st.name, st.parties.map(_._2).sum))
.sortBy(-_._2) //votes in decreasing order
.map{case (state,total) => f"$state%-9s:$total%8d"}
.foreach(println)
}
Notice that only the apply() method is public. Everything else is private and under wraps.
To create a new data report you just add an entry in the lookup Map and add the new method to produce the output.
Now all we need is the code to tie the pieces together and to take user input.
def main(args: Array[String]): Unit =
args.headOption.map(readFile) match {
case None =>
println(s"usage: ${this.productPrefix} <data_file>")
case Some(Failure(exc)) =>
println(s"Error reading data file: $exc")
case Some(Success(stateData)) =>
val menu = new Menu(stateData)
menu("menu")
Iterator.continually(menu(io.StdIn.readLine(">> ").toLowerCase))
.dropWhile(identity)
.next()
}
Note that this.productPrefix is made available if the surrounding object is a case object.

Chain Scala Futures when processing a Seq of objects?

import scala.concurrent.duration.Duration
import scala.concurrent.duration.Duration._
import scala.concurrent.{Await, Future}
import scala.concurrent.Future._
import scala.concurrent.ExecutionContext.Implicits.global
object TestClass {
final case class Record(id: String)
final case class RecordDetail(id: String)
final case class UploadResult(result: String)
val ids: Seq[String] = Seq("a", "b", "c", "d")
def fetch(id: String): Future[Option[Record]] = Future {
Thread sleep 100
if (id != "b" && id != "d") {
Some(Record(id))
} else None
}
def fetchRecordDetail(record: Record): Future[RecordDetail] = Future {
Thread sleep 100
RecordDetail(record.id + "_detail")
}
def upload(recordDetail: RecordDetail): Future[UploadResult] = Future {
Thread sleep 100
UploadResult(recordDetail.id + "_uploaded")
}
def notifyUploaded(results: Seq[UploadResult]): Unit = println("notified " + results)
def main(args: Array[String]): Unit = {
//for each id from ids, call fetch method and if record exists call fetchRecordDetail
//and after that upload RecordDetail, collect all UploadResults into seq
//and call notifyUploaded with that seq and await result, you should see "notified ...." in console
// In the following line of code how do I pass result of fetch to fetchRecordDetail function
val result = Future.traverse(ids)(x => Future(fetch(x)))
// val result: Future[Unit] = ???
Await.ready(result, Duration.Inf)
}
}
My problem is that I don't know what code to put in the main to make it work as written in the comments. To sum up, I have an ids:Seq[String] and I want each id to go through asynchronous methods fetch, fetchRecordDetail, upload, and finally the whole Seq to come to notifyUploaded.
I think that the simplest way to do it is :
def main(args: Array[String]): Unit = {
//for each id from ids, call fetch method and if record exists call fetchRecordDetail
//and after that upload RecordDetail, collect all UploadResults into seq
//and call notifyUploaded with that seq and await result, you should see "notified ...." in console
def runWithOption[A, B](f: A => Future[B], oa: Option[A]): Future[Option[B]] = oa match {
case Some(a) => f(a).map(b => Some(b))
case None => Future.successful(None)
}
val ids: Seq[String] = Seq("a", "b", "c", "d")
val resultSeq: Seq[Future[Option[UploadResult]]] = ids.map(id => {
for (or: Option[Record] <- fetch(id);
ord: Option[RecordDetail] <- runWithOption(fetchRecordDetail, or);
our: Option[UploadResult] <- runWithOption(upload, ord)
) yield our
})
val filteredResult: Future[Seq[UploadResult]] = Future.sequence(resultSeq).map(s => s.collect({ case Some(ur) => ur }))
val result: Future[Seq[UploadResult]] = filteredResult.andThen({ case Success(s) => notifyUploaded(s) })
Await.ready(result, Duration.Inf)
}
The idea is that you first get a Seq[Future[_]] that you map through all the methods (here it is done using for-comprehension). Here is an important trick is to actually pass Seq[Future[Option[_]]]. Passing Option[_] through the whole chain via runWithOption helper method simplifies code a lot without a need to block until the very last stage.
Then you convert Seq[Future[_]] into a Future[Seq[_]] and filter out results for those ids that failed at the fetch stage. And finally you apply notifyUploaded.
P.S. Note that there is no error handling in this code whatsoever and it is not clear how you expect it to behave in case of errors at different stages.

scala io Exception handling

I am trying to read a comma separated file in scala and convert that into list of Json object. The code is working if all the records are valid. How do i catch the exception for the records which are not valid in my function below. If a record is not valid it should throw and exception and continue reading the file. but in my case once an invalid record comes the application stops.
def parseFile(file: String): List[JsObject] = {
val bufferedSource = Source.fromFile(file)
try {
bufferedSource.getLines().map(line => {
val cols = line.split(",").map(_.trim)
(Json.obj("Name" -> cols(0), "Media" -> cols(1), "Gender" -> cols(2), "Age" -> cols(3).toInt)) // Need to handle io exception here.
}).toList
}
finally {
bufferedSource.close()
}
}
I think you may benefit from using Option (http://danielwestheide.com/blog/2012/12/19/the-neophytes-guide-to-scala-part-5-the-option-type.html) and the Try object (http://danielwestheide.com/blog/2012/12/26/the-neophytes-guide-to-scala-part-6-error-handling-with-try.html)
What you are doing here is stopping all work when an error happens (ie you throw to outside the map) a better option is to isolate the failure and return some object that we can filer out. Below is a quick implementation I made
package csv
import play.api.libs.json.{JsObject, Json}
import scala.io.Source
import scala.util.Try
object CsvParser extends App {
//Because a bad row can either != 4 columns or the age can not be a int we want to return an option that will be ignored by our caller
def toTuple(array : Array[String]): Option[(String, String, String, Int)] = {
array match {
//if our array has exactly 4 columns
case Array(name, media, gender, age) => Try((name, media, gender, age.toInt)).toOption
// any other array size will be ignored
case _ => None
}
}
def toJson(line: String): Option[JsObject] = {
val cols = line.split(",").map(_.trim)
toTuple(cols) match {
case Some((name: String, media: String, gender: String, age: Int)) => Some(Json.obj("Name" -> name, "Media" -> media, "Gender" -> gender, "Age" -> age))
case _ => None
}
}
def parseFile(file: String): List[JsObject] = {
val bufferedSource = Source.fromFile(file)
try { bufferedSource.getLines().map(toJson).toList.flatten } finally { bufferedSource.close() }
}
parseFile("my/csv/file/path")
}
The above code will ignore any rows where there not exactly 4 columns. It will also contain the NumberFormatException from .toInt.
The idea is to isolate the failure and pass back some type that the caller can either work with when the row was parsed...or ignore when a failure happened.

cache using functional callbacks/ proxy pattern implementation scala

How to implement cache using functional programming
A few days ago I came across callbacks and proxy pattern implementation using scala.
This code should only apply inner function if the value is not in the map.
But every time map is reinitialized and values are gone (which seems obivous.
How to use same cache again and again between different function calls
class Aggregator{
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
(t:Int) => {
if (!cache.contains(t)) {
println("Evaluating..."+t)
val r = function.apply(t);
cache.put(t,r)
r
}
else
{
cache.get(t).get;
}
}
}
def memoizedDoubler = memoize( (key:Int) => {
println("Evaluating...")
key*2
})
}
object Aggregator {
def main( args: Array[String] ) {
val agg = new Aggregator()
agg.memoizedDoubler(2)
agg.memoizedDoubler(2)// It should not evaluate again but does
agg.memoizedDoubler(3)
agg.memoizedDoubler(3)// It should not evaluate again but does
}
I see what you're trying to do here, the reason it's not working is that every time you call memoizedDoubler it's first calling memorize. You need to declare memoizedDoubler as a val instead of def if you want it to only call memoize once.
val memoizedDoubler = memoize( (key:Int) => {
println("Evaluating...")
key*2
})
This answer has a good explanation on the difference between def and val. https://stackoverflow.com/a/12856386/37309
Aren't you declaring a new Map per invocation ?
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
rather than specifying one per instance of Aggregator ?
e.g.
class Aggregator{
private val cache = HashMap[Int, Int]()
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
To answer your question:
How to implement cache using functional programming
In functional programming there is no concept of mutable state. If you want to change something (like cache), you need to return updated cache instance along with the result and use it for the next call.
Here is modification of your code that follows that approach. function to calculate values and cache is incorporated into Aggregator. When memoize is called, it returns tuple, that contains calculation result (possibly taken from cache) and new Aggregator that should be used for the next call.
class Aggregator(function: Function[Int, Int], cache:Map[Int, Int] = Map.empty) {
def memoize:Int => (Int, Aggregator) = {
t:Int =>
cache.get(t).map {
res =>
(res, Aggregator.this)
}.getOrElse {
val res = function(t)
(res, new Aggregator(function, cache + (t -> res)))
}
}
}
object Aggregator {
def memoizedDoubler = new Aggregator((key:Int) => {
println("Evaluating..." + key)
key*2
})
def main(args: Array[String]) {
val (res, doubler1) = memoizedDoubler.memoize(2)
val (res1, doubler2) = doubler1.memoize(2)
val (res2, doubler3) = doubler2.memoize(3)
val (res3, doubler4) = doubler3.memoize(3)
}
}
This prints:
Evaluating...2
Evaluating...3

How do you write a json4s CustomSerializer that handles collections

I have a class that I am trying to deserialize using the json4s CustomSerializer functionality. I need to do this due to the inability of json4s to deserialize mutable collections.
This is the basic structure of the class I want to deserialize (don't worry about why the class is structured like this):
case class FeatureValue(timestamp:Double)
object FeatureValue{
implicit def ordering[F <: FeatureValue] = new Ordering[F] {
override def compare(a: F, b: F): Int = {
a.timestamp.compareTo(b.timestamp)
}
}
}
class Point {
val features = new HashMap[String, SortedSet[FeatureValue]]
def add(name:String, value:FeatureValue):Unit = {
val oldValue:SortedSet[FeatureValue] = features.getOrElseUpdate(name, SortedSet[FeatureValue]())
oldValue += value
}
}
Json4s serializes this just fine. A serialized instance might look like the following:
{"features":
{
"CODE0":[{"timestamp":4.8828914447482E8}],
"CODE1":[{"timestamp":4.8828914541333E8}],
"CODE2":[{"timestamp":4.8828915127325E8},{"timestamp":4.8828910097466E8}]
}
}
I've tried writing a custom deserializer, but I don't know how to deal with the list tails. In a normal matcher you can just call your own function recursively, but in this case the function is anonymous and being called through the json4s API. I cannot find any examples that deal with this and I can't figure it out.
Currently I can match only a single hash key, and a single FeatureValue instance in its value. Here is the CustomSerializer as it stands:
import org.json4s.{FieldSerializer, DefaultFormats, Extraction, CustomSerializer}
import org.json4s.JsonAST._
class PointSerializer extends CustomSerializer[Point](format => (
{
case JObject(JField("features", JObject(Nil)) :: Nil) => new Point
case JObject(List(("features", JObject(List(
(feature:String, JArray(List(JObject(List(("timestamp",JDouble(ts)))))))))
))) => {
val point = new Point
point.add(feature, FeatureValue(ts))
point
}
},
{
// don't need to customize this, it works fine
case x: Point => Extraction.decompose(x)(DefaultFormats + FieldSerializer[Point]())
}
))
If I try to change to using the :: separated list format, so far I have gotten compiler errors. Even if I didn't get compiler errors, I am not sure what I would do with them.
You can get the list of json features in your pattern match and then map over this list to get the Features and their codes.
class PointSerializer extends CustomSerializer[Point](format => (
{
case JObject(List(("features", JObject(featuresJson)))) =>
val features = featuresJson.flatMap {
case (code:String, JArray(timestamps)) =>
timestamps.map { case JObject(List(("timestamp",JDouble(ts)))) =>
code -> FeatureValue(ts)
}
}
val point = new Point
features.foreach((point.add _).tupled)
point
}, {
case x: Point => Extraction.decompose(x)(DefaultFormats + FieldSerializer[Point]())
}
))
Which deserializes your json as follows :
import org.json4s.native.Serialization.{read, write}
implicit val formats = Serialization.formats(NoTypeHints) + new PointSerializer
val json = """
{"features":
{
"CODE0":[{"timestamp":4.8828914447482E8}],
"CODE1":[{"timestamp":4.8828914541333E8}],
"CODE2":[{"timestamp":4.8828915127325E8},{"timestamp":4.8828910097466E8}]
}
}
"""
val point0 = read[Point]("""{"features": {}}""")
val point1 = read[Point](json)
point0.features // Map()
point1.features
// Map(
// CODE0 -> TreeSet(FeatureValue(4.8828914447482E8)),
// CODE2 -> TreeSet(FeatureValue(4.8828910097466E8), FeatureValue(4.8828915127325E8)),
// CODE1 -> TreeSet(FeatureValue(4.8828914541333E8))
// )