Reading a file path from property and then reading the file idiomatic Scala - scala

I want to read in the path of a file from configureation and then read the file in an idomatic Scala way. This is the code I have so far:
val key: Option[String] = {
val publicKeyPath: Option[String] = conf.getString("bestnet.publicKeyFile")
publicKeyPath match {
case Some(path) => {
Future {
val source = fromFile(s"./$path")
val key: String = source.getLines.toIterable.drop(1).dropRight(1).mkString
source.close()
key
} onComplete {
case Success(key) => Success(key)
case Failure(t) => None
}
}
case None => None
}
}
However this is not working since Im getting the error Expression of type Unit does not conform to Option[String]
What am I getting wrong and is my approach idiomatic Scala or should it be done in some other way?

If you want to return the contents as String there is no need to use a Future. E.g. the following would do:
val key: Option[String] = {
val publicKeyPath: Option[String] = conf.getString("bestnet.publicKeyFile")
publicKeyPath match {
case Some(path) =>
val source = fromFile(s"./$path")
val key: String = source.getLines.toIterable.drop(1).dropRight(1).mkString
source.close()
Some(key)
case None =>
None
}
}
The pattern of transforming the value of a Some(_) can be done more idiomatic using the higher-level function map, i.e.:
val key: Option[String] = {
val publicKeyPath = conf.getString("bestnet.publicKeyFile")
publicKeyPath.map(path => {
val source = fromFile(s"./$path")
val key = source.getLines.toIterable.drop(1).dropRight(1).mkString
source.close()
key
})
}
A more idiomatic way to do resource management (i.e. closing the Source) is by using the "Loan Pattern". For example:
def using[A](r: Resource)(f: Resource => A): A = try {
    f(r)
} finally {
r.dispose()
}
val key: Option[String] = {
val publicKeyPath = conf.getString("bestnet.publicKeyFile")
publicKeyPath.map(path =>
using(fromFile(s"./$path"))(source =>
source.getLines.toIterable.drop(1).dropRight(1).mkString
)
)
}
Scala is a flexible language and it is not uncommon to define such an abstraction in user-land (whereas in Java, the using abstraction is a language feature).
If you need non-blocking parallel code you should return a Future[String] instead of an Option[String]. This complicates the automatic resource management, since code is executed at a different time. Anyway, this should give you some pointers for improving your code.

Related

Use higher order functions to concise scala code

I'm new to Scala and trying to write some programs to get better at it. I wrote a flow (version 1) that is very Java-like and I'm trying to write it using higher order functions (version 2).
version 1:
val entry: Option[Int] = getEntry()
if (entry.isDefined) {
val cachedEntry = entry.get
if (cachedEntry.state.isActive) {
return cachedEntry
} else {
Cache.invalidateCachedEntry(cachedEntry)
}
}
Cache.createNewEntry()
version 2:
val entry: Option[Int] = getEntry()
entry.filter(_.state.isActive).orElse((() => {
Cache.invalidateCachedEntry _
Option(Cache.createNewEntry())
})()).get
I'm not sure if this is the correct approach or there is a better way of doing this?
Let's consider following scenerio:
case class Entry(state: AnyState)
case class AnyState(isActive: Boolean = true)
object Cache {
def invalidateCachedEntry(entry: Entry): Unit = println("cleaned")
}
def getEntry: Option[Entry] = Some(Entry(AnyState()))
val optEntry: Option[Entry] = getEntry
val result: Option[Entry] = optEntry match {
case Some(entry) if entry.state.isActive =>
entry // do something
println("did something")
Some(entry)
case Some(entry) =>
Cache.invalidateCachedEntry(entry)
None
case _ =>
println("Entry not found")
None
}
This would be a one scenario. In general you should return something. But sometimes you don't have enough information. Such cases you can return Option and if you want to throw an error you can use Either
I prefer using match for clarity:
getEntry() match {
case Some(entry) if entry.state.isActive => entry
case opt => opt.foreach(Cache.invalidateCachedEntry); Cache.createNewEntry()
}

Working with options in Scala (best practices)

I have a method that I wrote to enrich person data by performing an API call and adding the enriched data.
I have this case class:
case class Person(personData: PersonData, dataEnrichment: Option[DataEnrichment])
My method is supposed to return this case class, but I have few filters before, in case person height is not "1.8 m" OR if personId was not found in the bio using the regex, I want to return Person with dataEnrichment = None . My issue is that person height and personId are Options themselves, so it looks like this:
def enrichPersonObjWithApiCall(person: Person) = {
person.personData.height.map(_.equals("1.8 m")) match {
case Some(true) =>
val personId = person.personData.bio flatMap { comment =>
extractPersonIdIfExists(comment)
}
personId match {
case Some(perId) =>
apiCall(perId) map { apiRes =>
Person(
person.personData,
dataEnrichment = apiRes)
}
case _ =>
Future successful Person(
person.personData,
dataEnrichment = None)
}
case _ =>
Future successful Person(
person.personData,
dataEnrichment = None)
}
}
def extractPersonIdIfExists(personBio: String): Option[String] = {
val personIdRegex: Regex = """(?<=PersonId:)[^;]+""".r
personIdRegex.findFirstIn(personBio)
}
def apiCall(personId: String): Future[Option[DataEnrichment]] = {
???
}
case class DataEnrichment(res: Option[String])
case class PersonData(name: String, height: Option[String], bio: Option[String])
It doesn't seem to be a Scala best practice to perform it like that. Do you have a more elegant way to get to the same result?
Using for is a good way to process a chain of Option values:
def enrichPersonObjWithApiCall(person: Person): Future[Person] =
(
for {
height <- person.personData.height if height == "1.8 m"
comment <- person.personData.bio
perId <- extractPersonIdIfExists(comment)
} yield {
apiCall(perId).map(Person(person.personData, _))
}
).getOrElse(Future.successful(Person(person.personData, None)))
This is equivalent to a chain of map, flatMap and filter calls, but much easier to read.
Here, I tried to make it more idiomatic and shorter:
def enrichPersonObjWithApiCall(person: Person) = {
person.personData.height.collect {
case h if h == "1.8 m" =>
val personId = person.personData.bio.flatMap(extractPersonIdIfExists)
personId.map(
apiCall(_)
.map(apiRes => person.copy(dataEnrichment = apiRes))
)
}.flatten.getOrElse(
Future.successful(person.copy(dataEnrichment = None))
)
}
Basically, the idea is to use appropriate monadic chains of map, flatMap, collect instead of pattern matching when appropriate.
Same idea as Aivean's answer. Just I would use map flatMap and filter.
def enrichPersonObjWithApiCall(person: Person) = {
person.personData.height
.filter(_ == "1.8 m")
.flatMap{_=>
val personId = person.personData.bio
.flatMap(extractPersonIdIfExists)
personId.map(
apiCall(_)
.map(apiRes => person.copy(dataEnrichment = apiRes))
)
}.getOrElse(Future.successful(person))
}
It's more readable for me.

cache using functional callbacks/ proxy pattern implementation scala

How to implement cache using functional programming
A few days ago I came across callbacks and proxy pattern implementation using scala.
This code should only apply inner function if the value is not in the map.
But every time map is reinitialized and values are gone (which seems obivous.
How to use same cache again and again between different function calls
class Aggregator{
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
(t:Int) => {
if (!cache.contains(t)) {
println("Evaluating..."+t)
val r = function.apply(t);
cache.put(t,r)
r
}
else
{
cache.get(t).get;
}
}
}
def memoizedDoubler = memoize( (key:Int) => {
println("Evaluating...")
key*2
})
}
object Aggregator {
def main( args: Array[String] ) {
val agg = new Aggregator()
agg.memoizedDoubler(2)
agg.memoizedDoubler(2)// It should not evaluate again but does
agg.memoizedDoubler(3)
agg.memoizedDoubler(3)// It should not evaluate again but does
}
I see what you're trying to do here, the reason it's not working is that every time you call memoizedDoubler it's first calling memorize. You need to declare memoizedDoubler as a val instead of def if you want it to only call memoize once.
val memoizedDoubler = memoize( (key:Int) => {
println("Evaluating...")
key*2
})
This answer has a good explanation on the difference between def and val. https://stackoverflow.com/a/12856386/37309
Aren't you declaring a new Map per invocation ?
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
rather than specifying one per instance of Aggregator ?
e.g.
class Aggregator{
private val cache = HashMap[Int, Int]()
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
To answer your question:
How to implement cache using functional programming
In functional programming there is no concept of mutable state. If you want to change something (like cache), you need to return updated cache instance along with the result and use it for the next call.
Here is modification of your code that follows that approach. function to calculate values and cache is incorporated into Aggregator. When memoize is called, it returns tuple, that contains calculation result (possibly taken from cache) and new Aggregator that should be used for the next call.
class Aggregator(function: Function[Int, Int], cache:Map[Int, Int] = Map.empty) {
def memoize:Int => (Int, Aggregator) = {
t:Int =>
cache.get(t).map {
res =>
(res, Aggregator.this)
}.getOrElse {
val res = function(t)
(res, new Aggregator(function, cache + (t -> res)))
}
}
}
object Aggregator {
def memoizedDoubler = new Aggregator((key:Int) => {
println("Evaluating..." + key)
key*2
})
def main(args: Array[String]) {
val (res, doubler1) = memoizedDoubler.memoize(2)
val (res1, doubler2) = doubler1.memoize(2)
val (res2, doubler3) = doubler2.memoize(3)
val (res3, doubler4) = doubler3.memoize(3)
}
}
This prints:
Evaluating...2
Evaluating...3

How do you write a json4s CustomSerializer that handles collections

I have a class that I am trying to deserialize using the json4s CustomSerializer functionality. I need to do this due to the inability of json4s to deserialize mutable collections.
This is the basic structure of the class I want to deserialize (don't worry about why the class is structured like this):
case class FeatureValue(timestamp:Double)
object FeatureValue{
implicit def ordering[F <: FeatureValue] = new Ordering[F] {
override def compare(a: F, b: F): Int = {
a.timestamp.compareTo(b.timestamp)
}
}
}
class Point {
val features = new HashMap[String, SortedSet[FeatureValue]]
def add(name:String, value:FeatureValue):Unit = {
val oldValue:SortedSet[FeatureValue] = features.getOrElseUpdate(name, SortedSet[FeatureValue]())
oldValue += value
}
}
Json4s serializes this just fine. A serialized instance might look like the following:
{"features":
{
"CODE0":[{"timestamp":4.8828914447482E8}],
"CODE1":[{"timestamp":4.8828914541333E8}],
"CODE2":[{"timestamp":4.8828915127325E8},{"timestamp":4.8828910097466E8}]
}
}
I've tried writing a custom deserializer, but I don't know how to deal with the list tails. In a normal matcher you can just call your own function recursively, but in this case the function is anonymous and being called through the json4s API. I cannot find any examples that deal with this and I can't figure it out.
Currently I can match only a single hash key, and a single FeatureValue instance in its value. Here is the CustomSerializer as it stands:
import org.json4s.{FieldSerializer, DefaultFormats, Extraction, CustomSerializer}
import org.json4s.JsonAST._
class PointSerializer extends CustomSerializer[Point](format => (
{
case JObject(JField("features", JObject(Nil)) :: Nil) => new Point
case JObject(List(("features", JObject(List(
(feature:String, JArray(List(JObject(List(("timestamp",JDouble(ts)))))))))
))) => {
val point = new Point
point.add(feature, FeatureValue(ts))
point
}
},
{
// don't need to customize this, it works fine
case x: Point => Extraction.decompose(x)(DefaultFormats + FieldSerializer[Point]())
}
))
If I try to change to using the :: separated list format, so far I have gotten compiler errors. Even if I didn't get compiler errors, I am not sure what I would do with them.
You can get the list of json features in your pattern match and then map over this list to get the Features and their codes.
class PointSerializer extends CustomSerializer[Point](format => (
{
case JObject(List(("features", JObject(featuresJson)))) =>
val features = featuresJson.flatMap {
case (code:String, JArray(timestamps)) =>
timestamps.map { case JObject(List(("timestamp",JDouble(ts)))) =>
code -> FeatureValue(ts)
}
}
val point = new Point
features.foreach((point.add _).tupled)
point
}, {
case x: Point => Extraction.decompose(x)(DefaultFormats + FieldSerializer[Point]())
}
))
Which deserializes your json as follows :
import org.json4s.native.Serialization.{read, write}
implicit val formats = Serialization.formats(NoTypeHints) + new PointSerializer
val json = """
{"features":
{
"CODE0":[{"timestamp":4.8828914447482E8}],
"CODE1":[{"timestamp":4.8828914541333E8}],
"CODE2":[{"timestamp":4.8828915127325E8},{"timestamp":4.8828910097466E8}]
}
}
"""
val point0 = read[Point]("""{"features": {}}""")
val point1 = read[Point](json)
point0.features // Map()
point1.features
// Map(
// CODE0 -> TreeSet(FeatureValue(4.8828914447482E8)),
// CODE2 -> TreeSet(FeatureValue(4.8828910097466E8), FeatureValue(4.8828915127325E8)),
// CODE1 -> TreeSet(FeatureValue(4.8828914541333E8))
// )

Scala pattern matching on generic Map

Whats the best way to handle generics and erasure when doing pattern matching in Scala (a Map in my case). I am looking for a proper implementation without compiler warnings. I have a function that I want to return Map[Int, Seq[String]] from. Currently the code looks like:
def teams: Map[Int, Seq[String]] = {
val dateam = new scala.collection.mutable.HashMap[Int, Seq[String]]
// data.attributes is Map[String, Object] returned from JSON parsing (jackson-module-scala)
val teamz = data.attributes.get("team_players")
if (teamz.isDefined) {
val x = teamz.get
try {
x match {
case m: mutable.Map[_, _] => {
m.foreach( kv => {
kv._1 match {
case teamId: String => {
kv._2 match {
case team: Seq[_] => {
val tid: Int = teamId.toInt
dateam.put(tid, team.map(s => s.toString))
}
}
}
}
})
}
}
} catch {
case e: Exception => {
logger.error("Unable to convert the team_players (%s) attribute.".format(x), e)
}
}
dateam
} else {
logger.warn("Missing team_players attribute in: %s".format(data.attributes))
}
dateam.toMap
}
Use a Scala library to handle it. There are some based on Jackson (Play's ScalaJson, for instance -- see this article on using it stand-alone), as well as libraries not based on Jackson (of which my preferred is Argonaut, though you could also go with Spray-Json).
These libraries, and others, solve this problem. Doing it by hand is awkward and prone to errors, so don't do it.
It could be reasonable to use for comprehension (with some built in pattern matching). Also we could take into account that Map is a list of tuples, in our case of (String, Object) type. As well we will ignore for this example probable exceptions, so:
import scala.collection.mutable.HashMap
def convert(json: Map[String, Object]): HashMap[Int, Seq[String]] = {
val converted = for {
(id: String, description: Seq[Any]) <- json
} yield (id.toInt, description.map(_.toString))
HashMap[Int, Seq[String]](converted.toSeq: _*)
}
So, our for comprehension taking into account only tuples with (String, Seq[Any]) type, then combines converted String to Int and Seq[Any] to Seq[String]. And makes Map to be mutable.