Scala: parsing an API parameter - scala

My API currently take an optional parameter named gamedate. It is passed in as a string at which time I later parse it to a Date object using some utility code. The code looks like this:
val gdate:Option[String] = params.get("gamedate")
val res = gdate match {
case Some(s) => {
val date:Option[DateTime] = gdate map { MyDateTime.parseDate _ }
val dateOrDefault:DateTime = date.getOrElse((new DateTime).withTime(0, 0, 0, 0))
NBAScoreboard.findByDate(dateOrDefault)
}
case None => NBAScoreboard.getToday
}
This works just fine. Now what I'm trying to solve is I'm allowing multiple gamedates get passed in via a comma delimited list. Originally you can pass a parameter like this:
gamedate=20131211
now I want to allow that OR:
gamedate=20131211,20131212
That requires modifying the code above to try to split the comma delimited string and parse each value into a Date and change the interface to findByDate to accept a Seq[DateTime] vs just DateTime. I tried running something like this, but apparently it's not the way to go about it:
val res = gdates match {
case Some(s) => {
val dates:Option[Seq[DateTime]] = gdates map { _.split(",").distinct.map(MyDateTime.parseDate _ )}
val datesOrDefault:Seq[DateTime] = dates map { _.getOrElse((new DateTime).withTime(0, 0, 0, 0))}
NBAScoreboard.findByDates(datesOrDefault)
}
case None => NBAScoreboard.getToday
}
What's the best way to convert my first set of code to handle this use case? I'm probably fairly close in the second code example I provided, but I'm just not hitting it right.

You mixed up the containers. The map you call on dates unpackes the Option so the getOrElse is applied to a list.
val res = gdates match {
case Some(s) =>
val dates = gdates.map(_.split(",").distinct.map(MyDateTime.parseDate _ ))
val datesOrDefault = dates.getOrElse(Array((new DateTime).withTime(0, 0, 0, 0)))
NBAScoreboard.findByDates(datesOrDefault)
case _ =>
NBAScoreboard.getToday
}
This should work.

Related

Handling and throwing Exceptions in Scala

I have the following implementation:
val dateFormats = Seq("dd/MM/yyyy", "dd.MM.yyyy")
implicit def dateTimeCSVConverter: CsvFieldReader[DateTime] = (s: String) => Try {
val elem = dateFormats.map {
format =>
try {
Some(DateTimeFormat.forPattern(format).parseDateTime(s))
} catch {
case _: IllegalArgumentException =>
None
}
}.collectFirst {
case e if e.isDefined => e.get
}
if (elem.isDefined)
elem.get
else
throw new IllegalArgumentException(s"Unable to parse DateTime $s")
}
So basically what I'm doing is that, I'm running over my Seq and trying to parse the DateTime with different formats. I then collect the first one that succeeds and if not I throw the Exception back.
I'm not completely satisfied with the code. Is there a better way to make it simpler? I need the exception message passed on to the caller.
The one problem with your code is it tries all patterns no matter if date was already parsed. You could use lazy collection, like Stream to solve this problem:
def dateTimeCSVConverter(s: String) = Stream("dd/MM/yyyy", "dd.MM.yyyy")
.map(f => Try(DateTimeFormat.forPattern(format).parseDateTime(s))
.dropWhile(_.isFailure)
.headOption
Even better is the solution proposed by jwvh with find (you don't have to call headOption):
def dateTimeCSVConverter(s: String) = Stream("dd/MM/yyyy", "dd.MM.yyyy")
.map(f => Try(DateTimeFormat.forPattern(format).parseDateTime(s))
.find(_.isSuccess)
It returns None if none of patterns matched. If you want to throw exception on that case, you can uwrap option with getOrElse:
...
.dropWhile(_.isFailure)
.headOption
.getOrElse(throw new IllegalArgumentException(s"Unable to parse DateTime $s"))
The important thing is, that when any validation succeedes, it won't go further but will return parsed date right away.
This is a possible solution that iterates through all the options
val dateFormats = Seq("dd/MM/yyyy", "dd.MM.yyyy")
val dates = Vector("01/01/2019", "01.01.2019", "01-01-2019")
dates.foreach(s => {
val d: Option[Try[DateTime]] = dateFormats
.map(format => Try(DateTimeFormat.forPattern(format).parseDateTime(s)))
.filter(_.isSuccess)
.headOption
d match {
case Some(d) => println(d.toString)
case _ => throw new IllegalArgumentException("foo")
}
})
This is an alternative solution that returns the first successful conversion, if any
val dateFormats = Seq("dd/MM/yyyy", "dd.MM.yyyy")
val dates = Vector("01/01/2019", "01.01.2019", "01-01-2019")
dates.foreach(s => {
dateFormats.find(format => Try(DateTimeFormat.forPattern(format).parseDateTime(s)).isSuccess) match {
case Some(format) => println(DateTimeFormat.forPattern(format).parseDateTime(s))
case _ => throw new IllegalArgumentException("foo")
}
})
I made it sweet like this now! I like this a lot better! Use this if you want to collect all the successes and all the failures. Note that, this might be a bit in-efficient when you need to break out of the loop as soon as you find one success!
implicit def dateTimeCSVConverter: CsvFieldReader[DateTime] = (s: String) => Try {
val (successes, failures) = dateFormats.map {
case format => Try(DateTimeFormat.forPattern(format).parseDateTime(s))
}.partition(_.isSuccess)
if (successes.nonEmpty)
successes.head.get
else
failures.head.get
}

Pattern matching against URL with question mark in Scala

I am trying to extract few values from the URL consisting of question mark.
However, the below code doesn't work. Would you please help me in figuring out what went wrong?
val LibraryPattern = ".*/library/([A-Za-z0-9\\-]+)?book=([A-Za-z0-9\\-]+)".r
val url = "https://bookscollection.com/library/mylib?book=abc"
Try(new URL(url)) match {
case Success(url) =>
println("my url:"+url)
url.getPath match {
case LibraryPattern(libId, bookId) =>
println(libId)
println(bookId)
case _ =>
}
}
As few answer already pointed how to fix code example, I want to suggest another solution. Parsing URL with regex may be inefficient in terms of future readability, type safety and flexability of your codebase.
I want to suggest using scala-uri library or something similar.
With this library one can do url parsing as simple as:
import io.lemonlabs.uri.Url
val url = Url.parse("https://bookscollection.com/library/mylib?book=abc")
val lastPathPart = url.path.parts.last
// println(lastPathPart)
// res: String = "mylib"
val bookParam: Option[String] = url.query.param("book")
// println(bookParam)
// res: Option[String] = Some("abc")
The URL object has already parsed the URL for you. getPath returns everything before the ?, use getQuery to obtain the part after the ?:
val LibraryPattern = ".*/library/([A-Za-z0-9\\-]+)".r
val BookPattern = "book=([A-Za-z0-9\\-]+)".r
val url = "https://bookscollection.com/library/mylib?book=abc"
Try(new URL(url)) match {
case Success(url) =>
url.getPath match {
case LibraryPattern(libId) =>
url.getQuery match {
case BookPattern(bookId) =>
println(libId)
println(bookId)
}
}
}
? is a special character in Regex (it essentially makes the previous character/group optional). You'll need to escape it.
EDIT: url.getPath only returns /library/mylib, so you shouldn't use this if you want your Regex to match.
val LibraryPattern = ".*/library/([A-Za-z0-9\\-]+)\\?book=([A-Za-z0-9\\-]+)".r
val url = "https://bookscollection.com/library/mylib?book=abc"
Try(new URL(url)) match {
case Success(url) =>
println("my url:"+url)
url.toString match {
case LibraryPattern(libId, bookId) =>
println(libId)
println(bookId)
case _ =>
}
}

Scala - How to safely operate on a map element

I want to get an element from a mutable map and do an operation on it.
For example I want to change his name value (the element on the map will be with the new value)
and I want to return it in the end
to start I wrote a working code but it is very Java
var newAppKey: AppKey = null
val appKey = myMap(request.appKeyId)
if (appKey != null) {
newAppKey = appKey.copy(name = request.appKeyName)
myMap.put(appKey.name, newAppKey)
newAppKey
} else {
newAppKey = null
}
This code works but it very java.
I though about something like
val newAppKey = appIdToApp(request.appKeyId) match {
case: Some(appKey) => appKey.copy(name = request.appKeyName)
case: None => None{AppKey}
}
Which doesn't compile or updates the myMap object with the new value.
How do I improve it to scala concepts.
Simply:
val key = request.appKeyId
val newValueOpt = myMap.get(key).map(_.copy(name = request.appKeyName))
newValueOpt.foreach(myMap.update(key, _))
There are a couple of mistakes in your code.
case: Some(appKey) => appKey.copy(name = request.appKeyName)
This syntax for case is incorrect. It should be
case Some(appKey) => appKey.copy(name = request.appKeyName)
Also, the return type of your expression is currently Any (Scala equivalent of Object), because your success case returns an object of type (appKey's type) whereas the failure case returns a None, which is of type Option. To make things consistent, your success case should return
Some(appKey.copy(name = request.appKeyName))
While there are better ways to deal with Options than pattern matching, the corrected code would be
val newAppKey = appIdToApp(request.appKeyId) map (appKey =>
appKey.copy(name = request.appKeyName))

Retrieve tuple from string

I have the following input string:
"0.3215,Some(0.5123)"
I would like to retrieve the tuple (0.3215,Some(0.5123)) with: (BigDecimal,Option[BigDecimal]).
Here is one of the thing I tried so far:
"\\d+\\.\\d+,Some\\(\\d+\\.\\d+".r findFirstIn iData match {
case None => Map[BigDecimal, Option[BigDecimal]]()
case Some(s) => {
val oO = s.split(",Some\\(")
BigDecimal.valueOf(oO(0).toDouble) -> Option[BigDecimal](BigDecimal.valueOf(lSTmp2(1).toDouble))
}
}
Using a Map and transforming it into a tuple.
When I try directly the tuple I get an Equals or an Object.
Must miss something here...
Your code has several issues, but the big one seems to be that the case None side of the match returns a Map but the Some(s) side returns a Tuple2. Map and Tuple2 unify to their lowest-common-supertype, Equals, which is what you're seeing.
I think this is what you're trying to achieve?
val Pattern = "(\\d+\\.\\d+),Some\\((\\d+\\.\\d+)\\)".r
val s = "0.3215,Some(0.5123)"
s match {
case Pattern(a,b) => Map(BigDecimal(a) -> Some(BigDecimal(b)))
case _ => Map[BigDecimal, Option[BigDecimal]]()
}
// Map[BigDecimal,Option[BigDecimal]] = Map(0.3215 -> Some(0.5123))

Pattern matching and RDDs

I have a very simple (n00b) question but I'm somehow stuck. I'm trying to read a set of files in Spark with wholeTextFiles and want to return an RDD[LogEntry], where LogEntry is just a case class. I want to end up with an RDD of valid entries and I need to use a regular expression to extract the parameters for my case class. When an entry is not valid I do not want the extractor logic to fail but simply write an entry in a log. For that I use LazyLogging.
object LogProcessors extends LazyLogging {
def extractLogs(sc: SparkContext, path: String, numPartitions: Int = 5): RDD[Option[CleaningLogEntry]] = {
val pattern = "<some pattern>".r
val logs = sc.wholeTextFiles(path, numPartitions)
val entries = logs.map(fileContent => {
val file = fileContent._1
val content = fileContent._2
content.split("\\r?\\n").map(line => line match {
case pattern(dt, ev, seq) => Some(LogEntry(<...>))
case _ => logger.error(s"Cannot parse $file: $line"); None
})
})
That gives me an RDD[Array[Option[LogEntry]]]. Is there a neat way to end up with an RDD of the LogEntrys? I'm somehow missing it.
I was thinking about using Try instead, but I'm not sure if that's any better.
Thoughts greatly appreciated.
To get rid of the Array - simply replace the map command with flatMap - flatMap will treat a result of type Traversable[T] for each record as separate records of type T.
To get rid of the Option - collect only the successful ones: entries.collect { case Some(entry) => entry }.
Note that this collect(p: PartialFunction) overload (which performs something equivelant to a map and a filter combined) is very different from collect() (which sends all data to the driver).
Altogether, this would be something like:
def extractLogs(sc: SparkContext, path: String, numPartitions: Int = 5): RDD[CleaningLogEntry] = {
val pattern = "<some pattern>".r
val logs = sc.wholeTextFiles(path, numPartitions)
val entries = logs.flatMap(fileContent => {
val file = fileContent._1
val content = fileContent._2
content.split("\\r?\\n").map(line => line match {
case pattern(dt, ev, seq) => Some(LogEntry(<...>))
case _ => logger.error(s"Cannot parse $file: $line"); None
})
})
entries.collect { case Some(entry) => entry }
}