sorting of ZonedDateTime in Scala - scala

I have the following List of ZonedDateTime which is basically read from xml fields in Scala:
var timestamps = List[ZonedDateTime]()
timestampNodes.foreach(node => timestamps = timestamps :+ ZonedDateTime.parse(node.text, DateTimeFormatter.ISO_OFFSET_DATE_TIME))
What is the best and fastest way to sort the timestamps List so that entries are sorted from oldest to newest?

.sortWith() should work.
import java.time.{ZonedDateTime => ZDT}
val sortedtimestamps: List[ZDT] =
timestampNodes.map(node => ZDT.parse(node.text))
.sortWith(_.isBefore(_))
Scala 2.11.12 tested via Scastie.

Implement Ordering for ZonedDateTime and you can use List.sorted:
import java.time._
import scala.math.Ordering.Implicits._
implicit val zonedDateTimeOrdering: Ordering[ZonedDateTime] =
_ compareTo _
val base = ZonedDateTime.of(
LocalDate.of(2021, 1, 1),
LocalTime.MIDNIGHT,
ZoneOffset.UTC
)
List(
base.plusHours(1),
base.plusHours(4),
base.plusHours(2)
).sorted // ==> a list with oldest first
Tested with Scala 2.13 but should work with Scala 2.12 and later.
Bonus
Add an import and you can compare ZonedDateTime variables with <, <=, >, etc.
import scala.math.Ordering.Implicits._
base <= base.plusMinutes(10) // ==> true

Related

Subtract Months from YYYYMM date in Scala

I am trying to subtract months from YYYYMM format.
import java.text.SimpleDateFormat
val date = 202012
val dt_format = new SimpleDateFormat("YYYYMM")
val formattedDate = dt_format.format(date)
new DateTime(formattedDate).minusMonths(3).toDate();
Expected output:
202012 - 3 months = 202009,
202012 - 14 months = 201910
But it did not work as expected. Please help!
Among standard date/time types YearMonth seems to be the most appropriate for the given use case.
import java.time.format.DateTimeFormatter
import java.time.YearMonth
val format = DateTimeFormatter.ofPattern("yyyyMM")
YearMonth.parse("197001", format).minusMonths(13) // 1968-12
This solution uses the functionality in java.time, available since Java 8. I would have preferred coming up with a solution that did not require to adjust the input so that it could be (forcefully) parsed into a LocalDate (so that plusMonths) could be used, but at least it works.
Probably a simple regex could get the job done. ;-)
import java.time.format.DateTimeFormatter
import java.time.LocalDate
val inFmt = DateTimeFormatter.ofPattern("yyyyMMdd")
val outFmt = DateTimeFormatter.ofPattern("yyyyMM")
def plusMonths(string: String, months: Int): String =
LocalDate.parse(s"${string}01", inFmt).plusMonths(months).format(outFmt)
assert(plusMonths("202012", -3) == "202009")
assert(plusMonths("202012", -14) == "201910")
You can play around with this code here on Scastie.

Filtering in Scala

So suppose I have the following data (only the first few rows, this data covers an entire year) -
(2014-08-31T00:05:00.000+01:00, John)
(2014-08-31T00:11:00.000+01:00, Sarah)
(2014-08-31T00:12:00.000+01:00, George)
(2014-08-31T00:05:00.000+01:00, John)
(2014-09-01T00:05:00.000+01:00, Sarah)
(2014-09-01T00:05:00.000+01:00, George)
(2014-09-01T00:05:00.000+01:00, Jason)
I would like to filter the data so that I only see what the names are for a specific date (say, 2014-09-05). I've tried doing this using the filter function in Scala but I keep receiving the following error -
error: value xxxx is not a member of (org.joda.time.DateTime, String)
Is there another way of doing this?
The filter method takes a function, called a predicate, that takes as parameter an element of your (I'm assuming) RDD, and returns a Boolean.
The returned RDD will keep only the rows for which the predicate evaluates to true.
In your case, it seems that what you want is something like
rdd.filter{
case (date, _) => date.withTimeAtStartOfDay() == new DateTime("2017-03-31")
}
I presume from the tag your question is in the context of Spark and not pure Scala. Given that, you could filter a dataframe on a date and get the associated name(s) like this:
import org.apache.spark.sql.functions._
import sparkSession.implicits._
Seq(
("2014-08-31T00:05:00.000+01:00", "John"),
("2014-08-31T00:11:00.000+01:00", "Sarah")
...
)
.toDF("date", "name")
.filter(to_date('date).equalTo(Date.valueOf("2014-09-05")))
.select("name")
Note that the Date above is java.sql.Date.
Here's a function that takes a date, a list of datetime-name pairs, and returns a list of names for the date:
def getNames(d: String, l: List[(String, String)]): List[String] = {
val date = """^([^T]*).*""".r
val dateMap = list.map {
case (x, y) => ( x match { case date(z) => z }, y )
}.
groupBy(_._1) mapValues( _.map(_._2) )
dateMap.getOrElse(d, List[String]())
}
val list = List(
("2014-08-31T00:05:00.000+01:00", "John"),
("2014-08-31T00:11:00.000+01:00", "Sarah"),
("2014-08-31T00:12:00.000+01:00", "George"),
("2014-08-31T00:05:00.000+01:00", "John"),
("2014-09-01T00:05:00.000+01:00", "Sarah"),
("2014-09-01T00:05:00.000+01:00", "George"),
("2014-09-01T00:05:00.000+01:00", "Jason")
)
getNames("2014-09-01", list)
res1: List[String] = List(Sarah, George, Jason)
val dateTimeStringZero = "2014-08-12T00:05:00.000+01:00"
val dateTimeOne:DateTime = org.joda.time.format.ISODateTimeFormat.dateTime.withZoneUTC.parseDateTime(dateTimeStringZero)
import java.text.SimpleDateFormat
val df = new DateTime(new SimpleDateFormat("yyyy-MM-dd").parse("2014-08-12"))
println(dateTimeOne.getYear==df.getYear)
println(dateTimeOne.getMonthOfYear==df.getYear)
...

Flattening nested java lists in Scala

I am working in Scala with java libraries. One of these libraries returns a list of lists. I want to flatten the list.
Example:
import scala.collection.JavaConverters._
var parentList : util.List[util.List[Int]] = null
parentList = new util.ArrayList[util.List[Int]]
parentList.asScala.flatten // error
I have used asScala converter but I'm still meeting an error.
You need to call .asScala on every inner list :
scala> parentList.asScala.map(_.asScala)
res0: scala.collection.mutable.Buffer[scala.collection.mutable.Buffer[Int]] = ArrayBuffer()
scala> parentList.asScala.map(_.asScala).flatten
res1: scala.collection.mutable.Buffer[Int] = ArrayBuffer()
Note that calling .map and then .flatten can be done in one step using .flatMap :
scala> parentList.asScala.flatMap(_.asScala)
res2: scala.collection.mutable.Buffer[Int] = ArrayBuffer()
You also need to convert the inner List[Int]:
parentList.asScala.flatMap(_.asScala)
Try like this
import scala.jdk.CollectionConverters._
parentList.asScala.flatMap.map(_.toSeq)
This will do the trick.

How to match dates through fromJson(toJson(date)) with specs2

I am stuck on the following problem : I want to write a specs2 specification to assert that my to and from json transformations are symetrical. However, I get an error on joda datetime dates.
'2012-04-17T00:04:00.000+02:00' is not equal to '2012-04-17T00:04:00.000+02:00'. Values have the same string representation but possibly different types like List[Int] and List[String] (TimeSpecs.scala:18)
Here is a minimalist specs demonstrating the problem
import org.joda.time.DateTime
import org.specs2.mutable.Specification
class TimeSpecs extends Specification {
"joda and specs2" should {
"play nice" in {
val date = DateTime.parse("2012-04-17T00:04:00+0200")
val date2 = DateTime.parse("2012-04-17T00:04:00+0200")
date === date2
}
"play nice through play json transform" in {
import play.api.libs.json._
import play.api.libs.json.Json._
val date = DateTime.parse("2012-04-17T00:04:00+0200")
val jsDate= toJson(date)
val date2= jsDate.as[DateTime]
date === date2
}
}
}
how should I compare date and date2 in the second test ? they are the same but specs2 doesn't seem to see that :(
--- edit
"manually" inspecting the type at runtime with date.getClass.getCanonicalName returns org.joda.time.Datetime as expected
import org.joda.time.DateTime
import org.specs2.mutable.Specification
class TimeSpecs extends Specification {
"joda and specs2" should {
"play nice" in {
val date = DateTime.parse("2012-04-17T00:04:00+0200")
val date2 = DateTime.parse("2012-04-17T00:04:00+0200")
date === date2
}
"play nice through play json transform" in {
import play.api.libs.json._
import play.api.libs.json.Json._
val date:DateTime = DateTime.parse("2012-04-17T00:04:00+0200")
val jsDate= toJson(date)
val date2:DateTim= jsDate.as[DateTime]
println(date.getClass.getCanonicalName) //prints org.joda.time.DateTime
println(date2.getClass.getCanonicalName)//prints org.joda.time.DateTime
date === date2
}
}
}
Using DateTime#isEqual does kind of work but I loose the benefit of fluent matchers and the useful error messages they bring. Aditionally, what I am actually trying to compare are case class instances which happen to contain dates, not the dates themselves.
Using
date should beEqualTo(date2)
yields the same error as ===
The problem is that joda time defines a very strict equals which considers the date's Chronology for the equality ( DateTime#getChronology ). The isEqual method proposed by Kim Stebel does ignore the Chronology.
From there on, there are 2 possibilities: Defining custom read and writes for play, then using the same pattern to create the dates as in the following example
import org.joda.time.DateTime
import org.joda.time.format.DateTimeFormat
import org.specs2.mutable.Specification
class TimeSpecs extends Specification {
val pattern = "yyyy-MM-dd'T'HH:mm:ssZZ"
"joda and specs2" should {
"play nice" in {
val date = DateTime.parse("2012-04-17T00:04:00+0200",DateTimeFormat.forPattern(pattern))
val date2 = DateTime.parse("2012-04-17T00:04:00+0200",DateTimeFormat.forPattern(pattern))
date === date2
}
"play nice through play json transform" in {
import play.api.libs.json.Json._
//play2 custom write
implicit def customJodaWrite = play.api.libs.json.Writes.jodaDateWrites(pattern)
//play2 custom read
implicit def customJodaRead = play.api.libs.json.Reads.jodaDateReads(pattern)
val date:DateTime = DateTime.parse("2012-04-17T00:04:00+0200",DateTimeFormat.forPattern(pattern)) //make sure you parse the initial date with the same pattern
val jsDate= toJson(date)
val date2:DateTime= jsDate.as[DateTime]
println(date.getClass.getCanonicalName)
println(date2.getClass.getCanonicalName)
println(jsDate)
date should beEqualTo(date2)
}
}
}
Play 2.1 defaults to parsing (and writing to json) based on the unix timestamp in milliseconds without timezone information. When parsing back from the unix timestamp, it will consider it in the local computer timezone (in my case Europe/Paris). Hence the need for a custom parser/writer
Joda uses a specific formatter when calling parse without a parser argument, it doesn't seem possible to create the same formatter with only a pattern string ( I haven't found a way to activate the DateTimeFormatter#withOffsetParsed method through a pattern string).
Another possibility may be to define a custom specs2 matcher for jodatime which would use isEqual instead of equals.
Since I don't want the unix epoch in my json anyway, I'll stick with the custom play transformers

What's the new way to iterate over a Java Map in Scala 2.8.0?

How does scala.collection.JavaConversions supercede the answers given in Stack Overflow question Iterating over Java collections in Scala (it doesn't work because the "jcl" package is gone) and in Iterating over Map with Scala (it doesn't work for me in a complicated test which I'll try to boil down and post here later).
The latter is actually a Scala Map question, but I think I need to know both answers in order to iterate over a java.util.Map.
In 2.8, you import scala.collection.JavaConversions._ and use as a Scala map. Here's an example (in 2.8.0.RC1):
scala> val jmap:java.util.Map[String,String] = new java.util.HashMap[String,String]
jmap: java.util.Map[String,String] = {}
scala> jmap.put("Hi","there")
res0: String = null
scala> jmap.put("So","long")
res1: String = null
scala> jmap.put("Never","mind")
res2: String = null
scala> import scala.collection.JavaConversions._
import scala.collection.JavaConversions._
scala> jmap.foreach(kv => println(kv._1 + " -> " + kv._2))
Hi -> there
Never -> mind
So -> long
scala> jmap.keys.map(_.toUpperCase).foreach(println)
HI
NEVER
SO
If you specifically want a Scala iterator, use jmap.iterator (after the conversions import).