Date/Time formatting Scala - scala

I'm trying to assert date and time displayed on the page
Problem is it's returning value of "2017-03-11T09:00" instead of "2017-03-11 09:00:00" and I'm confused why as the pattern = yyyy-MM-dd HH:mm:ss
Any ideas?
def getDate :String = {
val timeStamp = find(xpath("//*[#id=\"content\"]/article/div/div/table/tbody/tr[5]/td/div/p[4]")).get.underlying.getText
val stripDate: Array[String] = timeStamp.split("Timestamp:\n")
stripDate(1)
}
def datePattern(date: String): LocalDateTime = {
val pattern: DateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
val result = LocalDateTime.parse(date, pattern)
result
}
def checkDatePattern() = datePattern(getDate).toString shouldBe getDate

The DateTimeFormatter only gets used for the parse operation. It doesn't influence the result of toString. If you want to convert your LocalDateTime to a String in a certain format you have to call
date.format(pattern)

I've managed to get the result I wanted by just deleting some parts of the code. As long as the date is in displayed in the correct format, the test passes if it's displayed in an incorrect format it fails, which is good enough for me. Thanks for your input. CASE CLOSED
def getDate :String = {
val timeStamp = find(xpath("//*[#id=\"content\"]/article/div/div/table/tbody/tr[5]/td/div/p[4]")).get.underlying.getText
val stripDate: Array[String] = timeStamp.split("Timestamp:\n")
stripDate(1)
}
def datePattern(date: String): LocalDateTime = {
val pattern: DateTimeFormatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
LocalDateTime.parse(date, pattern)
}
def checkDatePattern() = datePattern(getDate)

Related

How to write functions for CurrentYear and PreviousYear in scala?

I have this scenario where I am fetching for CurrentDate. Adding to that I also want to fetch data from CurrentYear and PreviousYear
def getCurrentDate: String = {
val cal = Calendar.getInstance()
cal.add(Calendar.DATE, amount = 0)
new SimpleDateFormat(pattern = "yyyyMMdd").format(cal.getTime())
}
def getCurrentYear: String = {
val cal = Calendar.getInstance()
cal.add(Calendar.DATE, amount = 0)
new SimpleDateFormat(pattern = "yyyyMMdd").format(cal.getTime())
}
I am actually figuring out on how to write this simple function
First of all, you need to know that the current year depends on your timezone.
So you first need to choose the timezone for which you want to know the year... specially in edge scenarios like new years eve.
That being said, you can do this:
import java.time._
val yourTZ: ZoneId = ...
def currentDate = Instant.now.atZone(yourTZ).toString.take(10) // because the toString uses ISO formatting
def currentYear = currentDate.take(4)
def previousYear = (currentYear.toInt-1).toString

Spark scala UDF in DataFrames is not working

I have defined a function to convert Epoch time to CET and using that function after wrapping as UDF in Spark dataFrame. It is throwing error and not allowing me to use it. Please find below my code.
Function used to convert Epoch time to CET:
import java.text.SimpleDateFormat
import java.util.{Calendar, Date, TimeZone}
import java.util.concurrent.TimeUnit
def convertNanoEpochToDateTime(
d: Long,
f: String = "dd/MM/yyyy HH:mm:ss.SSS",
z: String = "CET",
msPrecision: Int = 9
): String = {
val sdf = new SimpleDateFormat(f)
sdf.setTimeZone(TimeZone.getTimeZone(z))
val date = new Date((d / Math.pow(10, 9).toLong) * 1000L)
val stringTime = sdf.format(date)
if (f.contains(".S")) {
val lng = d.toString.length
val milliSecondsStr = d.toString.substring(lng-9,lng)
stringTime.substring(0, stringTime.lastIndexOf(".") + 1) + milliSecondsStr.substring(0,msPrecision)
}
else stringTime
}
val epochToDateTime = udf(convertNanoEpochToDateTime _)
Below given Spark DataFrame uses the above defined UDF for converting Epoch time to CET
val df2 = df1.select($"messageID",$"messageIndex",epochToDateTime($"messageTimestamp").as("messageTimestamp"))
I am getting the below shown error, when I run the code
Any idea how am I supposed to proceed in this scenario ?
The spark optimizer execution tells you that your function is not a Function1, that means that it is not a function that accepts one parameter. You have a function with four input parameters. And, although you may think that in Scala you are allowed to call that function with only one parameter because you have default values for the other three, it seems that Catalyst does not work in this way, so you will need to change the definition of your function to something like:
def convertNanoEpochToDateTime(
f: String = "dd/MM/yyyy HH:mm:ss.SSS"
)(z: String = "CET")(msPrecision: Int = 9)(d: Long): String
or
def convertNanoEpochToDateTime(f: String)(z: String)(msPrecision: Int)(d: Long): String
and put the default values in the udf creation:
val epochToDateTime = udf(
convertNanoEpochToDateTime("dd/MM/yyyy HH:mm:ss.SSS")("CET")(9) _
)
and try to define the SimpleDateFormat as a static transient value out of the function.
I found why the error is due to and resolved it. The problem is when I wrap the scala function as UDF, its expecting 4 parameters, but I was passing only one parameter. Now, I removed 3 parameters from the function and took those values inside the function itself, since they are constant values. Now in Spark Dataframe, I am calling the function with only 1 parameter and it works perfectly fine.
import java.text.SimpleDateFormat
import java.util.{Calendar, Date, TimeZone}
import java.util.concurrent.TimeUnit
def convertNanoEpochToDateTime(
d: Long
): String = {
val f: String = "dd/MM/yyyy HH:mm:ss.SSS"
val z: String = "CET"
val msPrecision: Int = 9
val sdf = new SimpleDateFormat(f)
sdf.setTimeZone(TimeZone.getTimeZone(z))
val date = new Date((d / Math.pow(10, 9).toLong) * 1000L)
val stringTime = sdf.format(date)
if (f.contains(".S")) {
val lng = d.toString.length
val milliSecondsStr = d.toString.substring(lng-9,lng)
stringTime.substring(0, stringTime.lastIndexOf(".") + 1) + milliSecondsStr.substring(0,msPrecision)
}
else stringTime
}
val epochToDateTime = udf(convertNanoEpochToDateTime _)
import spark.implicits._
val df1 = List(1659962673251388155L,1659962673251388155L,1659962673251388155L,1659962673251388155L).toDF("epochTime")
val df2 = df1.select(epochToDateTime($"epochTime"))

Convert string to date in kotlin

could you help me?
I get a string of type 01092020, how could I convert to date type in 01/09 format, just month and year?
You should use DateTimeFormatter as follows:
fun main(args: Array<String>) {
val dateString = "01092020"
val readingFormatter = DateTimeFormatter.ofPattern("ddMMyyyy")
val date = LocalDate.parse(dateString, readingFormatter)
val writingFormatter = DateTimeFormatter.ofPattern("MM/yy")
val formattedDate = date.format(writingFormatter)
print(formattedDate)
}
You can substring your string and break into month and year e.g DDMMYYYY
val dateStr = "01092020"
val dd = dateStr.substring(0..1)
val mm = dateStr.substring(2..3)
val yyyy = dateStr.substring(4 until dateStr.length)
val DDYY = "$dd/$yyyy"
val mmYYYY = "$mm/$yyyy"

Spark - Convert all Timestamp columns to a certain date format

I have a use case where I need to read data from Hive tables (Parquet), convert Timestamp columns to a certain format and write the output as csv.
For the date format thing, I want to write a function that takes a StructField and returns either the original field name or date_format($"col_name", "dd-MMM-yyyy hh.mm.ss a"), if the dataType is TimestampType. This is what I have come up with so far
def main (String[] args) {
val hiveSchema = args(0)
val hiveName = args(1)
val myDF = spark.table(s"${hiveSchema}.${hiveTable}")
val colArray = myDF.schema.fields.map(getColumns)
val colString = colArray.mkString(",")
myDF.select(colString).write.format("csv").mode("overwrite").option("header", "true").save("/tmp/myDF")
}
def getColumns(structField: StructField): String = structField match {
case structField if(structField.dataType.simpleString.equalsIgnoreCase("TimestampType")) => s"""date_format($$"${structField.name}", "dd-MMM-yy hh.mm.ss a")"""
case _ => structField.name
}
But I get the following error at runtime
org.apache.spark.sql.AnalysisException: cannot resolve '`date_format($$"my_date_col", "dd-MMM-yy hh.mm.ss a")`' given input columns [mySchema.myTable.first_name, mySchema.myTable.my_date_col];
Is there a better way to do this?
Remove the double dollar sign and quotes. Also, no need to mkString; just use selectExpr:
def main (String[] args) {
val hiveSchema = args(0)
val hiveName = args(1)
val myDF = spark.table(s"${hiveSchema}.${hiveTable}")
val colArray = myDF.schema.fields.map(getColumns)
myDF.selectExpr(colArray: _*).write.format("csv").mode("overwrite").option("header", "true").save("/tmp/myDF")
}
def getColumns(structField: StructField): String = structField match {
case structField if(structField.dataType.simpleString.equalsIgnoreCase("TimestampType")) => s"""date_format(${structField.name}, "dd-MMM-yy hh.mm.ss a") as ${structField.name}"""
case _ => structField.name
}

Subtract days from a date in the form of a string

I have a date in the form of a string like below
"08/08/2017 11:43"
I was trying to subtract a day from the above date string so the final output would be "07/08/2017 11:43"
I tried it with the below code
val x = "18/8/2017 11:43"
val formatter = DateTimeFormat.forPattern("dd/MM/yyyy HH:mm");
val dt = formatter.parseDateTime(x);
println(dt.minusDays(1))
But, the output that I got is 2017-08-17T11:43:00.000-07:00
Is there a better way of doing this?
You need the formatter for both parsing and formatting for output.
import java.time.LocalDateTime
import java.time.format.DateTimeFormatter
val x = "18/8/2017 11:43"
val formatter = DateTimeFormatter.ofPattern("d/M/yyyy HH:mm")
val dt = LocalDateTime.parse(x, formatter)
val res = dt.minusDays(1).format(formatter) //res: String = 17/8/2017 11:43