I need to transform convert Tue Jul 07 2020 12:30:42 to timestamp with scala for spark.
So the expected result will be : 2020-07-07 12:30:42
Any idea, how to make this please ?
You can use the to_timestamp function.
spark.conf.set("spark.sql.legacy.timeParserPolicy", "LEGACY") <-- Spark 3.0 Only.
df.withColumn("date", to_timestamp('string, "E MMM dd yyyy HH:mm:ss"))
.show(false)
+------------------------+-------------------+
|string |date |
+------------------------+-------------------+
|Tue Jul 07 2020 12:30:42|2020-07-07 12:30:42|
+------------------------+-------------------+
Related
I'm using intl package to format the String.
import 'package:intl/intl.dart';
...
String date = "Wed Sep 07 11:11:19 GMT+05:30 2022";
DateFormat formatter = DateFormat("EEE MMM dd HH:mm:ss zXXX yyyy");
DateTime formattedDateTime = formatter.parse(date);
But getting an Exception
FormatException: Trying to read XXX from Wed Sep 07 11:11:19 GMT+05:30 2022 at position 24
Tested the date format with this tool.
I want to convert a string date column to a date or timestamp (YYYY-MM-DD). How can i do it in scala Spark Sql ?
Input:
D1
Apr 24 2022|
Jul 08 2021|
Jan 16 2022|
Expected :
D2
2022-04-24|
2021-07-08|
2022-01-16|
You can use to_date and format the input accordingly.
Need to make the month characters in uppercase to get it recognized in the pattern.
Refer this to get the format patterns
https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html
select 'Apr 24 2022' D1, to_date(upper('Apr 24 2022'),'MMM dd yyyy') D2
union
select 'Jul 08 2021' D1, to_date(upper('Jul 08 2021'),'MMM dd yyyy') D2
union
select 'Jan 16 2022' D1, to_date(upper('Jan 16 2022'),'MMM dd yyyy') D2
I have a problem with Single planning calendar.
I set fullday to false and set a startHour (8h) and endHour (17h).
But when i want to resize an appointment the startDate is not correct.
handleAppointmentResize: function (oEvent) {
var oAppointment = oEvent.getParameter("appointment"),
oStartDate = oEvent.getParameter("startDate"),
oEndDate = oEvent.getParameter("endDate"),
sAppointmentTitle = oAppointment.getTitle();
console.log(oStartDate);
console.log(oEndDate);
For exemple :
I have an appointment :
Wed Apr 06 2022 09:00:00 to Wed Apr 06 2022 13:00:00
I reduce them to 09:00 - 10:00
The result is
Mon Apr 04 2022 19:30:00 GMT+0200 (heure d’été d’Europe centrale)
Wed Apr 06 2022 09:00:00 GMT+0200 (heure d’été d’Europe centrale)
If i switch fullday to true, the result is ok !
Thanks for help
Resolve with upgrade of Sapui5 version
My Dataframe, myDF is like bellow -
DATE_TIME
Wed Sep 6 15:24:27 CDT 2017
Wed Sep 6 15:30:05 CDT 2017
Expected output in format :
2017-09-06 15:24:27
2017-09-06 15:30:05
Need to convert DATE_TIME timestamp to UTC.
Tried the below code in databricks notebook but it's not working.
%scala
val df = Seq(("Wed Sep 6 15:24:27 CDT 2017")).toDF("times")
df.withColumn("times2",date_format(to_timestamp('times,"ddd MMM dd hh:mm:ss CDT yyyy"),"yyyy-MM-dd HH:mm:ss")).show(false)
times | times2
Wed Sep 6 15:24:27 CDT 2017 | null
I think we need to remove wed from your string then use to_timestamp() function.
Example:
df.show(false)
/*
+---------------------------+
|times |
+---------------------------+
|Wed Sep 6 15:24:27 CDT 2017|
+---------------------------+
*/
df.withColumn("times2",expr("""to_timestamp(substring(times,5,length(times)),"MMM d HH:mm:ss z yyyy")""")).
show(false)
/*
+---------------------------+-------------------+
|times |times2 |
+---------------------------+-------------------+
|Wed Sep 6 15:24:27 CDT 2017|2017-09-06 15:24:27|
+---------------------------+-------------------+
*/
The following example:
import pyspark.sql.functions as F
df = sqlContext.createDataFrame([('Feb 4 1997 10:30:00',), ('Jan 14 2000 13:33:00',), ('Jan 13 2020 01:20:12',)], ['t'])
ts_format = "MMM dd YYYY HH:mm:ss"
df.select(df.t,
F.to_timestamp(df.t, ts_format),
F.date_format(F.current_timestamp(), ts_format))\
.show(truncate=False)
Outputs:
+--------------------+-----------------------------------------+------------------------------------------------------+
|t |to_timestamp(`t`, 'MMM dd YYYY HH:mm:ss')|date_format(current_timestamp(), MMM dd YYYY HH:mm:ss)|
+--------------------+-----------------------------------------+------------------------------------------------------+
|Feb 4 1997 10:30:00 |1996-12-29 10:30:00 |Jan 22 2020 14:38:28 |
|Jan 14 2000 13:33:00|1999-12-26 13:33:00 |Jan 22 2020 14:38:28 |
|Jan 22 2020 14:29:12|2019-12-29 14:29:12 |Jan 22 2020 14:38:28 |
+--------------------+-----------------------------------------+------------------------------------------------------+
Question:
The conversion from current_timestamp() to string works with the given format. Why the other way (String to Timestamp) doesn't?
Notes:
pyspark 2.4.4 docs point to simpleDateFormat patterns
Changing the year's format to lowercase fixed the issue
ts_format = "MMM dd yyyy HH:mm:ss"