I have a timestamp like "1461819600". The I execute this code in a distributed environment as val campaign_startdate_year: String = Utils.getYear(campaign_startdate_timestamp).toString
The problem is that I always get the same year 1970. Which might be the reason of it?
import com.github.nscala_time.time.Imports._
def getYear(timestamp: Any): Int = {
var dt = 2017
if (!timestamp.toString.isEmpty)
{
dt = new DateTime(timestamp.toString.toLong).getYear // toLong should be multiplied by 1000 to get millisecond value
}
dt
}
The same issue occurs when I want to get a day of a month. I get 17 instead of 28.
def getDay(timestamp: Any): Int = {
var dt = 1
if (!timestamp.toString.isEmpty)
{
dt = new DateTime(timestamp.toString.toLong).getDayOfYear
}
dt
}
The timestamp you have is a number of seconds since 01-01-1970, 00:00:00 UTC.
Java (and Scala) usually use timestamps that are a number of milliseconds since 01-01-1970, 00:00:00 UTC.
In other words, you need to multiply the number with 1000.
The timestamp that you have seems to be in seconds since the epoch (i.e. a Unix timestamp). Java time utilities expect the timestamp to be in milliseconds.
Just multiply that value by 1000 and you should get the expected results.
You can rely on either on spark sql function which have some date utilities (get year/month/day, add day/month) or you can use JodaTime library to have more control over Date and DateTime, like in my answer here: How to replace in values in spark dataframes after recalculations?
Related
I have a Spark data frame with the column timestamp. I need to create event_hour in unix_timestamp format out of this column. The current issue is that the timestamp is in unix_timestamp format with a granularity of milliseconds while I need the granularity of hours.
Current values for timestamp:
1653192037
1653192026
1653192025
1653192024
1653192023
1653192022
Expected values:
1653192000
1653195600
1653199200
1653202800
How can I achieve that using Spark functions?
I've already tried to convert it to timestamp and then format it but I got null as the result:
inputDf
.withColumn("event_hour", unix_timestamp(date_format($"timestamp".cast(TimestampType), "MM-dd-yyyy HH")))
A (not very explicit but) efficient way would be to use modulus operation with 3600 (as 3600 seconds = 1 hour):
timestamp_hour = timestamp_second - (timestamp_second % 3600)
This assumes you are manipulating data as numeric.
You can use DateUtils API,
import org.apache.commons.lang3.time.DateUtils;
Long epochTimestamp_hour = DateUtils.truncate(Timestamp_column, Calendar.HOUR)).getTime();
create new column of type timestamp
use that column to truncate timestamp to epochTimestamp_hour
I have a column ORDER_DATE with epoch timestamp in string. How can I convert this column with string like str = "1536309236032" which is time in epoch to a string with format: 2018-09-07T14:03:56.032Z in Scala?
Currently I am using:
from_unixtime(input.col(ORDER_DATE), "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")
but this converts incorrectly to 50668-08-21 01:10:00.000. Here, it increases the year and increments 000 for milliseconds.
I don't want to divide by 1000 as we would like to have the result in milli seconds.
In the documentation, the definition of from_unixtime is as follows:
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.
It uses seconds and are thus not compatible with milliseconds which is why the result is wrong. To convert the epoch timestamp and keeping the millisecond information, you can use concat:
val spark = SparkSession.builder.getOrCreate()
import spark.implicits._
df.withColumn("time", concat(
from_unixtime($"ORDER_DATE"/1000, "yyyy-MM-dd'T'HH:mm:ss."),
$"ORDER_DATE".substr(length($"ORDER_DATE")-2, length($"ORDER_DATE")),
lit("Z")))
This will work since the last 3 digits in the epoch timestamp is the same as those in the wanted result.
I got the idea from #Shaido and I did something similar. Finally, this solved the issue for me:
input.withColumn("time",
concat(from_unixtime(input.col("ORDER_DATE")/1000, "yyyy-MM-dd'T'HH:mm:ss"),
typedLit("."), substring(input.col("ORDER_DATE"), 11, 3), typedLit("Z")))
I creating an application, for that I need to find data by month using JPA and java.time.LocalDate. So, is it possible to retrieve data by month from mysql?
Thanks in advance for help.
First find start and end date of month and use between method of JPA to find data of current month.
LocalDate start = LocalDate.ofEpochDay(System.currentTimeMillis() / (24 * 60 * 60 * 1000) ).withDayOfMonth(1);
LocalDate end = LocalDate.ofEpochDay(System.currentTimeMillis() / (24 * 60 * 60 * 1000) ).plusMonths(1).withDayOfMonth(1).minusDays(1);
In Repository
List<Object> findByCreatedateGreaterThanAndCreatedateLessThan(LocalDate start,LocalDate end);
Its better to use the between keyword, it makes things allot shorter.
List<Object> findByCreatedateBetween(LocalDate start,LocalDate end);
Also if you want to use the LocalDate or LocalDateTime objects with Spring Data you should use the converter class Jsr310JpaConverters or else the documents will be stored as Blobs instead of Dates (which is bad for portability of the database). Please see this tutorial on how to implement the Converter.
https://www.mkyong.com/spring-boot/spring-boot-spring-data-jpa-java-8-date-and-time-jsr310/
tl;dr
YearMonth.now( ZoneId.of( "Pacific/Auckland" ) ) // Get current month for particular time zone.
.atDayOfMonth( 1 ) // Get the first date of that month.
.plusMonths( 1 ) // Get first of next month for Half-Open query.
Details
Assuming your column in MySQL is of DATE type…
LocalDate
The LocalDate class represents a date-only value without time-of-day and without time zone.
Time zone
A time zone is crucial in determining a date. For any given moment, the date varies around the globe by zone. For example, a few minutes after midnight in Paris France is a new day while still “yesterday” in Montréal Québec.
Specify a proper time zone name in the format of continent/region, such as America/Montreal, Africa/Casablanca, or Pacific/Auckland. Never use the 3-4 letter abbreviation such as EST or IST as they are not true time zones, not standardized, and not even unique(!).
ZoneId z = ZoneId.of( "America/Montreal" );
LocalDate today = LocalDate.now( z );
YearMonth
The YearMonth class represents an entire month. Getting the current month requires a time zone as discussed above. Around the beginning/ending of the month, the current moment could be “next” month in Auckland New Zealand while still “previous” month in Kolkata India.
YearMonth currentMonth = YearMonth.now( z ) ;
Get the first date of the month.
LocalDate start = currentMonth.atDayOfMonth( 1 ) ;
Half-Open
Generally best to use the Half-Open [) approach to defining a span of time, where the beginning is inclusive while the ending is exclusive. So defining a month means starting with the first date of the month and running up to, but not including, the first date of the following month.
LocalDate stop = start.plusMonths( 1 ) ;
Query
Do not use the BETWEEN command in SQL as it is fully closed [], both beginning and ending being inclusive. Half-Open uses >= & < logic.
SELECT when FROM tbl
WHERE when >= start
AND when < stop
;
it's also useful
#Query("from PogWorkTime p where p.codePto = :codePto and month(p.dateApply) = :month and year(p.dateApply) = :year")
Iterable<PtoExceptWorkTime> findByCodePtoAndDateApply_MonthAndDateApply_Year(#Param("codePto") String codePto,#Param("month") int month, #Param("year") int year);
I have a Unix timestamp as String and I would like to extract hour and minutes in order to convert this timestamp into total minutes.
val timestamp = "1469768809"
It would be straightforward if timestamp were not String (i.e. using timestamp.get(Calendar.HOUR_OF_DAY)). However I don't know how to deal with a String.
Looks like you have epoch time in seconds. Convert the string to an Int, then do the necessary calculation:
val minutes = "1469768809".toInt / 60
Unless what you actually want is a datetime, in which case you should look into some of the date/time libraries for scala, e.g. nscala-time
I'm trying to parse a string that contains a DateTime
def parseDateTime(str : String) : DateTime = {
//need to parse date time of this format
//2015-05-22T05:10:00.305308666Z
DateTime.parse(str,DateTimeFormat.forPattern(dateTimePattern))
}
def dateTimePattern = "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSSSZ"
and here is my test case trying to parse the date time
"MarshallerUtil" must "parse a date time correctly from blockcypher" in {
val str = "2015-05-22T05:10:00.305308666Z"
val dateTime = parseDateTime(str)
dateTime.getYear must be (2015)
dateTime.getMonthOfYear must be (DateTimeConstants.MAY)
dateTime.getDayOfMonth must be (22)
dateTime.getHourOfDay must be (5)
dateTime.getMinuteOfHour must be (10)
}
and it fails to get the correct hour
[info] - must parse a date time correctly from blockcypher *** FAILED ***
[info] 0 was not equal to 5 (MarshallerUtilTest.scala:17)
What is incorrect on my pattern?
That's because it parses the date time as UTC and when you invoke the getHourOfDay, it returns the time unit with local timezone. For example the same program printed '10' here, because my local timezone is '+05:30' and so, 05:10 and a 05:30 is 10:40. I hope this helps.
Update:
Z is a placeholder/matcher that is used in the date time pattern to match a timezone. A timezone has the form '+HH:mm' or '-HH:mm', for example '+05:30' means that the timezone is 5 hours and 30 mins ahead of the UTC time.