looking for spark scala(java) code for date string with spaces in between with specific conditions - scala

need some suggestions on below requirement.
Each response help a lot thanks in advance....
I have a date of type String with timestamp ex: Jan 8 2019 4:44 AM
My requirement is if the date is single digit I want date to be 1 space and digit
(ex: 8) and if the date is 2 digits which is dates from 10 to 31 I want date with no space(ex:10) and also same for hour in timestamp.
to summarize: if the date is 1 to 9 and hour in timestamp is 1 to 9 looking for below string
Jan 8 2019 4:44 AM
if the date is 10 to 31 and hour in timestamp is 10 to 12 looking for below string
Jan 18 2019 12:44 AM
right now I am creating a date in following way:
val sdf = new SimpleDateFormat("MMM d yyyy h:mm a")
but the above format satisfies only one condition which is dates from 1 to 9.
my application is spark with scala so looking for some spark scala code or java.
I appreciate your help...
Thanks..

java.time
Use p as a pad modifier in the format pattern string. In Java syntax (sorry):
DateTimeFormatter formatter = DateTimeFormatter.ofPattern(
"MMM ppd ppppu pph:mm a", Locale.ENGLISH);
System.out.println(LocalDateTime.of(2019, Month.JANUARY, 8, 4, 44)
.format(formatter));
System.out.println(LocalDateTime.of(2019, Month.JANUARY, 18, 0, 44)
.format(formatter));
Jan 8 2019 4:44 AM
Jan 18 2019 12:44 AM
And do yourself the favour: Forget everything about the SimpleDateFormat class. It is notoriously troublesome and fortunately long outdated. Use java.time, the modern Java date and time API.
Link: Oracle tutorial: Date Time explaining how to use java.time.
To quote the DateTimeFormatter class documentation:
Pad modifier: Modifies the pattern that immediately follows to be padded with spaces. The pad width is determined by the number of pattern letters. This is the same as calling DateTimeFormatterBuilder.padNext(int).
For example, 'ppH' outputs the hour-of-day padded on the left with spaces to a width of 2.

Related

DateTimeFormatter ofPattern not working for "L"

I have a LocalDateTime object and I would like to format this, to have printouts like:
Tue 23. Nov. Therefore, I used a DateTimeFormatter like:
val formatter: DateTimeFormatter = DateTimeFormatter.ofPattern("e dd. LLL")
But unfortunately I get Tue 23. 11 The month is a number and no letters!?
The correct format pattern string is E dd. MMM. Excuse my Java syntax.
private static final DateTimeFormatter DATE_FORMATTER
= DateTimeFormatter.ofPattern("E dd. MMM", Locale.ENGLISH);
Also remember to specify desired locale for your formatter.
Trying it out:
LocalDate date = LocalDate.of(2021, Month.NOVEMBER, 23);
String formatted = date.format(DATE_FORMATTER);
System.out.println(formatted);
Output is the desired:
Tue 23. Nov
Spelling out how my format pattern is different:
I am using upper case E for the abbreviation of the day of week. Lower case e should give you the number of the day of week like 2 for Tuesday. eee should work for the abbreviation too.
I am using MMM for the abbreviation of the month. LLL is for the standalone form. Some languages use a different form of the month depending on whether the day of month is present or not. A language may for example use the nominative for the month alone and the genetive with a day number, a bit the differene between November and of November. Since you have the day included, you should not use pattern letter L here. Funnily for some languages that have not got a stand-alone form (like English), Java gives you the number instead when you specify LLL.
Edit: you asked:
How would that look for "November" fully written out? "MMM" works for
"Dec."
The documentation that you linked to in another comment gives the answer:
Text: The text style is determined based on the number of pattern letters used. Less than 4 pattern letters will use the short form.
Exactly 4 pattern letters will use the full form. …
So use MMMM instead of MMM:
private static final DateTimeFormatter DATE_FORMATTER
= DateTimeFormatter.ofPattern("E dd. MMMM", Locale.ENGLISH);
Tue 23. November
Documentation link: DateTimeFormatter

conversion issue while putting transformer to read string having date values

I have the two fields
November 14 2019 10:35:24 AM and November 14 2019 as string from file
I want to convert in datstage to these fields as
11/14/2019 10:35:24AM and 20191114 respectively
Please note: after month there is one space between November and 14 and two spaces between 14 and 2019
and in output 11/14/2019 and time there is two spaces
As the input is a string and it seems you want again a string as a result string manipulation functions in a Transformer stage are always an option.
Alternatively you could also try to use the String_to_Timestamp and STRING_TO_DATE function on the same page
You will find valid format options here

How to change freemarker date value?

I am getting {item.pubDate} from XML and the value is:
Mon, 02 Mar 2015 14:35:47 +0000
so I did this:
<#assign starting_point = item.pubDate?index_of(",")>
<#assign date="${item.pubDate?substring(starting_point + 1)}" />
${date?datetime("dd MMM yyyy hh:mm:ss z")?date}<br>
and the result is: Mar 2, 2015.
My question is, can we change value from Mar to March and if we can then what is the best way to do it? I could have if/elseif statements in freemarker and assign each three letter months to full month name but it looks not good. Any advice/tips will be greatly appreciated. thanks.
It doesn't mater, MMM will parse both Mar and March. The only important thing is to have at least 3 M-s, as http://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html says:
If the number of pattern letters is 3 or more, the month is interpreted as text; otherwise, it is interpreted as a number.
Yes, FreeMarker follows the same datetime formatting rules as Java. Use the ?string built in for dates. You can do:
${date?datetime("dd MMM yyyy hh:mm:ss z")?string("MMMM dd, yyyy")}
Source: http://freemarker.org/docs/ref_builtins_date.html#ref_builtin_string_for_date

How do I parse "YYYY-MM-DD" with joda time

I'm trying to use joda-time to parse a date string of the form YYYY-MM-DD. I have test code like this:
DateTimeFormatter dateDecoder = DateTimeFormat.forPattern("YYYY-MM-DD");
DateTime dateTime = dateDecoder.parseDateTime("2005-07-30");
System.out.println(dateTime);
Which outputs:
2005-01-30T00:00:00.000Z
As you can see, the DateTime object produced is 30 Jan 2005, instead of 30 July 2005.
Appreciate any help. I just assumed this would work because it's one of the date formats listed here.
The confusion is with what the ISO format actually is. YYYY-MM-DD is not the ISO format, the actual resulting date is.
So 2005-07-30 is in ISO-8601 format, and the spec uses YYYY-MM-DD to describe the format. There is no connection between the use of YYYY-MM-DD as a pattern in the spec and any piece of code. The only constraint the spec places is that the result consists of a 4 digit year folowed by a dash followed by a 2 digit month followed by a dash followed by a two digit day-of-month.
As such, the spec could have used $year4-$month2-$day2, which would equally well define the output format.
You will need to search and replace any input pattern to convert "Y" to "y" and "D" to "d".
I've also added some enhanced documentation of formatting.
You're answer is in the docs: http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
The string format should be something like: "yyyy-MM-dd".
The date format described in the w3 document and JodaTime's DateTimeFormat are different.
More specifically, in DateTimeFormat, the pattern DD is for Day in year, so the value for DD of 30 is the 30th day in the year, ie. January 30th. As the formatter is reading your date String, it sets the month to 07. When it reads the day of year, it will overwrite that with 01 for January.
You need to use the pattern strings expected by DateTimeFormat, not the ones expected by the w3 dat and time formats. In this case, that would be
DateTimeFormatter dateDecoder = DateTimeFormat.forPattern("yyyy-MM-dd");

Groovy date parse issue

date.parse() method of groovy detects date DD and year yyyy correctly but is unable to detect the month as mmm.. As in
println new Date().parse("DD-MMM-yyyy", '22-MAR-2011')
yields output as
Sat Jan 22 00:00:00 GMT+05:30 2011
Why is the month march as MAR picked up as Jan? What can I do to make it detect the month in mmm format?
The problem is actualy that you are using DD - that means day in year
Correct way:
println new Date().parse("dd-MMM-yyyy", '22-MAR-2011')
Quick tip when formatting dates try using the reverse and see what comes out:
println new Date().format("dd-MMM-yyyy")
Groovy uses SimpleDateFormat under the hood but that's not that important since most date libraries use the same format conventions.