I am trying to parse a column's string values (e.g., D20200910.T000000) into a date format using pyspark. I tried the following, but the results came back null:
select(to_date('ingest_id','DYYYYMMDD.THHMMSS')
Related
I am wondering how to convert the GDATU fields from the TCURR table to the normal date format yyyy-mm-dd using Pyspark.
I tried it by creating a new column, using from_unixtime. but it seems not right.
df = df.withColumn('GDATU_NEW', F.from_unixtime('GDATU', 'yyyy-mm-dd'))
Trying to convert string value(2022-07-24T07:04:27.5765591Z) into datetime/timestamp to insert into SQL table in datetime format without losing any value till milliseconds. String which I am providing is actually a datetime and my source is ADLS CSV. I tried below options in data flow.
Using Projection-> Changed the datatype format for specific column into timestamp and format type-yyyy-MM-dd'T'HH:mm:ss.SSS'Z' however getting NULL in output.
Derived column-> Tried below expressions but getting NULL value in output
toTimestamp(DataLakeModified_DateTime,'%Y-%m-%dT%H:%M:%s%z')
toTimestamp(DataLakeModified_DateTime,'yyyy-MM-ddTHH:mm:ss:fffffffK')
toTimestamp(DataLakeModified_DateTime,'yyyy-MM-dd HH:mm:ss.SSS')
I want the same value in output-
2022-07-24T07:04:27.5765591Z (coming as string) to 2022-07-24T07:04:27.5765591Z (in datetime format which will be accepted by SQL database)
I have tried to repro the issue and it is also giving me the same error, i.e., null values for yyyy-MM-dd'T'HH:mm:ss.SSS'Z' timestamp format. The issue is with the string format you are providing in source. The ADF isn’t taking the given string as timestamp and hence giving NULL in return.
But if you tried with some different format, like keeping only 3 digits before Z in last format, it will convert it into timestamp and will not return NULL.
This is what I have tried. I have kept one timestamp as per your given data and other with some modification. Refer below image.
This will return NULL for the first time and datetime for second time.
But the format you are looking for is still missing. With the existing source format, the yyyy-MM-dd'T'HH:mm:ss would work fine. This format also works fine in SQL tables. I have tried and it’s working fine.
Try to use to String instead of timestamp and use this to create your Desired timestamp
toString(DataLakeModified_DateTime, 'yyyy-MM-dd HH:mm:ss:SS')
I am trying to change the date format for one of my field to dd/MM/yyyy but every time I used the below expression I get dd/MM/yyyy as the value in the cell instead of changing the format.
The data source that I am using is Teradata.
Expression used: =Format(FormatDateTime(Fields!DATE.Value, DateFormat.ShortDate),"dd/MM/yyyy")
Can some one help me with where am I going wrong.
Any help would be appreciated.
The FormatDateTime function returns a string instead of a date. When the FORMAT function tries to format it, it doesn't find a date field so it returns the characters in the format.
If your field is a date, you should be able to format the field without conversion:
=Format(Fields!DATE.Value,"dd/MM/yyyy")
If it does need to be converted first, try using the CDATE function:
=Format(CDATE(Fields!DATE.Value),"dd/MM/yyyy")
I have a string associated with date in ‘Teradata’ tables
Var1=09OCT2017-EMRT
I need to extract the date from the above string in ‘mm/dd/yyyy’ format
I tried the following
Cast(cast(substr(var1,1,9) as char(20)) as date format ‘mm/dd/yyyy’) as date
I am getting error as ‘invalid date supplied for var1’
I would appreciate your help
You need to apply a format matching the input string:
To_Date(Substr(var1,1,9), 'ddmonyyyy')
returns a DATE.
If you want to cast it back to a string:
To_Char(To_Date(Substr(var1,1,9), 'ddmonyyyy'), 'mm/dd/yyyy')
I'm very new to sql/hive. At first, I loaded a txt file into hive using:
drop table if exists Tran_data;
create table Tran_data(tran_time string,
resort string, settled double)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
Load data local inpath 'C:\Users\me\Documents\transaction_data.txt' into table Tran_Data;
The variable tran_time in the txt file is like this:10-APR-2014 15:01. After loading this Tran_data table, I tried to convert tran_time to a "standard" format so that I can join this table to another table using tran_time as the join key. The date format desired is 'yyyymmdd'. I searched online resources, and found this: unix_timestamp(substr(tran_time,1,11),'dd-MMM-yyyy')
So essentially, I'm doing this: unix_timestamp('10-APR-2014','dd-MMM-yyyy'). However, the output is "NULL".
So my question is: how to convert the date format to a "standard" format, and then further convert it to 'yyyymmdd' format?
from_unixtime(unix_timestamp('20150101' ,'yyyyMMdd'), 'yyyy-MM-dd')
My current Hive Version: Hive 0.12.0-cdh5.1.5
I converted datetime in first column to date in second column using the below hive date functions. Hope this helps!
select inp_dt, from_unixtime(unix_timestamp(substr(inp_dt,0,11),'dd-MMM-yyyy')) as todateformat from table;
inp_dt todateformat
12-Mar-2015 07:24:55 2015-03-12 00:00:00
unix_timestamp function will convert given string date format to unix timestamp in seconds , but not like this format dd-mm-yyyy.
You need to write your own custom udf to convert a given string date to the format that you need as present Hive do not have any predefined functions. We have to_date function to convert a timestamp to date , remaining all unix_timestamp functions won't help your problem.
select from_unixtime(unix_timestamp('01032018' ,'MMddyyyy'), 'yyyyMMdd');
input format: mmddyyyy
01032018
output after query: yyyymmdd
20180103
To help someone in the future:
The following function should work as it worked in my case
to_date(from_unixtime(UNIX_TIMESTAMP('10-APR-2014','dd-MMM-yyyy'))
unix_timestamp('2014-05-01','dd-mmm-yyyy') will work, your input string should be in this format for hive yyyy-mm-dd or yyyy-mm-dd hh:mm:ss
Where as you are trying with '01-MAY-2014' hive won't understand it as a date string