pyspark converting unix time to date - pyspark

I am using the following code to convert a column of unix time values into dates in pyspark:
transactions3=transactions2.withColumn('date', transactions2['time'].cast('date'))
The column transactions2['time'] contains the unix time values. However the column date which I create here has no values in it (date = None for all rows). Any idea why this would be?

Use from_unixtime. expr("from_unixtime(timeval)")

Related

Spark Scala - convert Timestamp with milliseconds to Timestamp without milliseconds

I have a column in Timestamp format that includes milliseconds.
I would like to reformat my timestamp column so that it does not include milliseconds. For example if my Timestamp column has values like 2019-11-20T12:23:13.324+0000, I would like my reformatted Timestamp column to have values of 2019-11-20T12:23:13
Is there a straight forward way to perform this operation in spark-scala? I have found lots of posts on converting string to timestamp but not for changing the format of a timestamp.
You can try trunc.
See more examples: https://sparkbyexamples.com/spark/spark-date-functions-truncate-date-time/

How to generate current_timestamp() without timezone in Pyspark?

I am trying to get the current_timestamp in a column in my dataframe. I am using below code for that.
df_new = df.withColumn('LOAD_DATE_TIME' , F.current_timestamp())
But this code is generating load_date_time in below format when exported to csv file.
2019-11-19T16:59:44.000+05:30
I don't want the timezone part and want the datetime in this below format.
2019-11-19 16:59:44

compare extracted date with today() in excel

Column 1 : I have this date-time format in one column = 2018-10-08T04:30:23Z
Column 3 : I extracted date with formula = =LEFT(A11,10) and changed column format to date.
Column 32 : today(). Just to make sure both date columns match
Now when I want to compare both dates
Column 4 : =IF(C11=D11,TRUE(),FALSE())
It does not work. What did I do wrong?
One option using formulas only would be to use Excel's DATE function, which takes three parameters:
=DATE(YEAR, MONTH, DAY)
Use the following formula to extract a date from your timestamp:
=DATE(LEFT(A1,4), MID(A1,6,2), MID(A1,9,2))
This assumes that the timestamp is in cell A1, with the format in your question. Now, comparing this date value against TODAY() should work, if the original timestamp were also from today.
Should be worth trying:
=1*LEFT(A1,10)=TODAY()
May depend upon your configuration. Without format conversion (the 1*) you are trying to compare text (all string functions return Text) with a Number.

How to get Hours and Minute using Extract function as single result in Postgresql

I have a timestamp with timezone column in one of my tables. I need to extract both hours and minutes from the timestamp with timezone column using extract function but i am unable too.
I tried like this,
extract(hour_minute from immi_referral_user_tb.completed_time) >= '06:30'
but I am getting a syntax error.
I am using extract function in where clause
immi_referral_user_tb.completed_time = timestamp with timezone column
Is there any other way too accomplish this?
You can cast the column to a time data type and compare that to a time value:
immi_referral_user_tb.completed_time::time >= time '06:30'

Date Format Conversion in Hive

I'm very new to sql/hive. At first, I loaded a txt file into hive using:
drop table if exists Tran_data;
create table Tran_data(tran_time string,
resort string, settled double)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
Load data local inpath 'C:\Users\me\Documents\transaction_data.txt' into table Tran_Data;
The variable tran_time in the txt file is like this:10-APR-2014 15:01. After loading this Tran_data table, I tried to convert tran_time to a "standard" format so that I can join this table to another table using tran_time as the join key. The date format desired is 'yyyymmdd'. I searched online resources, and found this: unix_timestamp(substr(tran_time,1,11),'dd-MMM-yyyy')
So essentially, I'm doing this: unix_timestamp('10-APR-2014','dd-MMM-yyyy'). However, the output is "NULL".
So my question is: how to convert the date format to a "standard" format, and then further convert it to 'yyyymmdd' format?
from_unixtime(unix_timestamp('20150101' ,'yyyyMMdd'), 'yyyy-MM-dd')
My current Hive Version: Hive 0.12.0-cdh5.1.5
I converted datetime in first column to date in second column using the below hive date functions. Hope this helps!
select inp_dt, from_unixtime(unix_timestamp(substr(inp_dt,0,11),'dd-MMM-yyyy')) as todateformat from table;
inp_dt todateformat
12-Mar-2015 07:24:55 2015-03-12 00:00:00
unix_timestamp function will convert given string date format to unix timestamp in seconds , but not like this format dd-mm-yyyy.
You need to write your own custom udf to convert a given string date to the format that you need as present Hive do not have any predefined functions. We have to_date function to convert a timestamp to date , remaining all unix_timestamp functions won't help your problem.
select from_unixtime(unix_timestamp('01032018' ,'MMddyyyy'), 'yyyyMMdd');
input format: mmddyyyy
01032018
output after query: yyyymmdd
20180103
To help someone in the future:
The following function should work as it worked in my case
to_date(from_unixtime(UNIX_TIMESTAMP('10-APR-2014','dd-MMM-yyyy'))
unix_timestamp('2014-05-01','dd-mmm-yyyy') will work, your input string should be in this format for hive yyyy-mm-dd or yyyy-mm-dd hh:mm:ss
Where as you are trying with '01-MAY-2014' hive won't understand it as a date string