presto from_unixtime function is right? - nosql

i make a query about bigint to timestamp and value is '1494257400'
i will use a presto query
but presto is not collect result about from_unixtime() function.
hive version.
select from_unixtime(1494257400) result : '2017-05-09 00:30:00'
presto version.
Blockquote
select from_unixtime(1494257400) result : '2017-05-08 08:30:00'
hive gave a collect result, but presto is not collect result. how i can solve about it?

The presto from_unixtime returns you a date at UTC when the one from Hive returns you a date on your local time zone.
According to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF, from_unixtime:
Converts the number of seconds from unix epoch (1970-01-01 00:00:00
UTC) to a string representing the timestamp of that moment in the
current system time zone in the format of "1970-01-01 00:00:00".
The output of Hive is not that good because ISO formatted strings should show GMT data if they have any which are not GMT+00.
With Hive, you can use to_utc_timestamp({any primitive type} ts, string timezone) to convert your timestamp to the proper timezones. Take a look at the manual whose link is provided above.

Related

Azure Data factory - data flow expression date and timestamp conversions

Using derived column i am adding 3 columns -> 2 columns for date and 1 for timestamp. for the date columns i am passing a string as parameter. for eg: 21-11-2021 and timstamp i am using currenttimestamp fucntion.
i wrote expressions in derived columns to convert them as date and timestamp datatype and also in a format that target table needs which is dd-MM-yyyy and dd-MM-yyyy HH:mm:ss repectively
For date->
expression used: toDate($initialdate, 'dd-MM-yyyy')
data preview output: 2021-01-21 --(not in the format i want)
After pipline Debug Run, value in target DB(Azure sql database) column:
2021-01-21T00:00:00 -- in table it shows like this I dont understand why
For Timstamp conversion:
Expression used:
toTimestamp(toString(currentTimestamp(), 'dd-MM-yyyy HH:mm:ss', 'Europe/Amsterdam'), 'dd-MM-yyyy HH:mm:ss')
Data preview output: 2021-11-17 19:37:04 -- not in the format i want
After pipline Debug Run, value in target DB(Azure sql database) column:
2021-11-17T19:37:04:932 -in table it shows like this I dont understand why
question 1: I am NOT getting values in the format the target requires ???and it should be only in DATE And Datetime2 dataype respectively so no string conversions
question 2: after debug run i dont know why after insert the table values look different from Data preview???
Kinldy let me know if i have written any wrong expressions??
--apologies i am not able post pictures---
toDate() converts input date string to date with default format as yyyy-[M]M-[d]d. Accepted formats are :[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ].
Same goes with toTimestamp(), the default pattern is yyyy-[M]M-[d]d hh:mm:ss[.f...] when it is used.
In Azure SQL Database as well the default date and datetime2 formats are in YYYY-MM-DD and YYYY-MM-DD HH:mm:ss as shown below.
But if your column datatypes are in string (varchar) format, then you can change the output format of date and DateTime in azure data flow mappings.
When loaded to Azure SQL database, it is shown as below:
Note: This format results when datatype is varchar
If the datatype is the date in the Azure SQL database, you can convert them to the required format using date conversions as
select id, col1, date1, convert(varchar(10),date1,105) as 'dd-MM-YYYY' from test1
Azure SQL Database always follows the UTC time zone. Using “AT TIME ZONE” convert it another non-UTC time zone.
select getdate() as a, getdate() AT TIME ZONE 'UTC' AT TIME ZONE 'Central Standard Time' as b
You can also refer to sys.time_zone_info view to check current UTC offset information.
select * from sys.time_zone_info

Can the as400 timestamp (yyyy-MM-dd-HH.mm.ssssss) be used for querying irrespective of the device time format?

Date format of the machine is yyyy/mm/dd HH:mm:ss
(yyyy-MM-dd-HH.mm.ssssss) 1988-12-25-17.12.30.000000 this is my time format input, Can this time format be used to query logs from historic_log_info table? Irrespective of the date format set in the machine.
example query - SELECT * FROM TABLE(HISTORY_LOG_INFO( START_TIME => '2021-02-22-09.35.16.508075'))WHERE MESSAGE_ID IS NOT NULL
Based on the description for end-time, that formating looks correct:
start-time
A timestamp expression that indicates the starting timestamp to use when returning history log information.
If this parameter is omitted, the default of CURRENT DATE - 1 DAY is used.
end-time
A timestamp expression that indicates the ending timestamp to use when returning history log information.
If this parameter is omitted, the default of '9999-12-30-00.00.00.000000' is used.
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/rzajq/rzajqudfhistoryloginfo.htm

Timestamp converted to UTC while loading to parquet

I am loading data to parquet through spark.
dataFrame.write.parquet(path)
My data have a timestamp column while writing to parquet it is actually converting timestamp to UTC timezone and then storing.
actual time ------- 2020-10-21 00:00:00.000
UTC time--------- 2020-10-21T05:30:00.000+05:30
I see spark conf is set to UTC timezone.spark.sql.session.timeZone
Is there any way to turn off this conversion?
I want to load timestamp as is without converting it to any other timezone. how do i do that?
See documentation here:
https://databricks.com/blog/2020/07/22/a-comprehensive-look-at-dates-and-timestamps-in-apache-spark-3-0.html
When writing timestamp values out to non-text data sources like Parquet, the values are just instants (like timestamp in UTC) that have no time zone information. If you write and read a timestamp value with a different session time zone, you may see different values of the hour, minute, and second fields, but they are the same concrete time instant.

indexing timestamptz for specific timezone

My console is PST.
Database server and times stored are GMT.
I'm having to run queries like so:
SELECT x,y,z
FROM tbl_msg
WHERE (msg_datetime AT TIME ZONE 'BST') BETWEEN '2016-11-21'::date and '2016-11-22'::date;
Indexing 101 says that performing this operation on msg_datetime will now avoid the index and this is what I'm seeing.
So I need advice with an indexing solution for this.
Can I index this timezone? or alter this query so that it queries these times in BST, converted to GMT?
You should have msg_datetime column of type timestamp with time zone (or shorter alias timestamptz) with normal index.
Then, to get data for these 2 days, you should:
set timezone 'Europe/London'; -- once, on connection start
SELECT x,y,z
FROM tbl_msg
WHERE
msg_datetime>='2016-11-21 00:00:00'
and
msg_datetime<'2016-11-23 00:00:00';
You should not use ordinary timestamp, as it stores literal date and hour without information about which timezone it actually meant. A timestamp with time zone type will automatically convert your client's configured time to internal representation (which is in UTC) and back. You can also express timestamptz from non-default timezone using for example '2016-11-23 00:00:00 Asia/Tokyo'.
Also you should not use BST - because you'd need to use GMT on winter and remember when to use which. You should use 'Europe/London' or other "city" timezones (list), which are right both in summer and in winter.

How to convert a postgres timestamp to a string with the timezone as numeric offset?

My query needs to format my timestamp without time zonecolumns into a format which in Java I can obtain like this:
SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZZZZZ")
For example:
2015-12-21T16:29:07.000-06:00 or 2016-04-18T10:10:09.000+0000
Now, I know the format wants a timezone, but it's OK to assume timezone = UTC.
What I tried:
select to_char(updated_at, 'yyyy-MM-dd''T''HH:mm:ss.SSSZZZZZZ') from teams
2016-04-07'T'05:04:47.47SZZZZZZ
select to_char(updated_at::TIMESTAMPTZ, 'yyyy-MM-dd''T''HH:mm:ss.SSSZZZZZZ') from teams
2016-04-07'T'05:04:47.47SZZZZZZ
Also tried with postgres' 'tz' and 'TZ' it's just not capable to output a numeric timezone!
select to_char(updated_at::TIMESTAMPTZ, 'yyyy-MM-dd''T''HH:mm:ss.SS TZ') from teams
2016-04-07'T'05:04:47.47 BST
I just can't get the numeric representation of the timezone, is this intentional? :(