mongoDB is adding 4 hours and 5 hours to date field while inserting date format(YYYY-MM-DD) through datastage ? is this the normal behavior? - mongodb

case#1 -- when inserted the date format (2015-08-01 ) , in mongodb it shows as 2015-08-01T04:00:00.000Z, where i expected it to be 2015-08-01T00:00:00.000Z
case#2 -- when inserted the date format (2016-01-01 ) , in mongodb it shows as 2016-01-01T05:00:00.000Z
where i expected it to be 2016-01-01T00:00:00.000Z
source data type used data stage : Date
target data type specified in mongoDB : Date
Question here is why 04:00:00 timestamp is coming even though source has only YYYY-MM-DD(with out time stamp) ?
can some one help to clarify this ? is this usual behavior of mongodb ? is there any thing i can do to have only timestamp as T00.00.00.000Z at target.
requirement is to not convert the date into string format ..

Related

MongoDB - datetime type for mongosqld

I have a collection in MongoDB that have a Date field :
date: 2021-02-17T18:40:01.000+00:00
I want to expose this collection to a BI tool thanks to MongoDB BI Connector, mongosqld. I used mongodrdl to create the data model.
mongodrdl converts MongoDB's Date type to MySQL's timestamp type. When i read the date column in a BI application, the time is only zeros :
17/02/2021 00:00:00
This is catastrophic because i need the time. I tried to edit the drdl generated by mongodrdl and put the SqlType to datetime. But when i restart mongosqld i get the following error :
unable to create column "date" from drdl: unsupported SQL type: "datetime" on column "date"
How can i preserve the time for this date field to be properly exposed to BI tools ?
date: 2021-02-17T18:40:01.000+00:00
This is a stringified (iso8601) representation of a timestamp. If this is how your timestamps are stored, they are of the wrong type (string, not timestamp) and hence produce zeroed SQL timestamps.
To fix, store the timestamps as timestamps (BSON date type).

Copying timestamp format from avro to redshift

I am trying to copy an avro file to redshift using the COPY command. The file has a column that is of the type:
{'name': 'timestamp',
'type': ['null', {'logicalType': 'timestamp-millis', 'type': 'long'}]}],
Redshift variable type: "timestamp" timestamptz
When I run the following command copy if fails:
COPY table_name
from 'fil_path.avro'
iam_role 'the_role'
FORMAT AS avro 'auto'
raw field value: 1581306474335
Invalid timestamp format or value [YYYY-MM-DD HH24:MI:SSOF]
However If I add the following line It works:
timeformat 'epochmillisecs'
I tried to put my timestamp in microseconds which should be the base supported epoch resolution but it fails as well, and didn't find an appropriate name (epochmicrosecs didn't seem to do the job).
My question is why is it so?
Furthermore I have another field that is causing some problem. A date field which apparently is saved as a number of days in the avro file (7305) that gives the following error:
Redshift variable type: "birthdate" date
avro: 'date_of_birth', 'type': ['null', {'type': 'int', 'logicalType': 'date'}]}
Invalid Date Format - length must be 10 or more
Firstly, about the Time Format:
As Docs states:
COPY command attempts to implicitly convert the strings in the source data to the data type of the target column. If you need to specify a conversion that is different from the default behavior, or if the default conversion results in errors, you can manage data conversions by specifying the following parameters.
First Solution:
Redshift Doesn't Recognize epoch time by default to be able to convert it to the format of TimeStamp as a result it can't extract year, month, day..etc from the epoch time to put them in the TimeStamp Format, as stated by the Docs:
If your source data is represented as epoch time, that is the number of seconds or milliseconds since January 1, 1970, 00:00:00 UTC, specify 'epochsecs' or 'epochmillisecs'.
This is the supported Formats that Redshift can convert Using automatic recognition.
TimeStamp needs the format to be as YYYYMMDD HHMISS = 19960108 040809 to be able to extract it right, that's what the error state Invalid timestamp format or value [YYYY-MM-DD HH24:MI:SSOF], while epoch time format is just seconds or milliseconds since January 1, 1970 that it doesn't understand how to extract it's values from.
microseconds isn't supported as a parameter for TIMEFORMAT in Redshift.
Second Solution:
You won't need to pass TIMEFORMAT to the COPY command, but you will insert epoch time in your staging tables as VARCHAR or TEXT.
Then, when inserting epoch time from your staging tables into the schema tables convert it like this: TIMESTAMP 'epoch' + epoch_time/1000 * interval '1 second' AS time
Secondly, about date field:
DATE data type is specified as Calendar date (year, month, day) as stated by the Docs, As a result it can't be the number of days or be less than 10 characters in length (as 2021-03-04) and that's what the error tell us Invalid Date Format - length must be 10 or more.
The solution for Date field:
You need to do a work-around, by passing the number of days as a VARCHAR or text to your staging tables.
When loading the schema tables from the staging tables, apply Data cleaning by convert number of days to a DATE using TOCHAR: TO_DATE(TO_CHAR(number of days, '9999-99-99'),'YYYY-MM-DD')
As a result, number of days will be a valid DATE in your schema tables.

Hive datatype confusion

I have a large data and in that one field be like Wed Sep 15 19:17:44 +0100 2010 and I need to insert that field in Hive.
I am getting troubled for choosing data type. I tried both timestamp and date but getting null values when loading from CSV file.
The data type is a String as it is text. If you want to convert it, I would suggest a TIMESTAMP. However you will need to do this conversion yourself while loading the data or (even better) afterwards.
To convert to a timestamp, you can use the following syntax:
CAST(FROM_UNIXTIME(UNIX_TIMESTAMP(<date_column>,'FORMAT')) as TIMESTAMP)
Your format seems complex though. My suggestion is to load it as a string and then just do a simple query on the first record until you get it working.
SELECT your_column as string_representation,
CAST(FROM_UNIXTIME(UNIX_TIMESTAMP(<date_column>,'FORMAT')) as TIMESTAMP) as timestamp_representation
FROM your_table
LIMIT 1
You can find more information on the format here: http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html
My advice would be to concat some substrings first and try to convert only the day, month, year part before you look at time and timezone et cetera.

Tableau is reading my dates wrong

Tableau is reading my dates wrong. I have 2 columns, Date and number for each day.
The date format is “yyyymmdd” i.e. (20160617) and per day number is integer. I am fetching this data directly from SQL server and my problem is, tableau is reading my dates wrong.
So I tried DATEPARSE() to convert my date.
My DATEPARSE function is : DATEPARSE(“yyyymmdd”,”Date”) , now after using DATEPARSE function, I get NULL for my dates.
Can anyone please help me why I get NULL for dates, my query returns 30-day data which is divided into per day count.
Sample after running the query on SQL
Date Per day number
20160617 215674
Tableau does not accept this date format and I applied DateParse(), which I guess is returning string since my date is null. I would ideally like to get the correct date so I can apply a trend line on my data.
Thanks in advance.
Cheers!
You aren't using DateParse() correctly. The second parameter, which you have as "Date", should be the name of the field you want parsed. So for example, if you store 20160617 in a field called my_date_as_integer, your function should be DateParse("yyyymmdd", [my_date_as_integer])

Invalid date value in csv file

I export data in csv format from sql server database. It contain 5 column. one column have date and time value. When i checked the date -time value i found date time value is in wrong format. I add the filter but filter not applied on some data. I try to format the data in same format but formatting did not applied on the data. I tried everything to fix the issue but it is not getting fix.
I have attached the sample data please check it from your end.
7/12/2013 14:50
8/12/2013 20:14
9/12/2013 11:38
10/12/2013 15:31
13/12/2013 12:45:50
13/12/2013 14:35:42
13/12/2013 14:37:40
14/12/2013 17:00:10
18/12/2013 14:57:35
Data started from 13/12/2013 12:45:50 are not getting change in date time format.
The trouble is that your dates are in french format dd/mm/yyyy you can force them to datetime with the following line :
[datetime]::ParseExact("7/12/2013 14:50", "d/MM/yyyy HH:mm", $null)
[datetime]::ParseExact("13/12/2013 12:45:50", "d/MM/yyyy HH:mm:ss", $null)
Be carefull in you case sometime you've got seconds and a double space between day and time.