I have QueryDatabaseTableRecord to get data from Oracle table.
In my oracle table, I have following data
id,name,bday
1,sachith,17-SEP-1990
2,nalaka,16-MAR-2020
When I run and get data, its changed into bigint.
1,sachith,653523824000
2,nalaka,1584311083000
In Record Writer : CSV
: Date format : yyyy-mm-dd
But yet its not working. Do I have to use intermediate UpdateRecord processor and update date fields as described here?
Edit :
After doing some research, I could add UpdateRecord processor with
/my_date_colum : ${filed.value:format("yyyy-MM-dd HH:mm:ss.SSS")}
But this fails with negative epoch values.
Error : Could not implicitly convert input to Date -104697000000
How can I handle this?
This statement should work:
${field.value:toDate():format('yyyy-MMM-dd')}
Related
Trying to convert string value(2022-07-24T07:04:27.5765591Z) into datetime/timestamp to insert into SQL table in datetime format without losing any value till milliseconds. String which I am providing is actually a datetime and my source is ADLS CSV. I tried below options in data flow.
Using Projection-> Changed the datatype format for specific column into timestamp and format type-yyyy-MM-dd'T'HH:mm:ss.SSS'Z' however getting NULL in output.
Derived column-> Tried below expressions but getting NULL value in output
toTimestamp(DataLakeModified_DateTime,'%Y-%m-%dT%H:%M:%s%z')
toTimestamp(DataLakeModified_DateTime,'yyyy-MM-ddTHH:mm:ss:fffffffK')
toTimestamp(DataLakeModified_DateTime,'yyyy-MM-dd HH:mm:ss.SSS')
I want the same value in output-
2022-07-24T07:04:27.5765591Z (coming as string) to 2022-07-24T07:04:27.5765591Z (in datetime format which will be accepted by SQL database)
I have tried to repro the issue and it is also giving me the same error, i.e., null values for yyyy-MM-dd'T'HH:mm:ss.SSS'Z' timestamp format. The issue is with the string format you are providing in source. The ADF isn’t taking the given string as timestamp and hence giving NULL in return.
But if you tried with some different format, like keeping only 3 digits before Z in last format, it will convert it into timestamp and will not return NULL.
This is what I have tried. I have kept one timestamp as per your given data and other with some modification. Refer below image.
This will return NULL for the first time and datetime for second time.
But the format you are looking for is still missing. With the existing source format, the yyyy-MM-dd'T'HH:mm:ss would work fine. This format also works fine in SQL tables. I have tried and it’s working fine.
Try to use to String instead of timestamp and use this to create your Desired timestamp
toString(DataLakeModified_DateTime, 'yyyy-MM-dd HH:mm:ss:SS')
I have a collection in MongoDB that have a Date field :
date: 2021-02-17T18:40:01.000+00:00
I want to expose this collection to a BI tool thanks to MongoDB BI Connector, mongosqld. I used mongodrdl to create the data model.
mongodrdl converts MongoDB's Date type to MySQL's timestamp type. When i read the date column in a BI application, the time is only zeros :
17/02/2021 00:00:00
This is catastrophic because i need the time. I tried to edit the drdl generated by mongodrdl and put the SqlType to datetime. But when i restart mongosqld i get the following error :
unable to create column "date" from drdl: unsupported SQL type: "datetime" on column "date"
How can i preserve the time for this date field to be properly exposed to BI tools ?
date: 2021-02-17T18:40:01.000+00:00
This is a stringified (iso8601) representation of a timestamp. If this is how your timestamps are stored, they are of the wrong type (string, not timestamp) and hence produce zeroed SQL timestamps.
To fix, store the timestamps as timestamps (BSON date type).
How do we convert a string to date in cloud datafusion?
I have a column with the value say 20191120 (format of yyyyMMdd) i want to load this into a table in bigquery as date. The table column datatype is also date.
What i have tried so far is that i converted the string to timestamp using "parse-as-simple-date" and i try to convert it to format using format-date to "yyyy-MM-dd", but this step converts it to string and the final load fails. I have even tried to explicitly mention the column as date in the o/p schema as date. But it fails at runtime.
I tried keeping it as timestamp in the pipeline and try loading the date into Bigquery date type.
I noticed in the error that came op was field dt_1 incompatible with avro integer. Is datafusion internally converting the extract into avro before loading. AVRO does not have a date datatype which is causing the isssue?
Adding answer for posterity:
You can try doing these,
Go to LocalDateTime column in wrangler
Open dropdown and click on "Custom Transform"
Type timestamp.toLocalDate() (timestamp being the column name)
After the last step it should convert it into LocalDate type which you can write to bigquery. Hope this helps
For this specific date format, the Wrangler Transform directive would be:
parse-as-simple-date date_field_dt yyyyMMdd
set-column date_field_dt date_field_dt.toLocalDate()
The second line is required if the destination is of type Date.
Skip empty values:
set-column date_field_dt empty(date_field_dt) ? date_field_dt : date_field_dt.toLocalDate()
References:
https://github.com/data-integrations/wrangler/blob/develop/wrangler-docs/directives/parse-as-simple-date.md
https://github.com/data-integrations/wrangler/blob/develop/wrangler-docs/directives/parse-as-date.md
You could try to parse your input data with Data Fusion using Wrangler.
In order to test it out I have replicated a workflow where a Data Fusion pipeline is fed with data coming from BigQuery. This data is then parsed to the proper type and then it is exported back again to BigQuery. Note that the public dataset is “austin_311” and I have used ‘’311_request’ table as some of their columns are TIMESTAMP type.
The steps I have done are the following:
I have queried a public dataset that contained TIMESTAMP data using:
select * from `bigquery-public-data.austin_311.311_request`
limit 1000;
I have uploaded it to Google Cloud Storage.
I have created a new Data Fusion batch pipeline following this.
I have used the Wrangler to Parse CSV data to custom 'Simple Data' yyyy-MM-dd HH:mm:ss
I have exported Pipeline results to BigQuery.
This qwiklab has helped me through the steps.
Result:
Following the above procedure I have been able to export Data Fusion data to BigQuery and the DATE fields are exported as TIMESTAMP, as expected.
I have Excel data and trying to insert the data into MongoDB using Talend Big Data for Open Studio. This is my job,
tFileInputExcel --> tMap --> tMongoDBOutput
In excel sheet, i have a date value column in this format 7/13/2017(MM/dd/yyyy) as string type and I am trying to insert this column value as ISO format ISODate("2017-07-13T00:00:00.000Z") in MongoDB.
This is my Job:
tFileInputExcel:
tMap:
tMongoDBOutput:
When execute this job, I'm getting the below error.
Error:
When i change the parse format like this TalendDate.parseDate("MM/dd/yyyy",row1.ClosingDate) , I'm getting SimpleDateFormat error. Simple Date Format Error
How to resolve this issue?
you can do simply if you mongodb column schema is date:
TalendDate.parseDate("MM/dd/yyyy",row3.newColumn)
That will automatically convert the date in the date model that your mongoDB column have.
You can change in your schema in Talend the date Model like "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'".
This is a very common mistake doing in reading data without understanding the underlying data types.
I have blogged about this especially for Talend: https://www.tobiasmaasland.de/2017/07/20/using-date-in-talend-etl-jobs/
But let me explain a bit.
Sometimes Excel tries to convert data in the cell even if one might think the cell type is set to String. Insted, it is set to Date. As such, no conversion is needed and the type needs to be Date in the input component.
If it is a String and an error occurs, the the structure of the String is either not the same everywhere or some cells are empty (null). So you might be lucky with
TalendDate.parseDate("MM/dd/yyyy", (row1.ClosingDate == null), "01/01/1970", row1.ClosingDate)
I just assumed you might want to use a placeholder date insted of having null.
This heavily depends on the actual data type in the cells, if every cell has the same data type and if all the data is formatted correctly.
To sum up one of the facts in my blog post: Don't use String for dates. Use Date for dates in Excel. It makes everything easier.
I m having some issues with converting a regular timestamp in cassandra with Apache Nifi.
My use case is following:
I have a csv file with a date in it looking like this ('2015010109') and I want to put it in cassandra by converting this string ('2015010109') to an proper format: 2015-01-01 09:00 -> yyyy/MM/dd HH:mm (I dont exactly need the minutes, but I guess it is more useful for later usage)
So far I got this propertie in my UpdateAttribute processor when trying to convert this string to a timestamp:
date : ${csvfiledate:toDate("yyyyMMddHH","GMT"):format("yyyy-MM-dd-HH")}
but then there is an error occuring in my PutCassandraQL processor: Unable to coerce '2015-01-01-09' to a formatted date (long).
I tried something along
date : ${csvfiledate:toDate("yyyyMMddHH","GMT"):format("yyyy-MM-dd HH-mmZ")} aswell, but the same error is occuring.
It seems like you need to have a specific timestamp type for cassandra as you can see here:
http://docs.datastax.com/en/archived/cql/3.0/cql/cql_reference/timestamp_type_r.html
But it isnt working so far, maybe you got some tipps.
Thanks in advance.
You've got it backwards..
toDate parameters are used to describe how to parse the date. format function is used to describe how the date output should be. So the expression should be:
${csvfiledate:toDate('yyyy-MM-dd-HH','GMT'):format('yyyyMMddHH')}