Kafka Connect JDBC db.timezone config - postgresql

I'm trying to wrap my head around how the db.timezone property works on both the source and sink connectors.
For the source the docs say:
Name of the JDBC timezone used in the connector when querying with time-based criteria. Defaults to UTC.
What does this actually mean? Is that supposed to be set to the timezone of my source database? I have a db that is set to eastern timezone. Do I need to set this to US/Eastern? If I don't what will it do?
On the sink side the docs say:
Name of the JDBC timezone that should be used in the connector when inserting time-based values. Defaults to UTC.
Again what exactly does this mean. Does it use that to convert the all timestamps in your payload to the value you give here?
My specific problem is my source db has eastern timezone, but my sink db is set to UTC and I can't change it. How should I define these properties.
Also to add to this, I think it's slightly unrelated but I notice on my sink side the timestamps don't have all the decimals. But on both sides in have the timestamp columns set to timestamp(6). However on the sink side the decimal points always only just have 3 digits and the remaining 3 are all 0s. Why would this be?

Have a look at the source code:
https://github.com/confluentinc/kafka-connect-jdbc/blob/master/src/main/java/io/confluent/connect/jdbc/source/JdbcSourceConnectorConfig.java#L805
to get a feeling on how the value you specify for the db.timezone configuration option will be used by the kafka-connect-jdbc connector.
I'd assume that for your source connector you should use
db.timezone=US/Eastern
Name of the JDBC timezone used in the connector when querying with time-based > criteria. Defaults to UTC.
What does this actually mean?
The db.timezone setting comes in handy when reading/writing data from databases which don't use UTC timezone for storing the date/time columns.
Since your sink database uses UTC timezone, there's no additional setting related to setting the timezone in your jdbc sink configuration.

Related

How to store date in IST format in MongoDb

Unable to store date in IST format in mongo DB
Currently storing date in UTC format during every operations i have to change it into IST format using timezone feature provided by mongo
You can not do that. MongoDB will always save time in UTC.
MongoDB stores times in UTC by default, and converts any local time
representations into this form. Applications that must operate or
report on some unmodified local time value may store the time zone
alongside the UTC timestamp, and compute the original local time in
their application logic.
You can check the official docs for more info.

Is there a way in Debezium to stop data serialization? Trying to get values from source as it is

I have seen many posts on StackOverflow where people are trying to capture the data from source RDBMS and are using Debezium for the same. I am working with SQL Server. However since the DECIMAL and TIMESTAMP values are encoded by default, it becomes an overhead to decode those values into its original form.
I was looking to avoid this extra decoding step but to no avail. Can anyone please tell me how to import data via Debezium as it is i.e. without serializing it.
I saw some youtube videos where DECIMAL values were extracted in its original form.
FOR EX-> 800.0 from SQL Server is obtained as 800.0 via Debezium and not as "ATiA" (encoded)
But i am not sure how to do this. Can anyone please help me with what configuration will be required for the same on Debezium. I am using Debezium Server for now. Can work with Debezium connectors as well if that's needed.
Any help is appreciated.
Thanks.
It may be a matter of representation of timestamp and decimal values as opposed to encoding.
For timestamps, try using different values of time.precision.mode and for decimals, use decimal.handling.mode.
For MySQL, documentation is here

Spark Elasticsearch connector fails to parse date as date. They are parsed as Long

I am pushing HDFS parquet Data to ElasticSearch using ES spark connector.
I have two columns having dates that I am unable to parse as date. They keep being ingested as long. They have the following formats:
basic_ordinal_date: YYYYddd
epoch_millis: in milliseconds; e.g. 1555498747861
I tried the following
Defining ingest pipelines
Defining mappings
Defining dynamic mappings
Defining index_templates with mapping
Converting my date columns to string so Elastic does some pattern matching
Depending on the method, either I have errors and my Spark job fails, or the documents are pushed with the concerned columns that remain parked as long.
How should I proceed to have my ordinal_dates, epoch_millis parsed as dates ?
Thank you in advance for your help

Some date values off by a day in JDV

When querying a source model in a VBD, with the source database being Informix 11, the values for a date column are sometimes returned as the prior day. For example, the actual value in Informix is Oct 10, but value shown when querying JDV source model is Oct 9. Querying Informix directly returns the correct date. I'm using JDV 6.4.0 with JDK 1.8.0_162 (x64) on Windows 10.
Any ideas? Thanks in advance!
To elaborate on what Ramesh is saying, you need to check the client and server jvm timezones. JDV will attempt to keep date/time calendar fields consistent across db, server, and client. If the Teiid client is in a different timezone than the server, the client will automatically alter UTC value for date/time values so that they match what the server would display - which is determined by the server timezone.
When a timestamp value is retrieved from the database we assume that it has already been adjusted by the driver to account for any timezone differences. If that is not the case there is a translator execution property called DatabaseTimeZone that will utilize the JDBC calendar based methods to adjust the retrieved date/time values.
A common issue would is a mismatch of daylight savings times - usually it's best to have the JDV server in a standard timezone.

Kafka JDBC connector not picking up new commits

I am currently using a Kafka JDBC connector to poll records from an Oracle db. The connector properties are set to use timestamp mode and we have provided a simple select query in the properties (not using a where clause) - based on my understanding this should work.
However currently when instantiating the connector I can see the initial query does pull out all of the records it should and does publish them to the Kafka consumer - but any new commits to the oracle db are not picked up and the connector just sits polling without finding any new info, and maintaining its offset.
No exceptions are being thrown in the connector, and no indication of a problem other than it is not picking up the new commits in the db.
One thing of note, which i have been unable to prove makes a difference, is that the fields in the oracle db are all nullable. But i have tested changing that for the timestamp field, and it had no effect and the same behaviour continued. I have also tested in bulk mode and it works fine and does pick up new commits, though I cannot use bulk mode as we cannot duplicate the records for the system.
Does anyone have any idea why the connector is unable to pick up new commits for timestamp mode?
What does your properties file look like? You need to make sure to use an incrementing column or a time stamp column.
If you you are using a time stamp column, is it getting updated on the commit?
Regarding nulls, You can tweak your query to coalesce the null column to a value. Alternatively, I think there is a setting to allow nullable columns.