Kafka: "Broker failed to validate record" after increasing partition - apache-kafka

I had increased the partition of an existing Kafka topic via terraform. The partition size had increased successfully however when I test the connection to the topic, I'm getting a "Broker failed to validate record"
Testing method:
echo "test" | kcat -b ...
**sensitive content has been removed**
...
% Auto-selecting Producer mode (use -P or -C to override)
% Delivery failed for message: Broker: Broker failed to validate record
I had tried to search up online and came across something called schema validation configuration: https://docs.confluent.io/cloud/current/sr/broker-side-schema-validation.html
Is there something I need to do after increasing the partition? ie flush some cache?

You need to ask your Kafka cluster administrator if they have schema validation enabled, but increasing partitions shouldn't cause that. (This is a feature of Confluent Server, not Apache Kafka).
If someone changed the schema in the schema registry for your topic, or validation has suddenly been enabled, and you are sending a record from an "old" schema (or not correct schema), then the broker would "fail to validate" the record.

Related

Is it possible to re-ingest (sink connector) data into the db if some messages got missed entering

Currently I setup 2 separate connectors running the JDBC Sink Connector to ingest topics produced from the producer to be read into the database. Sometimes, I see errors in the logs, which cause messages produced fails to get stored into the database.
The errors I constantly see is
Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id:11
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Subject 'topic-io..models.avro.Topic' not found; error code404
Which is true because TopicRecordName is not supposed to be directed toward this topic but another topic that I directed to, it is just supposed to be directed toward models.avro.Topic
I was wondering if this happens constantly, is there a way to re-ingest those produced records/messages into the database after the messages got produced. For example, if messages got produced during 12am-1am, and some kind of errors showed up in the logs and failed to consume those messages during that timeframe, the configurations or offset can restore it by re-ingesting it to the database. The error is due to the schema registry link failed to read/ link to the correct schema link. It failed because it read the incorrect worker file, since one of my worker file have a value.converter.value.subject.name.strategy=io.confluent.kafka.serializers.subject.TopicRecordNameStrategy while the other connector does not read that subjectName.
Currently, I set the consumer.auto.offset.reset=earliest to start reading message.
Is there a way to get back those data like into a file and I can restore those data because I am deploying to production and there must be data consumed into the database at all times without any errors.
Rather than mess with the consumer group offsets, which would eventually cause correctly processed data to get consumed again and duplicated, you could use the dead letter queue configurations to send error records to a new topic, which you'd need to monitor and consume before the topic retention completely drops the events
https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues/
one of my worker file have a [different config]
This is why configuration management software is important. Don't modify one server in a distributed system without a process that updates them all. Ansible/Terraform are most common if you're not running the connectors in Kubernetes

Running two instances of MirrorMaker 2.0 halting data replication for newer topics

We tried below scenario using mirror-maker 2.0 and want to know if output of second scenario is expected.
Scenario 1.) We ran single mirror-maker 2.0 instance using the below properties and start command.
clusters=a,b
tasks.max=10
a.bootstrap.servers=kf-test-cluster-a:9092
a.config.storage.replication.factor=1
a.offset.storage.replication.factor=1
a.security.protocol=PLAINTEXT
a.status.storage.replication.factor=1
b.bootstrap.servers=kf-test-cluster-b:9092
b.config.storage.replication.factor=1
b.offset.storage.replication.factor=1
b.security.protocol=PLAINTEXT
b.status.storage.replication.factor=1
a->b.checkpoints.topic.replication.factor=1
a->b.emit.checkpoints.enabled=true
a->b.emit.hearbeats.enabled=true
a->b.enabled=true
a->b.groups=group1|group2|group3
a->b.heartbeats.topic.replication.factor=1
a->b.offset-syncs.topic.replication.factor=1
a->b.refresh.groups.interval.seconds=30
a->b.refresh.topics.interval.seconds=10
a->b.replication.factor=2
a->b.sync.topic.acls.enabled=false
a->b.topics=.*
Start command: /usr/bin/connect-mirror-maker.sh connect-mirror-maker.properties &
Verification: Created new topic "test" on source cluster(a), produced data to topic on source cluster and ran consumer on target-cluster(b),topic "a.test" to verify data replication.
Observation: Worked fine as expected.
Scenario 2.) Ran one more instance of MirrorMaker 2.0 using the same properties as mentioned above.
Start command: /usr/bin/connect-mirror-maker.sh connect-mirror-maker.properties &
Verification: Created one more "test2" topic on source cluster, produced data to topic on source cluster and ran consumer on target-cluster(b),topic "a.test2" to verify data replication.
Observation: MM2 was able to replicate the topic on the target cluster, a.test2 was present on target cluster b but consumer didn't get any record to consume.
On newer mirror-maker 2.0 instance logs, after topic replication, mirror-sourceconnector task had not restarted which was restarting in single instance after topic replication.
NOTE: There were no error logs seen.
I observed the same behavior, your messages are most likely replicated, you can verify this by checking your consumer group offset, the problem is most likely your lag offset is 0 meaning your consumer assumes all previous messages have been consumed. You can reset the offset or read from beginning.
Ideally, the checkpoint heartbeat should contain the latest offset but I currently find this to be empty even though starting with Kafka 2.7, checkpoint heartbeat replication should be automatic

kafka connect - jdbc sink sql exception

I am using the confluent community edition for a simple setup consisting a rest client calling the Kafka rest proxy and then pushing that data into an oracle database using the provided jdbc sink connector.
I noticed that if there is an sql exception for instance if the actual data's length is greater than the actual one (column's length defined), the task stopped and if I do restart it, same thing it tries to insert the erroneous entry and it stopped. It does not insert the other entries.
Is not a way I can log the erroneous entry and let the tasks continue inserting the other data?
Kafka Connect framework for Sink Connectors can only skip problematic records when exception is thrown during:
- Convertion key or values (Converter:toConnectData(...))
- Transformation (Transformation::apply)
For that you can use errors.tolerance property:
"errors.tolerance": "all"
There are some additional properties, for printing details regarding errors: errors.log.enable, errors.log.include.messages.
Original answer: Apache Kafka JDBC Connector - SerializationException: Unknown magic byte
If an exception is thrown during delivering messages Sink Task is killed.
If you need to handle communication error (or others) with an external system, you have to add support to your connector
Jdbc Connector, when SQLException is thrown makes retries but doesn't skip any records
Number of retries and interval between them is managed by the following properties
max.retries default value 10
retry.backoff.ms default 3000
The sink cannot currently ignore bad records, but you can manually skip them, using the kafka-consumer-groups tool:
kafka-consumer-groups \
--bootstrap-server kafka:29092 \
--group connect-sink_postgres_foo_00 \
--reset-offsets \
--topic foo \
--to-offset 2 \
--execute
For more info see here.
Currently, there is no way to stop this from failing the sink connector, specifically.
However, there is another approach that might be worth looking into. You can apply a Single Message Transform (SMT) on the Connector, check the length of the incoming columns, then decide to either throw an exception, which would bubble up to the errors.tolerance configuration, or return null which will filter the record out entirely.
Since this is a Sink connector, the SMT would be applied before passing the record on to the connector, and therefore records that are skipped via the transform would never make it to the tasks to be sync'd into the database.

Kafka ACL's cause topic replication to fail

Good morning,
A bit of background for you: We are currently putting together a POC to use Apache Kafka as a messaging queue for inbound log data for post processing by Elastic Logstash. Currently I have 3 broker nodes configured to point to a single zookeeper node. I have a default replication factor of 3 and minumum ISR of 2 to account for a single node failure(or availability zone in this case). When creating a topic I set a partition count of 10 and replication factor of 3 - Kafka duly goes and creates the topic - happy days! However, because I use SSL on my inbound interface(because it will be internet facing) I need to secure the topics to be writable by a a certain principal as follows:
/opt/kafka-dq/bin/kafka-acls.sh --authorizer-properties zookeeper.connect=zookeeper-001:2181 --add --allow-principal User:USER01 --producer --topic 'USER01_openvpn'
When this happens the ISR drops to a single node, and as I have a minimum ISR of 2 the partitions are taken offline which causes filebeat(client end) to start throwing the following errors:
kafka/client.go:242 Kafka publish failed with: circuit breaker is open
The following errors are also seen in the kafka server logs
2018-11-16 09:59:12,736] ERROR [Controller id=3] Received error in LeaderAndIsr response LeaderAndIsrResponse(responses={USER01_openvpn-3=CLUSTER_AUTHORIZATION_FAILED, USER01_openvpn-2=CLUSTER_AUTHORIZATION_FAILED...
[2018-11-16 10:09:46,852] ERROR [Controller id=2 epoch=23] Controller 2
epoch 23 failed to change state for partition USER01_openvpn-4 from
OnlinePartition to OnlinePartition (state.change.logger)
kafka.common.StateChangeFailedException: Failed to elect leader for
partition USER01_openvpn-4 under strategy
PreferredReplicaPartitionLeaderElectionStrategy
I have attempted to remedy this by adding an ACL for the ANONYMOUS user to all topics but this actually caused the cluster to break further. For further clarity, whilst I have SSL enabled on the inbound interface my cluster inter-broker comms is plaintext.
The documentation around ACL's for the cluster itself are somewhat "wooly" at best so wondered how best to approach this issue.
It looks like you are missing an ACL with ClusterAction on the Cluster resource for your brokers. This is required to allow them to exchange inter-broker messages.
As your brokers are using plaintext, you probably need to set this ACL on the ANONYMOUS principal.
If you're using only SSL (without SASL), you want to make sure you do SSL authentication, otherwise anybody could connect to your cluster and would get ClusterAction permissions allowing them to cause havoc.

Messages sent to Kafka REST-Proxy being rejected by "This server is not the leader for that topic-partition" error

We have been facing some trouble and different understanding between the development team and the environment support team regarding Kafka rest-proxy from the confluent platform.
First of all, we have an environment of 5 Kafka brokers, with 64 partitions and replication factor of 3.
It happens that our calls to rest-proxy are all using the following structure for now:
curl -X POST \
http://somehost:8082/topics/test \
-H 'content-type: application/vnd.kafka.avro.v1+json' \
-d '{
"value_schema_id":1,
"records":[
{ "foo":"bar" }]}'
This kind of call is working for 98.4% of the calls and I noticed that when I try to make this call over 2k times we don't receive any OK response from partition 62 (exactly 1.6% of the partitions).
This error rate used to be 10.9% when we had 7 partitions returning errors right before support team recycled schema-registry.
Now, when the call goes to the partition 62, we receive the following answer:
{
"offsets": [
{
"partition": null,
"offset": null,
"error_code": 50003,
"error": "This server is not the leader for that topic-partition."
}
],
"key_schema_id": null,
"value_schema_id": 1
}
The error is the same when I try to send the messages to the specific partition adding "/partitions/62" to the URL.
Support says rest-proxy is not smart enough ("it's just a proxy", they say) to elect a valid partition and post it to the leader broker of that partition.
They said it randomly selects the partition and then randomly select the broker to post it (which can lead it to post to replicas or even brokers that doesn't have the partition).
They recommended us to change our calls to get topic metadata before posting the messages and then inform the partition and broker and handle the round-robin assignment on the application side, which doesn't make sense to me.
On the Dev side, my understanding is that rest-proxy uses the apache kafka-client to post the messages to the brokers and thus is smart enough to post to the leader broker to the given partition and it also handles the round-robin within the kafka-client lib when the partition is not informed.
It seems to me like an environment issue related to that partition and not to the call app itself (as it works without problem in other environments with same configuration).
To sum up, my questions are:
Am I correct when I say that rest-proxy is smart enough to handle the partition round-robin and posting to the leader?
Should the application be handling the logic in question 1? (I don't see the reason for using rest-proxy instead of kafka-client directly in this case)
Does it look like a problem in environment orchestration for you too?
Hope it all was clear for you to give me some help!
Thanks in advance!
I do not use rest-proxy, but this error likely indicates that NotLeaderForPartitionException happens during the calls. This error indicates that the leader of the partition has changed but the producer still uses stale metadata. This error happenned to me when the replication between brokers failed due to internal error in Kafka server. This can be checked in the server logs.
In our case I checked the topic with ./kafka-topics.sh --describe --zookeeper zookeeper_ip:2181 --topic test and it showed that the replicas from one the broker are not in sync (ISR column). Restart of this broker helped, replicas became synchronised and the error dissapeared.