In Cloud Stream "2021.0.4", we are facing issue: Routing to destination is failing when Partitioning is enabled for the RabbitMQ. Binder always throwing "Partition key can't be null". This does not occur when we rollback to 2021.0.3. Please note that this feature we are using in the existing app and working fine till now.
We are using the following expression:
spring.cloud.stream.bindings.output.producer.partition-key-expression: headers['amqp_correlationId']
tried with other varitions like headers['correlation_id'], headers.correlation_id but not working.
Looking for a fix in 2021.0.4
private Object extractKey(Message<?> message) {
Object key = invokeKeyExtractor(message);
if (key == null && this.producerProperties.getPartitionKeyExpression() != null) {
key = this.producerProperties.getPartitionKeyExpression()
.getValue(this.evaluationContext, message);
}
**Assert.notNull(key, "Partition key cannot be null");**
return key;
}
Related
I'm trying to run this simple example where data from a Kafka topic are filtered out: https://www.talend.com/blog/2018/08/07/developing-data-processing-job-using-apache-beam-streaming-pipeline/
I have a similar setup with a localhost broker with default settings, but i can't even read from the topic.
When running the application it gets stuck in a infinite loop and nothing happens. I've tried giving gibberish url for my broker to see if it's even able to reach to them - it's not. The cluster is up and running and i'm able to add messages to the topic. Here is where i specify the broker and the topic:
pipeline
.apply(
KafkaIO.<Long, String>read()
.withBootstrapServers("localhost:9092")
.withTopic("BEAM_IN")
.withKeyDeserializer(LongDeserializer.class)
.withValueDeserializer(StringDeserializer.class)
)
I don't see any errors and there is nothing written to the output topic.
When debugging, I see it's stuck in this loop:
while(Instant.now().isBefore(completionTime)) {
ExecutorServiceParallelExecutor.VisibleExecutorUpdate update = this.visibleUpdates.tryNext(Duration.millis(25L));
if (update == null && ((State)this.pipelineState.get()).isTerminal()) {
return (State)this.pipelineState.get();
}
if (update != null) {
if (this.isTerminalStateUpdate(update)) {
return (State)this.pipelineState.get();
}
if (update.thrown.isPresent()) {
Throwable thrown = (Throwable)update.thrown.get();
if (thrown instanceof Exception) {
throw (Exception)thrown;
}
if (thrown instanceof Error) {
throw (Error)thrown;
}
throw new Exception("Unknown Type of Throwable", thrown);
}
}
In the isKeyed(PValue pvalue) method within the ExecutorServiceParallelExecutor class.
What am I missing?
I'm writing a SinkConnector in Kafka Connect and hitting an issue. This connector has a configuration as such :
{
"connector.class" : "a.b.ExampleFileSinkConnector",
"tasks.max" : '1',
"topics" : "mytopic",
"maxFileSize" : "50"
}
I define the connector's config like this :
#Override public ConfigDef config()
{
ConfigDef result = new ConfigDef();
result.define("maxFileSize", Type.STRING, "10", Importance.HIGH, "size of file");
return result;
}
In the connector, I start the tasks as such :
#Override public List<Map<String, String>> taskConfigs(int maxTasks) {
List<Map<String, String>> result = new ArrayList<Map<String,String>>();
for (int i = 0; i < maxTasks; i++) {
Map<String, String> taskConfig = new HashMap<>();
taskConfig.put("connectorName", connectorName);
taskConfig.put("taskNumber", Integer.toString(i));
taskConfig.put("maxFileSize", maxFileSize);
result.add(taskConfig);
}
return result;
}
and all goes well.
However, when starting the Task (in taskConfigs()), if I add this :
taskConfig.put("epoch", "123");
this breaks the whole infrastructure : all connectors are stopped and restarted in an endless loop.
There is no exception or error whatsoever in the connect log file that can help.
The only way to make it work is to add "epoch" in the connector config, which I don't want to do since it is an internal parameter that the connector has to send to the task. It is not intended to be exposed to the connector's users.
Another point I noticed is that it is not possible to update the value of any connector config parameter, apart to set it to the default value. Changing a parameter and sending it to the task produces the same behavior.
I would really appreciate any help on this issue.
EDIT : here is the code of SinkTask::start()
#Override public void start(Map<String, String> taskConfig) {
try {
connectorName = taskConfig.get("connectorName");
log.info("{} -- Task.start()", connectorName);
fileNamePattern = taskConfig.get("fileNamePattern");
rootDir = taskConfig.get("rootDir");
fileExtension = taskConfig.get("fileExtension");
maxFileSize = SimpleFileSinkConnector.parseIntegerConfig(taskConfig.get("maxFileSize"));
maxTimeMinutes = SimpleFileSinkConnector.parseIntegerConfig(taskConfig.get("maxTimeMinutes"));
maxNumRecords = SimpleFileSinkConnector.parseIntegerConfig(taskConfig.get("maxNumRecords"));
taskNumber = SimpleFileSinkConnector.parseIntegerConfig(taskConfig.get("taskNumber"));
epochStart = SimpleFileSinkConnector.parseLongConfig(taskConfig.get("epochStart"));
log.info("{} -- fileNamePattern: {}, rootDir: {}, fileExtension: {}, maxFileSize: {}, maxTimeMinutes: {}, maxNumRecords: {}, taskNumber: {}, epochStart : {}",
connectorName, fileNamePattern, rootDir, fileExtension, maxFileSize, maxTimeMinutes, maxNumRecords, taskNumber, epochStart);
if (taskNumber == 0) {
checkTempFilesForPromotion();
}
computeInitialFilename();
log.info("{} -- Task.start() END", connectorName);
} catch (Exception e) {
log.info("{} -- Task.start() EXCEPTION : {}", connectorName, e.getLocalizedMessage());
}
}
We found the root cause of the issue. The Kafka Connect Framework is actually behaving as designed - the problem has to do with how we are trying to use the taskConfigs configuration framework.
The Problem
In our design, the FileSinkConnector sets an epoch in its start() lifecycle method, and this epoch is passed down to its tasks by way of the taskConfigs() lifecycle method. So each time the Connector's start() lifecycle method runs, different configuration is generated for the tasks - which is the problem.
Generating different configuration each time is a no-no. It turns out that the Connect Framework detects differences in configuration and will restart/rebalance upon detection - stopping and restarting the connector/task. That restart will call the stop() and start() methods of the connector ... which will (of course) produces yet another configuration change (because of the new epoch), and the vicious cycle is on!
This was an interesting and unexpected issue ... due to a behavior in Connect that we had no appreciation for. This is the first time we tried to generate task configuration that was not a simple function of the connector configuration.
Note that this behavior in Connect is intentional and addresses real issues of dynamically-changing configuration - like a JDBC Sink Connector that spontaneously updates its configuration when it detects a new database table it wants to sink.
Thanks to those who helped us !
I have created a Kafka Operator using ibm spl code. Now I have to replace the input of an existing ibm splmm code with this kafka operator. Currently the code receives the input from a different location, and now I would like to change the input as the kafka consumer. Help me out with this issue. Thanks.
Sorry about that. So,I created a Kafka Operator which receives an input message from the python code and writes it on a text file.
Composite Kafka_Read_OP {
param
expression $topic : "DNS_EXT" ;
graph
stream<rstring message> KafkaStream = KafkaConsumer()
{
param
propertiesFile : getThisToolkitDir() + "/etc/consumer.properties" ;
topic : $topic;
}
() as Sink = FileSink(KafkaStream)
{
param
file : "/tmp/output.txt";
format : txt;
append : true;
flush : 1u;
} //End of FileSink.
And I have an IBM spl code which recieves an input from another spl code that looks like,.
public composite ExternalDNSImportSubscriptionWrapper(output ImportSubStream)
{
param
expression $subscription : getSubmissionTimeValue("externalFeeder::subscription", getCompileTimeValue("externalFeeder::subscription","kind == \"\""));
type $outputType;
graph
stream ImportSubStream = Import()
{
param
applicationScope : "com.zel.streams.cti" ;
subscription : kind == "EnrichmentData" && category == "ExternalDNSzel" && id == getChannel();
//subscription : kind == "EnrichmentData" && category == "ExternalDNSzel" && id == 0;
}
Now I need to replace the above code with the Kafka Operator I created.
So do I have to send the output of the Kafka to Import or
Just delete this whole import code and create a Kafka here. If I do so I need to change the import in other connecting codes.
I am not able to find any examples to send an output from kafka to an import.
I am using Sitecore 8 and after I stopped the MongoDB service and set the setting in configs to stop using MongoDB for analytics this specific error:
ERROR Exception while handling event Sitecore.Eventing.Remote.PublishEndRemoteEvent
Exception: System.NullReferenceException
Message: Object reference not set to an instance of an object.
Source: Sitecore.SharedSource.PartialLanguageFallback
at Sitecore.SharedSource.PartialLanguageFallback.Providers.FallbackLanguageProvider.ClearFallbackCaches(ItemUri itemUri, Database database)
at Sitecore.SharedSource.PartialLanguageFallback.Providers.FallbackLanguageProvider.<>c__DisplayClass1.<InitializeEventHandlers>b__0(PublishEndRemoteEvent event)
at Sitecore.Eventing.EventProvider.RaiseEvent(Object event, Type eventType, EventContext context)
To disable the Analytics database I've used the indications from here.
Does PublishEndRemoteEvent use somehow MongoDB? Do you know how can I fix this so I won't get it anymore?
You need to generate a new assembly for Sitecore.SharedSource.PartialLanguageFallback with the following change:
Update the below file as follows:
Sitecore.SharedSource.PartialLanguageFallback/Providers/FallbackLanguageProvider.cs
private void ClearFallbackCaches(ItemUri itemUri, Database database) {
var cache = _fallbackValuesCaches[database.Name];
var ignore_cache = _fallbackIgnoreCaches[database.Name];
if (cache != null)
{
// Added a null check on itemUri
if (itemUri == null)
cache.Clear();
else
cache.RemoveKeysContaining(itemUri.ItemID.ToString());
}
if (ignore_cache != null)
{
// Added a null check on itemUri
if (itemUri == null)
ignore_cache.Clear();
else
ignore_cache.RemoveKeysContaining(itemUri.ItemID.ToString());
} }
https://blog.horizontalintegration.com/2014/05/03/sitecore-partial-language-fallback-cache-clearing-issue/
Our application is build upon mongodb replica set.
I'd like to catch all exceptions thrown among the time frame when replica set is in process of automatic failover.
I will make application retry or wait for failover completes.
So that the failover won't influence user.
I found document describing the behavior of java driver here: https://jira.mongodb.org/browse/DOCS-581
I write a test program to find all possible exceptions, they are all MongoException but with different message:
MongoException.Network: "Read operation to server /10.11.0.121:27017 failed on database test"
MongoException: "can't find a master"
MongoException: "not talking to master and retries used up"
MongoException: "No replica set members available in [ here is replica set status ] for { "mode" : "primary"}"
Maybe more...
I'm confused and not sure if it is safe to determine by error message.
Also I don't want to catch all MongoException.
Any suggestion?
Thanks
I am now of the opinion that Mongo in Java is particularly weak in this regards. I don't think your strategy of interpreting the error codes scales well or will survive driver evolution. This is, of course, opinion.
The good news is that the Mongo driver provides a way get the status of a ReplicaSet: http://api.mongodb.org/java/2.11.1/com/mongodb/ReplicaSetStatus.html. You can use it directly to figure out whether there is a Master visible to your application. If that is all you want to know, the http://api.mongodb.org/java/2.11.1/com/mongodb/Mongo.html#getReplicaSetStatus() is all you need. Grab that kid and check for a not-null master and you are on your way.
ReplicaSetStatus rss = mongo.getReplicaSetStatus();
boolean driverInFailover = rss.getMaster() == null;
If what you really need is to figure out if the ReplSet is dead, read-only, or read-write, this gets more difficult. Here is the code that kind-of works for me. I hate it.
#Override
public ReplSetStatus getReplSetStatus() {
ReplSetStatus rss = ReplSetStatus.DOWN;
MongoClient freshClient = null;
try {
if ( mongo != null ) {
ReplicaSetStatus replicaSetStatus = mongo.getReplicaSetStatus();
if ( replicaSetStatus != null ) {
if ( replicaSetStatus.getMaster() != null ) {
rss = ReplSetStatus.ReadWrite;
} else {
/*
* When mongo.getReplicaSetStatus().getMaster() returns null, it takes a a
* fresh client to assert whether the ReplSet is read-only or completely
* down. I freaking hate this, but take it up with 10gen.
*/
freshClient = new MongoClient( mongo.getAllAddress(), mongo.getMongoClientOptions() );
replicaSetStatus = freshClient.getReplicaSetStatus();
if ( replicaSetStatus != null ) {
rss = replicaSetStatus.getMaster() != null ? ReplSetStatus.ReadWrite : ReplSetStatus.ReadOnly;
} else {
log.warn( "freshClient.getReplicaSetStatus() is null" );
}
}
} else {
log.warn( "mongo.getReplicaSetStatus() returned null" );
}
} else {
throw new IllegalStateException( "mongo is null?!?" );
}
} catch ( Throwable t ) {
log.error( "Ingore unexpected error", t );
} finally {
if ( freshClient != null ) {
freshClient.close();
}
}
log.debug( "getReplSetStatus(): {}", rss );
return rss;
}
I hate it because it doesn't follow the Mongo Java Driver convention of your application only needs a single Mongo and through this singleton you connect to the rest of the Mongo data structures (DB, Collection, etc). I have only been able to observe this working by new'ing up a second Mongo during the check so that I can rely upon the ReplicaSetStatus null check to discriminate between "ReplSet-DOWN" and "read-only".
What is really needed in this driver is some way to ask direct questions of the Mongo to see if the ReplSet can be expected at this moment to support each of the WriteConcerns or ReadPreferences. Something like...
/**
* #return true if current state of Client can support readPreference, false otherwise
*/
boolean mongo.canDoRead( ReadPreference readPreference )
/**
* #return true if current state of Client can support writeConcern; false otherwise
*/
boolean mongo.canDoWrite( WriteConcern writeConcern )
This makes sense to me because it acknowledges the fact that the ReplSet may have been great when the Mongo was created, but conditions right now mean that Read or Write operations of a specific type may fail due to changing conditions.
In any event, maybe http://api.mongodb.org/java/2.11.1/com/mongodb/ReplicaSetStatus.html gets you what you need.
When Mongo is failing over, there are no nodes in a PRIMARY state. You can just get the replica set status via the replSetGetStatus command and look for a master node. If you don't find one, you can assume that the cluster is in a failover transition state, and can retry as desired, checking the replica set status on each failed connection.
I don't know the Java driver implementation itself, but I'd do catch all MongoExceptions, then filter them on getCode() basis. If the error code does not apply to replica sets failures, then I'd rethrow the MongoException.
The problem is, to my knowledge there is no error codes reference in the documentation. Well there is a stub here, but this is fairly incomplete. The only way is to read the code of the Java driver to know what code it uses…