How to register app from private repo in Spring Cloud dataflow 2.6.1 - spring-cloud

I'm using SCDF 2.6.1 in Openshift 3, and I'm facing error while registering the app, error log like below :
java.lang.NullPointerException: null
at org.springframework.cloud.dataflow.configuration.metadata.container.DefaultContainerImageMetadataResolver.getRegistryRequest(DefaultContainerImageMetadataResolver.java:162)
at org.springframework.cloud.dataflow.configuration.metadata.container.DefaultContainerImageMetadataResolver.getImageLabels(DefaultContainerImageMetadataResolver.java:110)
at org.springframework.cloud.dataflow.configuration.metadata.BootApplicationConfigurationMetadataResolver.resolvePortNamesFromContainerImage(BootApplicationConfigurationMetadataResolver.java:215)
at org.springframework.cloud.dataflow.configuration.metadata.BootApplicationConfigurationMetadataResolver.listPortNames(BootApplicationConfigurationMetadataResolver.java:163)
at org.springframework.cloud.dataflow.server.controller.AppRegistryController.getInfo(AppRegistryController.java:193)
at org.springframework.cloud.dataflow.server.controller.AppRegistryController.info(AppRegistryController.java:162)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I checked the line of code in DefaultContainerImageMetadataResolver.java:162
// Convert the image name into a well-formed ContainerImage
ContainerImage containerImage = this.containerImageParser.parse(imageName);
// Find a registry configuration that matches the image's registry host
RegistryConfiguration registryConf = this.registryConfigurationMap.get(containerImage.getRegistryHost());
// Retrieve a registry authorizer that supports the configured authorization type.
RegistryAuthorizer registryAuthorizer = this.registryAuthorizerMap.get(registryConf.getAuthorizationType());
I'm pretty sure the error is because registryConf is null as result from
RegistryConfiguration registryConf = this.registryConfigurationMap.get(containerImage.getRegistryHost());
How to put my private repo URI in registryConfigurationMap ?
I have tried to put imagePullSecret in the deployment.yml which is registered with the private repo, but I think it doesn't work because in the startup log, I still see :
2020-09-03 04:55:24.111 INFO 1 --- [ main] urationMetadataResolverAutoConfiguration :
Final Registry Configurations: {registry-1.docker.io=RegistryConfiguration{registryHost='registry-1.docker.io', user='null', secret='****'', authorizationType=dockeroauth2, manifestMediaType='application/vnd.docker.distribution.manifest.v2+json', disableSslVerification='false',
extra={registryAuthUri=https://auth.docker.io/token?service=registry.docker.io&scope=repository:{repository}:pull&offline_token=1&client_id=shell }}}

The only place where SCDF server downloads the container image layer is when it looks for app metadata.
Currently, this is configured to use the docker registry host (as this is where all the out-of-the-box applications are hosted).
If you want to override, you can modify these property values at the time of server startup and proceed.
Remember the fact that this configuration is only needed to download the app metadata layer of the image - not to download the entire container image at the SCDF server side.

Related

"SchemaRegistryException: Failed to get Kafka cluster ID" for LOCAL setup

I'm downloaded the .tz (I am on MAC) for confluent version 7.0.0 from the official confluent site and was following the setup for LOCAL (1 node) and Kafka/ZooKeeper are starting fine, but the Schema Registry keeps failing (Note, I am behind a corporate VPN)
The exception message in the SchemaRegistry logs is:
[2021-11-04 00:34:22,492] INFO Logging initialized #1403ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2021-11-04 00:34:22,543] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2021-11-04 00:34:22,614] INFO Adding listener: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2021-11-04 00:35:23,007] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryException: Failed to get Kafka cluster ID
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1488)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:166)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:71)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:90)
at io.confluent.rest.Application.configureHandler(Application.java:271)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:245)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:180)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1486)
... 7 more
My schema-registry.properties file has bootstrap URL set to
kafkastore.bootstrap.servers=PLAINTEXT://localhost:9092
I saw some posts saying its the SchemaRegistry unable to connect to the KafkaCluster URL because of the localhost address potentially. I am fairly new to Kafka and basically just need this local setup to run a git repo that is utilizing some Topics/Kafka so my questions...
How can I fix this (I am behind a corporate VPN but I figured this shouldn't affect this)
Do I even need the SchemaRegistry?
I ended up just going with the Docker local setup inside, and the only change I had to make to the docker compose YAML was to change the schema-registry port (I changed it to 8082 or 8084, don't remember exactly but just an unused port that is not being used by some other Confluent service listed in the docker-compose.yaml) and my local setup is working fine now

How do I set node definition from docker plugin?

I try to set Docker containers as node with the following Custom Mapping :
hostname.selector=docker:IPAddress
node.name.selector=docker:Name
username.selector=root
osFamily.selector=Docker
ssh-authentication=password
ssh-password-storage-path=keys/${node.hostname}/${node.username}
node.ssh-authentication.selector=password
docker-shell.default=bash
I alway get this error message :
Failed: AuthenticationFailure: Authentication failure connecting to node: "xxxxxx". Make sure your resource definitions and credentials are up to date.
Set the Docker node executor. Project Settings > Edit Configuration > Default Node Executor tab (select "docker-container-node-executor") and save it.

Using the Beam Python SDK and PortableRunner to connect to Kafka with SSL

I have the code below for connecting to kafka using the python beam sdk. I know that the ReadFromKafka transform is run in a java sdk harness (docker container) but I have not been able to figure out how to make ssl.truststore.location and ssl.keystore.location accesible inside the sdk harness' docker environment. The job_endpoint argument is pointing to java -jar beam-runners-flink-1.10-job-server-2.27.0.jar --flink-master localhost:8081
pipeline_args.extend([
'--job_name=paul_test',
'--runner=PortableRunner',
'--sdk_location=container',
'--job_endpoint=localhost:8099',
'--streaming',
"--environment_type=DOCKER",
f"--sdk_harness_container_image_overrides=.*java.*,{my_beam_sdk_docker_image}:{my_beam_docker_tag}",
])
with beam.Pipeline(options=PipelineOptions(pipeline_args)) as pipeline:
kafka = pipeline | ReadFromKafka(
consumer_config={
"bootstrap.servers": "bootstrap-server:17032",
"security.protocol": "SSL",
"ssl.truststore.location": "/opt/keys/client.truststore.jks", # how do I make this available to the Java SDK harness
"ssl.truststore.password": "password",
"ssl.keystore.type": "PKCS12",
"ssl.keystore.location": "/opt/keys/client.keystore.p12", # how do I make this available to the Java SDK harness
"ssl.keystore.password": "password",
"group.id": "group",
"basic.auth.credentials.source": "USER_INFO",
"schema.registry.basic.auth.user.info": "user:password"
},
topics=["topic"],
max_num_records=2,
# expansion_service="localhost:56938"
)
kafka | beam.Map(lambda x: print(x))
I tried specifying the image override option as --sdk_harness_container_image_overrides='.*java.*,beam_java_sdk:latest' - where beam_java_sdk:latest is a docker image I based on apache/beam_java11_sdk:2.27.0 and that pulls the credetials in its entrypoint.sh. But Beam does not appear to use it, I see
INFO org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for worker id 1-1
in the logs. Which is soon inevitebly followed by
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /opt/keys/client.keystore.p12 of type PKCS12
In conclusion, my question is this, In Apache Beam, is it possible to make files available inside java sdk harness docker container from the python beam sdk? If so, how might it be done?
Many thanks.
Currently, there is no straightforward way to achieve this. There is ongoing discussion and tracking issues to provide support for this kind of expansion service customization (see here, here, BEAM-12538 and BEAM-12539). That is the short answer.
Long answer is yes, you can do that. You would have to copy &paste ExpansionService.java into your codebase and build your custom expansion service, where you specify default environment (DOCKER) and default environment config (your image) here. You then have to manually run this expansion service and specify its address using expansion_service parameter of ReadFromKafka.

Unable to Run a Composed Task in Spring Cloud Data Flow

I am running latest version of SCDF server on Kubernetes cluster. Every time I try to run a composed task, it tries to fetch the application properties for composed-task-runner application and fails to launch the composed task.
First of all, SCDf is trying to pull the properties (metadata) from Spring Maven repo when I am running the server on k8s. my server behind a firewall and it cannot connect to spring maven repo. I already downloaded the composed-task-runner docker image to my local repo and added the composed-task-runner application using UI. Why it still tries to download metadata from Spring Maven repo ? How do I stop it ?
here is the log :
2020-11-21 15:49:07.591 INFO 1 --- [nio-8080-exec-4] o.s.c.d.s.k.DefaultContainerFactory : Using Docker entry point style: exec
2020-11-21 15:49:58.355 WARN 1 --- [nio-8080-exec-6] .s.c.d.s.s.i.TaskConfigurationProperties : org.springframework.cloud.dataflow.server.service.impl.TaskConfigurationProperties.logDeprecationWarning is deprecated. Please use org.springframework.cloud.dataflow.server.service.impl.ComposedTaskRunnerConfigurationProperties.logDeprecationWarning
2020-11-21 15:50:18.427 WARN 1 --- [nio-8080-exec-6] ApplicationConfigurationMetadataResolver : Failed to retrieve properties for resource org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT because of ConnectTimeoutException: Connect to repo.spring.io:443 timed out
2020-11-21 15:50:38.522 WARN 1 --- [nio-8080-exec-6] ApplicationConfigurationMetadataResolver : Failed to retrieve properties for resource org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT because of ConnectTimeoutException: Connect to repo.spring.io:443 timed out
2020-11-21 15:50:38.572 INFO 1 --- [nio-8080-exec-6] o.s.c.d.s.k.KubernetesTaskLauncher : Preparing to run a container from org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT. This may take some time if the image must be downloaded from a remote container registry.
2020-11-21 15:50:38.573 INFO 1 --- [nio-8080-exec-6] o.s.c.d.s.k.DefaultContainerFactory : Using Docker image: //org.springframework.cloud:spring-cloud-dataflow-composed-task-runner:jar:2.7.0-SNAPSHOT
Looks like, the Composed Task Runner docker image can now be set using the environment variable :
name: SPRING_CLOUD_DATAFLOW_TASK_COMPOSED_TASK_RUNNER_URI
value: 'docker://springcloud/spring-cloud-dataflow-composed-task-runner:2.6.0'
We were on SCDF server 2.2.4 version before this and we had to manually add the composed task runner as an Application using the dashboard UI.
Right now, all I had to do is to download this image and push to my local git repo and use it here.

Spring Cloud Dataflow errorChannel not working

I'm attempting to create a custom exception handler for my Spring Cloud Dataflow stream to route some errors to be requeued and others to be DLQ'd.
To do this I'm utilizing the global Spring Integration "errorChannel" and routing based on exception type.
This is the code for the Spring Integration error router:
package com.acme.error.router;
import com.acme.exceptions.DlqException;
import org.springframework.cloud.stream.annotation.EnableBinding;
import org.springframework.integration.annotation.MessageEndpoint;
import org.springframework.integration.annotation.Router;
import org.springframework.integration.transformer.MessageTransformationException;
import org.springframework.messaging.Message;
#MessageEndpoint
#EnableBinding({ ErrorMessageChannels.class })
public class ErrorMessageMappingRouter {
private static final Logger LOGGER = LoggerFactory.getLogger(ErrorMessageMappingRouter.class);
public static final String ERROR_CHANNEL = "errorChannel";
#Router(inputChannel = ERROR_CHANNEL)
public String onError(Message<Object> message) {
LOGGER.debug("ERROR ROUTER - onError");
if(message.getPayload() instanceof MessageTransformationException) {
MessageTransformationException exception = (MessageTransformationException) message.getPayload();
Message<?> failedMessage = exception.getFailedMessage();
if(exceptionChainContainsDlq(exception)) {
return ErrorMessageChannels.DLQ_QUEUE_NAME;
}
return ErrorMessageChannels.REQUEUE_CHANNEL;
}
return ErrorMessageChannels.DLQ_QUEUE_NAME;
}
...
}
The error router is picked up by each of the stream apps through a package scan on the Spring Boot App for each:
#ComponentScan(basePackages = { "com.acme.error.router" }
#SpringBootApplication
public class StreamApp {}
When this is deployed and run with the local Spring Cloud Dataflow server (version 1.5.0-RELEASE), and a DlqException is thrown, the message is successfully routed to the onError method in the errorRouter and then placed into the dlq topic.
However, when this is deployed as a docker container with SCDF Kubernetes server (also version 1.5.0-RELEASE), the onError method is never hit. (The log statement at the beginning of the router is never output)
In the startup logs for the stream apps, it looks like the bean is picked up correctly and registers as a listener for the errorChannel, but for some reason, when exceptions are thrown they do not get handled by the onError method in our router.
Startup Logs:
o.s.i.endpoint.EventDrivenConsumer : Adding {router:errorMessageMappingRouter.onError.router} as a subscriber to the 'errorChannel' channel
o.s.i.channel.PublishSubscribeChannel : Channel 'errorChannel' has 1 subscriber(s).
o.s.i.endpoint.EventDrivenConsumer : started errorMessageMappingRouter.onError.router
We are using all default settings for the spring cloud stream and kafka binder configurations:
spring.cloud:
stream:
binders:
kafka:
type: kafka
environment.spring.cloud.stream.kafka.binder.brokers=brokerlist
environment.spring.cloud.stream.kafka.binder.zkNodes=zklist
Edit: Added pod args from kubectl describe <pod>
Args:
--spring.cloud.stream.bindings.input.group=delivery-stream
--spring.cloud.stream.bindings.output.producer.requiredGroups=delivery-stream
--spring.cloud.stream.bindings.output.destination=delivery-stream.enricher
--spring.cloud.stream.binders.xdkafka.environment.spring.cloud.stream.kafka.binder.zkNodes=<zkNodes>
--spring.cloud.stream.binders.xdkafka.type=kafka
--spring.cloud.stream.binders.xdkafka.defaultCandidate=true
--spring.cloud.stream.binders.xdkafka.environment.spring.cloud.stream.kafka.binder.brokers=<brokers>
--spring.cloud.stream.bindings.input.destination=delivery-stream.config-enricher
One other idea we attempted was trying to use the Spring Cloud Stream - spring integration error channel support to send to a broker topic on errors, but since messages don't seem to be landing in the global Spring Integration errorChannel at all, that didn't work either.
Is there anything special we need to do in SCDF Kubernetes to enable the global Spring Integration errorChannel?
What am I missing here?
Update with solution from the comments:
After reviewing your configuration I am now pretty sure I know what
the issue is. You have a multi-binder configuration scenario. Even if
you only deal with a single binder instance the existence of
spring.cloud.stream.binders.... is what's going to make framework
treat it as multi-binder. Basically this a bug -
github.com/spring-cloud/spring-cloud-stream/issues/1384. As you can
see it was fixed but you need to upgrade to Elmhurst.SR2 or grab the
latest snapshot (we're in RC2 and 2.1.0.RELEASE is in few weeks
anyway) – Oleg Zhurakousky
This was indeed the problem with our setup. Instead of upgrading, we just eliminated our multi-binder usage for now and the issue was resolved.
Update with solution from the comments:
After reviewing your configuration I am now pretty sure I know what
the issue is. You have a multi-binder configuration scenario. Even if
you only deal with a single binder instance the existence of
spring.cloud.stream.binders.... is what's going to make framework
treat it as multi-binder. Basically this a bug -
github.com/spring-cloud/spring-cloud-stream/issues/1384. As you can
see it was fixed but you need to upgrade to Elmhurst.SR2 or grab the
latest snapshot (we're in RC2 and 2.1.0.RELEASE is in few weeks
anyway) – Oleg Zhurakousky
This was indeed the problem with our setup. Instead of upgrading, we just eliminated our multi-binder usage for now and the issue was resolved.