To help debug an issue with a yarn application, I need to modify some of the system code on IAE to provide more debug output.
I have retrieved this jar file from the cluster to my local machine:
/usr/hdp/current/hadoop-client/hadoop-aws.jar
I've modified the bytecode to log more information when the an exception is thrown on checkOpen():
public class S3AOutputStream extends OutputStream {
...
void checkOpen() throws IOException {
if (closed.get()) {
// some log4j statements added to the bytecode here ...
throw new IOException("Output Stream closed");
}
}
...
}
However, I'm unable to save the library with my changes back to the cluster because I don't have root access.
How can I deploy my modified jar files to the cluster? Assume that I need to install the libraries on the name node and compute nodes.
This is not currently possible with IBM Analytics Engine.
Please raise a support ticket describing your issue.
Related
I have been working on creating a simple custom processor in Scala for Spring Cloud Data Flow and have been running into issues with sending/receiving data from/to starter applications. I have been unable to see any messages propagating through the stream. The definition of the stream is time --trigger.time-unit=SECONDS | pass-through-log | log where pass-through-log is my custom processor.
I am using Spring Cloud Data Flow 2.5.1 and Spring Boot 2.2.6.
Here is the code used for the processor - I am using the functional model.
#SpringBootApplication
class PassThroughLog {
#Bean
def passthroughlog(): Function[String, String] = {
input: String => {
println(s"Received input `$input`")
input
}
}
}
object PassThroughLog {
def main(args: Array[String]): Unit = SpringApplication.run(classOf[PassThroughLog], args: _ *)
}
application.yml
spring:
cloud:
stream:
function:
bindings:
passthroughlog-in-0: input
passthroughlog-out-0: output
build.gradle.kts
// scala
implementation("org.scala-lang:scala-library:2.12.10")
// spring
implementation(platform("org.springframework.cloud:spring-cloud-dependencies:Hoxton.SR5"))
implementation(platform("org.springframework.cloud:spring-cloud-stream-dependencies:Horsham.SR5"))
implementation("org.springframework.boot:spring-boot-starter")
implementation("org.springframework.boot:spring-boot-starter-actuator")
implementation("org.springframework.cloud:spring-cloud-starter-function-web:3.0.7.RELEASE")
implementation("org.springframework.cloud:spring-cloud-starter-stream-kafka:3.0.5.RELEASE")
I have posted the entire project to github if the code samples here are lacking. I also posted the logs there, as they are quite long.
When I bootstrap a local Kafka cluster and push arbitrary data to the input topic, I am able to see data flowing through the processor. However, when I deploy the application on Spring Cloud Data Flow, this is not the case. I am deploying the app via Docker in Kubernetes.
Additionally, when I deploy a stream with the definition time --trigger.time-unit=SECONDS | log, I see messages in the log sink. This has convinced me the problem lies with the custom processor.
Am I missing something simple like a dependency or extra configuration? Any help is greatly appreciated.
When using Spring Cloud Stream 3.x version in SCDF, there's an additional property that you will have to set to let SCDF know what channel bindings are configured as input and output channels.
See: Functional Applications
Pay attention specifically to the following properties:
app.time-source.spring.cloud.stream.function.bindings.timeSupplier-out-0=output
app.log-sink.spring.cloud.stream.function.bindings.logConsumer-in-0=input
In your case, you will have to map passthroughlog-in-0 and passthroughlog-out-0 function bindings to input and output respectively.
Turns out the problem was with my Dockerfile. For ease of configuration, I had a build argument to specify the jar file used in the ENTRYPOINT. To accomplish this, I used the shell version of ENTRYPOINT. Changing up my ENTRYPOINT to the exec version solved my issue.
The shell version of ENTRYPOINT does not play well with image arguments (docker run <image> <args>), and hence SCDF could not pass the appropriate arguments to the container.
Changing my Dockerfile from:
FROM openjdk:11.0.5-jdk-slim as build
ARG JAR
ENV JAR $JAR
ADD build/libs/$JAR .
ENTRYPOINT java -jar $JAR
to
FROM openjdk:11.0.5-jdk-slim as build
ARG JAR
ADD build/libs/$JAR program.jar
ENTRYPOINT ["java", "-jar", "program.jar"]
fixed the problem.
I have installed and tested kafka connect in distributed mode, it works now and it connects to the configured sink and reads from the configured source.
That being the case, I moved to enhance my installation. The one area I think needs immediate attention is the fact that to create a connector, the only available mean is through REST calls, this means I need to send my information through the wire, unprotected.
In order to secure this, kafka introduced the new ConfigProvider seen here.
This is helpful as it allows to set properties in the server and then reference them in the rest call, like so:
{
.
.
"property":"${file:/path/to/file:nameOfThePropertyInFile}"
.
.
}
This works really well, just by adding the property file on the server and adding the following config on the distributed.properties file:
config.providers=file # multiple comma-separated provider types can be specified here
config.providers.file.class=org.apache.kafka.common.config.provider.FileConfigProvider
While this solution works, it really does not help to easy my concerns regarding security, as the information now passed from being sent over the wire, to now be seating on a repository, with text on plain sight for everyone to see.
The kafka team foresaw this issue and allowed clients to produce their own configuration providers implementing the interface ConfigProvider.
I have created my own implementation and packaged in a jar, givin it the sugested final name:
META-INF/services/org.apache.kafka.common.config.ConfigProvider
and added the following entry in the distributed file:
config.providers=cust
config.providers.cust.class=com.somename.configproviders.CustConfigProvider
However I am getting an error from connect, stating that a class implementing ConfigProvider, with the name:
com.somename.configproviders.CustConfigProvider
could not be found.
I am at a loss now, because the documentation on their site is not explicit about how to configure custom config providers very well.
Has someone worked on a similar issue and could provide some insight into this? Any help would be appreciated.
I just went through these to setup a custom ConfigProvider recently. The official doc is ambiguous and confusing.
I have created my own implementation and packaged in a jar, givin it the sugested final name:
META-INF/services/org.apache.kafka.common.config.ConfigProvider
You could name the final name of jar whatever you like, but needs to pack to jar format which has .jar suffix.
Here is the complete step by step. Suppose your custom ConfigProvider fully-qualified name is com.my.CustomConfigProvider.MyClass.
1. create a file under directory: META-INF/services/org.apache.kafka.common.config.ConfigProvider. File content is full qualified class name:
com.my.CustomConfigProvider.MyClass
Include your source code, and above META-INF folder to generate a Jar package. If you are using Maven, file structure looks like this
put your final Jar file, say custom-config-provider-1.0.jar, under the Kafka worker plugin folder. Default is /usr/share/java. PLUGIN_PATH in Kafka worker config file.
Upload all the dependency jars to PLUGIN_PATH as well. Use the META-INFO/MANIFEST.MF file inside your Jar file to configure the 'ClassPath' of dependent jars that your code will use.
In kafka worker config file, create two additional properties:
CONNECT_CONFIG_PROVIDERS: 'mycustom', // Alias name of your ConfigProvider
CONNECT_CONFIG_PROVIDERS_MYCUSTOM_CLASS:'com.my.CustomConfigProvider.MyClass',
Restart workers
Update your connector config file by curling POST to Kafka Restful API. In Connector config file, you could reference the value inside ConfigData returned from ConfigProvider:get(path, keys) by using the syntax like:
database.password=${mycustom:/path/pass/to/get/method:password}
ConfigData is a HashMap which contains {password: 123}
If you still seeing ClassNotFound exception, probably your ClassPath is not setup correctly.
Note:
• If you are using AWS ECS/EC2, you need to set the worker config file by setting the environment variable.
• worker config and connector config file are different.
I am working on creating a Camel, Spring boot application that implements the OPC-UA connection. Till now, I was successfully able to run the examples obtained from Eclipse milo github repository.
Now, my task is to create a camel route that will connect to the opc-ua server that is running on a different machine, read data from there and store in a jms queue.
Till now, I am able to run the BrowseNodeExample and ReadNodeExample where I am connecting to a server simulator (Top Server V6). In the example code, when connecting to the server, the endpoint of the server is given as - "opc.tcp://127.0.0.1:49384/SWToolbox.TOPServer.V6"
Now in the camel routing piece of code, in the .configure() part, what shall I write in the .from() part. The piece of code is as -
#Override
public void configure() throws Exception {
from("opc.tcp://127.0.0.1:49384/SWToolbox.TOPServer.V6")
.process(opcConnection)
.split(body().tokenize(";"))
.to(opcBean.getKarafQueue());
}
While searching for the solution I came across one option: milo-server:tcp://127.0.0.1:49384/SWToolbox.TOPServer.V6/nodeId=2&namespaceUri=http://examples.freeopcua.github.io. I tried that but it didn't work. In both the cases I get the below error:
ResolveEndpointFailedException: Failed to resolve endpoint: (endpoint
given) due to: No component found with scheme: milo-server (or
opc.tcp)
You might want to add the camel-opc component to your project.
I've found one on Github
and also milo version on maven central for the OPC-UA connection.
Hope that helps :-)
The ResolveEndpointFailedException is quite clear, Camel cannot find the component. That means that the auto-discovery failed to load the definition in the META-INF directory.
Have you checked that the camel-milo jar is contained in your fat-jar/war?
As a workaround you can add the component manualy via
CamelContext context = new DefaultCamelContext();
context.addComponent("foo", new FooComponent(context));
http://camel.apache.org/how-do-i-add-a-component.html
or in your case
#Override
public void configure() throws Exception {
getContext().addComponent("milo-server", new org.apache.camel.component.milo.server.MiloServerComponent());
from("milo-server:tcp://127.0.0.1:49384/SWToolbox.TOPServer.V6/nodeId=2&namespaceUri=http://examples.freeopcua.github.io")
...
}
Furthermore be aware that milo-server starts an OPC UA server. As I understood your question you want to connect to an OPC UA server. Therefore you need the milo-client component.
camel-milo client at github
This is related to embedded kafka server provided by spring. On running my test case, embedded kafka intantiation (as show below) fails.
public static KafkaEmbedded embeddedKafka = new KafkaEmbedded(1, true, KAFKA_TOPIC);
instantiation fails with following error.
1> C:\Users\r2dev\AppData\Local\Temp\kafka-1587343850239557903\version-2\log.1: The process cannot access the file because it is being used by another process
2> C:\Users\r2dev\AppData\Local\Temp\kafka-7315008084340411800.lock: The process cannot access the file because it is being used by another process.
I am using "1.2.0.RELEASE" spring kafka version and Java 8.
Any one faced this issue and were able to fix this issue. Please let me know
Thanks in advance.
I'm starting with Service Fabric. I have created a very simple console application that runs the following code:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello world!");
File.AppendAllText("c:\\temp\\hello.txt", "Hello world!" + DateTime.Now.ToString() + "\r\n");
Console.ReadLine();
}
}
Then I create a guest executable project with Visual Studio and point it to the exe application. It gets installed in Service Fabric, I can see that the file is created, but then service fabric throws an error:
Error event: SourceId='System.FM', Property='State'.
Partition is below target replica or instance count.
fabric:/Test3/Test3Service -1 1 5ef5a0eb-5621-4821-95cb-4c1920ab7f0c
(Showing 0 out of 0 replicas. Total available replicas: 0.)
Is this approach correct? Can I have exe applications hosted in Service Fabric or do I need to implement/inherit from something?
EDIT
When the application is deployed it enters in a Warning state, showing the following messages:
Soon afterwards it transitions to an error state:
Yes you can host a simple Console Application in Service Fabric as a Guest Executable, that should not be a problem.
The issue you are seeing is likely because the application is trying to write to a file in c:/temp where your Guest Exe by default doesnt have permissions. Try removing that part of your sample code, or change it to write to hello.txt and it will end up in the same folder your Guest Exe is running in.
You should consider file storage on a Service Fabric node as temporary however and not rely on storing data there as your service could be moved between nodes by Service Fabric as part of it's cluster maintenance.
See this answer for some more details on file system access in SF https://stackoverflow.com/a/37966158/1062217