Beam SDK harness still trying to launch docker when I set environment_type to be `PROCESS` - apache-kafka

According to the beam harness documentation:
PROCESS: User code is executed by processes that are automatically started by the runner on each worker node.
args = [
"--runner=portableRunner",
"--streaming",
"--sdk_worker_parallelism=2",
"--environment_type=PROCESS",
"--environment_config={\"command\": \"/opt/apache/beam/boot\"}",
]
consumer_config = {
"security.protocol": "SASL_SSL",
"sasl.mechanism": "AWS_MSK_IAM",
"sasl.jaas.config": "software.amazon.msk.auth.iam.IAMLoginModule required;",
"sasl.client.callback.handler.class": "software.amazon.msk.auth.iam.IAMClientCallbackHandler",
"bootstrap.servers": bootstrap_servers,
}
with beam.Pipeline(options=PipelineOptions(args)) as p:
data = p | "Reading messages from Kafka" >> ReadFromKafka(
consumer_config=consumer_config,
topics=topics,
with_metadata=True
)
data | 'Writing to stdout' >> beam.Map(logging.info)
But when I run the code (deployed to k8s using flinkk8soperator), it is complaining:
Caused by: java.io.IOException: Cannot run program "docker": error=2, No such file or directory
Wondering if I misunderstand anything? Thanks!

After couple digging, I finally make the cross language work without using DinD or DooD. Here's the steps:
Ensure both job and task manager mount a shared volume for artifact staging. (This is required, otherwise the task manager will complained unable to find the submitted jar)
Ensure your docker image can run both java and python beam code, here's what I did:
# python SDK
COPY --from=apache/beam_python3.7_sdk:2.41.0 /opt/apache/beam/ /opt/apache/beam/
# java SDK
COPY --from=apache/beam_java8_sdk:2.41.0 /opt/apache/beam/ /opt/apache/beam_java/
In the job, you'll need to start the expansion service with extra args, for example the KafkaIo:
from apache_beam.io.kafka import ReadFromKafka, default_io_expansion_service
ReadFromKafka(
consumer_config=consumer_config,
topics=[topic],
with_metadata=False,
expansion_service=default_io_expansion_service(
append_args=[
'--defaultEnvironmentType=PROCESS',
"--defaultEnvironmentConfig={\"command\":\"/opt/apache/beam_java/boot\"}",
]
)

You portable execution relies on xLang support that requires starting a Java SDK with docker. Your cluster doesn't have docker installed.

Related

Using the Beam Python SDK and PortableRunner to connect to Kafka with SSL

I have the code below for connecting to kafka using the python beam sdk. I know that the ReadFromKafka transform is run in a java sdk harness (docker container) but I have not been able to figure out how to make ssl.truststore.location and ssl.keystore.location accesible inside the sdk harness' docker environment. The job_endpoint argument is pointing to java -jar beam-runners-flink-1.10-job-server-2.27.0.jar --flink-master localhost:8081
pipeline_args.extend([
'--job_name=paul_test',
'--runner=PortableRunner',
'--sdk_location=container',
'--job_endpoint=localhost:8099',
'--streaming',
"--environment_type=DOCKER",
f"--sdk_harness_container_image_overrides=.*java.*,{my_beam_sdk_docker_image}:{my_beam_docker_tag}",
])
with beam.Pipeline(options=PipelineOptions(pipeline_args)) as pipeline:
kafka = pipeline | ReadFromKafka(
consumer_config={
"bootstrap.servers": "bootstrap-server:17032",
"security.protocol": "SSL",
"ssl.truststore.location": "/opt/keys/client.truststore.jks", # how do I make this available to the Java SDK harness
"ssl.truststore.password": "password",
"ssl.keystore.type": "PKCS12",
"ssl.keystore.location": "/opt/keys/client.keystore.p12", # how do I make this available to the Java SDK harness
"ssl.keystore.password": "password",
"group.id": "group",
"basic.auth.credentials.source": "USER_INFO",
"schema.registry.basic.auth.user.info": "user:password"
},
topics=["topic"],
max_num_records=2,
# expansion_service="localhost:56938"
)
kafka | beam.Map(lambda x: print(x))
I tried specifying the image override option as --sdk_harness_container_image_overrides='.*java.*,beam_java_sdk:latest' - where beam_java_sdk:latest is a docker image I based on apache/beam_java11_sdk:2.27.0 and that pulls the credetials in its entrypoint.sh. But Beam does not appear to use it, I see
INFO org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for worker id 1-1
in the logs. Which is soon inevitebly followed by
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /opt/keys/client.keystore.p12 of type PKCS12
In conclusion, my question is this, In Apache Beam, is it possible to make files available inside java sdk harness docker container from the python beam sdk? If so, how might it be done?
Many thanks.
Currently, there is no straightforward way to achieve this. There is ongoing discussion and tracking issues to provide support for this kind of expansion service customization (see here, here, BEAM-12538 and BEAM-12539). That is the short answer.
Long answer is yes, you can do that. You would have to copy &paste ExpansionService.java into your codebase and build your custom expansion service, where you specify default environment (DOCKER) and default environment config (your image) here. You then have to manually run this expansion service and specify its address using expansion_service parameter of ReadFromKafka.

Bazel Kubernetes Object Error: no objects passed to apply (Google Container Registry)

I have a k8s_object rule to apply a deployment to my Google Kubernetes Cluster. Here is my setup:
load("#io_bazel_rules_docker//nodejs:image.bzl", "nodejs_image")
nodejs_image(
name = "image",
data = [":lib", "//:package.json"],
entry_point = ":index.ts",
)
load("#io_bazel_rules_k8s//k8s:object.bzl", "k8s_object")
k8s_object(
name = "k8s_deployment",
template = ":gateway.deployment.yaml",
kind = "deployment",
cluster = "gke_cents-ideas_europe-west3-b_cents-ideas",
images = {
"gcr.io/cents-ideas/gateway:latest": ":image"
},
)
But when I run bazel run //services/gateway:k8s_deployment.apply, I get the following error
INFO: Analyzed target //services/gateway:k8s_deployment.apply (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //services/gateway:k8s_deployment.apply up-to-date:
bazel-bin/services/gateway/k8s_deployment.apply
INFO: Elapsed time: 0.113s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
$ /snap/bin/kubectl --kubeconfig= --cluster=gke_cents-ideas_europe-west3-b_cents-ideas --context= --user= apply -f -
2020/02/12 14:52:44 Unable to publish images: unable to publish image gcr.io/cents-ideas/gateway:latest
error: no objects passed to apply
error: no objects passed to apply
It doesn't push the new image to the Google Container Registry.
Strangely, this worked a few days ago. But I didn't change anything.
Here is the full code if you need to take a closer look: https://github.com/flolude/cents-ideas/blob/069c773ade88dfa8aff492f024a1ade1f8ed282e/services/gateway/BUILD
Update
I don't know if this has something to do with this issue but when I run
gcloud auth configure-docker
I get some warnings:
WARNING: `docker-credential-gcloud` not in system PATH.
gcloud's Docker credential helper can be configured but it will not work until this is corrected.
WARNING: Your config file at [/home/flolu/.docker/config.json] contains these credential helper entries:
{
"credHelpers": {
"asia.gcr.io": "gcloud",
"staging-k8s.gcr.io": "gcloud",
"us.gcr.io": "gcloud",
"gcr.io": "gcloud",
"marketplace.gcr.io": "gcloud",
"eu.gcr.io": "gcloud"
}
}
Adding credentials for all GCR repositories.
WARNING: A long list of credential helpers may cause delays running 'docker build'. We recommend passing the registry name to configure only the registry you are using.
gcloud credential helpers already registered correctly.
I had google-cloud-sdk installed via snap install. What I did to make it work is to remove google-cloud-sdk via
snap remove google-cloud-sdk
and then followed those instructions to install it via
sudo apt install google-cloud-sdk
Now it works fine

Cloud foundry on Google Compute engine can't create container

I am very new with Cloud foundry. I have added cloud foundry for google compute engine platform by this guides source1 and source2.
Terraform was used for creating needed infrastructure. It seemed all was fine I didn't get any errors during deployment cloud foundry itself and bosh cck command returns that there are no any problems. But when I tried to deploy my hello world app, I got next error message in terminal after cf push command:
Creating container
Failed to create container
FAILED
Error restarting application: StagingError.
After checking log files I found next message:
{
"timestamp":"1474637304.026303530",
"source":"garden-linux",
"message":"garden-linux.loop-mounter.mount-file.mounting",
"log_level":2,
"data":{
"destPath":"/var/vcap/data/garden/aufs_graph/aufs/diff/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"error":"exit status 32",
"filePath":"/var/vcap/data/garden/aufs_graph/backing_stores/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"output":"mount: wrong fs type, bad option, bad superblock on /dev/loop0,\n missing codepage or helper program, or other error\n In some cases useful info is found in syslog - try\n dmesg | tail or so\n\n",
"session":"2.276"
}
}{
"timestamp":"1474637304.026949406",
"source":"garden-linux",
"message":"garden-linux.pool.acquire.provide-rootfs-failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"9.545"
}
}
{
"timestamp":"1474637304.027062416",
"source":"garden-linux",
"message":"garden-linux.garden-server.create.failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"request":{
"Handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"GraceTime":0,
"RootFSPath":"/var/vcap/packages/rootfs_cflinuxfs2/rootfs",
"BindMounts":[
{
"src_path":"/var/vcap/data/executor_cache/6942123d3462ad9d21a45729c3cae183-1474475979582384649-1.d",
"dst_path":"/tmp/lifecycle"
}
],
"Network":"",
"Privileged":true,
"Limits":{
"bandwidth_limits":{
},
"cpu_limits":{
"limit_in_shares":512
},
"disk_limits":{
"inode_hard":200000,
"byte_hard":6442450944,
"scope":1
},
"memory_limits":{
"limit_in_bytes":1073741824
}
}
},
"session":"11.44187"
}
}{
"timestamp":"1474637304.034646988",
"source":"garden-linux",
"message":"garden-linux.garden-server.destroy.failed",
"log_level":2,
"data":{
"error":"unknown handle: ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"11.44188"
}
}
And meantime in dmesg | tail I got next:
[161023.238082] aufs test_add:283:garden-linux[7681]: uid/gid/perm
/var/vcap/data/garden/aufs_graph/aufs/diff/d350dcd30f6d6f8b37eabe06a3b73bcea0a87f9aff4edf15f12792269fc9f97c
4294967294/4294967294/0755, 0/0/0755 [161023.238109] aufs
au_opts_verify:1597:garden-linux[7681]: dirperm1 breaks the protection
by the permission bits on the lower branch [161023.413392] device
wtj3qdqhig0t-0 entered promiscuous mode
I'm not sure that this issues connected or that it is issue at all, but I post them here in order to be sure, that I didn't miss anything.
I don't know how to fix this problem and where, should I look solution for terraform scripts or for bosh manifest files. We have micro service architecture with three nodes on node js and one on ruby, so deployment is very important question for us.
here is my application manifest.yml file:
---
applications:
- name: hello_cloud
memory: 128M
buildpack: https://github.com/cloudfoundry/nodejs-buildpack
instances: 1
random-route: true
command: "node server.js"
My goal is to be able deploy applications using cloud foundry. If you have any additional questions or I wrote something unclear feel free to write me.
This issue is related a conflict between garden and the 4.4 Linux kernel. To use the example cloudfoundry manfest, use the follow stemcell:
bosh upload stemcell https://bosh.io/d/stemcells/bosh-google-kvm-ubuntu-trusty-go_agent?v=3262.19
bosh deploy
You may need to delete your cf deployment before re-deploying due to quota issues.

Failed to start Neo4j service

I am using neo4j enterprise 3.0.3 version for windows. Following the operations manual 3.0, I have installed the neo4j service with bin\neo4j install-service. But I can't start it with bin\neo4j start. It said
Invoke-Neo4j : Failed to start service 'Neo4j Graph Database - neo4j (neo4j)'.
And I can't start the neo4j service in windows serice either. Maybe anyone have encountered this case before?
I had the same problem: I am using neo4j community 3.1.2 for windows and installed the service with the neo4j.bat file without any problems.Then i wanted to start the service with neo4j.bat and got the same error as you
I found a solution that worked for me. My neo4j files were in a folder, where the path to the folder contained spaces (C:\Program Files\Neo4j) Then i moved the folder one level up (C:\Neo4j).
After that i could start the service without problems.
Maybe this solution helps.
I am running neo4j on windows and in my case the crux of the issue was that there was an incompatibility between the installed versions of Java (32-bit) v/s OS version. The biggest clue that led me to this is the following set of lines in neo4j-service.2018-08-03 log file
[2018-08-03 14:55:42] [info] [ 1432] Starting service...
[2018-08-03 14:55:42] [error] [ 1432] %1 is not a valid Win32 application.
[2018-08-03 14:55:42] [error] [ 1432] Failed creating java C:\JavaNew\bin\server\jvm.dll
[2018-08-03 14:55:42] [error] [ 1432] %1 is not a valid Win32 application.
[2018-08-03 14:55:42] [error] [ 1432] ServiceStart returned 1
There are a fair number of potential issues, and I have made an attempt to compile all the issues with this,
Windows services cannot deal with service names in folders that have spaces; especially if there is another folder with the same name as the one with spaces.
For example - C:\Program Files... will have issues if C:\Program\Something...
To work around this, I put Neo4j in root folder c:\Neo4j
Get-Java.ps1 (under ..\bin\Neo4j-Management folder)looks in the path variable for 'JAVA_HOME' (usually found in *nix environments). If it does not find it here, it keeps looking in registry, and finally throws up its hand!
To deal with this, I simply put in a path variable. For a good measure, I uninstalled Java and re-installed Java in the root folder under C:\JavaNew
In retrospect, this step is probably not on part of the problem, and hence can be ignored. But I am leaving it here for completeness sake.
Invoke-Neo4j.ps1 (also under ..\bin\Neo4j-Management folder) has code that determines if the OS is 32-bit (or 64-bit). Based on this it determines if it should run prunsrv-i386.exe (32-bit) or prunsrv-amd64.exe (64-bit).
This has to match the Java version installed.
Upon running java -XshowSettings:all, and inspecting the sun.arch.data.model value (32, in my case), I realized that my OS is 64 bit and the Java version is 32-bit.
To deal with this, I put in code (very klugey!). I am sure there are much better ways to get to the same outcome, but this is what I used.
switch ( (Get-WMIObject -Class Win32_Processor | Select-Object -First 1).Addresswidth ) {
32 { $PrunSrvName = 'prunsrv-i386.exe' } # 4 Bytes = 32bit
#64 { $PrunSrvName = 'prunsrv-amd64.exe' } # 8 Bytes = 64bit COMMENTED as a workaround!!!
64 { $PrunSrvName = 'prunsrv-i386.exe' } # 8 Bytes = 64bit
Now, uninstall the neo4j service, install it, and start the service.
Hope this works for you.
neo4j console
Posting for latest versions > 4.x
I had the same issue using neo4j start, Neo4j console is the right command I was looking for. It is a web-based graph that acts as an interactive tutorial.
i had the same problem , after the neo4j worked for few weeks it stoop working (without any change that i made)
i have set java_home uninstall and install and now it works
neo4j-enterprise-3.3.4
I was also having weired issue as there was no error but neo4J service did not start.
[xx#ss1 bin]$ ./neo4j console
[xx#ss1 bin]$ .
The problem was with the permission on Java directory and I tried
chmod -R 777 jdk_directory
and problem got solved.
#neo4j #neo4jnotstarting

Error running hadoop application in Eclipse on Windows

I'm trying to set up an Eclipse environment for developing and debugging hadoop. I'm following Tom White's Definitive Hadoop 3rd ed. What I would like to do is get the MaxTemperature app working locally on my Windows within Eclipse before moving it to my Hortonworks sandbox VM. The comment on page 158 about using the local job runner seems to be what I want. I don't want to set up a full hadoop implementation on Windows. I'm hoping with the right config params I can convince it to run as a java application inside Eclipse.
Windows: 7
Eclipse: Luna
Hadoop: 2.4.0
JDK: 7
When I set the Run configuration for MaxTemperatureDriver (Source code on page 157) to
inputfile outputdir foo (deliberate bogus 3rd parameter)
I get the usage message so I know I'm running my program with those params.
If I remove the bogus third param I get
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at mark.MaxTemperatureDriver.run(MaxTemperatureDriver.java:52)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at mark.MaxTemperatureDriver.main(MaxTemperatureDriver.java:56)
I've tried inserting -conf but it seems to be ignored. There is no error message if I specify a nonexistent path.
I've tried inserting -fs file:/// -jt local, but it makes no difference
I've tried inserting -D mapreduce.framework.name=local
I've tried specifying the input and output with the file: format
Note. I'm not asking about how to configure eclipse to connect to a remote Hadoop installation. I want the application to run within eclipse.
Is this possible? Any ideas?
Additional info:
I turned on debugging. I saw:
582 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Trying ClientProtocolProvider : org.apache.hadoop.mapred.YarnClientProtocolProvider
583 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Cannot pick org.apache.hadoop.mapred.YarnClientProtocolProvider as the ClientProtocolProvider - returned null protocol
I'm wondering not why YarnClientProtocolProvider failed, but why it didn't try LocalClientProtocolProvider.
New info:
It seems that this is an issue with Hadoop 2.4.0. I recreated my environment with Hadoop 1.2.1, followed the instructions in
http://gerrymcnicol.com/index.php/2014/01/02/hadoop-and-cassandra-part-4-writing-your-first-mapreduce-job/
added the Windows hack from
http://bigdatanerd.wordpress.com/2013/11/14/mapreduce-running-mapreduce-in-windows-file-system-debug-mapreduce-in-eclipse
and it all started working.
Following blog will be useful.
Running mapreduce in Windows filesystem