Cloud foundry on Google Compute engine can't create container - deployment

I am very new with Cloud foundry. I have added cloud foundry for google compute engine platform by this guides source1 and source2.
Terraform was used for creating needed infrastructure. It seemed all was fine I didn't get any errors during deployment cloud foundry itself and bosh cck command returns that there are no any problems. But when I tried to deploy my hello world app, I got next error message in terminal after cf push command:
Creating container
Failed to create container
FAILED
Error restarting application: StagingError.
After checking log files I found next message:
{
"timestamp":"1474637304.026303530",
"source":"garden-linux",
"message":"garden-linux.loop-mounter.mount-file.mounting",
"log_level":2,
"data":{
"destPath":"/var/vcap/data/garden/aufs_graph/aufs/diff/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"error":"exit status 32",
"filePath":"/var/vcap/data/garden/aufs_graph/backing_stores/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"output":"mount: wrong fs type, bad option, bad superblock on /dev/loop0,\n missing codepage or helper program, or other error\n In some cases useful info is found in syslog - try\n dmesg | tail or so\n\n",
"session":"2.276"
}
}{
"timestamp":"1474637304.026949406",
"source":"garden-linux",
"message":"garden-linux.pool.acquire.provide-rootfs-failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"9.545"
}
}
{
"timestamp":"1474637304.027062416",
"source":"garden-linux",
"message":"garden-linux.garden-server.create.failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"request":{
"Handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"GraceTime":0,
"RootFSPath":"/var/vcap/packages/rootfs_cflinuxfs2/rootfs",
"BindMounts":[
{
"src_path":"/var/vcap/data/executor_cache/6942123d3462ad9d21a45729c3cae183-1474475979582384649-1.d",
"dst_path":"/tmp/lifecycle"
}
],
"Network":"",
"Privileged":true,
"Limits":{
"bandwidth_limits":{
},
"cpu_limits":{
"limit_in_shares":512
},
"disk_limits":{
"inode_hard":200000,
"byte_hard":6442450944,
"scope":1
},
"memory_limits":{
"limit_in_bytes":1073741824
}
}
},
"session":"11.44187"
}
}{
"timestamp":"1474637304.034646988",
"source":"garden-linux",
"message":"garden-linux.garden-server.destroy.failed",
"log_level":2,
"data":{
"error":"unknown handle: ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"11.44188"
}
}
And meantime in dmesg | tail I got next:
[161023.238082] aufs test_add:283:garden-linux[7681]: uid/gid/perm
/var/vcap/data/garden/aufs_graph/aufs/diff/d350dcd30f6d6f8b37eabe06a3b73bcea0a87f9aff4edf15f12792269fc9f97c
4294967294/4294967294/0755, 0/0/0755 [161023.238109] aufs
au_opts_verify:1597:garden-linux[7681]: dirperm1 breaks the protection
by the permission bits on the lower branch [161023.413392] device
wtj3qdqhig0t-0 entered promiscuous mode
I'm not sure that this issues connected or that it is issue at all, but I post them here in order to be sure, that I didn't miss anything.
I don't know how to fix this problem and where, should I look solution for terraform scripts or for bosh manifest files. We have micro service architecture with three nodes on node js and one on ruby, so deployment is very important question for us.
here is my application manifest.yml file:
---
applications:
- name: hello_cloud
memory: 128M
buildpack: https://github.com/cloudfoundry/nodejs-buildpack
instances: 1
random-route: true
command: "node server.js"
My goal is to be able deploy applications using cloud foundry. If you have any additional questions or I wrote something unclear feel free to write me.

This issue is related a conflict between garden and the 4.4 Linux kernel. To use the example cloudfoundry manfest, use the follow stemcell:
bosh upload stemcell https://bosh.io/d/stemcells/bosh-google-kvm-ubuntu-trusty-go_agent?v=3262.19
bosh deploy
You may need to delete your cf deployment before re-deploying due to quota issues.

Related

Beam SDK harness still trying to launch docker when I set environment_type to be `PROCESS`

According to the beam harness documentation:
PROCESS: User code is executed by processes that are automatically started by the runner on each worker node.
args = [
"--runner=portableRunner",
"--streaming",
"--sdk_worker_parallelism=2",
"--environment_type=PROCESS",
"--environment_config={\"command\": \"/opt/apache/beam/boot\"}",
]
consumer_config = {
"security.protocol": "SASL_SSL",
"sasl.mechanism": "AWS_MSK_IAM",
"sasl.jaas.config": "software.amazon.msk.auth.iam.IAMLoginModule required;",
"sasl.client.callback.handler.class": "software.amazon.msk.auth.iam.IAMClientCallbackHandler",
"bootstrap.servers": bootstrap_servers,
}
with beam.Pipeline(options=PipelineOptions(args)) as p:
data = p | "Reading messages from Kafka" >> ReadFromKafka(
consumer_config=consumer_config,
topics=topics,
with_metadata=True
)
data | 'Writing to stdout' >> beam.Map(logging.info)
But when I run the code (deployed to k8s using flinkk8soperator), it is complaining:
Caused by: java.io.IOException: Cannot run program "docker": error=2, No such file or directory
Wondering if I misunderstand anything? Thanks!
After couple digging, I finally make the cross language work without using DinD or DooD. Here's the steps:
Ensure both job and task manager mount a shared volume for artifact staging. (This is required, otherwise the task manager will complained unable to find the submitted jar)
Ensure your docker image can run both java and python beam code, here's what I did:
# python SDK
COPY --from=apache/beam_python3.7_sdk:2.41.0 /opt/apache/beam/ /opt/apache/beam/
# java SDK
COPY --from=apache/beam_java8_sdk:2.41.0 /opt/apache/beam/ /opt/apache/beam_java/
In the job, you'll need to start the expansion service with extra args, for example the KafkaIo:
from apache_beam.io.kafka import ReadFromKafka, default_io_expansion_service
ReadFromKafka(
consumer_config=consumer_config,
topics=[topic],
with_metadata=False,
expansion_service=default_io_expansion_service(
append_args=[
'--defaultEnvironmentType=PROCESS',
"--defaultEnvironmentConfig={\"command\":\"/opt/apache/beam_java/boot\"}",
]
)
You portable execution relies on xLang support that requires starting a Java SDK with docker. Your cluster doesn't have docker installed.

Using the Beam Python SDK and PortableRunner to connect to Kafka with SSL

I have the code below for connecting to kafka using the python beam sdk. I know that the ReadFromKafka transform is run in a java sdk harness (docker container) but I have not been able to figure out how to make ssl.truststore.location and ssl.keystore.location accesible inside the sdk harness' docker environment. The job_endpoint argument is pointing to java -jar beam-runners-flink-1.10-job-server-2.27.0.jar --flink-master localhost:8081
pipeline_args.extend([
'--job_name=paul_test',
'--runner=PortableRunner',
'--sdk_location=container',
'--job_endpoint=localhost:8099',
'--streaming',
"--environment_type=DOCKER",
f"--sdk_harness_container_image_overrides=.*java.*,{my_beam_sdk_docker_image}:{my_beam_docker_tag}",
])
with beam.Pipeline(options=PipelineOptions(pipeline_args)) as pipeline:
kafka = pipeline | ReadFromKafka(
consumer_config={
"bootstrap.servers": "bootstrap-server:17032",
"security.protocol": "SSL",
"ssl.truststore.location": "/opt/keys/client.truststore.jks", # how do I make this available to the Java SDK harness
"ssl.truststore.password": "password",
"ssl.keystore.type": "PKCS12",
"ssl.keystore.location": "/opt/keys/client.keystore.p12", # how do I make this available to the Java SDK harness
"ssl.keystore.password": "password",
"group.id": "group",
"basic.auth.credentials.source": "USER_INFO",
"schema.registry.basic.auth.user.info": "user:password"
},
topics=["topic"],
max_num_records=2,
# expansion_service="localhost:56938"
)
kafka | beam.Map(lambda x: print(x))
I tried specifying the image override option as --sdk_harness_container_image_overrides='.*java.*,beam_java_sdk:latest' - where beam_java_sdk:latest is a docker image I based on apache/beam_java11_sdk:2.27.0 and that pulls the credetials in its entrypoint.sh. But Beam does not appear to use it, I see
INFO org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory - Still waiting for startup of environment apache/beam_java11_sdk:2.27.0 for worker id 1-1
in the logs. Which is soon inevitebly followed by
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /opt/keys/client.keystore.p12 of type PKCS12
In conclusion, my question is this, In Apache Beam, is it possible to make files available inside java sdk harness docker container from the python beam sdk? If so, how might it be done?
Many thanks.
Currently, there is no straightforward way to achieve this. There is ongoing discussion and tracking issues to provide support for this kind of expansion service customization (see here, here, BEAM-12538 and BEAM-12539). That is the short answer.
Long answer is yes, you can do that. You would have to copy &paste ExpansionService.java into your codebase and build your custom expansion service, where you specify default environment (DOCKER) and default environment config (your image) here. You then have to manually run this expansion service and specify its address using expansion_service parameter of ReadFromKafka.

Openshift 3.11 cloud integration fails with lookup RequestError: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com

Following the docs: https://docs.openshift.com/container-platform/3.11/install_config/configuring_aws.html#aws-cluster-labeling
Configuring the cloud integration after the cluster build.
When the cluster services are restarted on the masters it fails looking up AWS instances:
22 16:32:10.112895 75995 server.go:261] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0c5cbd50923f9c6d2: "error listing AWS instances: \"Request.service: main process exited, code=exited, status=255/n/a Error: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com/: dial tcp: lookup ec2.eu-west-.amazonaws.com: no such host\""
On closer inspection seems to be due to incorrect hostname:
https://ec2.eu-west-.amazonaws.com/ VS https://ec2.eu-west-2.amazonaws.com/
So I double checked the config, which seems to be correct:
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2
Had a google and it seems to be a similar issue to this:
https://github.com/kubernetes-sigs/kubespray/issues/4345
Is there a way to work around this? Moving off 3.11 isn't an option right now.
Thanks.
Looks as though it needs to be zone, rather than the region.
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2a

cf push to IBM Cloud failed : Unable to install node: improper constraint: >=4.1.0 <5.5.0

I pushed an app to the IBM Cloud after a minor change (just some data, no code or dependencies).
cat: /VERSION: No such file or directory
-----> IBM SDK for Node.js Buildpack v4.0.1-20190930-1425
Based on Cloud Foundry Node.js Buildpack 1.6.53
-----> Installing binaries
engines.node (package.json): >=4.1.0 <5.5.0
engines.npm (package.json): unspecified (use default)
**WARNING** Dangerous semver range (>) in engines.node. See: http://docs.cloudfoundry.org/buildpacks/node/node-tips.html
**ERROR** Unable to install node: improper constraint: >=4.1.0 <5.5.0
Failed to compile droplet: Failed to run all supply scripts: exit status 14
Exit status 223
Cell 155a85d3-8d60-425c-8e39-3a1183bfec2a stopping instance 5aad9d60-87d7-4153-b1ac-c3847c9a7a83
Cell 155a85d3-8d60-425c-8e39-3a1183bfec2a destroying container for instance 5aad9d60-87d7-4153-b1ac-c3847c9a7a83
Cell 155a85d3-8d60-425c-8e39-3a1183bfec2a successfully destroyed container for instance 5aad9d60-87d7-4153-b1ac-c3847c9a7a83
FAILED
Error restarting application: BuildpackCompileFailed
An older version of the app is running on the IBM Cloud already (from May 2019, I think).
So I wonder what changed so it's not working anymore.
In IBM Cloud Foundry, the Node.js version must be specified like this
"engines": {
"node": "12.x"
}
or
"engines": {
"node": "12.10.x"
}
You can also try removing this completely on your package.json file.
"engines": {
"node": "6.15.1",
"npm": "3.10.10"
},
Here's a quick reference.

IBM BLUEMIX BLOCKCHAIN SDK-DEMO failing

I have been working with HFC SDK for Node.js and it used to work, but since last night I am having some problems.
When running helloblockchain.js only few times works, most time I get this error when it tries to enroll a new user:
E0113 11:56:05.983919636 5288 handshake.c:128] Security handshake failed: {"created":"#1484304965.983872199","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484304965.983866102","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
Error: Failed to register and enroll JohnDoe: Error
Other times, the enroll works and the failure appears deploying the chaincode:
Enrolled and registered JohnDoe successfully
Deploying chaincode ...
E0113 12:14:27.341527043 5455 handshake.c:128] Security handshake failed: {"created":"#1484306067.341430168","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306067.341421859","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
Failed to deploy chaincode: request={"fcn":"init","args":["a","100","b","200"],"chaincodePath":"chaincode","certificatePath":"/certs/peer/cert.pem"}, error={"error":{"code":14,"metadata":{"_internal_repr":{}}},"msg":"Error"}
Or:
Enrolled and registered JohnDoe successfully
Deploying chaincode ...
E0113 12:15:27.448867739 5483 handshake.c:128] Security handshake failed: {"created":"#1484306127.448692244","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306127.448668047","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
events.js:160
throw er; // Unhandled 'error' event
^
Error
at ClientDuplexStream._emitStatusIfDone (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:189:19)
at ClientDuplexStream._readsDone (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:158:8)
at readCallback (/usr/lib/node_modules/hfc/node_modules/grpc/src/node/src/client.js:217:12)
E0113 12:15:27.563487641 5483 handshake.c:128] Security handshake failed: {"created":"#1484306127.563437122","description":"Handshake read failed","file":"../src/core/lib/security/transport/handshake.c","file_line":237,"referenced_errors":[{"created":"#1484306127.563429661","description":"FD shutdown","file":"../src/core/lib/iomgr/ev_epoll_linux.c","file_line":948}]}
This code worked yesterday, so I don't know what could be happening.
Does anybody know how can I fix it?
Thanks,
Javier.
ibm-bluemix
blockchain
These types of intermittent issues are usually related to GRPC. An initial suggestion is to ensure that you are using at least GRPC version 1.0.0.
If you are using a Mac, then the maximum number of open file descriptors should be checked (using ulimit -n). Sometimes this is initially set to a low value such as 256, so increasing the value could help.
There are a couple of GRPC issues with similar symptoms.
https://github.com/grpc/grpc/issues/8732
https://github.com/grpc/grpc/issues/8839
https://github.com/grpc/grpc/issues/8382
There is a grpc.initial_reconnect_backoff_ms property that is mentioned in some of these issues. Increasing the value past the 1000 ms level might help reduce the frequency of issues. Below are instructions for how the helloblockchain.js file can be modified to set this property to a higher value.
Open the helloblockchain.js file in the Hyperledger Fabric Client example and find the enrollAndRegisterUsers function.
Add “grpc.initial_reconnect_backoff_ms": 5000 to the setMemberServicesUrl call.
chain.setMemberServicesUrl(ca_url, {
pem: cert, "grpc.initial_reconnect_backoff_ms": 5000
});
Add “grpc.initial_reconnect_backoff_ms": 5000 to the addPeer call.
chain.addPeer("grpcs://" + peers[i].discovery_host + ":" + peers[i].discovery_port,
{pem: cert, "grpc.initial_reconnect_backoff_ms": 5000
});
Note that setting the grpc.initial_reconnect_backoff_ms property may reduce the frequency of issues, but it will not necessarily eliminate all issues.
The connection to the eventhub that is made in the helloblockchain.js file can also be a factor. There is an earlier version of the Hyperledger Fabric Client that does not utilize the eventhub. This earlier version could be tried to determine if this makes a difference. After running git clone https://github.com/IBM-Blockchain/SDK-Demo.git, run git checkout b7d5195 to use this prior level. Before running node helloblockchain.js from a Node.js command window, the git status command can be used to check the code level that is being used.