Spinnaker "Create Application" menu doesn't load - kubernetes

I'm quite new to the Spinnaker and have to ask for some help I guess. Does anyone knows why it could be that I can't create any Application and just keep seeing this screen.
My installation is through Halyard 1.5.0 and Ubuntu 14.04.
We don't use any cloud provider but I did configure Docker and Kubernetes part
And here is the error I see in the /var/log/spinnaker/echo/echo.log:
2017-11-16 13:52:29.901 INFO 13877 --- [ofit-/pipelines] c.n.s.echo.services.Front50Service : java.net.SocketTimeoutException: timeout
at okio.Okio$3.newTimeoutException(Okio.java:207)
at okio.AsyncTimeout.exit(AsyncTimeout.java:261)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:215)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.squareup.okhttp.internal.http.Http1xStream.readResponse(Http1xStream.java:186)
at com.squareup.okhttp.internal.http.Http1xStream.readResponseHeaders(Http1xStream.java:127)
at com.squareup.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:739)
at com.squareup.okhttp.internal.http.HttpEngine.access$200(HttpEngine.java:87)
at com.squareup.okhttp.internal.http.HttpEngine$NetworkInterceptorChain.proceed(HttpEngine.java:724)
at com.squareup.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:578)
at com.squareup.okhttp.Call.getResponse(Call.java:287)
at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243)
at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205)
at com.squareup.okhttp.Call.execute(Call.java:80)
at retrofit.client.OkClient.execute(OkClient.java:53)
at retrofit.RestAdapter$RestHandler.invokeRequest(RestAdapter.java:326)
at retrofit.RestAdapter$RestHandler.access$100(RestAdapter.java:220)
at retrofit.RestAdapter$RestHandler$1.invoke(RestAdapter.java:265)
at retrofit.RxSupport$2.run(RxSupport.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at retrofit.Platform$Base$2$1.run(Platform.java:94)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at okio.Okio$2.read(Okio.java:139)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:211)
... 24 more
2017-11-16 13:52:29.901 INFO 13877 --- [ofit-/pipelines] c.n.s.echo.services.Front50Service : ---- END ERROR

#grizzthedj
thanks again for recommendations. It doesn't seem, however, solved the issue. I wonder if it has something to do with my Docker Registry or Kubernetes.
Here is what I have in my .hal/config:
dockerRegistry:
enabled: true
accounts:
- name: <hidden-name>
requiredGroupMembership: []
address: https://docker-registry.<hidden-name>.net/
cacheIntervalSeconds: 30
repositories:
- hellopod
- demoapp
primaryAccount: <hidden-name>
kubernetes:
enabled: true
accounts:
- name: <username>
requiredGroupMembership: []
dockerRegistries:
- accountName: <hidden-name>
namespaces: []
context: sre-os1-dev
namespaces:
- spinnaker
omitNamespaces: []
kubeconfigFile: /home/<username>/.kube/config

I suspect you may be using redis as the persistent storage type(I ran into the same issue).
If this is the case, persistent storage using redis doesn't seem to be working properly out-of-the-box, and it is not supported. I would try using an S3 target, if available.
More info here on support for redis
To configure S3 using Halyard, use the following commands:
echo <SECRET_ACCESS_KEY> | hal config storage s3 edit --access-key-id <ACCESS_KEY_ID> --endpoint <S3_ENDPOINT> --bucket <BUCKET_NAME> --root-folder spinnaker --secret-access-key
hal config storage edit --type s3
hal deploy apply

#grizzthedj,
Here is what I've found inside front50.log (I wiped out ID's of course for security reasons)
You may be right.
2017-11-20 12:40:29.151 INFO 682 --- [0.0-8080-exec-1] com.amazonaws.latency : ServiceName=[Amazon S3], AWSErrorCode=[NoSuchKey], StatusCode=[404], ServiceEndpoint=[https://s3-us-west-2.amazonaws.com], Exception=[com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: ...; S3 Extended Request ID: ...), S3 Extended Request ID: ...], RequestType=[GetObjectRequest], AWSRequestID=[...], HttpClientPoolPendingCount=0, RetryCapacityConsumed=0, HttpClientPoolAvailableCount=1, RequestCount=1, Exception=1, HttpClientPoolLeasedCount=0, ClientExecuteTime=[39.634], HttpClientSendRequestTime=[0.072], HttpRequestTime=[39.213], RequestSigningTime=[0.067], CredentialsRequestTime=[0.001, 0.0], HttpClientReceiveResponseTime=[39.059],

I had a similar issue on kubernetes/aws, when I opened up the chrome dev console I was getting lots of 404 errors trying to connect to localhost:8084, I had to reconfigure the deck and gate baseurls. This is what I did using halyard:
hal config security ui edit --override-base-url http://<deck-loadbalancer-dns-entry>:9000
hal config security api edit --override-base-url http://<gate-loadbalancer-dns-entry>:8084
i did hal deploy apply and when it came back I noticed the developer console was throwing cors errors so I had to do the following.
echo "host: 0.0.0.0" | tee \ ~/.hal/default/service-settings/gate.yml \ ~/.hal/default/service-settings/deck.yml
You may note the lack of TLS and cors config, this is a test system so make better choices in production :)

Related

Kafka Upgrade from 5.4.1 to 6.1.2 with Confluent Playbooks

Currently I'm trying to upgrade our cluster with Confluent Playbooks. I setup a local environment with Vagrant where I'm simulate our production environment.
I must be honest I'm experimenting with a quite unusually thing, because our current setup was not installed with the Confluent Playbooks. I know the documentation says I should use the Upgrade Playbooks if I used the Install Playbooks with the same hosts.yml.
Anyway I'm trying to find out if it would be possible to use the official Confluent Upgrade Playbooks. It would eventually save lot's of time for us, if I don't have to create my own Upgrade Playbooks.
The Zookeeper Upgrade was successful and after upgrading Zookeeper now I'm trying to upgrade the Brokers.
During the upgrade the Broker goes to "failed" status. If I'm trying to restart the Broker service I get the following error:
[2021-07-12 08:59:52,144] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at io.confluent.rbacapi.resources.base.AuditLogConfigResource.<init>(AuditLogConfigResource.java:78)
at io.confluent.rbacapi.resources.v1.V1AuditLogConfigResource.<init>(V1AuditLogConfigResource.java:47)
at io.confluent.rbacapi.app.RbacApiApplication.setupResources(RbacApiApplication.java:339)
at io.confluent.rest.Application.configureHandler(Application.java:258)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:227)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.http.server.KafkaHttpServerImpl.doStart(KafkaHttpServerImpl.java:105)
at java.lang.Thread.run(Thread.java:748)
[2021-07-12 08:59:52,145] INFO [KafkaServer id=1] shutting down (kafka.server.KafkaServer)
[2021-07-12 08:59:52,141] WARN KafkaHttpServer transitioned from STARTING to FAILED.: null. (io.confluent.http.server.KafkaHttpServerImpl)
java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at io.confluent.rbacapi.resources.base.AuditLogConfigResource.<init>(AuditLogConfigResource.java:78)
at io.confluent.rbacapi.resources.v1.V1AuditLogConfigResource.<init>(V1AuditLogConfigResource.java:47)
at io.confluent.rbacapi.app.RbacApiApplication.setupResources(RbacApiApplication.java:339)
at io.confluent.rest.Application.configureHandler(Application.java:258)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:227)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.http.server.KafkaHttpServerImpl.doStart(KafkaHttpServerImpl.java:105)
at java.lang.Thread.run(Thread.java:748)
Since I see a reference to rbac api in the error logs above, I installed confluent-security package, but it didn't helped.
My hosts.yml in my cp-ansible Ansible root directory file looks like this:
---
all:
vars:
ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
ansible_connection: ssh
ansible_user: vagrant
ansible_become: true
ansible_port: 22
ansible_ssh_private_key_file: ~/.ssh/id_rsa
sasl_protocol: scram
ssl_enabled: true
ssl_provided_keystore_and_truststore: true
ssl_keystore_filepath: "/vagrant/ssl/kafka.server.keystore.jks"
ssl_keystore_key_password: pass_key
ssl_keystore_store_password: pass_key
ssl_truststore_filepath: "/vagrant/ssl/kafka.server.truststore.jks"
ssl_truststore_password: pass_trust
rbac_enabled: false
mds_super_user: mds
mds_super_user_password: password
kafka_broker_ldap_user: mds
kafka_broker_ldap_password: password
schema_registry_ldap_user: mds
schema_registry_ldap_password: password
ksql_ldap_user: mds
ksql_ldap_password: password
control_center_ldap_user: mds
control_center_ldap_password: password
create_mds_certs: false
token_services_public_pem_file: /vagrant/ssl/mds.publickey.pem
token_services_private_pem_file: /vagrant/ssl/mds.tokenkeypair.pem
kafka_broker_cluster_name: broker-cluster
schema_registry_cluster_name: schema-registry-cluster
kafka_broker_principal: User:mds
confluent_server_enabled: true
kafka_broker_schema_validation_enabled: true
kafka_broker_custom_listeners:
broker:
name: SSL
port: 9093
ssl_enabled: true
ssl_mutual_auth_enabled: true
sasl_protocol: none
zookeeper:
hosts:
bro1:
bro2:
bro3:
kafka_broker:
vars:
kafka_broker_custom_properties:
ldap.java.naming.factory.initial: com.sun.jndi.ldap.LdapCtxFactory
ldap.com.sun.jndi.ldap.read.timeout: 3000
ldap.java.naming.provider.url: ldap://192.168.0.198:10389
ldap.user.search.base: ou=TECH,ou=SPEZ-USER,o=VISA
ldap.group.search.base: ou=TECH,ou=SPEZ-USER,o=VISA
ldap.user.name.attribute: cn
ldap.user.memberof.attribute.pattern: cn=(.*),ou=TECH,ou=SPEZ-USER,o=TEST
ldap.group.name.attribute: cn
ldap.group.member.attribute.pattern: cn=(.*),ou=TECH,ou=SPEZ-USER,o=TEST
ldap.user.object.class: person
hosts:
bro1:
bro2:
bro3:
schema_registry:
hosts:
reg1:
control_center:
hosts:
cc1:
Do you have any hints where I should look for an issue?

Unknown host when using localstack with Spring Cloud AWS 2.3

"ResourceLoader" with AWS S3 works fine with these properties:
cloud:
aws:
s3:
endpoint: s3.amazonaws.com <-- custom endpoint added in spring cloud aws 2.3
credentials:
accessKey: XXXXXX
secretKey: XXXXXX
region:
static: us-east-1
stack:
auto: false
However, when I bring up a localstack container locally and try to use it with these properties(as per this release doc: https://spring.io/blog/2021/03/17/spring-cloud-aws-2-3-is-now-available):
cloud:
aws:
s3:
endpoint: http://localhost:4566
credentials:
accessKey: test
secretKey: test
region:
static: us-east-1
stack:
auto: false
I get this exception:
17:12:12.130 [reactor-http-nio-2] ERROR org.springframework.boot.autoconfigure.web.reactive.error.AbstractErrorWebExceptionHandler - [23efd000-1] 500 Server Error for HTTP GET "/getresource/test"
com.amazonaws.SdkClientException: Unable to execute HTTP request: mybucket.localhost
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207) ~[aws-java-sdk-core-1.11.951.jar:?]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
|_ checkpoint ⇢ org.springframework.boot.actuate.metrics.web.reactive.server.MetricsWebFilter [DefaultWebFilterChain]
|_ checkpoint ⇢ HTTP GET "/getresource/test" [ExceptionHandlingWebHandler]
Stack trace:
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207) ~[aws-java-sdk-core-1.11.951.jar:?]
Caused by: java.net.UnknownHostException: mybucket.localhost
at java.net.InetAddress$CachedAddresses.get(InetAddress.java:797) ~[?:?]
I can view my localstack bucket files otherwise fine in an S3 browser.
Here is the docker compose config for my localstack:
version: '3.1'
services:
localstack:
image: localstack/localstack:latest
environment:
- AWS_DEFAULT_REGION=us-east-1
- AWS_ACCESS_KEY_ID=test
- AWS_SECRET_ACCESS_KEY=test
- EDGE_PORT=4566
- SERVICES=lambda,s3
ports:
- '4566-4583:4566-4583'
volumes:
- "${TEMPDIR:-/tmp/localstack}:/tmp/localstack"
- "/var/run/docker.sock:/var/run/docker.sock"
Here is how I am reading a text file:
public class ResourceTransferManager {
#Autowired
ResourceLoader resourceLoader;
public void resourceLoadingMethod() throws IOException {
Resource resource = resourceLoader.getResource("s3://mybucket/index.txt");
InputStream inputStream = resource.getInputStream();
System.out.println("File content: " + IOUtils.toString(inputStream, StandardCharsets.UTF_8));
}}
By default S3 client creates a path having bucket name as subdomain and this causes the issue.
there are couple of ways to address this issue :
In case of localstack , do not use the endpoint http://localhost:4566 , use the standard formate endpoint i.e : http://s3.localhost.localstack.cloud:4566 , this will actualy reachout to DNS and will resolve into localhost IP internally and thus this will work fine. (only caviate it , it resolve using public DNS thus it either needs internet connection or you will need to make host entries prefixing bucketname for example in host file put 127.0.0.1 <yourexpectedbucketName>.s3.localhost.localstack.cloud).
OR if you are using docker then instead of making host entries , you can also create network alias for your localstack container like : <yourexpectedbucketName>.s3.localhost.localstack.cloud
another better way is extension to first approach , but here instead of using aliases for each of your bucket (which may not always be feasible) , you can spin up local dns container and use wildcard dns config there. refer simplified sample at : https://gist.github.com/paraspatidar/c29e4adb172a5afc92852a57e621323d
( original reference : https://gist.github.com/NAR8789/92da076d0c35b434107fb4f4f198fd12)

Openshift 3.11 cloud integration fails with lookup RequestError: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com

Following the docs: https://docs.openshift.com/container-platform/3.11/install_config/configuring_aws.html#aws-cluster-labeling
Configuring the cloud integration after the cluster build.
When the cluster services are restarted on the masters it fails looking up AWS instances:
22 16:32:10.112895 75995 server.go:261] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0c5cbd50923f9c6d2: "error listing AWS instances: \"Request.service: main process exited, code=exited, status=255/n/a Error: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com/: dial tcp: lookup ec2.eu-west-.amazonaws.com: no such host\""
On closer inspection seems to be due to incorrect hostname:
https://ec2.eu-west-.amazonaws.com/ VS https://ec2.eu-west-2.amazonaws.com/
So I double checked the config, which seems to be correct:
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2
Had a google and it seems to be a similar issue to this:
https://github.com/kubernetes-sigs/kubespray/issues/4345
Is there a way to work around this? Moving off 3.11 isn't an option right now.
Thanks.
Looks as though it needs to be zone, rather than the region.
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2a

Zalenium Readiness probe failed: HTTP probe failed with statuscode: 502

I am trying to deploy the zalenium helm chart in my newly deployed aks Kuberbetes (1.9.6) cluster in Azure. But I don't get it to work. The pod is giving the log below:
[bram#xforce zalenium]$ kubectl logs -f zalenium-zalenium-hub-6bbd86ff78-m25t2 Kubernetes service account found. Copying files for Dashboard... cp: cannot create regular file '/home/seluser/videos/index.html': Permission denied cp: cannot create directory '/home/seluser/videos/css': Permission denied cp: cannot create directory '/home/seluser/videos/js': Permission denied Starting Nginx reverse proxy... Starting Selenium Hub... ..........08:49:14.052 [main] INFO o.o.grid.selenium.GridLauncherV3 - Selenium build info: version: '3.12.0', revision: 'unknown' 08:49:14.120 [main] INFO o.o.grid.selenium.GridLauncherV3 - Launching Selenium Grid hub on port 4445 ...08:49:15.125 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - Initialising Kubernetes support ..08:49:15.650 [main] WARN d.z.e.z.c.k.KubernetesContainerClient - Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:206) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.(KubernetesContainerClient.java:87) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:35) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname kubernetes.default.svc not verified: certificate: sha256/OyzkRILuc6LAX4YnMAIGrRKLmVnDgLRvCasxGXDhSoc= DN: CN=client, O=system:masters subjectAltNames: [10.0.0.1] at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:308) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:268) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:160) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:256) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:134) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:113) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:56) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:107) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200) at okhttp3.RealCall.execute(RealCall.java:77) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:313) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:296) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:770) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:195) ... 16 common frames omitted 08:49:15.651 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - About to clean up any left over selenium pods created by Zalenium Usage: [options] Options: --debug, -debug : enables LogLevel.FINE. Default: false --version, -version Displays the version and exits. Default: false -browserTimeout in seconds : number of seconds a browser session is allowed to hang while a WebDriver command is running (example: driver.get(url)). If the timeout is reached while a WebDriver command is still processing, the session will quit. Minimum value is 60. An unspecified, zero, or negative value means wait indefinitely. -matcher, -capabilityMatcher class name : a class implementing the CapabilityMatcher interface. Specifies the logic the hub will follow to define whether a request can be assigned to a node. For example, if you want to have the matching process use regular expressions instead of exact match when specifying browser version. ALL nodes of a grid ecosystem would then use the same capabilityMatcher, as defined here. -cleanUpCycle in ms : specifies how often the hub will poll running proxies for timed-out (i.e. hung) threads. Must also specify "timeout" option -custom : comma separated key=value pairs for custom grid extensions. NOT RECOMMENDED -- may be deprecated in a future revision. Example: -custom myParamA=Value1,myParamB=Value2 -host IP or hostname : usually determined automatically. Most commonly useful in exotic network configurations (e.g. network with VPN) Default: 0.0.0.0 -hubConfig filename: a JSON file (following grid2 format), which defines the hub properties -jettyThreads, -jettyMaxThreads : max number of threads for Jetty. An unspecified, zero, or negative value means the Jetty default value (200) will be used. -log filename : the filename to use for logging. If omitted, will log to STDOUT -maxSession max number of tests that can run at the same time on the node, irrespective of the browser used -newSessionWaitTimeout in ms : The time after which a new test waiting for a node to become available will time out. When that happens, the test will throw an exception before attempting to start a browser. An unspecified, zero, or negative value means wait indefinitely. Default: 600000 -port : the port number the server will use. Default: 4445 -prioritizer class name : a class implementing the Prioritizer interface. Specify a custom Prioritizer if you want to sort the order in which new session requests are processed when there is a queue. Default to null ( no priority = FIFO ) -registry class name : a class implementing the GridRegistry interface. Specifies the registry the hub will use. Default: de.zalando.ep.zalenium.registry.ZaleniumRegistry -role options are [hub], [node], or [standalone]. Default: hub -servlet, -servlets : list of extra servlets the grid (hub or node) will make available. Specify multiple on the command line: -servlet tld.company.ServletA -servlet tld.company.ServletB. The servlet must exist in the path: /grid/admin/ServletA /grid/admin/ServletB -timeout, -sessionTimeout in seconds : Specifies the timeout before the server automatically kills a session that hasn't had any activity in the last X seconds. The test slot will then be released for another test to use. This is typically used to take care of client crashes. For grid hub/node roles, cleanUpCycle must also be set. -throwOnCapabilityNotPresent true or false : If true, the hub will reject all test requests if no compatible proxy is currently registered. If set to false, the request will queue until a node supporting the capability is registered with the grid. -withoutServlet, -withoutServlets : list of default (hub or node) servlets to disable. Advanced use cases only. Not all default servlets can be disabled. Specify multiple on the command line: -withoutServlet tld.company.ServletA -withoutServlet tld.company.ServletB org.openqa.grid.common.exception.GridConfigurationException: Error creating class with de.zalando.ep.zalenium.registry.ZaleniumRegistry : null at org.openqa.grid.web.Hub.(Hub.java:97) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: java.lang.ExceptionInInitializerError at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) ... 3 more Caused by: java.lang.NullPointerException at java.util.TreeMap.putAll(TreeMap.java:313) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:411) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:48) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.deleteSeleniumPods(KubernetesContainerClient.java:393) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.initialiseContainerEnvironment(KubernetesContainerClient.java:339) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:38) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) ... 11 more ...........................................................................................................................................................................................GridLauncher failed to start after 1 minute, failing... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 182 100 182 0 0 36103 0 --:--:-- --:--:-- --:--:-- 45500
A describe pod gives:
Warning Unhealthy 4m (x12 over 6m) kubelet, aks-agentpool-93668098-0 Readiness probe failed: HTTP probe failed with statuscode: 502
Zalenium Image Version(s):
dosel/zalenium:3
If using Kubernetes, specify your environment, and if relevant your manifests:
I use the templates as is from https://github.com/zalando/zalenium/tree/master/docs/k8s/helm
I guess it has to do something with rbac because of this part
"Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at "
I created a clusterrole and clusterrolebinding for the service account zalenium-zalenium that is automatically created by the Helm chart.
kubectl create clusterrole zalenium --verb=get,list,watch,update,delete,create,patch --resource=pods,deployments,secrets
kubectl create clusterrolebinding zalenium --clusterrole=zalnium --serviceaccount=zalenium-zalenium --namespace=default
Issue had to do with Azure's AKS and Kubernetes. It has been fixed.
See github issue 399
If the typo, mentioned by Ignacio, is not the case why don't you just do
kubectl create clusterrolebinding zalenium --clusterrole=cluster-admin serviceaccount=zalenium-zalenium
Note: no need to specify a namespace if creating cluster role binding, as it is cluster wide (role binding goes with namespaces).

Cloud foundry on Google Compute engine can't create container

I am very new with Cloud foundry. I have added cloud foundry for google compute engine platform by this guides source1 and source2.
Terraform was used for creating needed infrastructure. It seemed all was fine I didn't get any errors during deployment cloud foundry itself and bosh cck command returns that there are no any problems. But when I tried to deploy my hello world app, I got next error message in terminal after cf push command:
Creating container
Failed to create container
FAILED
Error restarting application: StagingError.
After checking log files I found next message:
{
"timestamp":"1474637304.026303530",
"source":"garden-linux",
"message":"garden-linux.loop-mounter.mount-file.mounting",
"log_level":2,
"data":{
"destPath":"/var/vcap/data/garden/aufs_graph/aufs/diff/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"error":"exit status 32",
"filePath":"/var/vcap/data/garden/aufs_graph/backing_stores/08829a3252c1d60729e3b5482b0fb109652c9ab5beff9724e4e4ae756a0bc3ce",
"output":"mount: wrong fs type, bad option, bad superblock on /dev/loop0,\n missing codepage or helper program, or other error\n In some cases useful info is found in syslog - try\n dmesg | tail or so\n\n",
"session":"2.276"
}
}{
"timestamp":"1474637304.026949406",
"source":"garden-linux",
"message":"garden-linux.pool.acquire.provide-rootfs-failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"9.545"
}
}
{
"timestamp":"1474637304.027062416",
"source":"garden-linux",
"message":"garden-linux.garden-server.create.failed",
"log_level":2,
"data":{
"error":"mounting file: mounting file: exit status 32",
"request":{
"Handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"GraceTime":0,
"RootFSPath":"/var/vcap/packages/rootfs_cflinuxfs2/rootfs",
"BindMounts":[
{
"src_path":"/var/vcap/data/executor_cache/6942123d3462ad9d21a45729c3cae183-1474475979582384649-1.d",
"dst_path":"/tmp/lifecycle"
}
],
"Network":"",
"Privileged":true,
"Limits":{
"bandwidth_limits":{
},
"cpu_limits":{
"limit_in_shares":512
},
"disk_limits":{
"inode_hard":200000,
"byte_hard":6442450944,
"scope":1
},
"memory_limits":{
"limit_in_bytes":1073741824
}
}
},
"session":"11.44187"
}
}{
"timestamp":"1474637304.034646988",
"source":"garden-linux",
"message":"garden-linux.garden-server.destroy.failed",
"log_level":2,
"data":{
"error":"unknown handle: ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"handle":"ec6e7469-0ef0-48a8-bcd0-82f4a2ea173f-5de2e641d9284aeea209ca447ffffb6d",
"session":"11.44188"
}
}
And meantime in dmesg | tail I got next:
[161023.238082] aufs test_add:283:garden-linux[7681]: uid/gid/perm
/var/vcap/data/garden/aufs_graph/aufs/diff/d350dcd30f6d6f8b37eabe06a3b73bcea0a87f9aff4edf15f12792269fc9f97c
4294967294/4294967294/0755, 0/0/0755 [161023.238109] aufs
au_opts_verify:1597:garden-linux[7681]: dirperm1 breaks the protection
by the permission bits on the lower branch [161023.413392] device
wtj3qdqhig0t-0 entered promiscuous mode
I'm not sure that this issues connected or that it is issue at all, but I post them here in order to be sure, that I didn't miss anything.
I don't know how to fix this problem and where, should I look solution for terraform scripts or for bosh manifest files. We have micro service architecture with three nodes on node js and one on ruby, so deployment is very important question for us.
here is my application manifest.yml file:
---
applications:
- name: hello_cloud
memory: 128M
buildpack: https://github.com/cloudfoundry/nodejs-buildpack
instances: 1
random-route: true
command: "node server.js"
My goal is to be able deploy applications using cloud foundry. If you have any additional questions or I wrote something unclear feel free to write me.
This issue is related a conflict between garden and the 4.4 Linux kernel. To use the example cloudfoundry manfest, use the follow stemcell:
bosh upload stemcell https://bosh.io/d/stemcells/bosh-google-kvm-ubuntu-trusty-go_agent?v=3262.19
bosh deploy
You may need to delete your cf deployment before re-deploying due to quota issues.