Dataproc returning StatusRuntimeException cluster not found - scala

I am sending request to an API that is using gcloud Dataproc for reindexing. I am sending a request that is giving me response
io.grpc.StatusRuntimeException: NOT_FOUND: Not found: Cluster projects/go-dev-central/regions/us-central1/clusters/cluster-156c
Pretty new to gcloud don't know where i should be looking into. Can it be a regional/zone issue.
Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed
at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57)
at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
at com.google.cloud.dataproc.v1.JobControllerClient.submitJob(JobControllerClient.java:210)
at com.google.cloud.dataproc.v1.JobControllerClient.submitJob(JobControllerClient.java:183)
at com.carecloud.edison.commons.providers.gcp.GoogleDataProcProvider.$anonfun$submitDataProcJob$1(GoogleDataProcProvider.scala:59)
at com.carecloud.edison.commons.providers.gcp.GoogleDataProcProvider.withJobControllerClientSync(GoogleDataProcProvider.scala:39)
at com.carecloud.edison.commons.providers.gcp.GoogleDataProcProvider.$anonfun$withJobControllerClient$1(GoogleDataProcProvider.scala:27)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:658)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
Caused by: io.grpc.StatusRuntimeException: NOT_FOUND: Not found: Cluster projects/go-dev-central/regions/us-central1/clusters/cluster-156c
at io.grpc.Status.asRuntimeException(Status.java:533)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:490)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
2020-08-31 04:33:49,568 [ERROR] c.ReindexController - General Service Error: io.grpc.StatusRuntimeException: NOT_FOUND: Not found: Cluster projects/go-dev-central/regions/us-central1/clusters/cluster-156c

The most likely explanation is that you used the separate "global" multiregion to create the cluster, even if you placed it in a GCE zone within us-central1, while you've configured your code to use the "us-central1" regional Dataproc universe.
See https://cloud.google.com/dataproc/docs/concepts/regional-endpoints for more details on the difference. The high level is that "global" is an independent Dataproc universe, just like each of the different regional universes like "us-central1", "europe-west1", etc; they are all isolated from each other.
You can see which one your cluster lives in if looking at your "Clusters" list page in the Cloud Console, as there should be a column indicating the Dataproc region being used.

Related

Jenkins - Error ‘Jenkins’ doesn’t have label ‘jenkins-eks-pod’

After running a pipeline job in Jenkins that runs in my k8s cluster
I am getting this error -
‘Jenkins’ doesn’t have label ‘jenkins-eks-pod’.
What am I missing in my configuration?
Pod Logs in k8s-
2023-02-20 14:37:03.379+0000 [id=1646] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: jenkins-eks-agent-h4z6t, template=PodTemplate{id='05395ad55cc56972ee3e4c69c2731189bc03a75c0b51e637dc7f868fa85d07e8', name='jenkins-eks-agent', namespace='default', slaveConnectTimeout=100, label='jenkins-non-prod-eks-global-slave', serviceAccount='default', nodeUsageMode=NORMAL, podRetention='Never', containers=[ContainerTemplate{name='jnlp', image='805787217936.dkr.ecr.us-west-2.amazonaws.com/aura-jenkins-slave:ecs-global-node_master_57', alwaysPullImage=true, workingDir='/home/jenkins/agent', command='', args='', ttyEnabled=true, resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceRequestEphemeralStorage='', resourceLimitCpu='512m', resourceLimitMemory='512Mi', resourceLimitEphemeralStorage='', envVars=[KeyValueEnvVar [getValue()=http://jenkins-non-prod.default.svc.cluster.local:8080/, getKey()=JENKINS_URL]], livenessProbe=ContainerLivenessProbe{execArgs='', timeoutSeconds=0, initialDelaySeconds=0, failureThreshold=0, periodSeconds=0, successThreshold=0}}]}
java.lang.IllegalStateException: Containers are terminated with exit codes: {jnlp=0}
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.checkTerminatedContainers(KubernetesLauncher.java:275)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:225)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:298)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:48)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:82)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2023-02-20 14:37:03.380+0000 [id=1646] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent jenkins-eks-agent-h4z6t
2023-02-20 14:37:03.380+0000 [id=1646] SEVERE o.c.j.p.k.KubernetesSlave#_terminate: Computer for agent is null: jenkins-eks-agent-h4z6t
This error might be due to not creating the label ‘jenkins-eks-pod’ on the jenkins server.
To create a label on the jenkins server :
go to manage jenkins > Manage nodes and Clouds > labels and then enter
the label name.
Post creating this label try to run the job and check if it works.
Refer to this Blog by Bibin Wilson.

localstack v0.11.5 and kcl v1.13.3

I am using kcl v1.13.3 with the latest localstack v0.11.5
The kcl client now uses edge service port 4566.
Are the kcl and localstack versions compatible?
I keep getting the following error:
com.amazonaws.SdkClientException: Unable to execute HTTP request: The target server failed to respond
Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond
com.amazonaws.SdkClientException: Unable to execute HTTP request: The target server failed to respond
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException (AmazonHttpClient.java:1163)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper (AmazonHttpClient.java:1109)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute (AmazonHttpClient.java:758)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer (AmazonHttpClient.java:732)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute (AmazonHttpClient.java:714)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500 (AmazonHttpClient.java:674)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute (AmazonHttpClient.java:656)
at com.amazonaws.http.AmazonHttpClient.execute (AmazonHttpClient.java:520)
at com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke (AmazonKinesisClient.java:2782)
Caused by: org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead (DefaultHttpResponseParser.java:141)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead (DefaultHttpResponseParser.java:56)
at org.apache.http.impl.io.AbstractMessageParser.parse (AbstractMessageParser.java:259)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader (DefaultBHttpClientConnection.java:163)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader (CPoolProxy.java:165)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse (HttpRequestExecutor.java:273)
at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse (SdkHttpRequestExecutor.java:82)
at org.apache.http.protocol.HttpRequestExecutor.execute (HttpRequestExecutor.java:125)
at org.apache.http.impl.execchain.MainClientExec.execute (MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute (ProtocolExec.java:185)
at org.apache.http.impl.client.InternalHttpClient.doExecute (InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute (CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute (CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute (SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest (AmazonHttpClient.java:1285)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper (AmazonHttpClient.java:1101)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute (AmazonHttpClient.java:758)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer (AmazonHttpClient.java:732)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute (AmazonHttpClient.java:714)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500 (AmazonHttpClient.java:674)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute (AmazonHttpClient.java:656)
at com.amazonaws.http.AmazonHttpClient.execute (AmazonHttpClient.java:520)
Did you already confirm that localstack endpoint is functional on that port? For example:
aws kinesis list-streams --endpoint http://localhost:4566
(If you do not have and want aws cli installed, there's always the option of using docker)
Moreover, it might be helpful for you to share how you are boostrapping the AWS client. It should be something along the lines of:
AwsClientBuilder.EndpointConfiguration endpointConfig = new AwsClientBuilder.EndpointConfiguration("http://localhost:4566",
Regions.EU_WEST_1.getName());
return AmazonDynamoDBClientBuilder.standard()
.withEndpointConfiguration(endpointConfig)
.build();
Note that if you are running your kcl app inside another docker container, then you might want to change from "http://localhost:4566" to "http://localstack:4566".

Openshift 3.11 cloud integration fails with lookup RequestError: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com

Following the docs: https://docs.openshift.com/container-platform/3.11/install_config/configuring_aws.html#aws-cluster-labeling
Configuring the cloud integration after the cluster build.
When the cluster services are restarted on the masters it fails looking up AWS instances:
22 16:32:10.112895 75995 server.go:261] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0c5cbd50923f9c6d2: "error listing AWS instances: \"Request.service: main process exited, code=exited, status=255/n/a Error: send request failed\\ncaused by: Post https://ec2.eu-west-.amazonaws.com/: dial tcp: lookup ec2.eu-west-.amazonaws.com: no such host\""
On closer inspection seems to be due to incorrect hostname:
https://ec2.eu-west-.amazonaws.com/ VS https://ec2.eu-west-2.amazonaws.com/
So I double checked the config, which seems to be correct:
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2
Had a google and it seems to be a similar issue to this:
https://github.com/kubernetes-sigs/kubespray/issues/4345
Is there a way to work around this? Moving off 3.11 isn't an option right now.
Thanks.
Looks as though it needs to be zone, rather than the region.
# cat /etc/origin/cloudprovider/aws.conf
[Global]
Zone = eu-west-2a

Google Dataflow Pipeline creation fails with 400: Bad Request / invalid grant

I have been building and creating templates for google dataflow for over a year now. I never had a problem creating templates and uploading them to gcs with the options.setTemplateLocation(templatePath); call. Since today, when creating the Pipeline with Pipeline.create(options); and running the java-program in eclipse, I get following exception:
Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:52)
at org.apache.beam.sdk.Pipeline.create(Pipeline.java:142)
at mypackage.PipelineCreation.getTemplatePipeline(PipelineCreation.java:34)
at myotherpackage.Main.main(Main.java:51)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
... 5 more
Caused by: java.lang.RuntimeException: Unable to verify that GCS bucket gs://my-projects-staging-bucket exists.
at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPathIsAccessible(GcsPathValidator.java:92)
at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported(GcsPathValidator.java:61)
at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:228)
... 10 more
Caused by: com.google.api.client.http.HttpResponseException: 400 Bad Request
{
"error" : "invalid_grant",
"error_description" : "Bad Request"
}
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1070)
at com.google.auth.oauth2.UserCredentials.refreshAccessToken(UserCredentials.java:207)
at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:149)
at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:135)
at com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
at com.google.cloud.hadoop.util.ChainingHttpRequestInitializer.initialize(ChainingHttpRequestInitializer.java:52)
at com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
at org.apache.beam.sdk.util.GcsUtil.getBucket(GcsUtil.java:505)
at org.apache.beam.sdk.util.GcsUtil.bucketAccessible(GcsUtil.java:492)
at org.apache.beam.sdk.util.GcsUtil.bucketAccessible(GcsUtil.java:457)
at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPathIsAccessible(GcsPathValidator.java:88)
... 12 more
I was logged-in today with another account into gcloud but logged in again with the account associated with the project as "Owner" with gcloud auth login.
I also restarted Eclipse but the same error keeps occuring. Also when trying to run the pipeline locally, I get another error but also with the "invalid_grant" "bad request" content. Restarting the laptop also had no effect.
My pom defines the google-cloud-dataflow-java-sdk-all with version 2.2.0 and upgrading to 2.5.0 had no effect.
I am able to copy data to the bucket with gsutil from commandline. But when running the java-program from command-line with mvn compile exec:java -Dexec.mainClass=mypackage.Main i still get the same errors.
My function to create a templatePipeline looks like the following:
public static Pipeline getTemplatePipeline(String jobName, String templatePath){
DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
options.setProject("my-project-id");
options.setRunner(DataflowRunner.class);
options.setStagingLocation("gs://my-projects-staging-bucket/binaries");
options.setTempLocation("gs://my-projects-staging-bucket/binaries/tmp");
options.setGcpTempLocation("gs://my-projects-staging-bucket/binaries/tmp");
options.setZone("europe-west3-a");
options.setWorkerMachineType("n1-standard-2");
options.setJobName(jobName);
options.setMaxNumWorkers(2);
options.setDiskSizeGb(40);
options.setTemplateLocation(templatePath);
return Pipeline.create(options);
}
Any help is highly appreciated.
You don't have to use service account and still you can use gcloud, you should use the following command and login with your account:
gcloud auth application-default login
I found the solution in the quickstart docs.
It seems like the gcloud auth is no longer used and you have to use a service account. So like in the docs I created a service account with role "project/owner" and downloaded it's json file to $path.
Then on my Mac i used export GOOGLE_APPLICATION_CREDENTIALS="$path" and within the same session used the command mentioned in the question to compile and execute the java-program.

Zalenium Readiness probe failed: HTTP probe failed with statuscode: 502

I am trying to deploy the zalenium helm chart in my newly deployed aks Kuberbetes (1.9.6) cluster in Azure. But I don't get it to work. The pod is giving the log below:
[bram#xforce zalenium]$ kubectl logs -f zalenium-zalenium-hub-6bbd86ff78-m25t2 Kubernetes service account found. Copying files for Dashboard... cp: cannot create regular file '/home/seluser/videos/index.html': Permission denied cp: cannot create directory '/home/seluser/videos/css': Permission denied cp: cannot create directory '/home/seluser/videos/js': Permission denied Starting Nginx reverse proxy... Starting Selenium Hub... ..........08:49:14.052 [main] INFO o.o.grid.selenium.GridLauncherV3 - Selenium build info: version: '3.12.0', revision: 'unknown' 08:49:14.120 [main] INFO o.o.grid.selenium.GridLauncherV3 - Launching Selenium Grid hub on port 4445 ...08:49:15.125 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - Initialising Kubernetes support ..08:49:15.650 [main] WARN d.z.e.z.c.k.KubernetesContainerClient - Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:206) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.(KubernetesContainerClient.java:87) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:35) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname kubernetes.default.svc not verified: certificate: sha256/OyzkRILuc6LAX4YnMAIGrRKLmVnDgLRvCasxGXDhSoc= DN: CN=client, O=system:masters subjectAltNames: [10.0.0.1] at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:308) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:268) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:160) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:256) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:134) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:113) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:56) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:107) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200) at okhttp3.RealCall.execute(RealCall.java:77) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:313) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:296) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:770) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:195) ... 16 common frames omitted 08:49:15.651 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - About to clean up any left over selenium pods created by Zalenium Usage: [options] Options: --debug, -debug : enables LogLevel.FINE. Default: false --version, -version Displays the version and exits. Default: false -browserTimeout in seconds : number of seconds a browser session is allowed to hang while a WebDriver command is running (example: driver.get(url)). If the timeout is reached while a WebDriver command is still processing, the session will quit. Minimum value is 60. An unspecified, zero, or negative value means wait indefinitely. -matcher, -capabilityMatcher class name : a class implementing the CapabilityMatcher interface. Specifies the logic the hub will follow to define whether a request can be assigned to a node. For example, if you want to have the matching process use regular expressions instead of exact match when specifying browser version. ALL nodes of a grid ecosystem would then use the same capabilityMatcher, as defined here. -cleanUpCycle in ms : specifies how often the hub will poll running proxies for timed-out (i.e. hung) threads. Must also specify "timeout" option -custom : comma separated key=value pairs for custom grid extensions. NOT RECOMMENDED -- may be deprecated in a future revision. Example: -custom myParamA=Value1,myParamB=Value2 -host IP or hostname : usually determined automatically. Most commonly useful in exotic network configurations (e.g. network with VPN) Default: 0.0.0.0 -hubConfig filename: a JSON file (following grid2 format), which defines the hub properties -jettyThreads, -jettyMaxThreads : max number of threads for Jetty. An unspecified, zero, or negative value means the Jetty default value (200) will be used. -log filename : the filename to use for logging. If omitted, will log to STDOUT -maxSession max number of tests that can run at the same time on the node, irrespective of the browser used -newSessionWaitTimeout in ms : The time after which a new test waiting for a node to become available will time out. When that happens, the test will throw an exception before attempting to start a browser. An unspecified, zero, or negative value means wait indefinitely. Default: 600000 -port : the port number the server will use. Default: 4445 -prioritizer class name : a class implementing the Prioritizer interface. Specify a custom Prioritizer if you want to sort the order in which new session requests are processed when there is a queue. Default to null ( no priority = FIFO ) -registry class name : a class implementing the GridRegistry interface. Specifies the registry the hub will use. Default: de.zalando.ep.zalenium.registry.ZaleniumRegistry -role options are [hub], [node], or [standalone]. Default: hub -servlet, -servlets : list of extra servlets the grid (hub or node) will make available. Specify multiple on the command line: -servlet tld.company.ServletA -servlet tld.company.ServletB. The servlet must exist in the path: /grid/admin/ServletA /grid/admin/ServletB -timeout, -sessionTimeout in seconds : Specifies the timeout before the server automatically kills a session that hasn't had any activity in the last X seconds. The test slot will then be released for another test to use. This is typically used to take care of client crashes. For grid hub/node roles, cleanUpCycle must also be set. -throwOnCapabilityNotPresent true or false : If true, the hub will reject all test requests if no compatible proxy is currently registered. If set to false, the request will queue until a node supporting the capability is registered with the grid. -withoutServlet, -withoutServlets : list of default (hub or node) servlets to disable. Advanced use cases only. Not all default servlets can be disabled. Specify multiple on the command line: -withoutServlet tld.company.ServletA -withoutServlet tld.company.ServletB org.openqa.grid.common.exception.GridConfigurationException: Error creating class with de.zalando.ep.zalenium.registry.ZaleniumRegistry : null at org.openqa.grid.web.Hub.(Hub.java:97) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: java.lang.ExceptionInInitializerError at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) ... 3 more Caused by: java.lang.NullPointerException at java.util.TreeMap.putAll(TreeMap.java:313) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:411) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:48) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.deleteSeleniumPods(KubernetesContainerClient.java:393) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.initialiseContainerEnvironment(KubernetesContainerClient.java:339) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:38) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) ... 11 more ...........................................................................................................................................................................................GridLauncher failed to start after 1 minute, failing... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 182 100 182 0 0 36103 0 --:--:-- --:--:-- --:--:-- 45500
A describe pod gives:
Warning Unhealthy 4m (x12 over 6m) kubelet, aks-agentpool-93668098-0 Readiness probe failed: HTTP probe failed with statuscode: 502
Zalenium Image Version(s):
dosel/zalenium:3
If using Kubernetes, specify your environment, and if relevant your manifests:
I use the templates as is from https://github.com/zalando/zalenium/tree/master/docs/k8s/helm
I guess it has to do something with rbac because of this part
"Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at "
I created a clusterrole and clusterrolebinding for the service account zalenium-zalenium that is automatically created by the Helm chart.
kubectl create clusterrole zalenium --verb=get,list,watch,update,delete,create,patch --resource=pods,deployments,secrets
kubectl create clusterrolebinding zalenium --clusterrole=zalnium --serviceaccount=zalenium-zalenium --namespace=default
Issue had to do with Azure's AKS and Kubernetes. It has been fixed.
See github issue 399
If the typo, mentioned by Ignacio, is not the case why don't you just do
kubectl create clusterrolebinding zalenium --clusterrole=cluster-admin serviceaccount=zalenium-zalenium
Note: no need to specify a namespace if creating cluster role binding, as it is cluster wide (role binding goes with namespaces).