I'm trying to define a Vertica-Kafka scheduler. I ran the first few commands successfully, but failed on the following command:
$ /opt/vertica/packages/kafka/bin/vkconfig source --create --cluster kafka_nms_cluster --source test --partitions 1 --conf /home/vertica/vkconfig/vkconfig.conf
The error I got
Exception in thread "main" com.vertica.solutions.kafka.exception.ConfigurationException: ERROR: [[Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]]
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:248)
at com.vertica.solutions.kafka.model.StreamSource.setFromMapAndValidate(StreamSource.java:194)
at com.vertica.solutions.kafka.model.StreamModel.<init>(StreamModel.java:93)
at com.vertica.solutions.kafka.model.StreamSource.<init>(StreamSource.java:44)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:62)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:13)
at com.vertica.solutions.kafka.cli.CLI.run(CLI.java:59)
at com.vertica.solutions.kafka.cli.CLI._main(CLI.java:141)
at com.vertica.solutions.kafka.cli.SourceCLI.main(SourceCLI.java:29)
Caused by: java.sql.SQLNonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeQuery(Unknown Source)
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:227)
... 8 more
Caused by: com.vertica.support.exceptions.NonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_2_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
... 17 more
However, when I try to run KafkaListTopics using vsql, the resultset shows the test topic with 1 partition.
[root#dal_server1 ~]# /opt/vertica/bin/vsql -U vertica -c "SELECT KafkaListTopics(USING PARAMETERS brokers='10.22.2.38:9092') OVER ();"
topic | num_partitions
--------------------+----------------
__consumer_offsets | 50
test | 1
TutorialTopic | 1
(3 rows)
What might be causing this error?
Thanks
Avi
The issue may actually be with the cluster you created before you attempted to create the source. I have had this same issue when testing the Vertica/Kafka integration where the test Kafka cluster does not have a DNS entry, but a DNS name is stored in the stream_clsuters table.
Query the <scheduler_config_schema>.stream_clusters table. If a DNS name is stored instead of just a plain IP address, then you can do two things.
Do a manual update on the stream_clusters table to change it to <ip_address>:<port> if there is only one Kafka node, or<ip_address1>:<port>,...,<ip_addressN>:<port> if there are multiple.
Or, add the domain name to your /etc/hosts on all Vertica nodes
For example, in the stream_clusters table, you see domain_name_1:9092, run this UPDATE statement:
UPDATE <scheduler_config_schema>.stream_clusters
SET hosts = '10.22.2.38:9092'
WHERE id = <some_id>
Normally, I would advice NOT doing any kind of manual DML on these scheduler configuration tables, but I have done this specific update before and it is safe (especially in testing).
Of course in a true production environment the Kafka clusters should have DNS entries in your network, and you will not have to worry about this error, but for testing with VMs or Docker containers, I've come across this several times and the suggestions above have done the trick.
Related
After running a pipeline job in Jenkins that runs in my k8s cluster
I am getting this error -
‘Jenkins’ doesn’t have label ‘jenkins-eks-pod’.
What am I missing in my configuration?
Pod Logs in k8s-
2023-02-20 14:37:03.379+0000 [id=1646] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: jenkins-eks-agent-h4z6t, template=PodTemplate{id='05395ad55cc56972ee3e4c69c2731189bc03a75c0b51e637dc7f868fa85d07e8', name='jenkins-eks-agent', namespace='default', slaveConnectTimeout=100, label='jenkins-non-prod-eks-global-slave', serviceAccount='default', nodeUsageMode=NORMAL, podRetention='Never', containers=[ContainerTemplate{name='jnlp', image='805787217936.dkr.ecr.us-west-2.amazonaws.com/aura-jenkins-slave:ecs-global-node_master_57', alwaysPullImage=true, workingDir='/home/jenkins/agent', command='', args='', ttyEnabled=true, resourceRequestCpu='512m', resourceRequestMemory='512Mi', resourceRequestEphemeralStorage='', resourceLimitCpu='512m', resourceLimitMemory='512Mi', resourceLimitEphemeralStorage='', envVars=[KeyValueEnvVar [getValue()=http://jenkins-non-prod.default.svc.cluster.local:8080/, getKey()=JENKINS_URL]], livenessProbe=ContainerLivenessProbe{execArgs='', timeoutSeconds=0, initialDelaySeconds=0, failureThreshold=0, periodSeconds=0, successThreshold=0}}]}
java.lang.IllegalStateException: Containers are terminated with exit codes: {jnlp=0}
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.checkTerminatedContainers(KubernetesLauncher.java:275)
at org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher.launch(KubernetesLauncher.java:225)
at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:298)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:48)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:82)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2023-02-20 14:37:03.380+0000 [id=1646] INFO o.c.j.p.k.KubernetesSlave#_terminate: Terminating Kubernetes instance for agent jenkins-eks-agent-h4z6t
2023-02-20 14:37:03.380+0000 [id=1646] SEVERE o.c.j.p.k.KubernetesSlave#_terminate: Computer for agent is null: jenkins-eks-agent-h4z6t
This error might be due to not creating the label ‘jenkins-eks-pod’ on the jenkins server.
To create a label on the jenkins server :
go to manage jenkins > Manage nodes and Clouds > labels and then enter
the label name.
Post creating this label try to run the job and check if it works.
Refer to this Blog by Bibin Wilson.
If I start the service using docker,It should look like this:
docker run -e PARAMS="--spring.datasource.url=jdbc:mysql://mysql-service.example.com/xxl-job?Unicode=true&characterEncoding=UTF-8 --spring.datasource.username=root --spring.datasource.password=<mysql-password>" -p 8180:8080 -v /tmp:/data/applogs --name xxl-job-admin -d xuxueli/xxl-job-admin:2.0.2
now I am running it in kubernetes(v1.15.2) cluster,how to pass the parameter into pod's container? I am trying to pass parameter like this:
"name": "xxl-job-service",
"image": "xuxueli/xxl-job-admin:2.0.2",
"args": [
"--spring.datasource.url=jdbc:mysql://mysql-service.ttt208.com/xxl-job?Unicode=true&characterEncoding=UTF-8 --spring.datasource.username=root --spring.datasource.password=<mysql-password>"
],
but it seem do not work,it throw :
19:19:55.563 logback [xxl-job, admin JobFailMonitorHelper] ERROR c.x.j.a.c.t.JobFailMonitorHelper - >>>>>>>>>>> xxl-job, job fail monitor thread error:{}
org.mybatis.spring.MyBatisSystemException: nested exception is org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
### The error may exist in class path resource [mybatis-mapper/XxlJobLogMapper.xml]
### The error may involve com.xxl.job.admin.dao.XxlJobLogDao.findFailJobLogIds
### The error occurred while executing a query
### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:77)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:446)
at com.sun.proxy.$Proxy57.selectList(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:230)
at org.apache.ibatis.binding.MapperMethod.executeForMany(MapperMethod.java:139)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:76)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:59)
at com.sun.proxy.$Proxy61.findFailJobLogIds(Unknown Source)
at com.xxl.job.admin.core.thread.JobFailMonitorHelper$1.run(JobFailMonitorHelper.java:49)
at java.lang.Thread.run(Thread.java:748)
what should I do to run this service success? I am sure the database username and password correct.
Dockers -e or --env sets an environment variable.
The equivalent in a Kubernetes pod spec is the containers env field
env:
- name: PARAMS
value: ' --spring.datasource.url=jdbc:mysql://mysql-service.example.com/xxl-job?Unicode=true&characterEncoding=UTF-8 --spring.datasource.username=root --spring.datasource.password=<mysql-password>'
I am trying to deploy the zalenium helm chart in my newly deployed aks Kuberbetes (1.9.6) cluster in Azure. But I don't get it to work. The pod is giving the log below:
[bram#xforce zalenium]$ kubectl logs -f zalenium-zalenium-hub-6bbd86ff78-m25t2 Kubernetes service account found. Copying files for Dashboard... cp: cannot create regular file '/home/seluser/videos/index.html': Permission denied cp: cannot create directory '/home/seluser/videos/css': Permission denied cp: cannot create directory '/home/seluser/videos/js': Permission denied Starting Nginx reverse proxy... Starting Selenium Hub... ..........08:49:14.052 [main] INFO o.o.grid.selenium.GridLauncherV3 - Selenium build info: version: '3.12.0', revision: 'unknown' 08:49:14.120 [main] INFO o.o.grid.selenium.GridLauncherV3 - Launching Selenium Grid hub on port 4445 ...08:49:15.125 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - Initialising Kubernetes support ..08:49:15.650 [main] WARN d.z.e.z.c.k.KubernetesContainerClient - Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:206) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:162) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.(KubernetesContainerClient.java:87) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:35) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname kubernetes.default.svc not verified: certificate: sha256/OyzkRILuc6LAX4YnMAIGrRKLmVnDgLRvCasxGXDhSoc= DN: CN=client, O=system:masters subjectAltNames: [10.0.0.1] at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:308) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:268) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:160) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:256) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:134) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:113) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:125) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:56) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:107) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200) at okhttp3.RealCall.execute(RealCall.java:77) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:379) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:344) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:313) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:296) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:770) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:195) ... 16 common frames omitted 08:49:15.651 [main] INFO d.z.e.z.c.k.KubernetesContainerClient - About to clean up any left over selenium pods created by Zalenium Usage: [options] Options: --debug, -debug : enables LogLevel.FINE. Default: false --version, -version Displays the version and exits. Default: false -browserTimeout in seconds : number of seconds a browser session is allowed to hang while a WebDriver command is running (example: driver.get(url)). If the timeout is reached while a WebDriver command is still processing, the session will quit. Minimum value is 60. An unspecified, zero, or negative value means wait indefinitely. -matcher, -capabilityMatcher class name : a class implementing the CapabilityMatcher interface. Specifies the logic the hub will follow to define whether a request can be assigned to a node. For example, if you want to have the matching process use regular expressions instead of exact match when specifying browser version. ALL nodes of a grid ecosystem would then use the same capabilityMatcher, as defined here. -cleanUpCycle in ms : specifies how often the hub will poll running proxies for timed-out (i.e. hung) threads. Must also specify "timeout" option -custom : comma separated key=value pairs for custom grid extensions. NOT RECOMMENDED -- may be deprecated in a future revision. Example: -custom myParamA=Value1,myParamB=Value2 -host IP or hostname : usually determined automatically. Most commonly useful in exotic network configurations (e.g. network with VPN) Default: 0.0.0.0 -hubConfig filename: a JSON file (following grid2 format), which defines the hub properties -jettyThreads, -jettyMaxThreads : max number of threads for Jetty. An unspecified, zero, or negative value means the Jetty default value (200) will be used. -log filename : the filename to use for logging. If omitted, will log to STDOUT -maxSession max number of tests that can run at the same time on the node, irrespective of the browser used -newSessionWaitTimeout in ms : The time after which a new test waiting for a node to become available will time out. When that happens, the test will throw an exception before attempting to start a browser. An unspecified, zero, or negative value means wait indefinitely. Default: 600000 -port : the port number the server will use. Default: 4445 -prioritizer class name : a class implementing the Prioritizer interface. Specify a custom Prioritizer if you want to sort the order in which new session requests are processed when there is a queue. Default to null ( no priority = FIFO ) -registry class name : a class implementing the GridRegistry interface. Specifies the registry the hub will use. Default: de.zalando.ep.zalenium.registry.ZaleniumRegistry -role options are [hub], [node], or [standalone]. Default: hub -servlet, -servlets : list of extra servlets the grid (hub or node) will make available. Specify multiple on the command line: -servlet tld.company.ServletA -servlet tld.company.ServletB. The servlet must exist in the path: /grid/admin/ServletA /grid/admin/ServletB -timeout, -sessionTimeout in seconds : Specifies the timeout before the server automatically kills a session that hasn't had any activity in the last X seconds. The test slot will then be released for another test to use. This is typically used to take care of client crashes. For grid hub/node roles, cleanUpCycle must also be set. -throwOnCapabilityNotPresent true or false : If true, the hub will reject all test requests if no compatible proxy is currently registered. If set to false, the request will queue until a node supporting the capability is registered with the grid. -withoutServlet, -withoutServlets : list of default (hub or node) servlets to disable. Advanced use cases only. Not all default servlets can be disabled. Specify multiple on the command line: -withoutServlet tld.company.ServletA -withoutServlet tld.company.ServletB org.openqa.grid.common.exception.GridConfigurationException: Error creating class with de.zalando.ep.zalenium.registry.ZaleniumRegistry : null at org.openqa.grid.web.Hub.(Hub.java:97) at org.openqa.grid.selenium.GridLauncherV3$2.launch(GridLauncherV3.java:291) at org.openqa.grid.selenium.GridLauncherV3.launch(GridLauncherV3.java:122) at org.openqa.grid.selenium.GridLauncherV3.main(GridLauncherV3.java:82) Caused by: java.lang.ExceptionInInitializerError at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:74) at de.zalando.ep.zalenium.registry.ZaleniumRegistry.(ZaleniumRegistry.java:62) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.openqa.grid.web.Hub.(Hub.java:93) ... 3 more Caused by: java.lang.NullPointerException at java.util.TreeMap.putAll(TreeMap.java:313) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:411) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.withLabels(BaseOperation.java:48) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.deleteSeleniumPods(KubernetesContainerClient.java:393) at de.zalando.ep.zalenium.container.kubernetes.KubernetesContainerClient.initialiseContainerEnvironment(KubernetesContainerClient.java:339) at de.zalando.ep.zalenium.container.ContainerFactory.createKubernetesContainerClient(ContainerFactory.java:38) at de.zalando.ep.zalenium.container.ContainerFactory.getContainerClient(ContainerFactory.java:22) at de.zalando.ep.zalenium.proxy.DockeredSeleniumStarter.(DockeredSeleniumStarter.java:59) ... 11 more ...........................................................................................................................................................................................GridLauncher failed to start after 1 minute, failing... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 182 100 182 0 0 36103 0 --:--:-- --:--:-- --:--:-- 45500
A describe pod gives:
Warning Unhealthy 4m (x12 over 6m) kubelet, aks-agentpool-93668098-0 Readiness probe failed: HTTP probe failed with statuscode: 502
Zalenium Image Version(s):
dosel/zalenium:3
If using Kubernetes, specify your environment, and if relevant your manifests:
I use the templates as is from https://github.com/zalando/zalenium/tree/master/docs/k8s/helm
I guess it has to do something with rbac because of this part
"Error initialising Kubernetes support. io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [zalenium-zalenium-hub-6bbd86ff78-m25t2] in namespace: [default] failed. at "
I created a clusterrole and clusterrolebinding for the service account zalenium-zalenium that is automatically created by the Helm chart.
kubectl create clusterrole zalenium --verb=get,list,watch,update,delete,create,patch --resource=pods,deployments,secrets
kubectl create clusterrolebinding zalenium --clusterrole=zalnium --serviceaccount=zalenium-zalenium --namespace=default
Issue had to do with Azure's AKS and Kubernetes. It has been fixed.
See github issue 399
If the typo, mentioned by Ignacio, is not the case why don't you just do
kubectl create clusterrolebinding zalenium --clusterrole=cluster-admin serviceaccount=zalenium-zalenium
Note: no need to specify a namespace if creating cluster role binding, as it is cluster wide (role binding goes with namespaces).
I have OrienDB 2.1.4 cluster of 3 nodes with basic configuration. The only change in hazelcast.xml I made is to replace multicast by implicit tcp-ip hosts list.
After heavy request to DB (select without joins, about 300k rows in result set), OrientDB stops response to network connection attempts from application (OrientDB Studio is still working), the follwoing exceptions continuously appear in logs:
on master node
2016-02-24 10:02:17:647 INFO [10.10.10.124]:2434 [zertodb] [3.3.5] Remaining migration tasks in queue => 1 [InternalPartitionService][10.10.10.124]:2434 [zertodb] [3.3.5] Received data format is invalid. (An old version of Hazelcast may be running here.)
com.hazelcast.nio.serialization.HazelcastSerializationException: java.io.UTFDataFormatException: Length check failed, maybe broken bytestream or wrong stream position
at com.hazelcast.nio.serialization.SerializationServiceImpl.handleException(SerializationServiceImpl.java:354)
at com.hazelcast.nio.serialization.SerializationServiceImpl.readObject(SerializationServiceImpl.java:341)
at com.hazelcast.nio.serialization.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:454)
at com.hazelcast.cluster.MulticastService.receive(MulticastService.java:155)
at com.hazelcast.cluster.MulticastService.run(MulticastService.java:113)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.UTFDataFormatException: Length check failed, maybe broken bytestream or wrong stream position
at com.hazelcast.nio.UTFEncoderDecoder.readUTF0(UTFEncoderDecoder.java:505)
at com.hazelcast.nio.UTFEncoderDecoder.readUTF(UTFEncoderDecoder.java:77)
at com.hazelcast.nio.serialization.ByteArrayObjectDataInput.readUTF(ByteArrayObjectDataInput.java:450)
at com.hazelcast.cluster.ConfigCheck.readData(ConfigCheck.java:219)
at com.hazelcast.cluster.JoinMessage.readData(JoinMessage.java:80)
at com.hazelcast.cluster.JoinRequest.readData(JoinRequest.java:64)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:111)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:39)
at com.hazelcast.nio.serialization.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.nio.serialization.SerializationServiceImpl.readObject(SerializationServiceImpl.java:335)
... 4 more
on other nodes:
[10.10.10.194]:2434 [zertodb] [3.3.5] Received data format is invalid. (An old version of Hazelcast may be running here.)
com.hazelcast.nio.serialization.HazelcastSerializationException: java.io.StreamCorruptedException: invalid type code: 00
at com.hazelcast.nio.serialization.SerializationServiceImpl.handleException(SerializationServiceImpl.java:354)
at com.hazelcast.nio.serialization.SerializationServiceImpl.readObject(SerializationServiceImpl.java:341)
at com.hazelcast.nio.serialization.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:454)
at com.hazelcast.cluster.ConfigCheck.readData(ConfigCheck.java:215)
at com.hazelcast.cluster.JoinMessage.readData(JoinMessage.java:80)
at com.hazelcast.cluster.JoinRequest.readData(JoinRequest.java:64)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:111)
at com.hazelcast.nio.serialization.DataSerializer.read(DataSerializer.java:39)
at com.hazelcast.nio.serialization.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.nio.serialization.SerializationServiceImpl.readObject(SerializationServiceImpl.java:335)
at com.hazelcast.nio.serialization.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:454)
at com.hazelcast.cluster.MulticastService.receive(MulticastService.java:155)
at com.hazelcast.cluster.MulticastService.run(MulticastService.java:113)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.StreamCorruptedException: invalid type code: 00
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1379)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at com.hazelcast.nio.serialization.DefaultSerializers$ObjectSerializer.read(DefaultSerializers.java:196)
at com.hazelcast.nio.serialization.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.nio.serialization.SerializationServiceImpl.readObject(SerializationServiceImpl.java:335)
... 12 more
The same query with smaller result set works fine.
I have found this issue https://github.com/hazelcast/hazelcast/issues/4327 about your problem with hazelcast.
Hope it helps.
We are trying to run Hive queries on HDP 2.1 using GCS Connector, it was working fine until yesterday but since today morning our jobs are randomly started failing. When we restart them manually they just work fine. I suspect it's something to do with number of parallel Hive jobs running at a given point of time.
Below is the error message:
vertexId=vertex_1407434664593_37527_2_00, diagnostics=[Vertex Input: audience_history initializer failed., java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
Any help will be highly appreciated.
Thanks!