Connection timeout on cluster.openBucket call with Couchbase / Kubernetes - kubernetes

I have deployed a 4 node Couchbase cluster using Google GKE.
The master node exposes ports 8091, 8093 to the Loadbaancer.
When connecting to the Loadbalancer IP (external) via a Java app to insert data, I get the timeout error with this stack:
Apr 04, 2017 3:32:15 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 operationComplete
WARNING: [null][ViewEndpoint]: Socket connect took longer than specified timeout.
Apr 04, 2017 3:32:15 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 operationComplete
WARNING: [null][KeyValueEndpoint]: Socket connect took longer than specified timeout.
Apr 04, 2017 3:32:15 PM com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise notifyListener0
WARNING: An exception was thrown by com.couchbase.client.core.endpoint.AbstractEndpoint$2.operationComplete()
rx.exceptions.OnErrorNotImplementedException: connection timed out: /10.4.0.3:8093
at rx.Observable$26.onError(Observable.java:7955)
at rx.observers.SafeSubscriber._onError(SafeSubscriber.java:159)
at rx.observers.SafeSubscriber.onError(SafeSubscriber.java:120)
at rx.internal.operators.OperatorMap$1.onError(OperatorMap.java:48)
What's puzzling is that the stack shows 10.4.0.3:8093 which is actually the the IP of the docker container.
Appreciate all suggestions.

Have you checked the firewall rules for the master node and the workers? You need to allow ingress for the ports you have set up.
See this answer

Related

Issues in Cluster Environment on weblogic 12.2.1.2.0 (liferay DXP running)

We have 2 node cluster in Weblogic running Liferay DXP. When we are trying to start the Node-2 we get the below error:
Any thoughts on this and what could be the best fix ? This is impacting our production and any help is appreciated.
Jun 18, 2018 1:26:21 AM org.jgroups.protocols.UDP setBufferSize
WARNING: JGRP000015: the receive buffer of socket MulticastSocket was
set to 500KB, but the OS only allocated 212.99KB. This might lead to
performance problems. Please set your max receive buffer in the OS
correctly (e.g. net.core.rmem_max on Linux)
1:26:21,354 AM EDT>
and we are getting in our logs:
01:26:21,484 ERROR [Start Level: Equinox Container:
123465678-1234-1234-1234-f0a48d1c8347][ClusterExecutorImpl:402] Unable
to get cluster node with address Servername-32883

Failed to create a cluster with command kubeadm init

I am tring to install kubenetes on debian 9.3, I followed the instructions on this document https://kubernetes.io/docs/setup/independent/install-kubeadm/, it failed to create the cluster with timeout error, the commands I used are as follows:
export HTTP_PROXY=http://192.168.56.1:1080 # this is my internet proxy
export HTTPS_PROXY=http://192.168.56.1:1080
export NO_PROXY=127.0.0.1,192.168.56.*,10.244.*,10.96.*
kubeadm init --apiserver-advertise-address=192.168.56.101 --pod-network-cidr=10.244.0.0/16
the last command hangs up for 1hour and failed with timeout, I found that several container had been running by command docker ps, the running containers included kube-controller-manager-amd64,etcd-amd64,kube-apiserver-amd64,kube-scheduler-amd64,4 instances of pause-amd64.
the error messages are as follows
duler-debvm01_kube-system(660259102d57385a8043d025ac189c87)": Get https://192.168.56.101:6443/api/v1/namespaces/kube-system/pods/kube-scheduler-debvm01: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.923017 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:474: Failed to list *v1.Node: Get https://192.168.56.101:6443/api/v1/nodes?fieldSelector=metadata.name%3Ddebvm01&limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.924966 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:465: Failed to list *v1.Service: Get https://192.168.56.101:6443/api/v1/services?limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:49 DebVM01 kubelet[10665]: E0406 21:44:49.925892 10665 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get xxx/api/v1/pods?fieldSelector=spec.nodeName%3Ddebvm01&limit=500&resourceVersion=0: net/http: TLS handshake timeout
Apr 06 21:44:50 DebVM01 kubelet[10665]: E0406 21:44:50.029333 10665 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node "debvm01" not found
Apr 06 21:44:50 DebVM01 kubelet[10665]: E0406 21:44:50.379543 10665 kubelet_node_status.go:106] Unable to register node "debvm01" with API server: Post xxx: net/http: TLS handshake timeout
Apr 06 21:44:52 DebVM01 kubelet[10665]: E0406 21:44:52.575452 10665 event.go:209] Unable to write event: 'Post xxxx: net/http: TLS handshake timeout' (may retry after sleeping)
Apr 06 21:44:57 DebVM01 kubelet[10665]: I0406 21:44:57.380498 10665 kubelet_node_status.go:273] Setting node annotation to enable volume controller attach/detach
Apr 06 21:44:57 DebVM01 kubelet[10665]: I0406 21:44:57.430059 10665 kubelet_node_status.go:82] Attempting to register node debvm01
Apr 06 21:45:00 DebVM01 kubelet[10665]: E0406 21:45:00.030635 10665 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node "debvm01" not found
Apr 06 21:45:01 DebVM01 kubelet[10665]: I0406 21:45:01.484580 10665 kubelet_node_status.go:85] Successfully registered node debvm01
the above error messages has been processed and eliminated a lot of repeated lines as follows:
Apr 06 22:46:20 DebVM01 kubelet[10665]: E0406 22:46:20.773690 10665 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 06 22:46:25 DebVM01 kubelet[10665]: W0406 22:46:25.779141 10665 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Kubernetes v1.9.3
could anyone help me?
kubeadm init --apiserver-advertise-address=192.168.56.101
--pod-network-cidr=10.244.0.0/16
From kubeadm documentation:
--apiserver-advertise-address ip-address The IP address the API Server will advertise it's listening on. Specify '0.0.0.0' to use the
address of the default network interface.
Unless otherwise specified, kubeadm uses the default gateway’s network
interface to advertise the master’s IP. If you want to use a different
network interface, specify --apiserver-advertise-address=ip-address
From kubernetes api-server documentation:
--advertise-address ip-address The IP address on which to advertise the apiserver to members of the cluster. This address must
be reachable by the rest of the cluster. If blank, the --bind-address
will be used. If --bind-address is unspecified, the host's default
interface will be used.
I've done a couple of experiments which confirm that it is necessary for ip-address to be configured (or added as a secondary IP) to one of the master's instance interfaces.
Just double check if that interface is up.
The last error message,
network plugin is not ready: cni config uninitialized
means that kubernetes networking subsystem is absent or broken. Try to install/reinstall it with
kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
This part described in section "(3/4) Installing a pod network" in the document you've mentioned.
If you are stuck, try to reinstall your cluster following this manual.

Zookeeper and Solr - unable to start the cluster

I've inherited a platform which runs Zookeeper and Solr. The main problem at first was the zoo.cfg and Zookeeper had old setup so DNS named resolution wasn't working.
I've already fixed. I also fixed the issue of the myid value in /var/lib/zookeeper thanks to this thread (someone left a string value there...)
Zookeeper - three nodes and nothing but errors
Now, the output of logs is different:
2017-08-28 14:24:19,368 [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager#400] - Cannot open channel to 2 at election address zookeeper-test-2/10.240.102.89:3888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:381)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)
2017-08-28 14:24:19,368 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer#149] - Resolved hostname: zookeeper-test-2 to address: zookeeper-test-2/10.240.102.89
2017-08-28 14:24:19,368 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection#852] - Notification time out: 400
and If I run this script from one EC2 inside the VPC I can see is almost working:
+
+ SOLR
+ Online Nodes: 2
+ solrcloud-test-2 [ Connection to 8983 succeeded ]
+ solrcloud-test-1 [ Connection to 8983 succeeded ]
-------------
+ ELB not responding properly. HTTP response: 503
+
+ ZOOKEEPER
+ Online Instances: 3
+ zookeeper-test-3 [ Connection to 2181 succeeded ]
+ zookeeper-test-2 [ Connection to 2181 succeeded ]
+ zookeeper-test-1 [ Connection to 2181 succeeded ]
+ Minimal Configured: 3
+ Cluster Status: UP
as you can see even the Solr Ec2 seems to be online the ELB doesn't detect the path of the Solr.
I checked the parameters of how Tomcat was bringing up the Solr and I detected again a misconfiguration related to the hostnames so I fixed and restarted Tomcat, however my ELB health check still does not detect the url.
This check is setup as
HTTP:8983/solr/
I do a netstat I can see the port up and listening
[root#solrcloud-test-2 ~]# netstat -lnp |grep 8983
tcp 0 0 :::8983 :::* LISTEN 3338/java
Moreover another thing I noticed is even I restarted tomcat the logs of serviced are freezed in the past 25th of August!
ERROR - 2017-08-25 17:10:25.369; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: java.net.UnknownHostException: zookeeper-1: unknown error
at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:139)
at org.apache.solr.cloud.ZkController.<init>(ZkController.java:207)
at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:152)
at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:67)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:216)
at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:189)
at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:136)
at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:279)
at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:260)
at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:105)
at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4939)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5633)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:147)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:652)
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1092)
at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1984)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: zookeeper-1: unknown error
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
at org.apache.solr.common.cloud.SolrZooKeeper.<init>(SolrZooKeeper.java:41)
at org.apache.solr.common.cloud.DefaultConnectionStrategy.connect(DefaultConnectionStrategy.java:37)
at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:114)
... 22 more
INFO - 2017-08-25 17:10:25.370; org.apache.solr.servlet.SolrDispatchFilter; SolrDispatchFilter.init() done
ERROR - 2017-08-25 17:10:25.525; org.apache.solr.core.CoreContainer; CoreContainer was not shutdown prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! instance=650576553
But in the other hand the catalina.out
INFO: Deploying web application archive /var/lib/tomcat7/webapps/solr.war
Aug 29, 2017 8:39:26 AM org.apache.catalina.startup.TldConfig execute
INFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
Aug 29, 2017 8:39:26 AM org.apache.catalina.core.StandardContext startInternal
SEVERE: One or more Filters failed to start. Full details will be found in the appropriate container log file
Aug 29, 2017 8:39:26 AM org.apache.catalina.core.StandardContext startInternal
SEVERE: Context [/solr] startup failed due to previous errors
Aug 29, 2017 8:39:26 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive /var/lib/tomcat7/webapps/solr.war has finished in 3,447 ms
Aug 29, 2017 8:39:26 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-bio-8983"]
Aug 29, 2017 8:39:26 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-bio-8443"]
Aug 29, 2017 8:39:26 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["ajp-bio-8009"]
Aug 29, 2017 8:39:26 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 3579 ms
So maybe my problem here is I don't know how exactly restart the Solr component? Is really my first work with this software so apologies if my questions are totally noob :)
UPDATE:
I found new info in the logs. Now I think I'm closer to the root issue. However I don't understand the problem, some files missing?
Aug 29, 2017 9:12:11 AM org.apache.catalina.core.StandardContext filterStart
SEVERE: Exception starting filter SolrRequestFilter
java.lang.NoClassDefFoundError: Failed to initialize Apache Solr: Could not find necessary SLF4j logging jars. If using Jetty, the SLF4j logging jars need to go in the jetty lib/ext directory. For other containers, the corresponding directory should be used. For more information, see: http://wiki.apache.org/solr/SolrLogging
Thanks
Wow, apologies people. Maybe I shot the question very fast. The setup was completely misconfigured so basically to sum up, Zookeeper cluster was partially running because service was up and running but it was lack of service. So Solr was unable to connect to them. Once I was able to bring up Zookeeper to good status Solr could connect to it.
thanks anyway

Internet of Things Platform Starter: Failed to create container

Created an Internet of Things Platform Starter service. As it did not start automatically, I started it manually:
STG/0 Failed to create container Apr 29, 2017 9:11:24 PM
API/1 Failed to stage application: staging failed Apr 29, 2017 9:11:24 PM
LGR/nullproxy: error connecting to 169.46.101.209:8081: dial tcp 169.46.101.209:8081: i/o timeout Apr 29, 2017 9:26:12 PM
Not sure, why a container is required.
Node-red is running now. Looks like a glitch.

Can't get my webapp accessible using eclipse, apache2 and tomcat7 (Ubuntu)

I'm quite a rookie about servlets, but I should deploy an Eclipse web project running on a Tomcat server (only localhost).
The whole process worked fine on Windows but recently I had to move to Ubuntu 12.04 and I have this problem when I want to access the app:
If I start apache2 and tomcat7 first, the output of
sudo netstat -lpn |grep :80
looks like this:
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 12231/apache2
tcp6 0 0 127.0.0.1:8005 :::* LISTEN 12848/java
tcp6 0 0 :::8080 :::* LISTEN 12848/java
then I try to start the server in eclipse and face this error:
Several ports (8005, 8080) required by cdrserver are already in use. The server may already >be running in another process, or a system process may be using the port. To start this >server you will need to stop the other process or change the port number(s).
Alright, let's kill these processes (although it seems that Tomcat uses them since when i stop tomcat, the 2 tcp6 processes disappear).
Now I'm able to start the Eclipse server, without a single warning:
Nov 27, 2013 10:59:24 AM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-bio-8080"]
Nov 27, 2013 10:59:24 AM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 869 ms
Nov 27, 2013 10:59:24 AM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Catalina
Nov 27, 2013 10:59:24 AM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.26
Nov 27, 2013 10:59:24 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory /home/aron/workspace/Text_manipulator
Nov 27, 2013 10:59:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory /home/aron/workspace/.metadata
Nov 27, 2013 10:59:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory /home/aron/workspace/Servers
Nov 27, 2013 10:59:26 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-bio-8080"]
Nov 27, 2013 10:59:26 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 2101 ms
Now, if I type the usual (like on Windows) URL to the browser: localhost/cdr I get this:
Not Found
The requested URL /cdr was not found on this server.
Apache/2.2.22 (Ubuntu) Server at localhost Port 80
Same happens with localhost/manager (I got this tip lately).
Moreover, using localhost:8080/cdr results in a totally blank page.
Here are my Eclipse server settings: http://i.imgur.com/lV6FwTm.png
I also checked the web.xml file in the project, it has the following servlet classes and related mappings:
Faces Servlet
Trinidad Resource Servlet
Resources Servlet
Spring MVC Dispatcher Servlet
Am I missing something obvious?
The requested URL /cdr was not found on this server.
Apache/2.2.22 (Ubuntu) Server at localhost Port 80
Firstly your URL should be something like localhost:8080 , Not localhost since this defaults to localhost:80 and you have apache2 running on your system thats why you get the 404 message of the apache server.
Secondly , I am concerned about the resources being deployed as shown by your logs
Nov 27, 2013 10:59:24 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory /home/aron/workspace/Text_manipulator
Nov 27, 2013 10:59:26 AM org.apache.catalina.startup.HostConfig deployDirectory
**INFO: Deploying web application directory /home/aron/workspace/.metadata** --> ?
Nov 27, 2013 10:59:26 AM org.apache.catalina.startup.HostConfig deployDirectory
**INFO: Deploying web application directory /home/aron/workspace/Servers** --> ?
Nov 27, 2013 10:59:26 AM org.apache.coyote.AbstractProtocol start
What is .metadata ? Isn't this one of the hidden folders created by eclipse to manage workspace. This shouldn't be deployed.
Also what is Servers ? This looks like the server project from eclipse.
I don't see any resource namely cdr being deployed from the logs. So first you need to verify that this particular resource is deployed at all. Secondly, I would advise you to do some reading on how Tomcat works here.