ThingsBoard on ECS - docker-compose

I try to deploy ThingsBoard on ECS using the Docker Compose ECS integration: I set up an external database, I write the Docker Compose File
version: '3.8'
services:
thingsboard:
container_name: thingsboard
image: thingsboard/tb-postgres
restart: always
environment:
- SPRING_DATASOURCE_URL=jdbc:postgresql://<HOST>:<PORT>/thingsboard?sslmode=require
- SPRING_DATASOURCE_USERNAME=<USERNAME>
- SPRING_DATASOURCE_PASSWORD=<PASSWORD>
ports:
- '9090:9090'
- '1883:1883'
- '5683:5683/udp'
and I launch the stack on ECS using docker compose up.
The cluster is correctly created and I can see from CloudWatch logs that the ThingsBoard container starts correctly (even if it is very slow).
After a while, though, ECS deregisters the tasks due to health check failures.
The ECS Event logs say:
service thingsboard-ThingsboardService-XXX (port 9090) is unhealthy in target-group thing-Thing-XXX due to (reason Health checks failed).
service thingsboard-ThingsboardService-XXX (port 1883) is unhealthy in target-group thing-Thing-XXX due to (reason Health checks failed).
service thingsboard-ThingsboardService-XXX (port 5683) is unhealthy in target-group thing-Thing-XXX due to (reason Health checks failed).
Modifying the health check configuration, I am able at least to login into ThingsBoard and check that everything works the right way. After few minutes, though, the health check failure repeats and the tasks are stopped.
Why is that?

Those ports are not HTTP ports. 9090 is gRPC. 1883 is MQTT, 5683 is CoAP. Most likely, the health checks are not expecting those protocols. I would recommend disabling them and checking for HTTP(S) only.

Related

Liveness-Probe of one pod via another

On my Kubernetes Setup, I have 2 pods - A (via deployment) and B(via DS).
Pod B is somehow dependent on Pod A being fully started through. I would now like to set an HTTP Liveness-Probe in Pods B, to restart POD B if health check via POD A fails. Restarting works fine if I put the External IP of my POD A's service in the host. The issue is in resolving DNS name in the host.
It works if I set it like this:
livenessProbe:
httpGet:
host: <POD_A_SERVICE_EXTERNAL_IP_HERE>
path: /health
port: 8000
Fails if I set it like this:
livenessProbe:
httpGet:
host: auth
path: /health
port: 8000
Failed with following error message:
Liveness probe failed: Get http://auth:8000/health: dial tcp: lookup auth on 8.8.8.8:53: no such host
ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Is the following line on the above page true for HTTP Probes as well?
"you can not use a service name in the host parameter since the kubelet is unable to resolve it."
Correct 👍, DNS doesn't work for liveness probes, the kubelet network space cannot basically resolve any in-cluster DNS.
You can consider putting both of your services in a single pod as sidecars. This way they would share the same address space if one container fails then the whole pod is restarted.
Another option is to create an operator 🔧 for your pods/application and basically have it check the liveness through the in-cluster DNS for both pods separately and restart the pods through the Kubernetes API.
You can also just create your own script in a pod that just calls curl to check for a 200 OK and kubectl to restart your pod if you get something else.
Note that for the 2 options above you need to make sure that Coredns is stable and solid otherwise your health checks might fail to make your services have potential downtime.
✌️☮️

Kubernetes Pod serial communication Problem

When a serial port is created in a docker container mapped on a host with an operating system of Linux, this is done with the ‘—device’ flag;
e.g. docker run -dit --device=/dev/ttyUSB0 --name SerialTest <docker image>
We would like to know how PODs can be mapped serial ports in Kubernetes. The figure below shows the Pod configuration for the application to be deployed in Rancher 2.x.
(https://i.imgur.com/RHhlD4S.png)
In node scheduling, we have configured pods to be distributed to specific nodes with serial ports. Also, it is of course not possible to map the serial port with the volume mount. So, I would like to raise a question because I couldn't find anything related to ‘—device’ flag of docker in my Rancher 2.x configuration.
(https://imgur.com/wRe7Eds.png) "Application configuration in Rancher 2.x"
(https://imgur.com/Lwil7cz.png) "Serial port device connected to the HOST PC"
(https://imgur.com/oWeW0LZ.png) "Volume Mount Status of Containers in Deployed Pods"
(https://imgur.com/GKahqY0.png) "Error log when running a .NET application that uses a serial port"
Based on the goal of the first diagram: Kubernetes abstractions covering the communication between the pod and the outside world (for this matter, outside of the node) are meant to handle at least layer 2 communications (veth, as in inter-node/pod communication).
Is not detailed why is not possible to map the device volume in the pod, so I'm wondering if you have tried using privileged containers like in this reference:
containers:
- name: acm
securityContext:
privileged: true
volumeMounts:
- mountPath: /dev/ttyACM0
name: ttyacm
volumes:
- name: ttyacm
hostPath:
path: /dev/ttyACM0
It is possible for Rancher to start containers in privileged mode.

Readiness-Probe another Service on boot-up of Pod

On my Kubernetes Setup, i have 2 Services - A and B.
Service B is dependent on Service A being fully started through.
I would now like to set a TCP Readiness-Probe in Pods of Service B, so they test if any Pod of Service A is fully operating.
the ReadinessProbe section of the deployment in Service B looks like:
readinessProbe:
tcpSocket:
host: serviceA.mynamespace.svc.cluster.local
port: 1101 # same port of Service A Readiness Check
I can apply these changes, but the Readiness Probe fails with:
Readiness probe failed: dial tcp: lookup serviceB.mynamespace.svc.cluster.local: no such host
I use the same hostname on other places (e.g. i pass it as ENV to the container) and it works and gets resolved.
Does anyone have an idea to get the readiness working for another service or to do some other kind of dependency-checking between services?
Thanks :)
Due to the fact that Readiness and Liveness probes are fully managed by kubelet node agent and kubelet inherits DNS discovery service from the particular Node configuration, you are not able to resolve K8s internal nameserver DNS records:
For a probe, the kubelet makes the probe connection at the node, not
in the pod, which means that you can not use a service name in the
host parameter since the kubelet is unable to resolve it.
You can consider scenario when your source Pod A consumes Node IP Address by propagating hostNetwork: true parameter, thus kubelet can reach and success Readiness probe from within Pod B, as described in the official k8s documentation:
tcpSocket:
host: Node Hostname or IP address where Pod A residing
port: 1101
However, I've found Stack thread, where you can get more efficient solution how to achieve the same result through Init Containers.
In addition to Nick_Kh's answer, another workaround is to use probe by command, which is executed in a container.
To perform a probe, the kubelet executes the command cat /tmp/healthy in the target container. If the command succeeds, it returns 0, and the kubelet considers the container to be alive and healthy.
An example:
readinessProbe:
exec:
command:
- sh
- -c
- wget -T2 -O- http://service

Istio on minikube - envoy missing listener for inbound application port: 9095

I follow this istio tutorial (part 3). After I created minikube local registry, I need to run the following command:
kubectl run hellodemo --image=hellodemo:v1 --port=9095 --image-pull-policy=IfNotPresent
Which should run image and istio-proxy on the Pod.
When I run kubectl get pods, I get:
NAME READY STATUS RESTARTS AGE
hellodemo-6d49fc6c51-adsa1 1/2 Running 0 1h
When I run kubectl logs hellodemo-6d49fc6c51-adsa1 istio-proxy:
* failed checking application ports. listeners="0.0.0.0:15090","10.110.201.202:16686","10.96.0.1:443","10.104.103.28:15443","10.104.103.28:15031","10.101.128.212:14268","10.104.103.28:15030","10.111.177.172:443","10.104.103.28:443","10.109.4.23:80","10.111.177.172:15443","10.104.103.28:15020","10.104.103.28:15032","10.105.175.151:15011","10.101.128.212:14267","10.96.0.10:53","10.104.103.28:31400","10.104.103.28:15029","10.98.84.0:443","10.99.194.141:443","10.99.175.237:42422","0.0.0.0:9411","0.0.0.0:3000","0.0.0.0:15010","0.0.0.0:15004","0.0.0.0:8060","0.0.0.0:9901","0.0.0.0:20001","0.0.0.0:8080","0.0.0.0:9091","0.0.0.0:80","0.0.0.0:15014","0.0.0.0:9090","172.17.0.6:15020","0.0.0.0:15001"
* envoy missing listener for inbound application port: 9095
2019-05-02T16:24:28.709972Z info Envoy proxy is NOT ready: 2 errors occurred:
* failed checking application ports. listeners="0.0.0.0:15090","10.110.201.202:16686","10.96.0.1:443","10.104.103.28:15443","10.104.103.28:15031","10.101.128.212:14268","10.104.103.28:15030","10.111.177.172:443","10.104.103.28:443","10.109.4.23:80","10.111.177.172:15443","10.104.103.28:15020","10.104.103.28:15032","10.105.175.151:15011","10.101.128.212:14267","10.96.0.10:53","10.104.103.28:31400","10.104.103.28:15029","10.98.84.0:443","10.99.194.141:443","10.99.175.237:42422","0.0.0.0:9411","0.0.0.0:3000","0.0.0.0:15010","0.0.0.0:15004","0.0.0.0:8060","0.0.0.0:9901","0.0.0.0:20001","0.0.0.0:8080","0.0.0.0:9091","0.0.0.0:80","0.0.0.0:15014","0.0.0.0:9090","172.17.0.6:15020","0.0.0.0:15001"
* envoy missing listener for inbound application port: 9095
2019-05-02T16:24:30.729987Z info Envoy proxy is NOT ready: 2 errors occurred:
* failed checking application ports. listeners="0.0.0.0:15090","10.110.201.202:16686","10.96.0.1:443","10.104.103.28:15443","10.104.103.28:15031","10.101.128.212:14268","10.104.103.28:15030","10.111.177.172:443","10.104.103.28:443","10.109.4.23:80","10.111.177.172:15443","10.104.103.28:15020","10.104.103.28:15032","10.105.175.151:15011","10.101.128.212:14267","10.96.0.10:53","10.104.103.28:31400","10.104.103.28:15029","10.98.84.0:443","10.99.194.141:443","10.99.175.237:42422","0.0.0.0:9411","0.0.0.0:3000","0.0.0.0:15010","0.0.0.0:15004","0.0.0.0:8060","0.0.0.0:9901","0.0.0.0:20001","0.0.0.0:8080","0.0.0.0:9091","0.0.0.0:80","0.0.0.0:15014","0.0.0.0:9090","172.17.0.6:15020","0.0.0.0:15001"
* envoy missing listener for inbound application port: 9095
Do you know what is the problem that prevent the istio-proxy container to come up?
I use istio-1.1.4 on minikube.
I was also having the same problem. I followed the documentation, and it said to enable SDS in the gateway. However, I enabled it in the gateway and also at the global scope as well, and this caused the error above.
I removed the following code from my values.yml file and everything worked:
global:
sds:
enabled: true

Kubernetes - Can't connect to a service IP from the service's pod

I'm trying to create 3 instances of Kafka and deploy it a local Kubernetes setup. Because each instance needs some specific configuration, I'm creating one RC and one service for each - eagerly waiting for #18016 ;)
However, I'm having problems because Kafka can't establish a network connection to itself when it uses the service IP (a Kafka broker tries to do this when it is exchanging replication messages with other brokers). For example, let's say I have two worker hosts (172.17.8.201 and 172.17.8.202) and my pods are scheduled like this:
Host 1 (172.17.8.201)
kafka1 pod (10.2.16.1)
Host 2 (172.17.8.202)
kafka2 pod (10.2.68.1)
kafka3 pod (10.2.68.2)
In addition, let's say I have the following service IPs:
kafka1 cluster IP: 11.1.2.96
kafka2 cluster IP: 11.1.2.120
kafka3 cluster IP: 11.1.2.123
The problem happens when the kafka1 pod (container) tries to send a message (to itself) using the kafka1 cluster IP (11.1.2.96). For some reason, the connection cannot established and the message is not sent.
Some more information: If I manually connect to the kafka1 pod, I can correctly telnet to kafka2 and kafka3 pods using their respective cluster IPs (11.1.2.120 / 11.1.2.123). Also, if I'm in the kafka2 pod, I connect to both kafka1 and kafka3 pods using 11.1.2.96 and 11.1.2.123. Finally, I can connect to all pods (from all pods) if I use the pod IPs.
It is important to emphasize that I shouldn't tell the kafka brokers to use the pod IPs instead of the cluster IPs for replication. As it is right now, Kafka uses for replication whatever IP you configure to be "advertised" - which is the IP that your client uses to connect to the brokers. Even if I could, I believe this problem may appear with other software as well.
This problem seems to happen only with the combination I am using, because the exact same files work correctly in GCE. Right now, I'm running:
Kubernetes 1.1.2
coreos 928.0.0
network setup with flannel
everything on vagrant + VirtualBpx
After some debugging, I'm not sure if the problem is in the workers iptables rules, in kube-proxy, or in flannel.
PS: I posted this question originally as an Issue on their github, but I have been redirected to here by the Kubernetes team. I reword the text a bit because it was sounding like it was a "support request", but actually I believe it is some sort of bug. Anyway, sorry about that Kubernetes team!
Edit: This problem has been confirmed as a bug https://github.com/kubernetes/kubernetes/issues/20391
for what you want to do you should be using a Headless Service
http://kubernetes.io/v1.0/docs/user-guide/services.html#headless-services
this means setting
clusterIP: None
in your Service
and that means there won't be an IP associated with the service but it will return all IPs of the Pods selected by the selector
Update: The bug is fixed in v1.2.4
You can try container hook.
containers:
- name: kafka
image: Kafka
lifecycle:
postStart:
exec:
command:
- "some.sh" #some shell scripts to get this pod's IP and notify the other Kafka members that "add me into your cluster"
preStop:
exec:
command:
- "some.sh" #some shell scripts to get other Kafka pods' IP and notify the other Kafka members that "delete me from your cluster"
I have got a similar problems on running 3 mongodb pods as a cluster but the pods cannot access themselves through their serivces' IP.
In addition, has the bug been fixed?