GKE config connector issue - Post i/o timeout - kubernetes

I am running into the below error when creating compute IP.
Config connector is already enabled, and it is a private cluster hosted on a shared network.
Version 1.17.15-gke.800
$ kubectl apply -f webapp-compute-ip. yaml
Error from server (InternalError): error when creating "webapp-compute-ip.yaml": Internal error occurred: failed calling webhook "annotation-defaulter.cnrm.cloud.google.com": Post https://cnrm-validating-webhook.cnrm-system.svc:443/annotation-defaulter?timeout=30s: dial tcp 192.168.66.130:9443: i/o timeout
$cat webapp-compute-ip.yaml
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeAddress
metadata:
name: webapp-ip-test
namespace: sandbox
labels:
app: webapp
environment: test
annotations:
cnrm.cloud.google.com/project-id: "cluster-name"
spec:
location: global`

This problem was due to a config connector version issue.
There was a change in the webhook default port, from 443 to 9443, see
Config connector version depends on GKE version, I did not have any control over it, moreover there no is public documentation available on what config connector version is available with the GKE version. There is an existing request here.
The solution was for me to add port 9443 in the firewall rule.

Related

HTTPRoute set a timeout

I am trying to set up a multi-cluster architecture. I have a Spring Boot API that I want to run on a second cluster (for isolation purposes). I have set that up using the gateway.networking.k8s.io API. I am using a Gateway that has an SSL certificate and matches an IP address that's registered to my domain in the DNS registry. I am then setting up an HTTPRoute for each service that I am running on the second cluster. That works fine and I can communicate between our clusters and everything works as intended but there is a problem:
There is a timeout of 30s by default and I cannot change it. I want to increase it as the application in the second cluster is a WebSocket and I obviously would like our WebSocket connections to stay open for more than 30s at a time. I can see that in the backend service that's created from our HTTPRoute there is a timeout specified as 30s. I found a command to increase it gcloud compute backend-services update gkemcg1-namespace-store-west-1-8080-o1v5o5p1285j --timeout=86400
When I run that command it would increase the timeout and the webSocket connection will be kept alive. But after a few minutes this change gets overridden (I suspect that it's because it's managed by the yaml file). This is the yaml file for my backend service
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
name: public-store-route
namespace: namespace
labels:
gateway: external-http
spec:
hostnames:
- "my-website.example.org"
parentRefs:
- name: external-http
rules:
- matches:
- path:
type: PathPrefix
value: /west
backendRefs:
- group: net.gke.io
kind: ServiceImport
name: store-west-1
port: 8080
I have tried to add either a timeout, timeoutSec, or timeoutSeconds under every level with no success. I always get the following error:
error: error validating "public-store-route.yaml": error validating data: ValidationError(HTTPRoute.spec.rules[0].backendRefs[0]): unknown field "timeout" in io.k8s.networking.gateway.v1beta1.HTTPRoute.spec.rules.backendRefs; if you choose to ignore these errors, turn validation off with --validate=false
Surely there must be a way to configure this. But I wasn't able to find anything in the documentation referring to a timeout. Am I missing something here?
How do I configure the timeout?
Edit:
I have found this resource: https://cloud.google.com/kubernetes-engine/docs/how-to/configure-gateway-resources
I have been trying to set up a LBPolicy and attatch it it the Gateway, HTTPRoute, Service, or ServiceImport but nothing has made a difference. Am I doing something wrong or is this not working how it is supposed to? This is my yaml:
kind: LBPolicy
apiVersion: networking.gke.io/v1
metadata:
name: store-timeout-policy
namespace: sandstone-test
spec:
default:
timeoutSec: 50
targetRef:
name: public-store-route
group: gateway.networking.k8s.io
kind: HTTPRoute

Webhook failing in rabbtimq

I have install rabbitmq cluster using rabbitmq cluster operator. I have also added rabbitmq topology operator. I am trying to create queues using rabbitmq topology operator using following yml file
kind: Queue
metadata:
name: software-results
namespace: rabbitmq-system
spec:
name: software-results # name of the queue
type: quorum # without providing a queue type, rabbitmq creates a classic queue
autoDelete: false
durable: true # seting 'durable' to false means this queue won't survive a server restart
rabbitmqClusterReference:
name: client-queues
I am getting error as
Error from server (InternalError): error when creating "singleQueue.yml": Internal error occurred: failed calling webhook "vqueue.kb.io": failed to call webhook: Post "https://webhook-service.rabbitmq-system.svc:443/validate-rabbitmq-com-v1beta1-queue?timeout=10s": dial tcp 10.97.65.156:443: connect: connection refused
I tried to search for the same but didn't find much. Can anyone help me what exactly is going wrong here ?

Kubernetes access Service in other namespace via http request

I've got an InfluxDB as database service in the default namespace.
It's service is called influxdb and works fine with chronograf to visualize the data.
Now i'd like to connect with an other deployment from the namspace test to this service. It's a python application. The normal python Influxdb Lib uses Requests to connect to the db.
Architecture Overview
Istio is also installed.
Namspace: Default
Influxdb Deployment
Influxdb Service
Chronograf Deployment (visualise influxdb)
Chronograf Service to Ingress(for external web access)
Namespace: test
Python App which should connect to influxdb for processing etc.
Influxdb Service (which points to influxdb.default.svc.cluster.local)
Therefore i created a service in the Namespace test which points to the service of influxdb in the default namespace.
apiVersion: v1
kind: Service
metadata:
name: influxdb
labels:
app: pythonapp
namespace: test
spec:
type: ExternalName
externalName: influxdb.default.svc.cluster.local
ports:
- port: 8086
name: http
- port: 8088
name: http-flux
And now deployed the python app which points to the influxdb Service. Which keeps getting a http connection error.
2020-07-03 13:02:05 - db.meterdb [meterdb.__init__:57] - ERROR - Oops, something wen't wrong during init of db. message: HTTPConnectionPool(host='influxdb', port=8086): Max retries exceeded with url: /query?q=SHOW+DATABASES (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6863ed6310>: Failed to establish a new connection: [Errno 111] Connection refused'))
2020-07-03 13:02:05 - db.meterdb [meterdb.check_connection:113] - ERROR - can't reach db server... message: HTTPConnectionPool(host='influxdb', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6863e57650>: Failed to establish a new connection: [Errno 111] Connection refused'))
When I visualise the traffic with kiali I see, that the Python app tries to connect to the influxdb service, but which is unknown with http traffic.
I don't know how to get it work to use the created influxdb Service.
Connection settings for python influxdb Client library. Link to python influxdb lib
host=influxdb
port=8086
Traffic from Kiali
How can i route the traffik to the right service ?
It seems to me, that it routes the traffik to an unknown service because it's http and not tcp.
You don't need the
kind: Service
metadata:
name: influxdb
labels:
app: pythonapp
namespace: test
Just access the service directly in your python request:
requests.get('influxdb.default.svc.cluster.local:8086')
And this can be more configurable.
# Kubernetes deployment
containers:
- name: pythonapp
env:
- name: DB_URL
value: influxdb.default.svc.cluster.local:8086
# python
DB = os.environ['DB_URL']
requests.get(DB)

Access SQL Server database from Kubernetes Pod

My deployed Spring boot application to trying to connect to an external SQL Server database from Kubernetes Pod. But every time it fails with error
Failed to initialize pool: The TCP/IP connection to the host <>, port 1443 has failed.
Error: "Connection timed out: no further information.
Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.
I have tried to exec into the Pod and successfully ping the DB server without any issues
Below are the solutions I have tried:
Created a Service and Endpoint and provided the DB IP in configuration file tried to bring up the application in the Pod
Tried using the Internal IP from Endpoint instead of DB IP in configuration to see Internal IP is resolved to DB IP
But both these cases gave the same result. Below is the yaml I am using the create the Service and Endpoint.
---
apiVersion: v1
kind: Service
metadata:
name: mssql
namespace: cattle
spec:
type: ClusterIP
ports:
- port: 1433
---
apiVersion: v1
kind: Endpoints
metadata:
name: mssql
namespace: cattle
subsets:
- addresses:
- ip: <<DB IP>>
ports:
- port: 1433
Please let me know if I am wrong or missing in this setup.
Additional information the K8s setup
It is clustered master with external etcd cluster topology
OS on the nodes is CentOS
Able to ping the server from all nodes and the pods that are created
For this scenario a headless service is very useful. You will redirect traffic to this ip without defining an endpoint.
kind: "Service"
apiVersion: "v1"
metadata:
namespace: "your-namespace"
name: "ftp"
spec:
type: ExternalName
externalName: your-ip
The issue was resolved by updating the deployment yaml with IP address. Since all the servers were in same subnet, I did not need the to create a service or endpoint to access the DB. Thank you for all the inputs on the post

Kubernetes Jenkins plugin - Jenkins doesn’t have label mypod

I'm trying to perform jenkins CI/CD on Kubernetes with dynamic slaves, my jenkins version is official image 2.60.2, while the kubernetes-plugin is 1.0. After add a cloud with kubernetes, the slave can't run up. It shows:
pending—Jenkins doesn’t have label mypod
I refer to
Kubernetes Jenkins plugin - slaves always offline
to configure the jenkins system. I find the issue is described as a defect, I don't know whether this updated to latest jenkins images. Here is the link: https://github.com/jenkinsci/kubernetes-plugin/pull/127
Next error:
Jenkins doesn’t have label mypod
Could this be because of 400d1ed? KubernetesDeclarativeAgentScript.groovy probably needs to get an update then.
Does anyone know how to fix this issue?
The keyword is (as always): look at the logs! You should see your errors when issuing
kubectl logs $JENKINS_POD_NAME
Also, you can try the command below. Here, your faulted slaves will be listed. Look at the logs for these:
kubectl get pods -a
Your issue is related to JNLP communication, slave->master
My jenkins is running in a container and I had to expose the JNLP port to the cluster-node (nodePort).
apiVersion: v1
kind: Service
metadata:
name: jenkins
labels:
app: jenkins
spec:
ports:
- name: jnlp
port: 40294
targetPort: 40294
- name: http
port: 80
targetPort: 8080
selector:
app: jenkins
tier: jenkins
type: NodePort
Also in jenkins security, look for JNLP and enable ALL protocols.
I am still playing with fixed or random ports. Not sure how I can expose random port from a k8s service. Port-range is not supported in k8s.
But I am able to fire off a slave and do some work!