Kubernetes MLflow Service Pod Connection - kubernetes

I have deployed a build of mlflow to a pod in my kubernetes cluster. I'm able to port forward to the mlflow ui, and now I'm attempting to test it. To do this, I am running the following test on a jupyter notebook that is running on another pod in the same cluster.
import mlflow
print("Setting Tracking Server")
tracking_uri = "http://mlflow-tracking-server.default.svc.cluster.local:5000"
mlflow.set_tracking_uri(tracking_uri)
print("Logging Artifact")
mlflow.log_artifact('/home/test/mlflow-example-artifact.png')
print("DONE")
When I run this though, I get
ConnectionError: HTTPConnectionPool(host='mlflow-tracking-server.default.svc.cluster.local', port=5000): Max retries exceeded with url: /api/2.0/mlflow/runs/get? (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object>: Failed to establish a new connection: [Errno 111] Connection refused'))
The way I have deployed the mlflow pod is shown below in the yaml and docker:
Yaml:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mlflow-tracking-server
namespace: default
spec:
selector:
matchLabels:
app: mlflow-tracking-server
replicas: 1
template:
metadata:
labels:
app: mlflow-tracking-server
spec:
containers:
- name: mlflow-tracking-server
image: <ECR_IMAGE>
ports:
- containerPort: 5000
env:
- name: AWS_MLFLOW_BUCKET
value: <S3_BUCKET>
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
---
apiVersion: v1
kind: Service
metadata:
name: mlflow-tracking-server
namespace: default
labels:
app: mlflow-tracking-server
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
externalTrafficPolicy: Local
type: LoadBalancer
selector:
app: mlflow-tracking-server
ports:
- name: http
port: 5000
targetPort: http
While the dockerfile calls a script that executes the mlflow server command: mlflow server --default-artifact-root ${AWS_MLFLOW_BUCKET} --host 0.0.0.0 --port 5000, I cannot connect to the service I have created using that mlflow pod.
I have tried using the tracking uri http://mlflow-tracking-server.default.svc.cluster.local:5000, I've tried using the service EXTERNAL-IP:5000, but everything I tried cannot connect and log using the service. Is there anything that I have missed in deploying my mlflow server pod to my kubernetes cluster?

Your mlflow-tracking-server service should have ClusterIP type, not LoadBalancer.
Both pods are inside the same Kubernetes cluster, therefore, there is no reason to use LoadBalancer Service type.
For some parts of your application (for example, frontends) you may want to expose a Service onto an external IP address, that’s outside of your cluster.
Kubernetes ServiceTypes allow you to specify what kind of Service you want. The default is ClusterIP.
Type values and their behaviors are:
ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this
value makes the Service only reachable from within the cluster. This
is the default ServiceType.
NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). A > ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll > be able to contact the NodePort Service, from outside the cluster, by
requesting :.
LoadBalancer: Exposes the Service
externally using a cloud provider’s load balancer. NodePort and
ClusterIP Services, to which the external load balancer routes, are
automatically created.
ExternalName: Maps the Service to the contents
of the externalName field (e.g. foo.bar.example.com), by returning a
CNAME record with its value. No proxying of any kind is set up.
kubernetes.io

So to oversimplify this, you have no ways to access the mlflow uri from jupyterhub pod. What I would do here is check the proxies for the jupyterhub pod. If you dont have .svc in NO_PROXY you have to add it. A detailed reason is that you are accessing the Internal .svc mlflow url as if it is on open internet. But actually your mlflow uri is only accessible inside the cluster. If adding .svc doesnt work for no proxy doesnt work we can take a deeper look at that. The ways to check the proxies is by taking ‘ kubectl get po $JHPODNAME -n $ JHNamespace -o yaml’

Related

azure AKS internal load balancer not responding requests

I have an AKS cluster, as well as a separate VM. AKS cluster and the VM are in the same VNET (as well as subnet).
I deployed a echo server with the following yaml, I'm able to directly curl the pod with vnet ip from the VM. But when trying that with load balancer, nothing returns. Really not sure what I'm missing. Any help is appreciated.
apiVersion: v1
kind: Service
metadata:
name: echo-server
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: echo-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo-deployment
spec:
replicas: 1
selector:
matchLabels:
app: echo-server
template:
metadata:
labels:
app: echo-server
spec:
containers:
- name: echo-server
image: ealen/echo-server
ports:
- name: http
containerPort: 8080
The following pictures demonstrate the situation
I'm expecting that when curl the vnet ip from load balancer, to receive the same response as I did directly curling the pod ip
Can you check your internal-loadbalancer health probe.
"For Kubernetes 1.24+ the services of type LoadBalancer with appProtocol HTTP/HTTPS will switch to use HTTP/HTTPS as health probe protocol (while before v1.24.0 it uses TCP). And / will be used as the default health probe request path. If your service doesn’t respond 200 for /, please ensure you're setting the service annotation service.beta.kubernetes.io/port_{port}_health-probe_request-path or service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path (applies to all ports) with the correct request path to avoid service breakage."
(ref: https://github.com/Azure/AKS/releases/tag/2022-09-11)
If you are using nginx-ingress controller, try adding the same as mentioned in doc:
(https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli#basic-configuration)
helm upgrade ingress-nginx ingress-nginx/ingress-nginx \
--reuse-values \
--namespace <NAMESPACE> \
--set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz
Have you checked whether the pod's IP is correctly mapped as an endpoint to the service? You can check it using,
k describe svc echo-server -n test | grep Endpoints
If not please check label and selectors with your actual deployment (rather the resources put in the description).
If it is correctly mapped, are you sure that the VM you are using (_#tester) is under the correct subnet which should include the iLB IP;10.240.0.226 as well?
Found the solution, the only thing I need to do is to add the following to the Service declaration:
externalTrafficPolicy: 'Local'
Full yaml as below
apiVersion: v1
kind: Service
metadata:
name: echo-server
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec:
type: LoadBalancer
externalTrafficPolicy: 'Local'
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: echo-server
previously it was set to 'Cluster'.
Just got off with azure support, seems like a specific bug on this (it happens with newer version of the AKS), posting the related link here: https://github.com/kubernetes/ingress-nginx/issues/8501

How do I access Kubernetes pods through a single IP?

I have a set of pods running based on the following fleet:
apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: bungee
spec:
replicas: 2
template:
metadata:
labels:
run: bungee
spec:
ports:
- name: default
containerPort: 25565
protocol: TCP
template:
spec:
containers:
- name: bungee
image: a/b:test
I can access these pods outside the cluster with <node-IP>:<port> where the port is random per pod given by Agones.
My goal is to be able to connect to these pods through a single IP, meaning I have to add some sort of load balancer. I tried using this service of type LoadBalancer, but I can't connect to any of the pods with it.
apiVersion: v1
kind: Service
metadata:
name: bungee-svc
spec:
type: LoadBalancer
loadBalancerIP: XXX.XX.XX.XXX
ports:
- port: 25565
protocol: TCP
selector:
run: bungee
externalTrafficPolicy: Local
Is a service like this the wrong approach here, and if so what should I use instead? If it is correct, why is it not working?
Edit: External IP field says pending while checking the service status. I am running Kubernetes on bare-metal.
Edit 2: Attempting to use NodePort as suggested, I see the service has not been given an external IP address. Trying to connect to <node-IP>:<nodePort> does not work. Could it be a problem related to the selector?
LoadBalancer Services could have worked, in clusters that are integrating with the API of the cloud provider hosting your Kubernetes nodes (cloud-controller-manager component). Since this is not your case, you're looking for a NodePort Service.
Something like:
apiVersion: v1
kind: Service
metadata:
name: bungee-svc
spec:
type: NodePort
ports:
- port: 25565
protocol: TCP
selector:
run: bungee
Having created that service, you can check its description - or yaml/json representation:
# kubectl describe svc xxx
Type: NodePort
IP: 10.233.24.89 <- ip within SDN
Port: tcp-8080 8080/TCP <- ports within SDN
TargetPort: 8080/TCP <- port on your container
NodePort: tcp-8080 31655/TCP <- port exposed on your nodes
Endpoints: 10.233.108.232:8080 <- pod:port ...
Session Affinity: None
Now, I know the port 31655 was allocated to my NodePort Service -- ports are unique on your cluster, they are picked within a range, depends on your cluster configuration.
I can connect to my service, accessing any Kubernetes node IP, on the port that was allocated to my NodePort service.
curl http://k8s-worker1.example.com:31655/
As a sidenote: a LoadBalancer Service extends a NodePort Service. While the externalIP won't ever show up, note that your Service was already allocated with its own port, as any NodePort Service - which is meant to receive traffic from whichever LoadBalancer would have been configured on behalf of your cluster, onto the cloud infrastructure it is integrated with.
And ... I have to say I'm not familiar with Agones. When you say "I can access these pods outside the cluster with <node-IP>:<port> where the port is random per pod given by Agones". Are you sure ports are allocated on a per-pod basis, and bound to a given node? Or could it be they're already using a NodePort Service. Give it another look: have you tried connecting that port on other nodes of your cluster?

Kubernetes LoadBalancer Service not loadbalancing requests

I have a simple microservice setup running in a minikube cluster. It is inspired by this example.
My setup includes a simple router microservice that contains a golang webserver. What I want to test now is the loadbalancing when there is more then one pod. But there seems to be no load-balancing whatsoever.
The kubernetes file for the microservices looks like this:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: router
labels:
app: router
tier: router
spec:
replicas: 2
strategy: {}
template:
metadata:
labels:
app: router
tier: router
spec:
containers:
- image: {myregistry}/router
name: router
resources: {}
ports:
- name: target-port
containerPort: 8082
env:
- name: PORT
value: "8082"
status: {}
---
apiVersion: v1
kind: Service
metadata:
name: router
labels:
app: router
tier: router
spec:
type: LoadBalancer
selector:
app: router
tier: router
ports:
- port: 8082
name: http
targetPort: target-port
The skaffold config looks like this:
apiVersion: skaffold/v1beta2
kind: Config
build:
artifacts:
- image: {myregistry}/router
context: src/router/bin
tagPolicy:
gitCommit: {}
local:
push: false
deploy:
kubectl:
manifests:
- ./kubernetes/**.yaml
Kubernetes correctly deploys two pods. The output of kubectl get pods looks like this:
NAME READY STATUS RESTARTS AGE
router-7f75f6f9df-c8mgp 1/1 Running 0 14m
router-7f75f6f9df-k248m 1/1 Running 0 14m
From the skaffold dev log output I can see that every request is routed to the router-7f75f6f9df-c8mgp pod. Even with different browsers all requests end up at the exact same pod.
When I delete this pod there is even a slight downtime of the router microservice even though there is another pod running.
What could be the problem of this behavior?
minikube doesn't 'properly' support the LoadBalancer service type. It used to be commonplace to just use the NodePort or externalIP service type instead, however the official hello-minikube sample now states:
On cloud providers that support load balancers, an external IP address
would be provisioned to access the Service. On Minikube, the
LoadBalancer type makes the Service accessible through the minikube
service command
So effectively you should be able to use your minikube LoadBalancer service with: minikube service router
However there is a neat solution that was developed for bare-metal kubernetes clusters called metallb that may be able to help you test this in a better way on minikube.
You can install and configure it on minikube. E.g.
kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.8.1/manifests/metallb.yaml
Here are some blog posts where others have explained the setup and use of metallb with minikube for LoadBalancer support:
Blog Post 1
Blog Post 2
Here are the official docs.
Hope that helps!

Kubernetes Load balancer server connection refused:Default 80 port is working

After deploying a spring microservice ,Load balancer in Kubernetes is not connecting to the mentioned port in Google Cloud Platform.
Is there any firewall settings we need to change to connect to the deployed service ?
https://serverfault.com/questions/912734/kubernetes-connection-refused-during-deployment
Most likely this is an issue with your Kubernetes Service and/or Deployment. GKE will automatically provision the firewall rules required for the ports mapped to the Service resource.
Ensure that you have exposed port 80 on your Service and also mapped it to a valid port on your Deployment's Pods
Here is an example of using a Deployment and Service to expose an nginx pod:
deployment.yaml:
apiVersion: apps/v1 # API Version of this Object
kind: Deployment # This Object Type
metadata: # Allows you to specify custom metadata
name: nginx # Specifies the name of this object
spec: # The official specification matching object type schema
selector: # Label selector for pods
matchLabels: # Must match these label(s)
app: nginx # Custom label with value
template: # Template describes the pods that are created
metadata: # Standard objects metadata
labels: # Labels used to group/categorize objects
app: nginx # The name of this template
spec: # Specification of the desired behaviour of this pod
containers: # List of containers belonging to this pod (cannot be changed/updated)
- name: nginx # Name of this container
image: nginx # Docker image used for this container
ports: # Port mapping(s)
- containerPort: 80 # Number of port to expose on this pods ip
service.yaml:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
type: LoadBalancer
selector:
app: nginx
ports:
- name: http
port: 80
targetPort: 80
To see what ip address (and ports) are being mapped you can run:
kubectl get services and kubectl describe pod <your pod name>`
If you are still having problems please provide the outputs of the two kubectl commands above.
Good luck!

Kubernetes to find Pod IP from another Pod

I have the following pods hello-abc and hello-def.
And I want to send data from hello-abc to hello-def.
How would pod hello-abc know the IP address of hello-def?
And I want to do this programmatically.
What's the easiest way for hello-abc to find where hello-def?
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: hello-abc-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: hello-abc
spec:
containers:
- name: hello-abc
image: hello-abc:v0.0.1
imagePullPolicy: Always
args: ["/hello-abc"]
ports:
- containerPort: 5000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: hello-def-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: hello-def
spec:
containers:
- name: hello-def
image: hello-def:v0.0.1
imagePullPolicy: Always
args: ["/hello-def"]
ports:
- containerPort: 5001
---
apiVersion: v1
kind: Service
metadata:
name: hello-abc-service
spec:
ports:
- port: 80
targetPort: 5000
protocol: TCP
selector:
app: hello-abc
type: NodePort
---
apiVersion: v1
kind: Service
metadata:
name: hello-def-service
spec:
ports:
- port: 80
targetPort: 5001
protocol: TCP
selector:
app: hello-def
type: NodePort
Preface
Since you have defined a service that routes to each deployment, if you have deployed both services and deployments into the same namespace, you can in many modern kubernetes clusters take advantage of kube-dns and simply refer to the service by name.
Unfortunately if kube-dns is not configured in your cluster (although it is unlikely) you cannot refer to it by name.
You can read more about DNS records for services here
In addition Kubernetes features "Service Discovery" Which exposes the ports and ips of your services into any container which is deployed into the same namespace.
Solution
This means, to reach hello-def you can do so like this
curl http://hello-def-service:${HELLO_DEF_SERVICE_PORT}
based on Service Discovery https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables
Caveat: Its very possible that if the Service port changes, only pods that are created after the change in the same namespace will receive the new environment variables.
External Access
In addition, you can also reach this your service externally since you are using the NodePort feature, as long as your NodePort range is accessible from outside.
This would require you to access your service by node-ip:nodePort
You can find out the NodePort which was randomly assigned to your service with kubectl describe svc/hello-def-service
Ingress
To reach your service from outside you should implement an ingress service such as nginx-ingress
https://github.com/helm/charts/tree/master/stable/nginx-ingress
https://github.com/kubernetes/ingress-nginx
Sidecar
If your 2 services are tightly coupled, you can include both in the same pod using the Kubernetes Sidecar feature. In this case, both containers in the pod would share the same virtual network adapter and accessible via localhost:$port
https://kubernetes.io/docs/concepts/workloads/pods/pod/#uses-of-pods
Service Discovery
When a Pod is run on a Node, the kubelet adds a set of environment
variables for each active Service. It supports both Docker links
compatible variables (see makeLinkVariables) and simpler
{SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT variables, where the
Service name is upper-cased and dashes are converted to underscores.
Read more about service discovery here:
https://kubernetes.io/docs/concepts/services-networking/service/#environment-variables
You should be able to reach hello-def-service from pods in hello-abc via DNS as specified here: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services
However, kube-dns or CoreDNS has to be configured/installed in your k8s cluster before DNS records can be utilized in your cluster.
Specifically, you should be reach hello-def-service via the DNS record http://hello-def-service for the service running in the same namespace as hello-abc-service
And you should be able to reach hello-def-service running in another namespace ohter_namespace via the DNS record hello-def-service.other_namespace.svc.cluster.local.
If, for some reason, you do not have DNS add-ons installed in your cluster, you still can find the virtual IP of the hello-def-service via environment variables in hello-abc pods. As is documented here.