I am working with aks service. I have a Tensorflow serving image at azure container registry. Now when I deploy my service, the public service endpoint is not accessible neither is it pingable.
My image is exposed at port 8501 , so I am using it as a target port in my yaml.
Here is the yaml file I am using for this deployment.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-model-gpu
spec:
replicas: 1
template:
metadata:
labels:
app: my-model-gpu
spec:
containers:
- name: my-model-gpu
image: dsdemocr.azurecr.io/work-place-safety-gpu
ports:
- containerPort: 8501
resources:
limits:
nvidia.com/gpu: 1
imagePullSecrets:
- name: registrykey
---
apiVersion: v1
kind: Service
metadata:
name: my-model-gpu
spec:
type: LoadBalancer
ports:
- port: 8501
protocol: "TCP"
targetPort: 8501
selector:
app: my-model-gpu
below are my svc description : kubectl describe svc my-model-gpu
Name: my-model-gpu
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-model-gpu","namespace":"default"},"spec":{"ports":[{"port":850...
Selector: app=my-model-gpu
Type: LoadBalancer
IP: 10.0.244.106
LoadBalancer Ingress: 52.183.17.101
Port: <unset> 8501/TCP
TargetPort: 8501/TCP
NodePort: <unset> 31546/TCP
Endpoints: 10.244.0.22:8501
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 10m service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 9m8s service-controller Ensured load balancer
Looks like I am making some mistake with port mapping. Any help is much appreciated.
From the information you provided, there is no problem with the service having the load balancer type. As I think, the possible reasons are all about your application and I would list below:
the port which you expose is not right, so you need to make sure what is the right port to expose.
I see you want to use GPU in AKS, so you need to choose the right VM size for GPU in the creation time. This may also cause the problem that your application not in the running state.
there are other problems happen to your application and it cause your application not run well. So you also need to check your application state.
To the imagePullSecret, maybe you do not assign enough permission for the service principal to pull the image. This reason is less likely, but I also list here.
Hope this helps you.
Please follow the advice provided by community:
Verify if your application is working properly,
Does the Service have any Endpoints?
Researching this topic I would suggest to take a look for those Azure specific information:
Use a static public IP address with the Azure Kubernetes Service (AKS) load balancer
Preview - Use a Standard SKU load balancer in Azure Kubernetes Service (AKS)
If the static IP address defined in the loadBalancerIP property of the Kubernetes service manifest does not exist, or has not been created in the node resource group and no additional delegations configured, the load balancer service creation fails.
There is very similar case on github.
If using Advanced networking it creates the vNet in the same resource group as the AKS service by default.
Note:
Currently only Basic IP SKUis supported. Work is in progress to support the Standard IP resource SKU. For more information, see IP address types and allocation methods in Azure.
Additional resources:
Azure-LoadBalnacer related:
Azure - Service Type LoadBalancer
Azure - Internal load balancer
Hope this help.
The container i was trying to access had no port open on 8501 , once i fixed it it worked well.
I think you need to access application on 52.183.17.101:8501
Because you didn't defined the routing the traffic to 80 port of loadbalancer.
By default it will create loadbalancer listening on 8501.
Related
I am trying to setup access to an external MySQL database that is setup using MariaDB PXC three node cluster.
Let say my external database nodes has these IP addresses
172.16.10.100
172.16.10.101
172.16.10.102
In case one of those nodes goes down, I want Kubernetes to automatically route traffic to only available two nodes.
If I create a simple Service and Endpoints in kubernetes (showing below), does it automatically do a failover?
#
# Service
#
kind: Service
apiVersion: v1
metadata:
name: mariadb-service
spec:
clusterIP: None
sessionAffinity: None
ports:
- port: 3306
protocol: TCP
targetPort: 3306
---
#
# Endpoints
#
kind: Endpoints
apiVersion: v1
metadata:
name: mariadb-service
subsets:
- addresses:
- ip: 172.16.10.100
- ip: 172.16.10.101
- ip: 172.16.10.102
ports:
- port: 3306
protocol: TCP
Please note this:
Kubernetes does not offer an implementation of network load balancers (Services of type LoadBalancer) for bare-metal clusters. The implementations of network load balancers that Kubernetes does ship with are all glue code that calls out to various IaaS platforms (GCP, AWS, Azure…). If you’re not running on a supported IaaS platform (GCP, AWS, Azure…), LoadBalancers will remain in the “pending” state indefinitely when created.
I found the option to have LoadBalancer with on premises Kubernetes cluster - that is MetalLB
MetalLB aims to redress this imbalance by offering a network load balancer implementation that integrates with standard network equipment, so that external services on bare-metal clusters also “just work” as much as possible.
See the requirements and the installation options you prefer.
MetalLB respects the spec.loadBalancerIP parameter, so if you want your service to be set up with a specific address, you can request it by setting that parameter. MetalLB also supports requesting a specific address pool, if you want a certain kind of address but don’t care which one exactly. To request assignment from a specific pool, add the metallb.universe.tf/address-pool annotation to your service, with the name of the address pool as the annotation value. For example:
apiVersion: v1
kind: Service
metadata:
name: nginx
annotations:
metallb.universe.tf/address-pool: production-public-ips
spec:
ports:
- port: 80
targetPort: 80
selector:
app: nginx
type: LoadBalancer
And for address-pool annotation you can define it something like:
address-pools:
name: production-public-ips
protocol: TCP
addresses:
- ip: 172.16.10.100
- ip: 172.16.10.101
- ip: 172.16.10.102
ports:
- port: 3306
Find full example usage here.
However, there is no health check implemented, that is main option for you.
There is an ideal issue example on GitHub and this thread explaining on why health check is not implemented for Kubernetes CRDs.
If we jump into Kubernetes concepts, this use case is not doable and you can seek for some custom endpoint controller:
readinessProbe: Indicates whether the container is ready to respond to requests. If the readiness probe fails, the endpoints controller removes the Pod's IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.
Please find more information in official documentation
I am using HAProxy as the ingress-controller in my GKE clusters. And exposing HAProxy service as LoadBalancer service(Internal).
Recently, I experienced an issue, where the HA-Proxy service changed its EXTERNAL-IP, and traffic stopped routing to HAProxy. This issue occurred multiple times on different days(now it has stopped). I had to manually add that new External-IP to the frontend of that Loadbalancer to allow traffic to HAProxy.
There were two pods running for HAProxy, and both had been running for days, and there was nothing in their logs. I assume it was something related to Service or GCP LB and not HAProxy itself.
I am afraid that I don't have any logs related to that.
I still don't know, what caused the service IP to change. As there were no recent changes, and the cluster and all services were running for many days properly, and suddenly this occurred.
Has anyone faced a similar issue earlier? Or what can I do to avoid such issue in future?
What could have caused the IP to change?
This is how my service is configured:
---
apiVersion: v1
kind: Service
metadata:
labels:
run: haproxy-ingress
name: haproxy-ingress
namespace: haproxy-controller
annotations:
cloud.google.com/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
cloud.google.com/network-tier: "Premium"
spec:
selector:
run: haproxy-ingress
type: LoadBalancer
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
- name: https
port: 443
protocol: TCP
targetPort: 443
- name: stat
port: 1024
protocol: TCP
targetPort: 1024
Found some logs:
Warning SyncLoadBalancerFailed 30m (x3570 over 13d) service-controller Error syncing load balancer: failed to ensure load balancer: googleapi: Error 409: IP_IN_USE_BY_ANOTHER_RESOURCE - IP '10.17.129.17' is already being used by another resource.
Normal EnsuringLoadBalancer 3m33s (x3576 over 13d) service-controller Ensuring load balancer
The Short answer is: External IP for the service are ephemeral.
Because HA-Proxy controller pods are recreated the HA-Proxy service is created with an ephemeral IP.
To avoid this issue, I would recommend using a static IP that you can reference in the loadBalancerIP field.
This can be done by following steps:
Reserve a static IP. (link)
Use this IP, to create a service (link)
Example YAML:
apiVersion: v1
kind: Service
metadata:
name: helloweb
labels:
app: hello
spec:
selector:
app: hello
tier: web
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
loadBalancerIP: "YOUR.IP.ADDRESS.HERE"
Unfortunately without logs it's hard to say anything for sure. You should check the audit logs that GKE ships to Cloud Logging as that might give you some idea of what happened. One option is the GCP "oops"'d the GLB and GKE recreated it, thus giving it a new IP. I've never heard of that happening with LBs though (it happens pretty often with nodes, but not LBs). A more common case would be you ran some kubectl command that inadvertently removed the Service object and then it was recreated by some management layer you have set up (Argo, Flux, Helm Operator, whatever) but delete+recreate again means it's a new LB with a new IP. The latter case should be visible in the audit logs so check those out for sure.
I'm trying to use a load balancer to expose a service I have running on an EKS pod. My service is defined in a yaml like this:
kind: Service
apiVersion: v1
metadata:
name: mlflow-server
namespace: default
labels:
app: mlflow-server
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
externalTrafficPolicy: Local
type: LoadBalancer
selector:
app: mlflow-server
ports:
- name: http
port: 88
targetPort: http
- name: https
port: 443
targetPort: https
This is to define a service for a pod that I have mlflow server running on. When I apply this and access the external IP generated for the service, I get a This site can’t be reached webpage error. Is there something I'm missing with exposing my service as a load balanced service to access the mlflow ui?
For a basic Loadbalancer type service you do not need the annotation service.beta.kubernetes.io/aws-load-balancer-type: nlb this creates the network load balancer. Now if you need it to be an NLB then there might be following problems:
The nlb takes few minutes to come up when you apply the setting. If you check it just after you deploy it it will not be able to accept the traffic. Please do check if the intended network loadbalancer is up in your AWS-EC2console > Loadbalancer tab.
The second problem that is more likely to happen is that the NLB is can be attached with only some instance types only. To check that you can go through the following link.
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-register-targets.html#register-deregister-targets
So if you actually do not have the need of network loadbalancer remove the annotation as the nlb has an higher charge as well. But, if that is the dire requirement do check with the second option if the instances that you are using on AWS are compatible with Network LoadBalancer.
I have very specific case when my Pod should access to another LoadBalancer service via an ExternalIP.
Is there any way to assign LoadBalancer ExternalIP as an ENV variable to Deployment.yaml?
Thank you in advance!
I don't think this is directly possible in any of the standard templating tools. Part of the problem is that creating a cloud-hosted load balancer is an asynchronous operation, so that external-IP value won't be available until some time after kubectl apply (or the equivalent helm install) has finished.
If you can create the Service in advance then you can hard-code its external IP address or host name into other configuration, but this is intrinsically two steps. (If you're bought into Kubernetes operators, this should be possible with custom code: watch the Service, and once it gets its external address, create a corresponding ConfigMap that holds the address.)
Depending on your specific use case it may also work to just target the LoadBalancer Service within your cluster, the same as any other Service. This won't go out through the cloud provider's load-balancer tier, but it should be indistinguishable otherwise.
I found the way how to do it but #David Maze was perfectly right - there is no straight way how to do it.
So, my solution to add DNS with public and private zones:
apiVersion: v1
kind: Service
metadata:
name: nginx-lb
labels:
app.kubernetes.io/name: nginx-lb
annotations:
external-dns.alpha.kubernetes.io/hostname: mycoolservice.{{ .Values.dns_external_zone }}.
external-dns.alpha.kubernetes.io/zone-type: public,private
external-dns.alpha.kubernetes.io/ttl: "1"
spec:
type: LoadBalancer
ports:
- name: https
port: 443
targetPort: https
- name: http
port: 80
targetPort: http
selector:
app.kubernetes.io/name: nginx
Kubernetes version:
v1.10.3
Docker version:
17.03.2-ce
Operating system and kernel:
Centos 7
Steps to Reproduce:
https://kubernetes.io/docs/tasks/access-application-cluster/service-access-application-cluster/
Results:
[root#rd07 rd]# kubectl describe services example-service
Name: example-service
Namespace: default
Labels: run=load-balancer-example
Annotations:
Selector: run=load-balancer-example
Type: NodePort
IP: 10.108.214.162
Port: 9090/TCP
TargetPort: 9090/TCP
NodePort: 31105/TCP
Endpoints: 192.168.1.23:9090,192.168.1.24:9090
Session Affinity: None
External Traffic Policy: Cluster
Events:
Expected:
Expect to be able to curl the cluster ip defined in the kubernetes service
I'm not exactly sure which is the so called "public-node-ip", so I tried every related ip address, only when using the master ip as the "public-node-ip" it shows "No route to host".
I used "netstat" to check if the endpoint is listened.
I tried "https://github.com/rancher/rancher/issues/6139" to flush my iptables, and it was not working at all.
I tried "https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/", "nslookup hostnames.default" is not working.
The services seems working perfectly fine, but the services still cannot be accessed.
I'm using "calico" and the "flannel" was also tried.
I tried so many tutorials of apply services, they all cannot be accessed.
I'm new to kubernetes, plz if anyone could help me.
If you are on any public cloud you are not supposed to get public ip address at ip a command. But even though the port will be exposed to 0.0.0.0:31105
Here is the sample file you can verify for your configuration:
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: app-name
name: bss
namespace: default
spec:
externalIPs:
- 172.16.2.2
- 172.16.2.3
- 172.16.2.4
externalTrafficPolicy: Cluster
ports:
- port: 9090
protocol: TCP
targetPort: 9090
selector:
k8s-app: bss
sessionAffinity: ClientIP
type: LoadBalancer
status:
loadBalancer: {}
Just replace your <private-ip> at externalIPs: and do curl your public ip with your node port.
If you are using any cloud to deploy application, Also verify configuration from cloud security groups/firewall for opening port.
Hope this may help.
Thank you!
My k8s cluster is 1 master and 1 node.
The service pod is running on the node.
So I used http://nodeip:31105, it shows "Hello Kubernetes!".
But http://masterip:31105 still not working, is it suppose to be right?
I checked the endpoint listen, 31105 is listened on master.