I have two grpc services that communicate to each other using Istio service mesh and have envoy proxies for all the services.
We have an issue where during the scaling up of server pods due to high load, the client throws a few grpc UNAVAILABLE(mostly)/DEADLINE_EXCEEDED/CANCELLED errors for a while as soon as the new pod is ready.
I don't see any CPU throttling in server pods at all.
What else could be the issue and how can I investigate this?
Without the concrete status message, it's hard to say what could be the cause of the errors mentioned above.
To reduce UNAVAILABLE, one way to ask the RPC to wait-for-ready: https://github.com/grpc/grpc/blob/master/doc/wait-for-ready.md. This feature is available in all major gRPC languages (some may rename it to fail-fast=false).
DEADLINE_EXCEEDED is caused by the timeout set by your application or Envoy config, you should be able to tune it.
CANCELLED could mean: 1. the server is entering a graceful shutdown state; 2. the server is overloaded and rejecting new connections.
I have two API that are publicly exposed, let's say xyz.com/apiA and xyz.com/apiB.
Both these API are running node and are dockerized services running as individual pods in the same namespace of a Kubernetes cluster.
Now, apiA calls apiB internally as part of its code logic. The apiA service makes a POST call to apiB and sends with it a somewhat large payload in its body parameter. This POST request times out if the payload in its body is more than 30kb.
We have checked the server logs and that POST request is not seen.
The error prompt shows connection timeout to 20.xx.xx.xx which is the public ip address of xyz.com
I'm new to Kubernetes and would appreciate your help.
So far have tried this, but it didn't help.
Please let me know if more information is needed.
Edit: kubectl client and server version is 1.22.0
To update the kind folks who took time to understand the problem and suggest solutions - The issue was due to bad routing. Internal APIs (apiB in the example above) should not be called using full domain name of xyz.com/apiB, rather they can be directly referenced using pod name as
http://pod_name.namespace.svc.local/apiB.
This will ensure internal calls are routed through Kubernetes DNS and don't have to go through load balancer and nginx, thus improving response time heavily.
Every call made to apiA was creating a domino effect by populating hundreds of calls to apiB and overloading the server, which caused it to fail only after a few thousand requests.
Lesson learned: Route all internal calls using cluster's internal network.
I am new to distributed systems, and came to this problem once needed to deploy a gRPC service to kubernetes (GKE). As far as I know, when a client initiate an rpc, it creates a long lasting http2 connection and further calls are multiplexed on it. I like to send/push notifications or similar messages to the client through this connection. If I deploy to multiple pod, then the connections are spread across them, and not sure what is the best way to locate the instance where the channel is registered to the client. A possible solution could be, as soon as user initiate a connection, keep a reference of clientId and pod ip (or some identification) in a centralized service and other pods lookup the pod and forward the message to it. Is something like is advisable or is there an existing solution for this? I am unfamiliar with this space and any suggestion is highly appreciated.
Edit: (response to #mebius99)
While looking at deploying option, I stumbled upon GKE, and other cloud deployment options were limited because of my use of gRPC/http2. Thanks for mentioning service discovery , and that or service mesh might be an option. With gRPC, client maintains a long lived connection to a single pod. So, I want every pod to be able to query, based on unique clientId (clients can do an initial register rpc call), which pod is it connected, so can make use of this connection and also a way pods to forward the message between them. So, something like when I get a registration call from client, I update the central registry about the client and pod ip, then look it up from any pod and forward package to it so it further forward to client through the existing streaming connection. You guiding me to the right direction, please let me know above is possible in container environment.
thank you.
Another idea, You can use Envoy proxy.
If you are using GKE, these posts are helpful.
https://cloud.google.com/solutions/exposing-grpc-services-on-gke-using-envoy-proxy
https://github.com/GoogleCloudPlatform/grpc-gke-nlb-tutorial
I'd suggest to start from the Kubernetes Service concept and Service discovery. The External HTTP(S) Load Balancing should fit your needs.
In case you need something more sophisticated, Envoy proxy + Network Load Balancing could be a solution, as is mentioned here.
It sounds like you want to implement some kind of Pub-Sub system.
You must do some back-of-envelop calculation of the scale, such as how many clients, how many messages per second first.
Then you can choose whether to implement yourself or pick an off-the-shelf system, such as https://doc.akka.io/docs/alpakka/current/google-cloud-pub-sub-grpc.html
I just want to add more explanations to the existing answers here.
Since requests in HTTP/2 is multiplexed (multiple requests can be active on the same connection at any point in time), requests will be just pinned to a single Kubernetes pod. Hence, we need to configure a service mesh to shift from connection-based balancing to request-based balancing. Envoy Proxy mentioned here is one example.
I'd recommend everyone to read this good article from Kubernetes blog https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears.
Currently I have an Kubernetes cluster which is to analyze Video Feeds and send particular results based on the video. I wish to send an HTTP request from my Kubernetes pod from from time to time if the requested Video needs to be retrieved over the internet. However all of these requests seem to Fail. When I issued a CURL command in the shell of the container I receive an error message saying
could not resolve host
I have looked into a few answers and many of them involve exposing a port in the container running a server to the Kubernetes node and making it publicly available through a external IP which I have already Implemented.
I am still very much a Novice in this area. So any guidance is appreciated
Answered in comments, kube-dns was unavailable due to resource constraints.
I was trying fortio server/client application on istio. I used istoctl for injecting istio dependency and my serer pod was came up fine. But client pod was giving connection refused error due to proxy sidecar is not yet ready to handle connection request of client. Please help me addressing this issue. For reference attaching my yaml files.
This is by design and there is no way around it.
The part responsible for configuration of the iptables for capturing the traffic is run as an init container, which ensures that the required rules are in place before any of the normal pod containers start up. If you use istio for all the traffic, then until it's container is ready, no network traffic will reach in/out of the container.
You should make sure your application handles this right. Apps should be able to withstand unavailability of it's dependencies for a time, both on startup and during operation. In worst case you can introduce your own handling in form of ie. custom entrypoint that awaits for communication to be up.