GKE streaming large file download fails with partial response - kubernetes

I have an app hosted on GKE which, among many tasks, serve's a zip file to clients. These zip files are constructed on the fly through many individual files on google cloud storage.
The issue that I'm facing is that when these zip's get particularly large, the connection fails randomly part way through (anywhere between 1.4GB to 2.5GB). There doesn't seem to be any pattern with timing either - it could happen between 2-8 minutes.
AFAIK, the connection is disconnecting somewhere between the load balancer and my app. Is GKE ingress (load balancer) known to close long/large connections?
GKE setup:
HTTP(S) load balancer ingress
NodePort backend service
Deployment (my app)
More details/debugging steps:
I can't reproduce it locally (without kubernetes).
The load balancer logs statusDetails: "backend_connection_closed_after_partial_response_sent" while the response has a 200 status code. A google of this gave nothing helpful.
Directly accessing the pod and downloading using k8s port-forward worked successfully
My app logs that the request was cancelled (by the requester)
I can verify none of the files are corrupt (can download all directly from storage)

I believe your "backend_connection_closed_after_partial_response_sent" issue is caused by websocket connection being killed by the back-end prematurily. You can see the documentation on websocket proxying in nginx - it explains the nature of this process. In short - by default WebSocket connection is killed after 10 minutes.
Why it works when you download the file directly from the pod ? Because you're bypassing the load-balancer and the websocket connection is kept alive properly. When you proxy websocket then things start to happen because WebSocket relies on hop-by-hop headers which are not proxied.
Similar case was discussed here. It was resolved by sending ping frames from the back-end to the client.
In my opinion your best shot is to do the same. I've found many cases with similar issues when websocket was proxied and most of them suggest to use pings because it will reset the connection timer and will keep it alive.
Here's more about pinging the client using WebSocket and timeouts
I work for Google and this is as far as I can help you - if this doesn't resolve your issue you have to contact GCP support.

Related

grpc client shows Unavailable errors during server scale up

I have two grpc services that communicate to each other using Istio service mesh and have envoy proxies for all the services.
We have an issue where during the scaling up of server pods due to high load, the client throws a few grpc UNAVAILABLE(mostly)/DEADLINE_EXCEEDED/CANCELLED errors for a while as soon as the new pod is ready.
I don't see any CPU throttling in server pods at all.
What else could be the issue and how can I investigate this?
Without the concrete status message, it's hard to say what could be the cause of the errors mentioned above.
To reduce UNAVAILABLE, one way to ask the RPC to wait-for-ready: https://github.com/grpc/grpc/blob/master/doc/wait-for-ready.md. This feature is available in all major gRPC languages (some may rename it to fail-fast=false).
DEADLINE_EXCEEDED is caused by the timeout set by your application or Envoy config, you should be able to tune it.
CANCELLED could mean: 1. the server is entering a graceful shutdown state; 2. the server is overloaded and rejecting new connections.

Internal API call from one pod to another pod in same Kubernetes cluster times out

I have two API that are publicly exposed, let's say xyz.com/apiA and xyz.com/apiB.
Both these API are running node and are dockerized services running as individual pods in the same namespace of a Kubernetes cluster.
Now, apiA calls apiB internally as part of its code logic. The apiA service makes a POST call to apiB and sends with it a somewhat large payload in its body parameter. This POST request times out if the payload in its body is more than 30kb.
We have checked the server logs and that POST request is not seen.
The error prompt shows connection timeout to 20.xx.xx.xx which is the public ip address of xyz.com
I'm new to Kubernetes and would appreciate your help.
So far have tried this, but it didn't help.
Please let me know if more information is needed.
Edit: kubectl client and server version is 1.22.0
To update the kind folks who took time to understand the problem and suggest solutions - The issue was due to bad routing. Internal APIs (apiB in the example above) should not be called using full domain name of xyz.com/apiB, rather they can be directly referenced using pod name as
http://pod_name.namespace.svc.local/apiB.
This will ensure internal calls are routed through Kubernetes DNS and don't have to go through load balancer and nginx, thus improving response time heavily.
Every call made to apiA was creating a domino effect by populating hundreds of calls to apiB and overloading the server, which caused it to fail only after a few thousand requests.
Lesson learned: Route all internal calls using cluster's internal network.

how sockets or communication channels are maintained in distibuted system

I am new to distributed systems, and came to this problem once needed to deploy a gRPC service to kubernetes (GKE). As far as I know, when a client initiate an rpc, it creates a long lasting http2 connection and further calls are multiplexed on it. I like to send/push notifications or similar messages to the client through this connection. If I deploy to multiple pod, then the connections are spread across them, and not sure what is the best way to locate the instance where the channel is registered to the client. A possible solution could be, as soon as user initiate a connection, keep a reference of clientId and pod ip (or some identification) in a centralized service and other pods lookup the pod and forward the message to it. Is something like is advisable or is there an existing solution for this? I am unfamiliar with this space and any suggestion is highly appreciated.
Edit: (response to #mebius99)
While looking at deploying option, I stumbled upon GKE, and other cloud deployment options were limited because of my use of gRPC/http2. Thanks for mentioning service discovery , and that or service mesh might be an option. With gRPC, client maintains a long lived connection to a single pod. So, I want every pod to be able to query, based on unique clientId (clients can do an initial register rpc call), which pod is it connected, so can make use of this connection and also a way pods to forward the message between them. So, something like when I get a registration call from client, I update the central registry about the client and pod ip, then look it up from any pod and forward package to it so it further forward to client through the existing streaming connection. You guiding me to the right direction, please let me know above is possible in container environment.
thank you.
Another idea, You can use Envoy proxy.
If you are using GKE, these posts are helpful.
https://cloud.google.com/solutions/exposing-grpc-services-on-gke-using-envoy-proxy
https://github.com/GoogleCloudPlatform/grpc-gke-nlb-tutorial
I'd suggest to start from the Kubernetes Service concept and Service discovery. The External HTTP(S) Load Balancing should fit your needs.
In case you need something more sophisticated, Envoy proxy + Network Load Balancing could be a solution, as is mentioned here.
It sounds like you want to implement some kind of Pub-Sub system.
You must do some back-of-envelop calculation of the scale, such as how many clients, how many messages per second first.
Then you can choose whether to implement yourself or pick an off-the-shelf system, such as https://doc.akka.io/docs/alpakka/current/google-cloud-pub-sub-grpc.html
I just want to add more explanations to the existing answers here.
Since requests in HTTP/2 is multiplexed (multiple requests can be active on the same connection at any point in time), requests will be just pinned to a single Kubernetes pod. Hence, we need to configure a service mesh to shift from connection-based balancing to request-based balancing. Envoy Proxy mentioned here is one example.
I'd recommend everyone to read this good article from Kubernetes blog https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears.

Kubernetes send HTTP request from Pods

Currently I have an Kubernetes cluster which is to analyze Video Feeds and send particular results based on the video. I wish to send an HTTP request from my Kubernetes pod from from time to time if the requested Video needs to be retrieved over the internet. However all of these requests seem to Fail. When I issued a CURL command in the shell of the container I receive an error message saying
could not resolve host
I have looked into a few answers and many of them involve exposing a port in the container running a server to the Kubernetes node and making it publicly available through a external IP which I have already Implemented.
I am still very much a Novice in this area. So any guidance is appreciated
Answered in comments, kube-dns was unavailable due to resource constraints.

Application container unable to access network before sidecar ready

I was trying fortio server/client application on istio. I used istoctl for injecting istio dependency and my serer pod was came up fine. But client pod was giving connection refused error due to proxy sidecar is not yet ready to handle connection request of client. Please help me addressing this issue. For reference attaching my yaml files.
This is by design and there is no way around it.
The part responsible for configuration of the iptables for capturing the traffic is run as an init container, which ensures that the required rules are in place before any of the normal pod containers start up. If you use istio for all the traffic, then until it's container is ready, no network traffic will reach in/out of the container.
You should make sure your application handles this right. Apps should be able to withstand unavailability of it's dependencies for a time, both on startup and during operation. In worst case you can introduce your own handling in form of ie. custom entrypoint that awaits for communication to be up.