Kubernetes - How can I route pod egress traffic based on ingress - kubernetes

I have an application that currently is deployed multiple times to an Kubernetes cluster because it needs to call different sources based on the URLs that it is called with. You could call them lanes. These deployments are well automated by a CI/CD pipeline.
The problem is, most of these lanes are not frequently used, yet they still need to be available. And we have a lot of applications following this pattern. I'd love to be able to deploy less pods that can handle the traffic of all lanes and still call the appropriate dependencies.
I know we could 'fix' this problem by incorporating switching logic within the application and passing the lane in a header or something, but that seems like a can of worms and can be problematic in production if that logic isn't necessary there.
I know you can have a single deployment with multiple ingresses.
Is it possible to use the ingress API to accomplish something like this in my kube.yml where I could choose or rewrite outgoing urls based on which ingress was called?
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/
Is there something in the nginx API that I could use in my kube.yml to accomplish this?
https://docs.nginx.com/nginx-ingress-controller/
Thanks for the help!

Related

Controlling the user experience when doing canary or A/B deployments with Istio

I have an application with multiple services called from a primary application service. I understand the basics of doing canary and A/B deployments, however all the examples I see show a round robin where each request switches between versions.
What I'd prefer is that once a given user/session is associated with a certain version it stays that way to avoid giving a confusing experience to the user.
How can this be achieved with Kubernetes or Istio/Envoy?
You can do this with Istio using Request Routing - Route based on user identity but I don't know how mature the feature is. It may also be possible to route based on cookies or header values.
We've been grappling with this because we want to deploy test microservices into production and expose them only if the first request contains a "dark release" header.
As mentioned by Jonas, cookies and header values can in theory be used to achieve what you're looking for. It's very easy to achieve if the service that you are canarying is on the edge, and your user is directly accessing.
The problem is, you mention you have multiple services. If you have a chain where the user accesses edge service A which is then making calls to service B, service C etc, the headers or cookies will not be propagated from one service to another.
This is the same problem that we hit when trying to do distributed tracing. The Istio documents currently have this FAQ:
https://istio.io/faq/distributed-tracing/#istio-copy-headers
The long and short of that is that you will have to do header propagation manually. Luckily most of my microservices are built on Spring Boot and I can achieve header propagation with a simple 5-line class that intercepts all outgoing calls. But it is nonetheless invasive and has to be done everywhere. The antithesis of a service mesh.
It's possible there is a clever way around this but it's hard to infer from the docs what is possible and what isn't. I've seen a few github issues raised by Istio developers to address this but every one I've seen has gone stale after initial enthusiasm.

How to direct request to pod with proper resource in Kubernetes

I'm new to Kubernetes and want to know the best approach to this problem.
I have a varying set of large models (~5GB) that need to be loaded into memory for my application to run. The app handles a request that specifies which model it needs, but the actual task is the same. I don't want to load all of the models with a single pod for cost reasons, and so models can be added/removed more easily. Can I have a single service with pods that each have a different subset of the resources loaded (1 or 2 each), with a request being directed to the pod that has the model it needs? Or do I need to make a service for each model, and then a gateway service in front of all of them?
I'm thinking it might be possible with process namespaces, but I'm not sure how customizable a service master can be in terms of parsing a request parameter and sending it the right namespace.

How to notify POD in statefull set about other PODS in Kubernetes

I was reading the tutorial on deploying a Cassandra ring and zookeeper with statefulsets. What I don't understand is if I decide to add another replica into the statefulset, how do I notify the other PODS that there is another one. What are best practices for it? I want to be able for one POD to redirect request to another POD in my custom application in case the request doesn't belong to it (ie. it doesn't have the data)
Well, seems like you want to run a clustered application inside kubernetes. It is not something that kubernetes is directly responsible for. The cluster coordination for given solution should be handled within it, and a response to a "how to" question can not be generic.
Most of the softwares out there will have some kind of coordination, discovery and registration mechanism. Be it preconfigured members, external dioscovery catalog/db or some networ broadcasting.
StatefulSet helps a lot in it by retaining network identity under service/pod, or helping to keep storage, so you can ie. always point your new replicas to register with first replica (or preferably one of the first two, cause what if your no.1 is the one that restarted), but as a wrote above, this is pretty much depending on capabilities available on the solution you want to deploy.

Kubernetes: single POD with many container, or many Pod with single container

I've rather a teoretical question which I can't answer with the reousrces found online. The question is: what's the rule to decide how to compose containers in POD? . Let me explain with an example.
I've these microservices:
Authentication
Authorization
Serving content
(plus) OpenResty to forward the calls form one to the other and orhcestarate the flow. (is there a possibility to do so natively in K8?, it seems to have services base on nginx+lua, but not sure how it works)
For the sake of the example I avoid Databases and co, I assume they are external and not managed by kubernetes
Now, what's the correct way here LEFT or RIGHT of the image?
LEFT : this seems easier to make it working, everything works on "localhost" , the downside is that it looses a bit the benefit of the microservices. For example, if the auth become slows and it would need more instances, I've to duplicate the whole pod and not just that service.
RIGHT seems a bit more complex, need services to expose each POD to the other PODs. Yet, here, I could duplicate auth as I need without duplicating the other containers. On the other hand I'll have a lot of pods since each pod is basically a container.
It is generally recommended to keep different services in different pods or better deployments that will scale independently. The reasons are what is generally discussed as benefits of a microservices architecture.
A more loose coupling allowing the different services to be developed independently in their own languages/technologies,
be deployed and updated independently and
also to scale independently.
The exception are what is considered a "helper application" to assist a "primary application". Examples given in the k8s docs are data pullers, data pushers and proxies. In those cases a share file system or exchange via loopback network interface can help with critical performance use cases. A data puller can be a side-car container for an nginx container pulling a website to serve from a GIT repository for example.
right image, each in own pod. multi containers in a pod should really only be used when they are highly coupled or needed for support of the main container such as a data loader.
With separate pods, it allows for each service to be updated and deployed independently. It also allows for more efficient scaling. in the future, you may need 2 or 3 content pods but still only one authorization. if they are all together you scale them all since you don't have a choice with them all together in the same pod.
Right image is better option. Easier management, upgrades, scaling.
Should choose the right side of the structure, on the grounds that the deployment of the left side of the architecture model is tight coupling is not conducive to a module according to the actual needs of the business expansion capacity.

How to monitor (micro)services?

I have a set of services. Every service contains some components.
Some of them are stateless, some of them are stateful, some are synchronous, some are asynchronous.
I used different approaches to monitoring and alerting.
Log-based alerting and metrics gathering. New Relic based. Own bicycle.
Basically, atm I am looking for a way, how to generalize and aggregate important metrics for all services in single place. One of things, I want is that we monitor more products, than separate services.
As an end result I see it as a single dashboard with small amount of widgets, but looking at those widgets I would be able to say for sure, if services are usable to end-customer.
Probably someone can recommend me some approach/methodology. Or give a reference to some best practices.
I like what you're trying to achieve! A service is not production-ready unless it's thoroughly monitored.
I believe what your're describing goes into the topics of health-checking and metrics.
... I would be able to say for sure, if services are usable to end-customer.
That however will require a little of both ;-) To ensure you're currently fulfilling your SLA, you have to make sure, that your services are all a) running and b) perform as requested. With both problems I suggest to look at the StatsD toolchain. Initially developed by Etsy, it has become the de-facto standard for gathering metrics.
To ensure all your services are running, we're relaying Kubernetes. It takes our description for what should run, be reachable from outside etc. and hosts that on our infrastructure. It also makes sure, that should things die - that they will be restarted. It helps with things like auto-scaling etc. as well! Awesome tooling and kudos to Google!
The way it ensures that is with health-checks. There are multiple ways how you can ensure your service node booted by Kubernetes is alive and kicking (namely HTTP calls and CLI scripts but this should be a modular thing should you need anything else!) If Kubernetes detects unhealthy nodes it will immediately phase them out and start another node instead.
Now, making sure, all your services perform as expected you'll need to gather some metrics. For all of our services (and all individual endpoints), we gather a few metrics via StatsD like:
Requests/sec
number of errors returned (404, etc...)
Response times (Average, Median, Percentiles depending on the services SLA)
Payload size (Average)
sometimes the number of concurrent requests per endpoint, the number of instances currently running
general metrics like the hosts current CPU and memory usage and uptime.
We gather a lot more metrics but that's about the bottom line. Since StatsD has become more of a "protocol specification" than a concrete product there are a myriad of collector, front- and backends to choose from. They help you visualize your systems state and many of them feature alerts of something or some combination of metrics go beyond their thresholds.
Let me know, if this was helpfull!
There's at least 3 types of things you will need to monitor: the host where the service is deployed, the component itself and the SLAs and some of them depend on the software stack you're using as well as the architecture.
With that said, you could for example use Nagios to monitor the hardware where the services are deployed, Splunk for the services metrics/SLAs as well as for any errors that might occur. You can also use SNMP packages in case something goes wrong and you have a more sophisticated support structure, this would be yours triggers. Without knowing how your infrastructure/services are set up it is complicated to go into deeper details.