CloudSQL Proxy on GKE : Service vs Sidecar - kubernetes

Does anyone know the pros and cons for installing the CloudSQL-Proxy (that allows us to connect securely to CloudSQL) on a Kubernetes cluster as a service as opposed to making it a sidecar against the application container?
I know that it is mostly used as a sidecar. I have used it as both (in non-production environments), but I never understood why sidecar is more preferable to service. Can someone enlighten me please?

The sidecar pattern is preferred because it is the easiest and more secure option. Traffic to the Cloud SQL Auth proxy is not encrypted or authenticated, and relies on the user to restrict access to the proxy (typically be running local host).
When you run the Cloud SQL proxy, you are essentially saying "I am user X and I'm authorized to connect to the database". When you run it as a service, anyone that connects to that database is connecting authorized as "user X".
You can see this warning in the Cloud SQL proxy example running as a service in k8s, or watch this video on Connecting to Cloud SQL from Kubernetes which explains the reason as well.

The Cloud SQL Auth proxy is the recommended way to connect to Cloud SQL, even when using private IP. This is because the Cloud SQL Auth proxy provides strong encryption and authentication using IAM, which can help keep your database secure.
When you connect using the Cloud SQL Auth proxy, the Cloud SQL Auth proxy is added to your pod using the sidecar container pattern. The Cloud SQL Auth proxy container is in the same pod as your application, which enables the application to connect to the Cloud SQL Auth proxy using localhost, increasing security and performance.
As sidecar is a container that runs on the same Pod as the application container, because it shares the same volume and network as the main container, it can “help” or enhance how the application operates. In Kubernetes, a pod is a group of one or more containers with shared storage and network. A sidecar is a utility container in a pod that’s loosely coupled to the main application container.
Sidecar Pros: Scales indefinitely as you increase the number of pods. Can be injected automatically. Already used by serviceMeshes.
Sidecar Cons: A bit difficult to adopt, as developers can't just deploy their app, but deploy a whole stack in a deployment. It consumes much more resources and it is harder to secure because every Pod must deploy the log aggregator to push the logs to the database or queue.
Refer to the documentation for more information.

Related

Flink native kubernetes deployment

I have some limitations with the rights required by Flink native deployment.
The prerequisites say
KubeConfig, which has access to list, create, delete pods and **services**, configurable
Specifically, my issue is I cannot have a service account with the rights to create/remove services. create/remove pods is not an issue. but services by policy only can be created within an internal tool.
could it be any workaround for this?
Flink creates two service in native Kubernetes integration.
Internal service, which is used for internal communication between JobManager and TaskManager. It is only created when the HA is not enabled. Since the HA service will be used for the leader retrieval when HA enabled.
Rest service, which is used for accessing the webUI or rest endpoint. If you have other ways to expose the rest endpoint, or you are using the application mode, then it is also optional. However, it is always be created currently. I think you need to change some codes to work around.

Usage of custom resource definition (CRD) vs Service Catalog in k8s

I recently started to explore k8s extensions and got introduced to two concepts:
CRD.
Service catalogs.
They look pretty similar to me. The only difference to my understanding is, CRDs are deployed inside same cluster to be consumed; whereas, catalogs are deployed to be exposed outside the cluster for example as database service (client can order cluster of mysql which will be accessible from his cluster).
My query here is:
Is my understanding correct? if yes, can there be any other scenario where I would like to create catalog and not CRD.
Yes, your understanding is correct. Taken from official documentation:
Example use case
An application developer wants to use message queuing as part of their application running in a Kubernetes cluster.
However, they do not want to deal with the overhead of setting such a
service up and administering it themselves. Fortunately, there is a
cloud provider that offers message queuing as a managed service
through its service broker.
A cluster operator can setup Service Catalog and use it to communicate
with the cloud provider’s service broker to provision an instance of
the message queuing service and make it available to the application
within the Kubernetes cluster. The application developer therefore
does not need to be concerned with the implementation details or
management of the message queue. The application can simply use it as
a service.
With CRD you are responsible for provisioning resources, running backend logic and so on.
More info can be found in this KubeCon 2018 presentation.

Whitelist traffic to mysql from a kubernetes service

I have a Cloud MySQL instance which allows traffic only from whitelisted IPs. How do I determine which IP I need to add to the ruleset to allow traffic from my Kubernetes service?
The best solution is to use the Cloud SQL Proxy in a sidecar pattern. This adds an additional container into the pod with your application that allows for traffic to be passed to Cloud SQL.
You can find instructions for setting it up here. (It says it's for GKE, but the principles are the same)
If you prefer something a little more hands on, this codelab will walk you through taking an app from local to on a Kubernetes Cluster.
I am using Google Cloud Platform, so my solution was to add the Google Compute Engine VM instance External IP to the whitelist.

Disadvantages of using eureka for Service Discovery with kubernetes

Context
I am deploying a set of services that are containerised using Docker into AWS. No matter which deployment solution is chosen (e.g. raw EC2/ECS/Elastic Beanstalk/Fargate) we will face the issue of "service discovery".
To name just a few of the options for service discovery that I've considered:
AWS Route 53 Service Registry
Kubernetes
Hashicorp Consul
Spring Cloud Netflix Eureka
Specifics Of My Stack
I am developing Java Spring Boot applications using Spring Cloud with the target deployment environment being AWS.
Given that my stack is Spring based, spring cloud eureka made sense to me while developing locally. It was easy to set up a single node, integrates well with the stack and ecosystem of choice and required very little set up.
Locally, we are using docker compose (not swarm) to deploy services - one of the containers deployed is a single node Eureka service discovery server.
However, when we progress outside of local development and into staging or production environment we are considering options like Kubernetes.
My Own Assessment Of Pros/Cons
AWS Route 53 Service Registry
Requires us to couple code specifically to AWS services. Not a problem per se, we are quite tied in anyway on other parts of the stack (SNS/SQS).
Makes running the stack locally slightly more difficult as it relies on Route 53, I suppose we could open up a certain hosted zone for local development.
AWS native, no managing service registries or extra "moving parts".
Spring Cloud Eureka
Downside is that thus requires us to deploy and manage a high availability service registry cluster and requires more resources. Another "moving part" to manage.
Advantages are that it fits into our stack well (spring ecosystem, spring boot, spring cloud, feign and zuul work well with this). Also can be run locally trivially.
I presume we need to configure the networks and registry zone to ensure that that clients publish their host address rather and docker container internal IP address. e.g. if service A is on host A and wants to talk to service B on host B, service B needs to advertise its EC2 address rather than some internal docker IP.
Questions
If we use Kubernetes for orchestration, are there any disadvantages to using something like Spring Cloud Eureka over the built in service discovery options described here https://kubernetes.io/docs/concepts/services-networking/service/#discovering-services
Given Kube provides this, it seems suboptimal to then use eureka deployed using kube to perform discovery. I presume kube can make some optimisations that impact avaialbility and stability that might nit be possible using eureka. e.g kube would know when deploying a new service - eureka will have to rely on heartbeats/health checks and depending on how that is configured (e.g. frequency) this could result in stale records whereas i presume kube might not suffer from this for planned service shutdown/restarts. I guess it still does for unplanned failures such as a host failure or network partition.
Does anyone have any advice on this, do people use services like Kubernetes but use other mechanisms for service discovery rather than those provided by kube. Is there a good reason to do one or the other?
Possible Challenges I Anticipate
We could replace eureka, but relying on Kube to perform discovery will mean that we need to run kube locally to deploy whereas currently we have a simple tiny docker-compose file. Also, I'll have to look at how easy it'll be to ensure that ribbon, zuul and feign play nicely with this.
Currently we have ribbon configured with a eureka client so that service A can server to service B just as "service-b" for example and have ribbon resolve a healthy host via a eureka client. I guess we can configure ribbon to not use eureka and use an external Kube service name which will be resolved by Kube DNS at runtime...
Final Note
Thanks in advance for any contribution or advice. I know this might elicit a primarily opinion focused response. But I am hoping someone can provide objective guidance on when one solution might be preferable to another.
Service discovery is something you get out-of-the-box with Kubernetes. So having another external service in your platform will be another application to maintain, deploy and can be a point of failure. So I would stick with the the service discovery provided by Kubernetes.

Should my database server be a Pod on the same Service

In terms of providing a url (to a postgres database) for my web server. Should the postgres database be behind it's own Service or is it okay for it to be a Pod on the same Service as the web server?
Can I configure a Pod to have a FQDN that doesn't change?
Its absolutely fine and I would say recommended to keep the database behind its own service in k8s.
The database would need to be backed by a persistent volume as well.
You can reference the service in other webserver/application pods.
As long as you expose the service properly, FQDN should work.
"This is one of the simpler methods, you could evolve based on your network design"