GKE with Hashicorp Vault - Possible to use Google Cloud Run? - kubernetes

I'm looking into deploying a cluster on Google Kubernetes Engine in the near future. I've also been looking into using Vault by Hashicorp in order to manage the secrets that my cluster has access to. Specifically, I'd like to make use of dynamic secrets for greater security.
However, all of the documentation and Youtube videos that cover this type of setup always mention that a set of nodes strictly dedicated to Vault should operate as their own separate cluster - thus requiring more VMs.
I am curious if a serverless approach is possible here. Namely, using Google Cloud Run to create Vault containers on the fly.
This video (should start at the right time) mentions that Vault can be run as a Deployment so I don't see there being an issue with state. And since Google mention that each Cloud Run service gets its own stable HTTPS endpoint, I believe that I can simply pass this endpoint to my configuration and all of the pods will be able to find the service, even if new instances are created. However, I'm new to using Kubernetes so I'm not sure if I'm entirely correct here.
Can anyone with more experience using Kubernetes and/or Vault point out any potential drawbacks with this approach? Thank you.

In beta since 3 weeks, and not officially announced (It should be in a couple of days) you can have a look to secret-manager. It's a serverless secret manager with, I think, all the basic requirements that you need.
The main reason that it has not yet announced, it's because the client library in several languages aren't yet released/finished
The awesome guy on your video link, Seth Vargo, has been involved in this project.
He has also released Berglas. It's write in Python, use KMS for ciphering the secret and Google Cloud Storage for storing them. I also recommend it.
I built a python library to easily use Berglas secret in Python.
Hope that this secret management tool will meet your expectation. In any case, it's serverless and quite cheap!

Related

Automated container-based solution (kubernetes) deployments for customers

Hello fellow developers.
I have the following problem:
I have a solution that relies on a few services each having a container. The way I provide the solution at the moment is create a manual deployment for each client each time using Cloud services. The requirements include scalability, fault tolerance, cost tracking, and access management.
However, instead of creating a deployment manually every time for each client, I would like them to go through another service or app so the new requirements are:
Dashboard to issue a deployment.
Dashboard gives overview of health of services.
Dashboard tracks costs and issues API keys automatically.
My first thought is using kubernetes and creating deployments issued by a user through the dashboard.
I have little experience with kubernetes, so I would appreciate anyone pointing me in a direction and not necessity explaining everything.
Thanks all

How can I migrate a live production kubernetes cluster to another cluster while minimizing downtime?

I would like to migrate an application from one GKE cluster to another, and I'm wondering how to accomplish this while avoiding any downtime for this process.
The application is an HTTP web backend.
Usually how I'd usually handle this in a non GCP/K8S context is have a load balancer in front of the application, setup a new web backend and then just update the appropriate IP address in the load balancer to point from the old IP to the new IP. This would essentially have 0 downtime while also allowing for a seemless rollback if anything goes wrong.
I do not see why this should not work for this context as well however I'm not 100% sure. And if there is a more robust or alternative way to do this (GCP/GKE friendly way), I'd like to investigate that.
So to summarize my question, does GCP/GKE support this type of migration functionality? If not, is there any implications I need to be aware of with my usual load balancer approach mentioned above?
The reason for migrating is the current k8s cluster is running quite an old version (1.18) and if doing an GKE version upgrade to something more recent like 1.22, I suspect a lot of incompatibilities as well risk.
I see 2 approaches:
In the new cluster get a new IP address and update the DNS record to point to the new load balancer
See if you can switch to Multi-cluster gateways, however that would probably require you to use approach 1 to switch to multi-cluster gateways as well: https://cloud.google.com/kubernetes-engine/docs/how-to/deploying-multi-cluster-gateways
Few pain points you'll run into:
As someone used to DIY Kubernetes I hate GKE's managed Ingress Certs, because they make it very hard to pre-provision HTTPS certs in advance on the new cluster. (GKE's defacto method of provisioning HTTPS certs is to update DNS to point to the LB, and then wait 10-60 minutes. That means if you cutover to a new cluster the new cluster's HTTPS cert supplied by a managedcertificate Custom Resource, won't be ready in advance.)
It is possible to use pre-provision HTTPS certs using ACME-DNS challenge on GCP, but it's poorly documented and a god awful UX(user experience), there's no GUI, and the CLI API is terrible.
You can do it using gcloud services enable certificatemanager.googleapis.com, but I'd highly recommend against
that certificate-manager service that went GA in June, 2022. The UX is painful.
GKE's official docs are pretty bad when it comes to this scenario
You basically want to do 2 things:
Follow this how to guide for a zero downtime HTTPS cutover from cluster1 to cluster2 by leveraging Lets Encrypt Free Cert
https://gist.github.com/neoakris/4aafeac7628995da8dd423f1702c975b
(I know link only answers are bad, but it's github (great uptime) and it's way too long and nuanced to post here.)
Use Velero to migrate workloads from cluster1 to cluster2 (it migrate can do CRDs, CRs, generic yaml objects, and PV/PVCs. One thing of note is Velero works best when you're migrate to and from a cluster of the same version, if you go from a really old version to a really new version you could encounter issues where kubernetes yaml APIs got removed in the new version. Going from old version to new version can be done, but it's best left to an experienced hand. For Happy Path results migrating to and from a cluster of the same version is best.)

How to setup GKE Cluster and GKE pods has to communicate with cloud sql and cloud sql password stored on google cloud secret manager

I am trying to setup google kubernetes engine and its pods has to communicate with cloud sql database. The cloud sql database credentials are stored on google cloud secret manger. How pods will fetch credentials from secret manager and if secret manager credentials are updated than how pod will get update the new secret?
How to setup above requirement? Can you someone please help on the same?
Thanks,
Anand
You can make your deployed application get the secret (password) programmatically, from Google Cloud Secret Manager. You can find and example in many languages in the following link: https://cloud.google.com/secret-manager/docs/samples/secretmanager-access-secret-version
But before make sure that your GKE setup, more specifically your application is able to authenticate to Google Cloud Secret Manager. The following links can help you to choose the appropriate approche:
https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform
https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity
You can find information regarding that particular solution in this doc.
There are also good examples on medium here and here.
To answer your question regarding updating the secrets:
Usually secrets are pulled when the container is being created, but if you expect the credentials to change often (or for the pods to stick around for very long) you can adjust the code to update the secrets on every execution.

Is testing on OpenShift Container Platform (OCP) equivalent to testing on Openshift Origin from a kubernetes standpoint?

This applications which are programmed to use the kubernetes API.
Should we assume that openshift container platform, from a kubernetes standpoint, matches all the standards that openshift origin (and kubernetes) does?
Background
Compatibility testing cloud native apps that are shipped can include a large matrix. It seems that, as a baseline, if OCP is meant to be a pure kubernetes distribution, with add ons, then testing against it is trivial, so long as you are only using core kubernetes features.
Alternatively, if shipping an app with support on OCP means you must test OCP, that would to me imply that (1) the app uses OCP functionality or (2) the app uses kube functionality which may not work correctly in OCP, which should be a considered a bug.
In practice you should be able to regard OpenShift Container Platform (OCP) as being the same as OKD (previously known as Origin). This is because it is effectively the same software and setup.
In comparing both of these to plain Kubernetes there are a few things you need to keep in mind.
The OpenShift distribution of Kubernetes is set up as a multi-tenant system, with a clear distinction between normal users and administrators. This means RBAC is setup so that a normal user is restricted in what they can do out of the box. A normal user cannot for example create arbitrary resources which affect the whole cluster. They also cannot run images that will only work if run as root as they run under a default service account which doesn't have such rights. That default service also has no access to the REST API. A normal user has no privileges to enable the ability to run such images as root. A user who is a project admin, could allow an application to use the REST API, but what it could do via the REST API will be restricted to the project/namespace it runs in.
So if you develop an application on Kubernetes where you have an expectation that you have full admin access, and can create any resources you want, or assume there is no RBAC/SCC in place that will restrict what you can do, you can have issues getting it running.
This doesn't mean you can't get it working, it just means that you need to take extra steps so you or your application is granted extra privileges to do what it needs.
This is the main area where people have issues and it is because OpenShift is setup to be more secure out of the box to suit a multi-tenant environment for many users, or even to separate different applications so that they cannot interfere with each other.
The next thing worth mentioning is Ingress. When Kubernetes first came out, it had no concept of Ingress. To fill that hole, OpenShift implemented the concept of Routes. Ingress only came much later, and was based in part of what was done in OpenShift with Routes. That said, there are things you can do with Routes which I believe you still can't do with Ingress.
Anyway, obviously, if you use Routes, that only works on OpenShift as a plain Kubernetes cluster only has Ingress. If you use Ingress, you need to be using OpenShift 3.10 or later. In 3.10, there is an automatic mapping of Ingress to Route objects, so I believe Ingress should work even though OpenShift actually implements Ingress under the covers using Routes and its haproxy router setup.
There are obviously other differences as well. OpenShift has DeploymentConfig because Kubernetes never originally had Deployment. Again, there are things you can do with DeploymentConfig you can't do with Deployment, but Deployment object from Kubernetes is supported. One difference with DeploymentConfig is how it works in with ImageStream objects in OpenShift, which don't exist in Kubernetes. Stick with Deployment/StatefulSet/DaemonSet and don't use the OpenShift objects which were created when Kubernetes didn't have such features you should be fine.
Do note though that OpenShift takes a conservative approach on some resource types and so they may not by default be enabled. This is for things that are still regarded as alpha, or are otherwise in very early development and subject to change. You should avoid things which are still in development even if using plain Kubernetes.
That all said, for the core Kubernetes bits, OpenShift is verified for conformance against CNCF tests for Kubernetes. So use what is covered by that and you should be okay.
https://www.cncf.io/certification/software-conformance/

Google Container Engine Subnets

I'm trying to isolate services from one another.
Suppose ops-human has a bunch of mysql stores running on Google Container Engine, and dev-human has a bunch of node apps running on the same cluster. I do NOT want dev-human to be able to access any of ops-human's mysql instances in any way.
Simplest solution: put both of these in separate subnets. How do I do such a thing? I'm open to other implementations as well.
The Kubernetes Network-SIG team has been working on the network isolation issue for a while, and there is an experimental API in kubernetes 1.2 to support this. The basic idea is to provide network policy on a per-namespace basis. A third party network controller can then react to changes to resources and enforces the policy. See the last blog post about this topic for the details.
EDIT: This answer is more about the open-source kubernetes, not for GKE specifically.
The NetworkPolicy resource is now available in GKE in alpha clusters (see latest blog post).