I want to understand the role of openstack when kubernetes is deployed on top of it. Will the user be able to access the underlying openstack layer in this case? (I mean to ask if user can create instances, networks and access any other openstack resource)Or will the user be only provided with Kubernetes offerings? Any link or answer would help.
I don't seem to find the functionality part mentioned in any guide.
Openstack's role in the k8s world is to provide k8s with instances and storage to do it's job, just like GCE and Azure.
Kubernetes tries to abstract underlying cloud infrastructure so applications can be ported from one cloud provider to another transparently.
k8s achieves this by defining abstractions like persistent volumes and persistent volume claims allowing a pod to define a requirement for storage without needing to state it requires a cinder volume directly.
There should be no need to access openstack directly from your kubernetes-based app unless you app needs to actually manage an openstack cluster in which case you can provide your openstack credentials to your app and access the openstack api.
Related
Currently, my Kubernetes cluster is provisioned via GKE.
I use GCE Persistent Disks to persist my data.
In GCE, persistent storage is provided via GCE Persistent Disks. Kubernetes supports adding them to Pods or PersistenVolumes or StorageClasses via the gcePersistentDisk volume/provisioner type.
What if I would like to transfer my cluster from Google to, lets say, Azure or AWS?
Then I would have to change value of volume type to azureFile or awsElasticBlockStore respectively in all occurrences in the manifest files.
I hope CSI driver will solve that problem, unfortunately, they also use a different type of volume for each provider cloud provider, for example pd.csi.storage.gke.io for GCP or disk.csi.azure.com for Azure.
Is there any convenient way to make the Kubernetes volumes to be cloud agnostic? In which I wouldn't have to make any changes in manifest files before K8s cluster migration.
You cannot have cloud agnostic storage by using the CSI drivers or the native VolumeClaims in Kubernetes. That's because these API's are the upstream way of provisioning storage which each cloud provider has to integrate with to translate them into the Cloud Specific API (PD for Google, EBS for AWS...)
Unless you have a self-managed Storage that you can access via an NFS driver or a specific driver from the tools managed above. And still with that the Self-Managed Storage solution is going to be based on a Cloud provider specific volume. So You are just going to shift the issue to a different place.
I'm using a juicefs-csi in GKE. I use postgre as meta-store and GCS as storage. The corresponding setting is as follow:
node:
# ...
storageClasses:
- name: juicefs-sc
enabled: true
reclaimPolicy: Retain
backend:
name: juicefs
metaurl: postgres://user:password#my-ec2-where-postgre-installed.ap-southeast-1.compute.amazonaws.com:5432/the-database?sslmode=disable
storage: gs
bucket: gs://my-bucket
# ...
According to this documentation, I don't have to specify access key/secret (like in S3).
But unfortunately, whenever I try to write anything to the mounted volume (with juicefs-sc storage class), I always get this error:
AccessDeniedException: 403 Caller does not have storage.objects.create access to the Google Cloud Storage object.
I believe it should be related to IAM role.
My question is, how could I know which IAM user/service account is used by juicefs to access GCS, so that I can assign a sufficient role to it?
Thanks in advance.
EDIT
Step by step:
Download juicefs-csi helm chart
Add values as described in the question, apply
Create a pod that mount from PV with juicefs-sc storage class
Try to read/write file to the mount point
Ok I misunderstood you at the beginning.
When you are creating GKE cluster you can specify which GCP Service Account will be used by this cluster, like below:
By Default it's Compute Engine default service account (71025XXXXXX-compute#developer.gserviceaccount.com) which is lack of a few Cloud Product permissions (like Cloud Storage, it has Read Only). It's even described in this message.
If you want to check which Service Account was set by default to VM, you could do this via
Compute Engine > VM Instances > Choose one of the VMs from this cluster > In details find API and identity management
So You have like 3 options to solve this issue:
1. During Cluster creation
In Node Pools > Security, you have Access scopes where you can add some additional permissions.
Allow full access to all Cloud APIs to allow access for all listed Cloud APIs
Set access for each API
In your case you could just use Set access for each API and change Storage to Full.
2. Set permissions with a Service Account
You would need to create a new Service Account and provide proper permissions for Compute Engine and Storage. More details about how to create SA you can find in Creating and managing service accounts.
3. Use Workload Identity
Workload Identity on your Google Kubernetes Engine (GKE) clusters. Workload Identity allows workloads in your GKE clusters to impersonate Identity and Access Management (IAM) service accounts to access Google Cloud services.
For more details you should check Using Workload Identity.
Useful links
Configuring Velero - Velero is software for backup and restore, however steps 2 and 3 are mentioned there. You would just need to adjust commands/permissions to your scenario.
Authenticating to Google Cloud with service accounts
I developed a web application for our students and i would like to run this now in a kubernetes container environment. Every user (could be seen as tenant) gets its own application environment (1:1 relation).
the application environment consists of 2 pods (1x webserver, 1x database), defined by a deployment and a service.
I am using kubernetes v1.17.2 and i would like to use the feature of dynamic PersistentVolumeClaims together with the possibility to keep data of a specific user (tenant) between the deletion and re-creation of a new pod (e.g. case of updating to a new application version or after a hardware reboot).
I thought about using a environment variable at pod-creation (e.g. user-1, user-2, user-x,...) and using this information to allow a reusing of a dynamic created PersistentVolume.
is there any best-practise or concept how this can be achieved?
best regards
shane
The outcome that you wish to achieve will be strongly connected to the solution that you are currently using.
It will differ between Kubernetes instances that are provisioned in cloud (for example GKE) and Kubernetes instances on premises (for example: kubeadm, kubespray).
Talking about the possibility to retain user data please refer to official documentation: Kubernetes.io: Persistent volumes reclaiming. It shows a way to retain data inside a pvc.
Be aware of that local static provisioner does not support dynamic provisioning.
The local volume static provisioner manages the PersistentVolume lifecycle for pre-allocated disks by detecting and creating PVs for each local disk on the host, and cleaning up the disks when released. It does not support dynamic provisioning.
Github.com: Storage local static provisioner
Contrary to that VMware Vsphere supports dynamic provisioning. If you are using this solution please refer to this documentation
In your question there is a lack of specific explanation of users in your environment. Are they inside your application or are they outside? Is the application authenticating users? One of solution will be to create users inside of Kubernetes by service accounts and limit their view to namespace specifically created for them.
For service account creation please refer to: Kubernetes.io: Configure service account.
Additionally you could also look on Statefulsets.
We are working on provisioning our service using Kubernetes and the service needs to register/unregister some data for scaling purposes. Let's say the service handles long-held transactions so when it starts/scales out, it needs to store the starting and ending transaction ids somewhere. When it scales out further, it will need to find the next transaction id and save it with the ending transaction id that is covered. When it scales in, it needs to delete the transaction ids, etc. ETCD seems to make the cut as it is used (by Kubernetes) to store deployment data and not only that it is close to Kubernetes, it is actually inside and maintained by Kubernetes; thus we'd like to find out if that is open for our use. I'd like to ask the question for both EKS, AKS, and self-installed. Any advice welcome. Thanks.
Do not use the kubernetes etcd directly for an application.
Access to read/write data to the kubernetes etcd store is root access to every node in your cluster. Even if you are well versed in etcd v3's role based security model avoid sharing that specific etcd instance so you don't increase your clusters attack surface.
For EKS and GKE, the etcd cluster is hidden in the provided cluster service so you can't break things. I would assume AKS takes a similar approach unless they expose the instances to you that run the management nodes.
If the data is small and not heavily updated, you might be able to reuse the kubernetes etcd store via the kubernetes API. Create a ConfigMap or a custom resource definition for your data and edit it via the easily securable and namespaced functionality in the kubernetes API.
For most application uses run your own etcd cluster (or whatever service) to keep Kubernetes free to do it's workload scheduling. The coreos etcd operator will let you define and create new etcd clusters easily.
I was doing some research, but could not really find an answer in the K8s documentation. Is it possible to orchestrate that certain pods in a Kubernetes cluster have access to other certain resources outside of the cluster without giving the permissions to the whole cluster?
For example: A pod accesses data from Google storage. To not hard code some credentials I want it to be able to access it via RBAC/IAM, but on the other hand I do not want another pod in the cluster to be able to access the same storage.
This is necessary as users interact with those pods and the data in the storages have privacy restrictions.
The only way I see so far is to create a service account for that resource and pass the credentials of the service account to the pod. So far I am not really satisfied with this solution, as passing around credentials seems to be insecure to me.
Unfortunately, there is only one way to do this, and you wrote it looks insecure for you. I found an example in documentation and they use the way where you store credential of service account in secret and then use it in pod from secret.