Have been using Kubernetes secrets up to date.
Now we have ConfigMaps as well.
What is the preferred way forward - secrets or config maps?
P.S. After a few iterations we have stabilised at the following rule:
configMaps are per solution domain (can be shared across microservices within the domain, but ultimately are single purpose config entries)
secrets are shared across solution domains, usually represent third party systems or databases
I'm the author of both of these features. The idea is that you should:
Use Secrets for things which are actually secret like API keys, credentials, etc
Use ConfigMaps for not-secret configuration data
In the future, there will likely be some differentiators for secrets like rotation or support for backing the secret API w/ HSMs, etc. In general, we like intent-based APIs, and the intent is definitely different for secret data vs. plain old configs.
One notable difference in the implementation is that kubectl apply -f:
ConfigMaps are "unchanged" if the data hasn't changed.
Secrets are always "configured" - even if the file hasn't changed
Both, ConfigMaps and Secrets store data as a key value pair. The major difference is, Secrets store data in base64 format meanwhile ConfigMaps store data in a plain text.
If you have some critical data like, keys, passwords, service accounts credentials, db connection string, etc then you should always go for Secrets rather than Configs.
And if you want to do some application configuration using environment variables which you don't want to keep secret/hidden like, app theme, base platform url, etc then you can go for ConfigMaps
Related
I'm not finding any answers to this on Google but I may just not know what terms to search for.
In a CRD, is there a way to define a field in the spec that is a secret (and therefore shouldn't be stored in plain text)? For example, if the custom resource needs to have an API token included in it, how do you define that in the CRD?
One thought I had was to just have the user create a Secret outside of the CRD and then provide the secret's name in a custom resource field so the operator can query it from the K8s API on demand when needed (and obviously associated RBAC needs to be configured so the operator has read access to the Secret). So the field in the CRD would just be a normal string that is the name of the target Secret.
But is there a better way? Any existing best practices around this?
You do indeed just store the value in an actual Secret and reference it. You'll find the same pattern all over k8s. Then in your controller code you get your custom object, find the ref, get that secret, and then you have your data.
I would like to use the node sdk to implement a backup and restore mechanism between 2 instances of Cloud Object Storage. I have added a service ID to the instances and added a permissions for the service id to access the buckets present in the instance i want to write to. The buckets will be in different regions. I have tried a variety of endpoints both legacy and non-legacy private and public to achieve this but i usually get Access Denied.
Is what I am trying to do possible with the sdk? if so can someone point me in the right direction?
var config = {
"apiKeyId": "xxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxx",
"endpoint": "s3.eu-gb.objectstorage.softlayer.net",
"iam_apikey_description": "Auto generated apikey during resource-key operation for Instance - crn:v1:bluemix:public:cloud-object-storage:global:a/xxxxxxxxxxx:xxxxxxxxxxx::",
"iam_apikey_name": "auto-generated-apikey-xxxxxxxxxxxxxxxxxxxxxx",
"iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Writer",
"iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/0xxxxxxxxxxxxxxxxxxxx::serviceid:ServiceIdxxxxxxxxxxxxxxxxxxxxxx",
"serviceInstanceId": "crn:v1:bluemix:public:cloud-object-storage:global:a/xxxxxxxxxxxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxx::",
"ibmAuthEndpoint": "iam.cloud.ibm.com/oidc/token"
}
This should work as long as you are able to properly grant the requesting user access to be able to read the source of the put-copy, so long as you are not using KeyProtect based keys.
So the breakdown here is a bit confusing due to some unintuitive terminology.
A service instance is a collection of buckets. The primary reason for having multiple instances of COS is to have more granularity in your billing, as you'll get a separate line item for each instance. The term is a bit misleading, however, because COS is a true multi-tenant system - you aren't actually provisioning an instance of COS, you're provisioning a sort of sub-account within the existing system.
A bucket is used to segment your data into different storage locations or storage classes. Other behavior, like CORS, archiving, or retention, acts on the bucket level as well. You don't want to segment something that you expect to scale (like customer data) across separate buckets, as there's a limit of ~1k buckets in an instance. IBM Cloud IAM treats buckets as 'resources' and are subject to IAM policies.
Instead, data that doesn't need to be segregated by location or class, and that you expect to be subject to the same CORS, lifecycle, retention, or IAM policies can be separated by prefix. This means a bunch of similar objects share a path, like foo/bar and foo/bas have the same prefix foo/. This helps with listing and organization but doesn't provide granular access control or any other sort of policy-esque functionality.
Now, to your question, the answer is both yes and no. If the buckets are in the same instance then no problem. Bucket names are unique, so as long as there isn't any secondary managed encryption (eg Key Protect) there's no problem copying across buckets, even if they span regions. Keep in mind, however, that large objects will take time to copy, and COS's strong consistency might lead to situations where the operation may not return a response until it's completed. Copying across instances is not currently supported.
Using kubectl get with -o yaml on a resouce , I see that every resource is versioned:
kind: ConfigMap
metadata:
creationTimestamp: 2018-10-16T21:44:10Z
name: my-config
namespace: default
resourceVersion: "163"
I wonder what is the significance of these versioning and for what purpose these are used? ( use cases )
A more detailed explanation, that helped me to understand exactly how this works:
All the objects you’ve created throughout this book—Pods,
ReplicationControllers, Services, Secrets and so on—need to be
stored somewhere in a persistent manner so their manifests survive API
server restarts and failures. For this, Kubernetes uses etcd, which
is a fast, distributed, and consistent key-value store. The only
component that talks to etcd directly is the Kubernetes API server.
All other components read and write data to etcd indirectly through
the API server.
This brings a few benefits, among them a more robust optimistic
locking system as well as validation; and, by abstracting away the
actual storage mechanism from all the other components, it’s much
simpler to replace it in the future. It’s worth emphasizing that etcd
is the only place Kubernetes stores cluster state and metadata.
Optimistic concurrency control (sometimes referred to as optimistic
locking) is a method where instead of locking a piece of data and
preventing it from being read or updated while the lock is in place,
the piece of data includes a version number. Every time the data is
updated, the version number increases. When updating the data, the
version number is checked to see if it has increased between the time
the client read the data and the time it submits the update. If this
happens, the update is rejected and the client must re-read the new
data and try to update it again. The result is that when two clients
try to update the same data entry, only the first one succeeds.
The result is that when two clients try to update the same data entry,
only the first one succeeds
Marko Luksa, "Kubernetes in Action"
So, all the Kubernetes resources include a metadata.resourceVersion field, which clients need to pass back to the API server when updating an object. If the version doesn’t match the one stored in etcd, the API server rejects the update
The main purpose for the resourceVersion on individual resources is optimistic locking. You can fetch a resource, make a change, and submit it as an update, and the server will reject the update with a conflict error if another client has updated it in the meantime (their update would have bumped the resourceVersion, and the value you submit tells the server what version you think you are updating)
While I can create custom objects just fine, I am wondering how one is supposed to handle large payloads (Gigabytes) for an object.
CRs are mostly used in order to interface with garbage collection/reference counting in Kubernetes.
Adding the payload via YAML does not work, though (out of memory for large payloads):
apiVersion: "data.foo.bar/v1"
kind: Dump
metadata:
name: my-data
ownerReferences:
- apiVersion: apps/v1
kind: Deploy
name: my-deploy
uid: d9607a69-f88f-11e7-a518-42010a800195
spec:
payload: dfewfawfjr345434hdg4rh4ut34gfgr_and_so_on_...
One could perhaps add the payload to a PV and just reference that path in the CR.
Then I have the problem, that it seems like I cannot clean up the payload file, should the CR get finalized (could not find any info about custom Finalizers).
Have no clear idea how to integrate such a concept into Kubernetes lifetimes.
In general the limit on size for any Kube API object is ~1M due to etcd restrictions, but putting more than 20-30k in an object is a bad idea and will be expensive to access (and garbage collection will be expensive as well).
I would recommend storing the data in a object storage bucket and using an RBAC proxy like https://github.com/brancz/kube-rbac-proxy to gate access the bucket contents (use a URL to the proxy as a reference from your object). That gives you all the benefits of tracking the data in the api, but keeps the object size small. If you want a more complex integration you could implement an aggregated API and reuse the core Kubernetes libraries to handle your API, storing the data in the object store.
We still went with using the CO. Alongside, we created a Kubernetes Controller, which handles the lifetime in the PV. For us this works fine, since the Controller can be the single writer to the PV, while the actual Services only need read access to the PV.
Combined with ownerReference, this makes for a good integration into the Kubernetes lifetime.
The kubernetes dashboard allows one to see secrets in plain text (not base64 encoded) and make an easy change to any key-value pair within a Secret. I cannot find a way to easily make a similar change on the command line.
My best attempt has been to write a script which uses kubectl get secret to pull all of the data in Json format, grab each key-value pair, base64 decode the values, update the one I actually want, then feed them all back in to kubectl apply. After running into multiple issues I figured there is probably a kubectl option that I'm overlooking which will allow me to update just one key-value pair in a given Secret.
How can I do this?
Usually you would have your secrets manifest stashed somewhere where it is secure.
I dont usually change the secrets using the dashboard, but instead do a kubectl apply -f mysecret.yaml. the mysecret.yaml file keeps the latest and greatest values. No in-place editing. This way you get consistency across deployments.