Max size of Environment variables in kubernetes - kubernetes

What is the max size allowed for Environment variable (pod->container->Env) in kubernetes, assuming base ubuntu containers? I am unable to find the relevant documentation. Question might seem stupid, but, I do need the info to make my design robust.

So at bare minimum there is some 1,048,576 byte limitation imposed:
The ConfigMap "too-big" is invalid: []: Too long: must have at most 1048576 characters
which I generated as:
cat > too-big.yml<<FOO
apiVersion: v1
kind: ConfigMap
metadata:
name: too-big
data:
kaboom.txt: |
$(python -c 'print("x" * 1024 * 1024)')
FOO
And when I try that same stunt with a Pod, I'm met with a very similar outcome:
containers:
- image: ubuntu:18.10
env:
- name: TOO_BIG
value: |
$(python -c the same print)
standard_init_linux.go:178: exec user process caused "argument list too long"
So I would guess it's somewhere in between those two numbers: 0 and 1048576
That said, as the practically duplicate question answered, you are very, very likely solving the wrong problem. The very fact that you have to come to a community site to ask such a question means you are brining risk to your project that it will work one way on Linux, another way on docker, another way on kubernetes, and a different way on macOS.

Related

helmfile best practices with multiple customers

We would like to have some recommendations, since we want to integrate helmfile in our deployment process...
Our infrastructure has following details:
we have many customers
all customers have the same installed services
(each customer get's it's own services, no sharing between customers)
credentials are different for each customer
we prefer a seperate
deployment process (we dont want to upgrade all customers at the same
time)
all customer-config data is seperated into seperate config
files, like:
config/customer1.yaml
config/customer2.yaml
config/customer3.yaml
So I'm wondering, if we should use "Environment" with the customer name, to upgrade it.. or would you recommend another variable?
And do you think it's better to create multiple helmfiles for this process, or just one?
Thank you!
do you think it's better to create multiple helmfiles for this process, or just one?
Using one helmfile for multiple environemnts is quite practical and it saves you writing multiple helmfiles.
we should use "Environment" with the customer name?
For a similar setup (deploying to multiple environements with different values and configurations), I have in Helmfile:
- name: my-app
namespace: "{{ .Namespace }}"
chart: k4r-distance-matrix-api
values:
- my-app/values.yaml ## here go the common values if any exists
- my-app/values.{{ .Environment.Name }}.yaml ## here goes the environment specific values
In the deploy step in my CI I have:
.deploy:
stage: deploy
variables:
ENVIRONMENT: ""
CONTEXT: ""
NAMESPACE: ""
before_script:
- kubectl config use-context $CONTEXT
script:
- helmfile -e "$ENVIRONMENT" --namespace "$NAMESPACE" sync

How to reference other environment variables in a Helm values file?

I have a Helm values with content like following:
envs:
- name: PACT_BROKER_DATABASE_NAME
value: $POSTGRES_DBNAME
I want to reference another variable called $POSTGRES_DB_NAME and feed into that PACT_BROKER_DATABASE_NAME. The current value does not work. How do I feed one value to another variable?
I was looking for something to "reference another variable" in the helm variable section, google landed me here, posting as an answer, might help someone else.
I was looking for away to set heap allocation base on pod limit.
env:
- name: MEMORY_LIMIT
valueFrom:
resourceFieldRef:
resource: limits.memory
- name: NODE_OPTIONS
value: "--max-old-space-size=$(MEMORY_LIMIT)"
so I just need $(MEMORY_LIMIT)", so $POSTGRES_DBNAME this should be define in the same pod/container env.

How if I interact with different kubernetes clusters in different terminals sessions with out having to switch contexts all the the time?

I am testing role differences right now so I have a context for each role setup.
Terminal session Admin, I want to be able to use context Admin in one session so I can update the rules as needed.
In terminal session User, I want to be able to test that role via its context.
(Note: I am on EKS so roles map to IAM roles)
Well, I am an idiot.
Natively, there is no answer in the --help output for kubectl; however, there is output for this in the man page.
All one has to do is throw the --context flag into their command.
However, the below-mentioned kubectx tool is what I use day to day now.
Here are some tips for managing multiple kubectl contexts:
Use asdf to manage multiple kubectl versions
Set the KUBECONFIG env var to change between multiple kubeconfig files
Use kube-ps1 to keep track of your current context/namespace
Use kubectx and kubens to change fast between clusters/namespaces
Use aliases to combine them all together
Take a look at this article, it explains how to accomplish this: Using different kubectl versions with multiple Kubernetes clusters (Disclaimer: I wrote the mentioned article)
I also recommend this reads: Mastering the KUBECONFIG file and Configure Access to Multiple Clusters
Now, there is kubie.
https://github.com/sbstp/kubie
it does it all.
You can create a copy of your context file that is located under ~/.kube/config, and in 2 different shells, point to 2 different config files using export KUBECONFIG=/path/to/kubeconfig1 on the first and export KUBECONFIG=/path/to/kubeconfig2 on the second. You can edit those files to have 2 different context selected.
To easily select contexts/switch between them, you can use kubectx, as suggested by Blokje5.
I always like kubectx as a way to quickly switch context. If you correctly setup your contexts with the aws-iam-authenticator, like so:
users:
- name: kubernetes-admin
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "<cluster_id>"
- "-r"
- "<admin_role_arn>"
- name: kubernetes-user
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "<cluster_id>"
- "-r"
- "<user_role_arn>"
This should allow you to easily switch contexts. (Note: This assumes an assume-role type situation. You can also pass AWS_PROFILE to the aws-iam-authenticator instead.)

Security: Yaml Bomb: user can restart kube-api by sending configmap

Create yaml-bomb.yaml file:
apiVersion: v1
data:
a: &a ["web","web","web","web","web","web","web","web","web"]
b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]
c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]
d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]
e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d]
f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e]
g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f]
h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g]
i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h]
kind: ConfigMap
metadata:
name: yaml-bomb
namespace: default
Send ConfigMap creation request to Kubernetes API by cmd kubectl apply -f yaml-bomb.yaml.
kube-api CPU/memory usage are very high, even later are getting restarted.
How do we prevent such yaml-bomb?
This is a billion laughts attack and can only be fixed in the YAML processor.
Note that the Wikipedia is wrong here when it says
A "Billion laughs" attack should exist for any file format that can contain references, for example this YAML bomb:
The problem is not that the file format contains references; it is the processor expanding them. This is against the spirit of the YAML spec which says that anchors are used for nodes that are actually referred to from multiple places. In the loaded data, anchors & aliases should become multiple references to the same object instead of the alias being expanded to a copy of the anchored node.
As an example, compare the behavior of the online PyYAML parser and the online NimYAML parser (full disclosure: my work) when you paste your code snippet. PyYAML won't respond because of the memory load from expanding aliases, while NimYAML doesn't expand the aliases and therefore responds quickly.
It's astonishing that Kubernetes suffers from this problem; I would have assumed since it's written in Go that they are able to properly handle references. You have to file a bug with them to get this fixed.
There's a couple of possible mitigations I could think of although as #flyx says the real fix here would be in the YAML parsing library used by Kubernetes.
Interestingly running this on a Kubernetes cluster on my local machine showed the CPU spike to be client-side (it's the kubectl process churning CPU) rather than server side.
If the issue was server side, then possible mitigations would be to use RBAC to minimize access to ConfigMap creation, and potentially to use an admission controller like OPA to review manifests before they are applied to the cluster.
This should probably be raised with the Kubernetes security vulnerability response team so that a proper fix can be implemented.
EDIT - I think where the problem manifests, might be down to the cluster version used. Server-side apply graduated to beta (should be enabled by default) in 1.16. So on a 1.16 cluster perhaps this would hit server side instead of client side.
EDIT - Just setup a 1.16 cluster, still showing the CPU usage as client-side in kubectl...
EDIT - I've filed an issue for this here also confirmed that the DoS can be achieved server-side by using curl instead of kubectl
Final EDIT - This got assigned a CVE (CVE-2019-11253) and is being fixed in Kubernetes 1.13+ . The fix has also been applied to the underlying YAML parsing lib here so any other Go programs should be ok as long as they're using an up to date version.
There was a TrustCom19 paper studying vulnerabilities in YAML parsers for different languages, it found that most parsers have some issues, so this is common and there are several recent CVEs in this space (details in paper: Laughter in the Wild: A Study into DoS Vulnerabilities in YAML Libraries, TrustCom19.
Preprint: https://www.researchgate.net/publication/333505459_Laughter_in_the_Wild_A_Study_into_DoS_Vulnerabilities_in_YAML_Libraries

How to pass kubernetes pod instance id to within the pod upon start up?

So I'm researching how to use Kubernetes for my case. I installed it and played a bit.
The question is when the replication controller starts couple of replicas they have something like an id in their name:
How unique is this id? Is it uniqueness for the lifetime of kubernetes? Is it unique across different kubernetes runs (i.e. if I restart kubernetes)?
How to pass this id to the app in the container? Can I specify some kind of template in the yaml so for example the id will be assigned to environment variable or something similar?
Alternatively is there a way for the app in the container to ask for this id?
More explanation of the use case. I have an application that writes some session files inside a directory. I want to guarantee unique for the session ids in the system. This means if one app instance is running on VM1 and another instance on VM2, I want to prepend some kind of identifier to the ids like app-1-dajk4l and app-2-dajk4l, where app is the name of the app and 1, 2 is the instance identifier, which should come from the replication controller because it is dynamic and can not be configured manually. dajk4l is some identifier like the current timestamp or similar.
Thanks.
The ID is guaranteed to be unique at any single point in time, since Kubernetes doesn't allow two pods in the same namespace to have the same name. There aren't any longer-term guarantees though, since they're just generated as a random string of 5 alphanumeric characters. However, given that there are more than 60 million such random strings, conflicts across time are also unlikely in most environments.
Yes, you can pull in the pod's namespace and name as environment variables using what's called the "Downward API", adding a field on the container like
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name