Is there a possibility to have a service in all namespaces of k8s dynamically deployed?
Right now, glusterFS endpoint(ns dependent) is being deleted by k8s if the port is not in use anymore.
Ex:
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"subsets": [
{
"addresses": [
{
"ip": "172.0.0.1"
}
],
"ports": [
{
"port": 1
}
]
}
]
}
So I made a svc for port 1 to be used all the time, so I dont end up with a missing/deleted endpoint in any ns.
apiVersion: v1
kind: Service
metadata:
name: glusterfs
spec:
ports:
- port: 1
It would be interesting to have the above service deployed dynamically every time someone creates a new namespace.
DaemonSet is used to deploy Exactly one replica per node.
coming to your question, why do you need to create same service across namespaces?
It is not supported out of box though. However, you can create a custom script to achieve it.
K8s doesn't have any replication of services, pods, deployments, secrets, etc across namespaces... out of the box.
Introducing...The Kubernetes Controller/Operator Pattern.
Deploy a controller pod that has a read/list permissions on the namespaces resource. This controller will "watch" the namespaces and deploy whatever resources you want when they show up or change.
To get started building your own operator or controller please look at kubebuilder. https://book.kubebuilder.io/
Related
First all of, for some reasons, I'm using an unsupported and obsolete version of Kubernetes (1.12), and I can't upgrade.
I'm trying to configure the scheduler to avoid running pods on some nodes by changing the node score when the scheduler try to find the best available node, and I would like to do that on scheduler level and not by using nodeAffinity at deployment, replicaset, pod, etc level (therefore all pods will be affected by this change).
After reading the k8s docs here: https://kubernetes.io/docs/reference/scheduling/config/#scheduling-plugins and checking that some options were already present in 1.12, I'm trying to use the NodePreferAvoidPods plugins.
In the documentation the plugin specifies:
Scores nodes according to the node annotation scheduler.alpha.kubernetes.io/preferAvoidPods
Which if understand correctly should do the work.
So, i've updated the static manifest for kube-scheduler.yaml to use the following config:
apiVersion: kubescheduler.config.k8s.io/v1alpha1
kind: KubeSchedulerConfiguration
profiles:
- plugins:
score:
enabled:
- name: NodePreferAvoidPods
weight: 100
clientConnection:
kubeconfig: /etc/kubernetes/scheduler.conf
But adding the following annotation
scheduler.alpha.kubernetes.io/preferAvoidPods: to the node doesn't seem to work.
For testing I'm made a basic nginx deployment with a replica equal to the number of worker nodes (4).
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 4
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
Then I check where the pods where created with kubectl get pods -owide
So, I believe some options are required for this annotation to works.
I've tried to set the annotation to "true", "1" but k8s refuse my change and I can't figure what are the valid options for this annotation and I can't find any documentation about that.
I've checked within git release for 1.12, this plugin was already present (at least there are some lines of codes), I don't think the behavior or settings changed much since.
Thanks.
So from source Kubernetes codes here a valid value for this annoation:
{
"preferAvoidPods": [
{
"podSignature": {
"podController": {
"apiVersion": "v1",
"kind": "ReplicationController",
"name": "foo",
"uid": "abcdef123456",
"controller": true
}
},
"reason": "some reason",
"message": "some message"
}
]
}`
But there is no details on how to predict the uid and no answer where given when asked by another one on github years ago: https://github.com/kubernetes/kubernetes/issues/41630
For my initial question which was to avoid scheduling pods on node, I found an other method by using the well-known taint node.kubernetes.io/unschedulable and the value PreferNoSchedule
Tainting a node with this command do the job and this taint seem persistent across cordon/uncordon (a cordon will set to NoSchedule and uncordon will set it back to PreferNoSchedule).
kubectl taint node NODE_NAME node.kubernetes.io/unschedulable=:PreferNoSchedule
I have a Kubernetes deployment (apache flume to be exact) which needs to store persistent data. It has a PVC set up and bind to a path, which works without problem.
When I simply increase the scale of the deployment through kubernetes dashboard, it gives me an error saying multiple pods are trying to attach the same persistent volume. My deployment description is something like this (I tried to remove irrelevant parts)
{
"kind": "Deployment",
"apiVersion": "extensions/v1beta1",
"metadata": {
"name": "myapp-deployment",
"labels": {
"app": "myapp",
"name": "myapp-master"
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"app": "myapp",
"name": "myapp-master"
}
},
"template": {
"spec": {
"volumes": [
{
"name": "myapp-data",
"persistentVolumeClaim": {
"claimName": "myapp-pvc"
}
}
],
"containers": [
{
"name": "myapp",
"resources": {},
"volumeMounts": [
{
"name": "ingestor-data",
"mountPath": "/data"
}
]
}
]
}
},...
Each pod should get its own persistent space (but with same pathname), so one doesn't mess with the others'. I tried to add a new volume in the volume array above, and a volume mount in the volume mount array, but it didn't work (I guess it meant "bind two volumes to a single container")
What should I change to have 2 pods with separate persistent volumes? What should I change to have N number of pods and N number of PVC's so I can freely scale the deployment up and down?
Note: I saw a similar question here which explains N number of pods cannot be done using deployments. Is it possible to do what I want with only 2 pods?
You should use a StatefulSet for that. This is for pods with persistent data that should survive a pod restart. Replicas have a certain order and are named in that way (my-app-0, my-app-1, ...). They are stopped and restarted in this order and will mount the same volume after a restart/update.
With the StatefulSet you can use a volumeClaimTemplates to dynamically create new PersistentVolumes with the creation of a new pod. So every time a pod is created a volume get provisioned by your storage class.
From docs:
The volumeClaimTemplates will provide stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-storage-class"
resources:
requests:
storage: 1Gi
See docs for more details:
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#components
Each pod should get its own persistent space (but with same pathname), so one doesn't mess with the others'.
For this reason, use a StatefulSet instead. Most things will work the same way, except that each Pod will get its own unique Persistent Volume.
With Okteto Cloud, in order to let different pods/deployments access a shared PersistentVolumeClaim, I tried setting the PersistentVolumeClaim's accessModes to "ReadWriteMany":
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "pv-claim-cpdownloads"
},
"spec": {
"accessModes": [
"ReadWriteMany"
],
"resources": {
"requests": {
"storage": "10Gi"
}
}
}
}
Applying my deployment with kubectl succeeds, but the deployment itself times out on the okteto web UI, with the error:
pod has unbound immediate PersistentVolumeClaims (repeated 55 times)
Now, the same PersistentVolumeClaim with accessModes set to "ReadWriteOnce" deploys just fine.
Is the accessMode "ReadWriteMany" disallowed on Okteto Cloud ?
If it is, how could I get several pods/deployments to access the same volume data ?
For precisions, in my case I think I technically only need one pod to write to the volume and the other one to read from it.
My use case is to have one container save files to a folder, and another container watches changes and loads files from that same folder.
Okteo Cloud only supports the "ReadWriteOnce" access mode.
If you share the volume between pods/deployments they will all go to the same node, which is equivalent to have a single reader/writer. But it is not a recommended practice.
What is your use case? why do you need to share volumes?
I have 3 different Kubernetes Secrets and I want to mount each one into its own Pod managed by a StatefulSet with 3 replicas.
Is it possible to configure the StatefulSet such that each Secret is mounted into its own Pod?
Not really. A StatefulSet (and any workload controller for that matter) allows only a single pod definition template (it could have multiple containers). The issue with this is that a StatefulSet is designed to have N replicas so can you have an N number of secrets. It would have to be a SecretStatefulSet: a different controller.
Some solutions:
You could define a single Kubernetes secret that contains all your required secrets for all of your pods. The downside is that you will have to share the secret between the pods. For example:
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
pod1: xxx
pod2: xxx
pod3: xxx
...
podN: xxx
Use something like Hashicorp's Vault and store your secret remotely with keys such as pod1, pod2, pod3,...podN. You can also use an HSM. This seems to be the more solid solution IMO but it might take longer to implement.
In all cases, you will have to make sure that the number of secrets matches your number of pods in your StatefulSet.
This is exactly what you're looking for I guess. https://github.com/spoditor/spoditor
Essentially, it uses a custom annotation on the PodSpec template, like:
annotations:
spoditor.io/mount-volume: |
{
"volumes": [
{
"name": "my-volume",
"secret": {
"secretName": "my-secret"
}
}
],
"containers": [
{
"name": "nginx",
"volumeMounts": [
{
"name": "my-volume",
"mountPath": "/etc/secrets/my-volume"
}
]
}
]
}
Now, nginx container in each Pod of the StatefulSet will try to mount its own dedicated secret in the pattern of my-secret-{pod ordinal}.
You will just need to make sure my-secret-0, my-secret-1, so on and so forth exists in the same namespace of the StatefulSet.
There're more advanced usage of the annotation in the documentation of the project.
I will use a very specific way to explain my problem, but I think this is better to be specific than explain in an abstract way...
Say, there is a MongoDB replica set outside of a Kubernetes cluster but in a network. The ip addresses of all members of the replica set were resolved by /etc/hosts in app servers and db servers.
In an experiment/transition phase, I need to access those mongo db servers from kubernetes pods.
However, kubernetes doesn't seem to allow adding custom entries to /etc/hosts in pods/containers.
The MongoDB replica sets are already working with large data set, creating a new replica set in the cluster is not an option.
Because I use GKE, changing any of resources in kube-dns namespace should be avoided I suppose. Configuring or replace kube-dns to be suitable for my need are last thing to try.
Is there a way to resolve ip address of custom hostnames in a Kubernetes cluster?
It is just an idea, but if kube2sky can read some entries of configmap and use them as dns records, it colud be great.
e.g. repl1.mongo.local: 192.168.10.100.
EDIT: I referenced this question from https://github.com/kubernetes/kubernetes/issues/12337
There are 2 possible solutions for this problem now:
Pod-wise (Adding the changes to every pod needed to resolve these domains)
cluster-wise (Adding the changes to a central place which all pods have access to, Which is in our case is the DNS)
Let's begin with the pod-wise solution:
As of Kunbernetes 1.7, It's possible now to add entries to a Pod's /etc/hosts directly using .spec.hostAliases
For example: to resolve foo.local, bar.local to 127.0.0.1 and foo.remote,
bar.remote to 10.1.2.3, you can configure HostAliases for a Pod under
.spec.hostAliases:
apiVersion: v1
kind: Pod
metadata:
name: hostaliases-pod
spec:
restartPolicy: Never
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "foo.local"
- "bar.local"
- ip: "10.1.2.3"
hostnames:
- "foo.remote"
- "bar.remote"
containers:
- name: cat-hosts
image: busybox
command:
- cat
args:
- "/etc/hosts"
The Cluster-wise solution:
As of Kubernetes v1.12, CoreDNS is the recommended DNS Server, replacing kube-dns. If your cluster originally used kube-dns, you may still have kube-dns deployed rather than CoreDNS. I'm going to assume that you're using CoreDNS as your K8S DNS.
In CoreDNS it's possible to Add an arbitrary entries inside the cluster domain and that way all pods will resolve this entries directly from the DNS without the need to change each and every /etc/hosts file in every pod.
First:
Let's change the coreos ConfigMap and add required changes:
kubectl edit cm coredns -n kube-system
apiVersion: v1
kind: ConfigMap
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
hosts /etc/coredns/customdomains.db example.org {
fallthrough
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
customdomains.db: |
10.10.1.1 mongo-en-1.example.org
10.10.1.2 mongo-en-2.example.org
10.10.1.3 mongo-en-3.example.org
10.10.1.4 mongo-en-4.example.org
Basically we added two things:
The hosts plugin before the kubernetes plugin and used the fallthrough option of the hosts plugin to satisfy our case.
To shed some more lights on the fallthrough option. Any given backend is usually the final word for its zone - it either returns a result, or it returns NXDOMAIN for the
query. However, occasionally this is not the desired behavior, so some of the plugin support a fallthrough option.
When fallthrough is enabled, instead of returning NXDOMAIN when a record is not found, the plugin will pass the
request down the chain. A backend further down the chain then has the opportunity to handle the request and that backend in our case is kubernetes.
We added a new file to the ConfigMap (customdomains.db) and added our custom domains (mongo-en-*.example.org) in there.
Last thing is to Remember to add the customdomains.db file to the config-volume for the CoreDNS pod template:
kubectl edit -n kube-system deployment coredns
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
- key: customdomains.db
path: customdomains.db
and finally to make kubernetes reload CoreDNS (each pod running):
$ kubectl rollout restart -n kube-system deployment/coredns
#OxMH answer is fantastic, and can be simplified for brevity. CoreDNS allows you to specify hosts directly in the hosts plugin (https://coredns.io/plugins/hosts/#examples).
The ConfigMap can therefore be edited like so:
$ kubectl edit cm coredns -n kube-system
apiVersion: v1
kind: ConfigMap
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
hosts {
10.10.1.1 mongo-en-1.example.org
10.10.1.2 mongo-en-2.example.org
10.10.1.3 mongo-en-3.example.org
10.10.1.4 mongo-en-4.example.org
fallthrough
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . "/etc/resolv.conf"
cache 30
loop
reload
loadbalance
}
You will still need to restart coredns so it rereads the config:
$ kubectl rollout restart -n kube-system deployment/coredns
Inlining the contents of the hostsfile removes the need to map the hostsfile from the configmap. Both approaches achieve the same outcome, it is up to personal preference as to where you want to define the hosts.
A type of External Name is required to access hosts or ips outside of the kubernetes.
The following worked for me.
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "tiny-server-5",
"namespace": "default"
},
"spec": {
"type": "ExternalName",
"externalName": "192.168.1.15",
"ports": [{ "port": 80 }]
}
}
For the record, an alternate solution for those not checking the referenced github issue.
You can define an "external" Service in Kubernetes, by not specifying any selector or ClusterIP. You have to also define a corresponding Endpoint pointing to your external IP.
From the Kubernetes documentation:
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "my-service"
},
"spec": {
"ports": [
{
"protocol": "TCP",
"port": 80,
"targetPort": 9376
}
]
}
}
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "my-service"
},
"subsets": [
{
"addresses": [
{ "ip": "1.2.3.4" }
],
"ports": [
{ "port": 9376 }
]
}
]
}
With this, you can point your app inside the containers to my-service:9376 and the traffic should be forwarded to 1.2.3.4:9376
Limitations:
The DNS name used needs to be only letters, numbers or dashes. You can't use multi-level names (something.like.this). This means you probably have to modify your app to point just to your-service, and not yourservice.domain.tld.
You can only point to a specific IP, not a DNS name. For that, you can define a kind of a DNS alias with an ExternalName type Service.
UPDATE: 2017-07-03 Kunbernetes 1.7 now support Adding entries to Pod /etc/hosts with HostAliases.
The solution is not about kube-dns, but /etc/hosts.
Anyway, following trick seems to work so far...
EDIT: Changing /etc/hosts may has race condition with kubernetes system. Let it retry.
1) create a configMap
apiVersion: v1
kind: ConfigMap
metadata:
name: db-hosts
data:
hosts: |
10.0.0.1 db1
10.0.0.2 db2
2) Add a script named ensure_hosts.sh.
#!/bin/sh
while true
do
grep db1 /etc/hosts > /dev/null || cat /mnt/hosts.append/hosts >> /etc/hosts
sleep 5
done
Don't forget chmod a+x ensure_hosts.sh.
3) Add a wrapper script start.sh your image
#!/bin/sh
$(dirname "$(realpath "$0")")/ensure_hosts.sh &
exec your-app args...
Don't forget chmod a+x start.sh
4) Use the configmap as a volume and run start.sh
apiVersion: extensions/v1beta1
kind: Deployment
...
spec:
template:
...
spec:
volumes:
- name: hosts-volume
configMap:
name: db-hosts
...
containers:
command:
- ./start.sh
...
volumeMounts:
- name: hosts-volume
mountPath: /mnt/hosts.append
...
Use configMap seems better way to set DNS, but it's a little bit heavy when just add a few record (in my opinion). So I add records to /etc/hosts by shell script executed by docker CMD.
for example:
Dockerfile
...(ignore)
COPY run.sh /tmp/run.sh
CMD bash /tmp/run.sh
run.sh
#!/bin/bash
echo repl1.mongo.local 192.168.10.100 >> /etc/hosts
# some else command...
Notice, if your run MORE THAN ONE container in a pod, you have to add script in each container, because kubernetes start container randomly, /etc/hosts may be override by another container (which start later).