Unable to make elasticsearch as persistent volume in kubernetes cluster - kubernetes

I want to setup Elasticsearch on Kubernetes Cluster using Helm. I can setup Elasticsearch on Kubernetes Cluster without persistence. I am using below helm chart.
helm install --name elasticsearch incubator/elasticsearch \
--set master.persistence.enabled=false \
--set data.persistence.enabled=false \
--set image.tag=6.4.2 \
--namespace logging
However, i am not able to use it with Persistence. Moreover i am confused as i am using neither cloud based storage(aws,gce) nor nfs. I am using Local VM storage.
I added disk in my VM environment formated it under ext4. And now i am trying to use it as a persistent disk for my elasticsearch Deployment.
I tried lots of ways, not working much.
For any data if you need i would be helpful to provide.
But kindly get a solution which will work.
I just need help..

I don't believe this chart will support local storage.
Looking at the volumeClaimTemplate such as on the master-statefulset.yaml shows that it's missing key parameters for a local volume setup (such as path, nodeAffinity, volumeBindingMode) described here. If you are using a cloud deployment, just use a cloud volume claim. If you have deployed a cluster on a on-prem or just onto your computer, then you should fork the chart and adjust the volume claims to meet the requirements for local storage.
Either way on your future posts you should include relevant logs. With kubernetes errors it's helpful to see from all parts of the stack such as: kubernetes control plane logs, object events (like the output from describing the volume claim), helm logs, elasticsearch pod logs failing to discover a volume, etc etc.

Related

How to backup PVC regularly

What can be done to backup kubernetes PVC regularly for GCP and AWS?
GCP has VolumeSnapshot but I'm not sure how to schedule it, like every hour or every day.
I also tried Gemini/fairwinds but I get the following error when for GCP. I installed the charts as mentioned in README.MD and I can't find anyone else encountering the same error.
error: unable to recognize "backup-test.yml": no matches for kind "SnapshotGroup" in version "gemini.fairwinds.com/v1beta1"
You can implement Velero, which gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes.
Unfortunately, Velero only allows you to backup & restore PV, not PVCs.
Velero’s restic integration backs up data from volumes by accessing the node’s filesystem, on which the pod is running. For this reason, restic integration can only backup volumes that are mounted by a pod and not directly from the PVC.
Might wanna look into stash.run
Agree with #hdhruna - Velero is really the most popular tool for doing that task.
However, you can also try miracle2k/k8s-snapshots
Automatic Volume Snapshots on Kubernetes
How is it useful? Simply add
an annotation to your PersistentVolume or PersistentVolumeClaim
resources, and let this tool create and expire snapshots according to
your specifications.
Supported Environments:
Google Compute Engine disks,
AWS EBS disks.
I evaluated multiple solutions including k8s CSI VolumeSnapshots, https://stash.run/, https://github.com/miracle2k/k8s-snapshots and CGP disks snapshots.
The best one in my opinion, is using k8s native implementation of snapshots via CSI driver, that is if you have a cluster version > = 1.17. This allows snapshoting volumes while in use, doesn't require having a read many or write many volume like stash.
I chose gemini by fairwinds also to automate backup creation and deletion and restoration and it works like a charm.
I believe your problem is caused by that missing CRD from gemini in your cluster. Verify that the CRD is installed correctly and also that the version installed is indeed the version you are trying to use.
My installation went flawlessly using their install guide with Helm.

How to get Kubernetes cluster name from K8s API using client-go

How to get Kubernetes cluster name from K8s API mentions that
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
(from within the cluster), or
kubectl run curl --rm --restart=Never -it --image=appropriate/curl -- -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/cluster-name
(from outside the cluster), can be used to retrieve the cluster name. That works.
Is there a way to perform the same programmatically using the k8s client-go library? Maybe using the RESTClient()? I've tried but kept getting the server could not find the requested resource.
UPDATE
What I'm trying to do is to get the cluster-name from an app that runs either in a local computer or within a k8s cluster. the k8s client-go allows to initialise the clientset via in cluster or out of cluster authentication.
With the two commands mentioned at the top that is achievable. I was wondering if there was a way from the client-go library to achieve the same, instead of having to do kubectl or curl depending on where the service is run from.
The data that you're looking for (name of the cluster) is available at GCP level. The name itself is a resource within GKE, not Kubernetes. This means that this specific information is not available using the client-go.
So in order to get this data, you can use the Google Cloud Client Libraries for Go, designed to interact with GCP.
As a starting point, you can consult this document.
First you have to download the container package:
➜ go get google.golang.org/api/container/v1
Before you will launch you code you will have authenticate to fetch the data:
Google has a very good document how to achieve that.
Basically you have generate a ServiceAccount key and pass it in GOOGLE_APPLICATION_CREDENTIALS environment:
➜ export GOOGLE_APPLICATION_CREDENTIALS=sakey.json
Regarding the information that you want, you can fetch the cluster information (including name) following this example.
Once you do do this you can launch your application like this:
➜ go run main.go -project <google_project_name> -zone us-central1-a
And the result would be information about your cluster:
Cluster "tom" (RUNNING) master_version: v1.14.10-gke.17 -> Pool "default-pool" (RUNNING) machineType=n1-standard-2 node_version=v1.14.10-gke.17 autoscaling=false%
Also it is worth mentioning that if you run this command:
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
You are also interacting with the GCP APIs and can go unauthenticated as long as it's run within a GCE machine/GKE cluster. This provided automatic authentication.
You can read more about it under google`s Storing and retrieving instance metadata document.
Finally, one great advantage of doing this with the Cloud Client Libraries, is that it can be launched externally (as long as it's authenticated) or internally within pods in a deployment.
Let me know if it helps.
If you're running inside GKE, you can get the cluster name through the instance attributes: https://pkg.go.dev/cloud.google.com/go/compute/metadata#InstanceAttributeValue
More specifically, the following should give you the cluster name:
metadata.InstanceAttributeValue("cluster-name")
The example shared by Thomas lists all the clusters in your project, which may not be very helpful if you just want to query the name of the GKE cluster hosting your pod.

kubernetes: deploying kong helm chart

I deploy kong via helm on my kubernetes cluster but I can't configure it as I want.
helm install stable/kong -f values.yaml
value.yaml:
{
"persistence.size":"1Gi",
"persistence.storageClass":"my-kong-storage"
}
Unfortunately, the created persistenceVolumeClaim stays at 8G instead of 1Gi. Even adding "persistence.enabled":false has no effect on deployment. So I think my all my configuration is bad.
What should be a good configuration file?
I am using kubernetes rancher deployment on bare metal servers.
I use Local Persistent Volumes. (working well with mongo-replicaset deployment)
What you are trying to do is to configure a dependency chart (a.k.a subchart ) which is a little different from a main chart when it comes to writing values.yaml. Here is how you can do it:
As postgresql is a dependency chart for kong so you have to use the name of the dependency chart as a key then the rest of the options you need to modify in the following form:
The content of values.yaml does not need to be surrounded with curly braces. so you need to remove it from the code you posted in the question.
<dependcy-chart-name>:
<configuration-key-name>: <configuration-value>
For Rancher you have to write it as the following:
#values.yaml for rancher
postgresql.persistence.storageClass: "my-kong-storage"
postgresql.persistence.size: "1Gi"
Unlike if you are using helm itself with vanilla kubernetes - at least - you can write the values.yml as below:
#values.yaml for helm
postgresql:
persistence:
storageClass: "my-kong-storage"
size: "1Gi"
More about Dealing with SubChart values
More about Postgresql chart configuration
Please tell us which cluster setup you are using. A cloud managed service? Custom setup kubernetes?
The problem you are facing is that there is a "minimum size" of storage to be provisioned. For example in IBM Cloud it is 20 GB.
So even if 2GB are requested in the PVC , you will end up with a 20GB PV.
Please check the documentation of your NFS Provisioner / Storage Class

What is the difference between the core os projects kube-prometheus and prometheus operator?

The github repo of Prometheus Operator https://github.com/coreos/prometheus-operator/ project says that
The Prometheus Operator makes the Prometheus configuration Kubernetes native and manages and operates Prometheus and Alertmanager clusters. It is a piece of the puzzle regarding full end-to-end monitoring.
kube-prometheus combines the Prometheus Operator with a collection of manifests to help getting started with monitoring Kubernetes itself and applications running on top of it.
Can someone elaborate this?
I've always had this exact same question/repeatedly bumped into both, but tbh reading the above answer didn't clarify it for me/I needed a short explanation. I found this github issue that just made it crystal clear to me.
https://github.com/coreos/prometheus-operator/issues/2619
Quoting nicgirault of GitHub:
At last I realized that prometheus-operator chart was packaging
kube-prometheus stack but it took me around 10 hours playing around to
realize this.
**Here's my summarized explanation:
"kube-prometheus" and "Prometheus Operator Helm Chart" both do the same thing:
Basically the Ingress/Ingress Controller Concept, applied to Metrics/Prometheus Operator.
Both are a means of easily configuring, installing, and managing a huge distributed application (Kubernetes Prometheus Stack) on Kubernetes:**
What is the Entire Kube Prometheus Stack you ask? Prometheus, Grafana, AlertManager, CRDs (Custom Resource Definitions), Prometheus Operator(software bot app), IaC Alert Rules, IaC Grafana Dashboards, IaC ServiceMonitor CRDs (which auto-generate Prometheus Metric Collection Configuration and auto hot import it into Prometheus Server)
(Also when I say easily configuring I mean 1,000-10,000++ lines of easy for humans to understand config that generates and auto manage 10,000-100,000 lines of machine config + stuff with sensible defaults + monitoring configuration self-service, distributed configuration sharding with an operator/controller to combine config + generate verbose boilerplate machine-readable config from nice human-readable config.
If they achieve the same end goal, you might ask what's the difference between them?
https://github.com/coreos/kube-prometheus
https://github.com/helm/charts/tree/master/stable/prometheus-operator
Basically, CoreOS's kube-prometheus deploys the Prometheus Stack using Ksonnet.
Prometheus Operator Helm Chart wraps kube-prometheus / achieves the same end result but with Helm.
So which one to use?
Doesn't matter + they achieve the same end result + shouldn't be crazy difficult to start with 1 and switch to the other.
Helm tends to be faster to learn/develop basic mastery of.
Ksonnet is harder to learn/develop basic mastery of, but:
it's more idempotent (better for CICD automation) (but it's only a difference of 99% idempotent vs 99.99% idempotent.)
has built-in templating which means that if you have multiple clusters you need to manage / that you want to always keep consistent with each other. Then you can leverage ksonnet's templating to manage multiple instances of the Kube Prometheus Stack (for multiple envs) using a DRY code base with lots of code reuse. (If you only have a few envs and Prometheus doesn't need to change often it's not completely unreasonable to keep 4 helm values files in sync by hand. I've also seen Jinja2 templating used to template out helm values files, but if you're going to bother with that you may as well just consider ksonnet.)
Kubernetes operator are kubernetes specific application(pods) that configure, manage and optimize other Kubernetes deployments automatically. They are implemented as a custom controller.
According to official coreOS website:
Operators were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.
The prometheus operator provides the easy way to deploy configure and monitor your prometheus instances on kubernetes cluster. To do so, prometheus operator introduces three types of custom resource definition(CRD) in kubernetes.
Prometheus
Alertmanager
ServiceMonitor
Now, with the help of above CRD's, you can directly create a prometheus instance by providing kind: Prometheus and the prometheus instance is ready to serve, likewise you can do for AlertManager. Without this you would have to setup the deployment for prometheus with its image, configuration and many more things.
The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.
Now, kube-prometheus implemented the prometheus operator and provides you minimum yaml files to create your basic setup of prometheus, alertmanager and grafana by running a single command.
git clone https://github.com/coreos/prometheus-operator.git
kubectl apply -f prometheus-operator/contrib/kube-prometheus/manifests/
By running above command in kube-prometheus directory, you will get a monitoring namespace which will have an instance of alertmanager, prometheus and grafana for UI. This is enough setup for most of the basic implementation and if you need any more specifics according to your application, you can add more yamls of exporter you need.
Kube-prometheus is more of a contribution to prometheus-operator project, which implements the prometheus operator functionality very well and provide you a complete monitoring setup for your kubernetes cluster. You can start with kube-prometheus and extend the functionality of your monitoring setup according to your application from there.
You can learn more about prometheus-operator here
As of today, 28-09-2020, this is the way to install Prometheus in a Kubernetes cluster
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#kube-prometheus-stack
According to official documentation, kube-prometheus-stack is a rename of prometheus-operator.
As I understood, kube-prometheus-stack also has preinstalled grafana dashboards and prometheus rules.
Note: This chart was formerly named prometheus-operator chart, now
renamed to more clearly reflect that it installs the kube-prometheus
project stack, within which Prometheus Operator is only one component.
Taken from https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
Architecturally the container runs docker
The default container logs are managed by Docker, and the default log driver uses JSON-file
log-driver": "json-file
https://docs.docker.com/config/containers/logging/configure/
If the default jSON-file is used to manage container logs, log rotation is not performed by default. Therefore, the default JSON-file log driver the log files stored by the log driver can result in a large amount of disk space for containers that generate a large amount of output, which can cause disk space to run out.
In this case, save the log to ES, store it separately, and periodically delete the index using curator kubernetes
And run a scheduled task in K8S to delete the index periodically
Another solution for disk space is to periodically delete old logs from jSON-files
Typically we set the size and number of logs
This will set up a maximum of 10 log files, each with a maximum size of 20 Mb. Therefore, the container has a maximum of 200 Mb of logs
"log-driver": "json-file", "log-opts": { "max-size": "20m", "max-file": "10" },
Note: In general, the default Docker log is placed
/var/lib/docker/containers/
But in the same case kubernetes also saves logs and creates a directory structure to help you find pods-based logs, so you can find container logs for each Pod running on a node
/var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/
When removing pod, / var/lib/container under the docker/containers/log and k8s created under/var/log/pods/pod log will be deleted
For example, if the POD is restarted during production, the pod log will be deleted whether it is on the original node or jumped to another node
Therefore, this log needs to be saved in ES for centralized management. Many R&D projects will check the log for troubleshooting in most cases

How to update kubernetes cluster

I am working with Kube-Aws by coreos to generate a cloud formation script and deploy it as part of my stack,
I would like to upgrade my kubernetes cluster to a newer version.
I don't mind creating a new cluster, but what I do mind is recreating all the deployments/services etc...
Is there any way to take the configuration and replace/transfer them to the new cluster? maybe copy the entire etcd data? will that help?
Use kubectl get --export=true on all the resources that you want to move into a new cluster and then restore them that way.
kubectl get <pods,services,deployments,whatever> --export=true --all-namespaces=true