I've deployed the ELK stack to Kubernetes cluster.
I'm also able to create the lifecycle policies on Kibana UI. My issue is everytime I want to bring the ELK to another cluster, or simply need to clear the Persistent Volume and re-deploy, all of my data gone. The logs doesn't matter - indeed I want the logs to be empty in the new environment, I just want to have my policies available for the new env.
I have an idea about every time deploying the ELK stack, I will bootstrap it with policies some how.
First approach: Call the APIs to create the policies, index templates... after the deployment are ready. This would need to check some readinessProb of Elasticsearch/Kibana and call the right commands => The peasants way.
Second approach: Find some way to extract the saved policies from elasticsearch "database". Then copy it to new elasticsearch deployment.
Is there anyone has experiences about this?
Related
I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.
According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack
You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.
I am currently working on a project that uses the Prometheus Server (Not the Prometheus Operator).
We're looking to introduce a way of modifying the PrometheusRules without having to redeploy it.
I'm completely new to containers and Kubernetes and a little over my head so I'm hoping someone could let me know if I'm wasting my time trying to make this work.
What I have thought of doing so far is:
Store the PrometheusRules in a configmap.
Apply the configmap of rules to the Prometheus Server configuration.
Create a sidecar to the Prometheus Server that can modify this configMap.
The sidecar will have an API exposed so users will have CRUD functionality for the rules.
When successful modifying a rule, the sidecar will trigger the reload endpoint on the Prometheus Server that causes it to reload its configuration file without having to restart the container.
Thanks
Your initial use case seems valid, though I would say there are better ways of achieving this.
For points 1,2 I would suggest you use the Prometheus Helm Chart for ease of use, better config management and deployment. This would keep track of Prometheus configuration as one single unit rather than maintaining the rule files separately.
For point 3,4:- Making direct untracked changes to the live configuration does not seem safe or secure. Using the helm chart mentioned above I would suggest you make the changes before deploying to the cluster (Use VCS like Git to track changes)
Best case scenario:- Also setup CI/CD pipelines to deploy changes instantly.
Use the reload API as mentioned to reload the new released config.
Explore more about Helm
I can't find the answer to a pretty easy question: Where can I configure Cassandra (normally using Cassandra.yaml) when its deployed on a cluster with kubernetes using the Google Kubernetes Engine?
So I'm completely new to distributed databases, Kubernetes etc. and I'm setting up a cassandra cluster (4VMs, 1 pod each) using the GKE for a university course right now.
I used the official example on how to deploy Cassandra on Kubernetes that can be found on the Kubernetes homepage (https://kubernetes.io/docs/tutorials/stateful-application/cassandra/) with a StatefulSet, persistent volume claims, central load balancer etc. Everything seems to work fine and I can connect to the DB via my java application (using the datastax java/cassandra driver) and via Google CloudShell + CQLSH on one of the pods directly. I created a keyspace, some tables and started filling them with data (~100million of entries planned), but as soon as the DB reaches some size, expensive queries result in a timeout exception (via datastax and via cql), just as expected. Speed isn't necessary for these queries right now, its just for testing.
Normally I would start with trying to increase the timeouts in the cassandra.yaml, but I'm unable to locate it on the VMs and have no clue where to configure Cassandra at all. Can someone tell me if these configuration files even exist on the VMs when deploying with GKE and where to find them? Or do I have to configure those Cassandra details via Kubectl/CQL/StatefulSet or somewhere else?
I think the faster way to configure cassandra in Kubernetes Engine, you could use the next deployment of Cassandra from marketplace, there you could configure your cluster and you could follow this guide that is also marked there to configure it correctly.
======
The timeout config seems to be a configuration that require to be modified inside the container (Cassandra configuration itself).
you could use the command: kubectl exec -it POD_NAME -- bash in order to open a Cassandra container shell, that will allow you to get into the container configurations and you could look up for the configuration and change it for what you require.
after you have the configuration that you require you will need to automate it in order to avoid manual intervention every time that one of your pods get recreated (as configuration will not remain after a container recreation). Next options are only suggestions:
Create you own Cassandra image from am own Docker file, changing the value of the configuration you require from there, because the image that you are using right now is a public image and the container will always be started with the config that the pulling image has.
Editing the yaml of your Satefulset where Cassandra is running you could add an initContainer, which will allow to change configurations of your running container (Cassandra) this will make change the config automatically with a script ever time that your pods run.
choose the option that better fits for you.
Here's my scenario,
I want to launch a Job in Kubernetes, the first container that runs will look through a list of custom resources, and launch each of the containers defined in that resource to completion. I don't know whats in the list ahead of time, I only know when the job is kicked off.
Is this possible? Can someone point me to something that shows how to do it?
You can use the Kubernetes Client Libraries to create any Kubernetes resource from inside your code (given that it has the correct service account of course if RBAC is configured in your cluster).
If you want to run a container to completion, a Kubernetes Job would be the best fit.
It is possible to manage jobs programmatically using the kubernetes client-go project.
Here are some examples.
To create a job to completion, refer:
Job APIs
JobInterface
Batch client APIs
Custom resources definitions can be managed using the kubernetes apiextensions-apiserver project.
To manage custom resources definitions, refer:
CRD APIs
CRD API tests
To create custom resources, refer:
This link has steps to access kubernetes API from inside a pod
Example
We started using Kubernetes, a few time ago, and now we have deployed a fair amount of services. It's becoming more and more difficult to know exactly what is deployed. I suppose many people are facing the same issue, so is there already a solution to handle this issue?
I'm talking of a solution that when connected to kubernetes (via kubectl for example) can generate a kind of map off the cluster.
In order to display one or many resources you need to use kubectl get command.
To show details of a specific resource or group of resources you can use kubectl describe command.
Please check the links I provided for more details and examples.
You may also want to use Web UI (Dashboard)
Dashboard is a web-based Kubernetes user interface. You can use
Dashboard to deploy containerized applications to a Kubernetes
cluster, troubleshoot your containerized application, and manage the
cluster resources. You can use Dashboard to get an overview of
applications running on your cluster, as well as for creating or
modifying individual Kubernetes resources (such as Deployments, Jobs,
DaemonSets, etc). For example, you can scale a Deployment, initiate a
rolling update, restart a pod or deploy new applications using a
deploy wizard.
Let me know if that helped.