Export logs of Kubernetes cronjob to a path after each run - kubernetes

I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.

According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack

You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.

Related

K8s limit for pod logs

When I try to retrieve logs from my pods, I note that K8s does not print all the logs, and I know that because I observe that logs about microservice initialization are not present in the head of logs.
Considering that my pods print a lot of logs in a long observation period, does someone know if K8s has a limit in showing all logs?
I also tried to set --since parameter in the kubectl logs command to get all logs in a specific time range, but it seems to have no effect.
Thanks.
The container runtime engine typically manages container (pod) logs. Do check the settings on the runtime engine in use.
There seems to be an issue with the logging earlier. Attaching the link for the same. https://github.com/kubernetes/kubernetes/pull/78071
There are some answers, I'll add more details and sources.
The answer is quite short. There is no limit but free space. By default kubernetes is not responsible for log rotation:
An important consideration in node-level logging is implementing log
rotation, so that logs don't consume all available storage on the
node. Kubernetes is not responsible for rotating logs, but rather a
deployment tool should set up a solution to address that. For example,
in Kubernetes clusters, deployed by the kube-up.sh script, there is a
logrotate tool configured to run each hour. You can also set up a
container runtime to rotate an application's logs automatically.
As it was stated by William, Kubernetes itself doesn’t provide log aggregation of its own and it relies on container runtime by default.
When a container running on Kubernetes writes its logs to stdout or
stderr streams, they are picked up by the kubelet service running on
that node, and are delegated to the container engine for handling
based on the logging driver configured in Kubernetes.
In most cases, Docker container logs will end up in the
/var/log/containers directory on your host. Docker supports multiple
logging drivers but, unfortunately, Kubernetes API does not support
driver configuration.
Once a container terminates or restarts, kubelet keeps its logs on the
node. To prevent these files from consuming all of the host’s storage,
a log rotation mechanism should be set on the node.
Kubernetes doesn’t provide built-in log rotation, but this
functionality is available in many tools, such as Docker’s log-opt, or
standard file shippers or even a simple custom cron job. When a
container is evicted from the node, so are its corresponding log files
That means you can try to find full logs in /var/log/containers and var/log/pods. This part is from official documentation and more precise:
By default, if a container restarts, the kubelet keeps one terminated
container with its logs. If a pod is evicted from the node, all
corresponding containers are also evicted, along with their logs.
To have a good visibility and accessibility of logs you may consider having a dedicated solution for logs storing. E.g. node logging agent or streaming to a sidecar
Please find articles and official kubernetes documentation with concepts and examples:
Kubernetes logging architecture
Practical guide to kubernetes

Kubernetes Job to create a volume snapshot

I have a job, which I want to run regularly in Kubernetes 1.19.3 (DigitalOcean).
For this job, I need to take a snapshot of a PVC and do stuff to it. I know how can I run a job and mount a volume to the pod it runs, but I have a hard time finding out how to take that snapshot at the beginning of this job.
Is there any way to do it?
The tool of choice to take PV snapshots in K8s is VolumeSnapshots.
The trouble with them is that they don't come yet) with functionality for periodic triggering. So, you would have to create them from a K8s CronJob. However, doing so is not terribly straight forward, since your CronJob Pod would need to have a K8s client installed and require access to the K8s API Server with RBAC.
There are a couple of options to get there, reaching from writing your own image from scratch to using open-source solutions based on the clients from this project k8s client libraries.
Seeing that dynamic K8s manifest applying is somewhat badly supported by K8s, I actually started an open source project myself, that you could use for this purpose: K8sCrud.

Kubernetes Cluster - How to automatically generate documentation/Architecture of services

We started using Kubernetes, a few time ago, and now we have deployed a fair amount of services. It's becoming more and more difficult to know exactly what is deployed. I suppose many people are facing the same issue, so is there already a solution to handle this issue?
I'm talking of a solution that when connected to kubernetes (via kubectl for example) can generate a kind of map off the cluster.
In order to display one or many resources you need to use kubectl get command.
To show details of a specific resource or group of resources you can use kubectl describe command.
Please check the links I provided for more details and examples.
You may also want to use Web UI (Dashboard)
Dashboard is a web-based Kubernetes user interface. You can use
Dashboard to deploy containerized applications to a Kubernetes
cluster, troubleshoot your containerized application, and manage the
cluster resources. You can use Dashboard to get an overview of
applications running on your cluster, as well as for creating or
modifying individual Kubernetes resources (such as Deployments, Jobs,
DaemonSets, etc). For example, you can scale a Deployment, initiate a
rolling update, restart a pod or deploy new applications using a
deploy wizard.
Let me know if that helped.

Live monitoring of container, nodes and cluster

we are using k8s cluster for one of our application, cluster is owned by other team and we dont have full control over there… We are trying to find out metrics around resource utilization (CPU and memory), detail about running containers/pods/nodes etc. Need to find out how many parallel containers are running. Problem is they have exposed monitoring of cluster via Prometheus but with Prometheus we are not getting live data, it does not have info about running containers.
My query is , what is that API which is by default available in k8s cluster and can give all what we need. We dont want to read data form another client like Prometheus or anything else, we want to read metrics directly from cluster so that data is not stale. Any suggestions?
As you mentioned you will need metrics-server (or heapster) to get those information.
You can confirm if your metrics server is running kubectl top nodes/pods or just by checking if there is a heapster or metrics-server pod present in kube-system namespace.
Also the provided command would be able to show you the information you are looking for. I wont go into details as here you can find a lot of clues and ways of looking at cluster resource usage. You should probably take a look at cadvisor too which should be already present in the cluster. It exposes a web UI which exports live information about all the containers on the machine.
Other than that there are probably commercial ways of acheiving what you are looking for, for example SignalFx and other similar projects - but this will probably require the cluster administrator involvement.

Application monitoring in Azure Kubernetes cluster using new relic

Requirement - New Relic monitoring for an application running in pods as part of a kubernetes cluster.
I have installed Kube-state-metrics on my cluster and able to see kubernetes dashboard using newrelic insights.
Also, need to configure the Application monitoring for the same. Following https://blog.newrelic.com/2017/11/27/monitoring-application-performance-in-kubernetes/ for the same.
Have some questions for the same -
Can this be achieved using kube-state-metrics ?
Do I need to have separate yaml file for each pod containing license key?
Do I need to make changes in my application also or adding the information in spec will work?
Do I need to install Java agent in every pod? If yes, will it eat resources?
Somehow, Installation of application monitoring is becoming complex. Please explain the exact requirement of installation
You didn't mention your stack, you should follow instructions on their site for your language. Typically you just pull in their agent library and configure credentials to get started. You should not have a reason to tell your pods apart, so the agent credentials should be the same for all pods
Installing agents at infrastructure will let you have infrastructure data. So you'll get alerts if you're running out of memory/space/cpu and such. Infrastructure agent cannot possibly know about application data. If you want application performance data (apm) you need to install the agent at the application level too and you'll get data such as http request rates, error rates and response times if it's a webserver. You can also annotate current transaction with data which is all application specific. They have a bunch of client agents, see if there's one for your stack. For example all you need for a nodejs service is require('newrelic') at the top of your app and configuration