Kubernetes Job to create a volume snapshot - kubernetes

I have a job, which I want to run regularly in Kubernetes 1.19.3 (DigitalOcean).
For this job, I need to take a snapshot of a PVC and do stuff to it. I know how can I run a job and mount a volume to the pod it runs, but I have a hard time finding out how to take that snapshot at the beginning of this job.
Is there any way to do it?

The tool of choice to take PV snapshots in K8s is VolumeSnapshots.
The trouble with them is that they don't come yet) with functionality for periodic triggering. So, you would have to create them from a K8s CronJob. However, doing so is not terribly straight forward, since your CronJob Pod would need to have a K8s client installed and require access to the K8s API Server with RBAC.
There are a couple of options to get there, reaching from writing your own image from scratch to using open-source solutions based on the clients from this project k8s client libraries.
Seeing that dynamic K8s manifest applying is somewhat badly supported by K8s, I actually started an open source project myself, that you could use for this purpose: K8sCrud.

Related

reversing the order of pod recreation in Kubernetes Vertical Pod Autoscaler

I am trying to use VPA for autoscaling my deployed services. Due to limitation in resources in my cluster I set the min_replica option to 1. The workflow of VPA that have seen so far is that it first deletes the existing pod and then re-create the pod. This approach will cause a downtime to my services. What I want is that the VPA first create the new pod and then deletes the old pod, completely similar to the rolling updates for deployments. Is there an option or hack to reverse the flow to the desired order in my case?
This can be achieved by using python script or by using an IAC pipeline, you can get the metrics of the kubernetes cluster and whenever these metrics exceed a certain threshold, trigger this python code for creating new pod with the required resources and shutdown the old pod. Follow this github link for more info on python plugin for kubernetes.
Ansible can also be used for performing this operation. This can be achieved by triggering your ansible playbook whenever the threshold breaches a certain limit and you can specify the new sizes of the pods that need to be created. Follow this official ansible document for more information. However both these procedures involve manual analysis for selecting the desired pod size for scaling. So if you don’t want to use vertical scaling you can go for horizontal scaling.
Note: The information is gathered from official Ansible and github pages and the urls are referred to in the post.

Export logs of Kubernetes cronjob to a path after each run

I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.
According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack
You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.

Where is KOPS located/running from?

I am new to Docker and Kubernetes, though I have mostly figured out how it all works at this point.
I inherited an app that uses both, as well as KOPS.
One of the last things I am having trouble with is the KOPS setup. I know for absolute certain that Kubernetes is setup via KOPS. There's two KOPS state stores on an S3 bucket (corresponding to a dev and prod cluster respectively)
However while I can find the server that kubectl/kubernetes is running on, absolutely none of the servers I have access to seem to have a kops command.
Am I misunderstanding how KOPS works? Does it not do some sort of dynamic monitoring (would that just be done by ReplicaSet by itself?), but rather just sets a cluster running and it's done?
I can include my cluster.spec or config files, if they're helpful to anyone, but I can't really see how they're super relevant to this question.
I guess I'm just confused - as far as I can tell from my perspective, it looks like KOPS is run once, sets up a cluster, and is done. But then whenever one of my node or master servers goes down, it is self-healing. I would expect that of the node servers, but not the master servers.
This is all on AWS.
Sorry if this is a dumb question, I am just having trouble conceptually understanding what is going on here.
kops is a command line tool, you run it from your own machine (or a jumpbox) and it creates clusters for you, it’s not a long-running server itself. It’s like Terraform if you’re familiar with that, but tailored specifically to spinning up Kubernetes clusters.
kops creates nodes on AWS via autoscaling groups. It’s this construct (which is an AWS thing) that ensures your nodes come back to the desired number.
kops is used for managing Kubernetes clusters themselves, like creating them, scaling, updating, deleting. kubectl is used for managing container workloads that run on Kubernetes. You can create, scale, update, and delete your replica sets with that. How you run workloads on Kubernetes should have nothing to do with how/what tool you (or some cluster admin) use to manage the Kubernetes cluster itself. That is, unless you’re trying to change the “system components” of Kubernetes, like the Kubernetes API or kubedns, which are cluster-admin-level concerns but happen to run on top of Kuberentes as container workloads.
As for how pods get spun up when nodes go down, that’s what Kubernetes as a container orchestrator strives to do. You declare the desired state you want, and the Kubernetes system makes it so. If things crash or fail or disappear, Kubernetes aims to reconcile this difference between actual state and desired state, and schedules desired container workloads to run on available nodes to bring the actual state of the world back in line with your desired state. At a lower level, AWS does similar things — it creates VMs and keeps them running. If Amazon needs to take down a host for maintenance it will figure out how to run your VM (and attach volumes, etc.) elsewhere automatically.

Clusters and nodes formation in Kubernetes

I am trying to deploy my Docker images using Kubernetes orchestration tools.When I am reading about Kubernetes, I am seeing documentation and many YouTube video tutorial of working with Kubernetes. In there I only found that creation of pods, services and creation of that .yml files. Here I have doubts and I am adding below section,
When I am using Kubernetes, how I can create clusters and nodes ?
Can I deploy my current docker-compose build image directly using pods only? Why I need to create services yml file?
I new to containerizing, Docker and Kubernetes world.
My favorite way to create clusters is kubespray because I find ansible very easy to read and troubleshoot, unlike more monolithic "run this binary" mechanisms for creating clusters. The kubespray repo has a vagrant configuration file, so you can even try out a full cluster on your local machine, to see what it will do "for real"
But with the popularity of kubernetes, I'd bet if you ask 5 people you'll get 10 answers to that question, so ultimately pick the one you find easiest to reason about, because almost without fail you will need to debug those mechanisms when something inevitably goes wrong
The short version, as Hitesh said, is "yes," but the long version is that one will need to be careful because local docker containers and kubernetes clusters are trying to solve different problems, and (as a general rule) one could not easily swap one in place of the other.
As for the second part of your question, a Service in kubernetes is designed to decouple the current provider of some networked functionality from the long-lived "promise" that such functionality will exist and work. That's because in kubernetes, the Pods (and Nodes, for that matter) are disposable and subject to termination at almost any time. It would be severely problematic if the consumer of a networked service needed to constantly update its IP address/ports/etc to account for the coming-and-going of Pods. This is actually the exact same problem that AWS's Elastic Load Balancers are trying to solve, and kubernetes will cheerfully provision an ELB to represent a Service if you indicate that is what you would like (and similar behavior for other cloud providers)
If you are not yet comfortable with containers and docker as concepts, then I would strongly recommend starting with those topics, and moving on to understanding how kubernetes interacts with those two things after you have a solid foundation. Else, a lot of the terminology -- and even the problems kubernetes is trying to solve -- may continue to seem opaque

Kubernetes job that consists of two pods (that must run on different nodes and communicate with each other)

I am trying to create a Kubernetes job that consists of two pods that have to be scheduled on separate nodes in our Hybrid cluster. Our requirement is that one of the pods runs on a Windows Server node and the other pod is running on a Linux node (thus we cannot just run two Docker containers from the same pod, which I know is possible, but would not work in our scenario). The Linux pod (which you can imagine as a client) will communicate over the network with the Windows pod (which you can imagine as a stateful server) exchanging data while the job runs. When the Linux pod terminates, we want to also terminate the Windows pod. However, if one of the pods fail, then we want to fail both pods (as they are designed to be a single job)
Our current design is to write a K8S service that handles the communication between the pods, and then apply the service and the two pods to the cluster to "emulate" a job. However, this is not ideal since the two pods are not tightly coupled as a single job and adds quite a bit of overhead to manually manage this setup (e.g. when failures or the job, we probably need to manually kill the service and deployment of the Windows pod). Plus we would need to deploy a new service for each "job", as we require the Linux pod to always communicate with the same Windows pod for the duration of the job due to underlying state (thus cannot use a single service for all Windows pods).
Any thoughts on how this could be best achieved on Kubernetes would be much appreciated! Hopefully this scenario is supported natively, and I would not need to resort in this kind of pod-service-pod setup that I described above.
Many thanks
I am trying to distinguish your distaste for creating and wiring the Pods from your distaste at having to do so manually. Because, in theory, a Job that creates Pods is very similar to what you are describing, and would be able to have almost infinite customization for those kinds of rules. With a custom controller like that, one need not create a Service for the client(s) to speak to their server, as the Job could create the server Pod first, obtain its Pod-specific-IP, and feed that to the subsequently created client Pods.
I would expect one could create a Job controller using only bash and either curl or kubectl: generate the json or yaml that describes the situation you wish to have, feed it to the kubernetes API (since the Job would have a service account - just like any other in-cluster container), and use normal traps to cleanup after itself. Without more of the specific edge cases loaded in my head it's hard to say if that's a good idea or not, but I believe it's possible.