I am using AWS EKS to run Kubernetes Cluster.
In that, I am using AWS EFS for persistent storage to store application logs. I have many applications running and I create a PVC for each application. I need persistent storage for the application logs only. Now, I need these logs in Elasticsearch also. So, I use Filebeat to do so. So, this is my architecture,
I just want to get feedback on this architecture. Is this the correct way to do this? What can be the drawbacks of this? How are you sending application logs to ELK in kubernetes?
Related
I am looking to implement Redis Cache for a number of web applications in Kubernetes, but am not sure how exactly to architect the Redis Cache part.
I was thinking that if I have 5 replicas of my application, they could all use a single Redis Cache in a separate pod, as I wanted to avoid using a sidecar container for each application pod. Then for each application, they have their own Redis Cache Deployment in Kubernetes, and the application connects to this (by a service I guess).
Does this sound like a suitable plan?
How does the application talk to the Redis Cache pod, do I need to expose it via a Service?
I've seen that you should co-locate your Redis Cache and Application on the same node, is this a concern, and is there a way to do this?
With helm you can easily install redis on your cluster using the bitnami chart.
Or i prefer to install a redis operator an let it do the magic.
Either way, you can install one or multiple redis on your kubernetes cluster and they will be accessible through a kubernetes service at something like http://my-redis-service.cool-namespace.svc.cluster.local:6379.
There is no need to co-host redis on the same node, that's where kubernetes does the work.
I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.
According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack
You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.
I am new to Kubernetes and for our application I need to find number of active connection of a pod. I analyzed a bit and got to know that we can get such data using Kubernetes addons like Istio. But I learnt that using such addons can lead to memory hit as the addon creates a kind of side car container for each pod to track such data.
Our application will be getting deployed finally in IBM cloud. I am not aware at the moment if there are any efficient addons which are specific to cloud service provider. I am still checking on it.
I could not get much from Kubernetes API. Would like to know if there is any other way to get active connection count of a Pod.
We started using Kubernetes, a few time ago, and now we have deployed a fair amount of services. It's becoming more and more difficult to know exactly what is deployed. I suppose many people are facing the same issue, so is there already a solution to handle this issue?
I'm talking of a solution that when connected to kubernetes (via kubectl for example) can generate a kind of map off the cluster.
In order to display one or many resources you need to use kubectl get command.
To show details of a specific resource or group of resources you can use kubectl describe command.
Please check the links I provided for more details and examples.
You may also want to use Web UI (Dashboard)
Dashboard is a web-based Kubernetes user interface. You can use
Dashboard to deploy containerized applications to a Kubernetes
cluster, troubleshoot your containerized application, and manage the
cluster resources. You can use Dashboard to get an overview of
applications running on your cluster, as well as for creating or
modifying individual Kubernetes resources (such as Deployments, Jobs,
DaemonSets, etc). For example, you can scale a Deployment, initiate a
rolling update, restart a pod or deploy new applications using a
deploy wizard.
Let me know if that helped.
I would like to know if it is possible for multiple pods in the same Kubernetes cluster to access a database which is configured using persistent volumes on a Google cloud persistent disk.
Currently I am building a microservices achitecture web app which has 3 node apis in different pods all accessing the same database. So how do I achieve this with kubernetes.
Kindly let me know if my architecture is right as well
You can certainly connect multiple node-based app pods to the same database. It is sometimes said that microservices shouldn't share a database but this depends on what your apps are doing, the project history and the extent to which you want the parts to be worked on separately.
There are questions you have to answer about running databases at scale, such as your future load and whether you want to use relational databases if you're going to try to span availability zones. And there are
some specific to kubernetes, especially around how you associate DB Pods to data. See https://stackoverflow.com/a/53980021/9705485. Another popular option is to use a managed DB service from a cloud provider. If you do run the DB in k8s then I'd suggest looking for a helm chart or looking at an operator, such as the kubeDB operator, to avoid crafting the kubernetes descriptors yourself and to get more guidance on running the DB and setting it up.
If it's a new project and you've not used k8s before then you'll also have to decide where to host your code, your docker images and your deployment descriptors and how to setup your CI pipelines. If you've not got answers to these questions already then I'd suggest looking at Jenkins-X as it will provide you with out of the box defaults for a whole cluster and CI setup and a template ('build pack') for building node apps and deploying them to staging and prod environments through a pipeline.