In GCP kubernetes I have 2 clusters of different region, in both I have deployed Elasticsearch,Kibana operator & logs pushed by filebeat runs in a container along with application container in same pod.
I have plan to deploy ElasticSearch, Kibana operator in every cluster, thus looking for feasibility to have a centralized kibana without an Elasticsearch for that centralized kibana as I dont want to spend money buying storage for centralized elasticsearch to store all other region cluster's logs.
Expectation: will have a centralized kibana and I will configure other kibana's IP & password in it and my query should go over the cluster's kibana fetch/return data to central kibana.
Is it possible? any alternates suggestions please..
Kibana needs elastic search for storing configuration data. So you may add one small elastic node with centralized Kibana.
Then connect to external elastic search servers. https://www.elastic.co/guide/en/kibana/current/working-remote-clusters.html
Related
I have Grafana running inside a Kubernetes Cluster and i want to push logs from outside of Kubernetes (apps not running in K8s/DB's etc) into kubernetes so i can view them inside the Grafana cluster. What's the best way of doing this?
So Grafana is a GUI for reporting on data stored in other databases. It sounds like you are capturing metrics from the cluster and this data is stored in another database. If you are running Prometheus this is the database for Grafana's time-series data. You also may end up running long-term storage systems like Thanos in the future for that data to keep it over time depending on the volume of data.
Back to logging... Similarly to use Grafana for logs you'll need to implement some kind of logging database. The most popular is the formerly open-source ELK (ElasticSearch, Logstash, Kibana) stack. You can now use OpenSearch which is an open-source version of ElasticSearch and Kibana. Most K8S distributions come with Fluentd which replaces logstash for sending data. You can also install Fluentd or Fluentbit on any host to send data to this stack. You'll find that Grafana is not the best for log analysis, so most people use Kibana (OpenSearch Dashboards). However you can use Grafana as well, it's just painful IMO.
Another option if you don't want to run ELK is using Grafana Loki, which is another open-source database for logging. It's a lot more simple, but also more limited as to how you can query the logs due to the way it indexes. It works nicely with Grafana, but once again this is not a full-text indexing technology so it will be a bit limited.
Hope this is helpful, let me know if you have questions!
I would like to know if it is possible for multiple pods in the same Kubernetes cluster to access a database which is configured using persistent volumes on a Google cloud persistent disk.
Currently I am building a microservices achitecture web app which has 3 node apis in different pods all accessing the same database. So how do I achieve this with kubernetes.
Kindly let me know if my architecture is right as well
You can certainly connect multiple node-based app pods to the same database. It is sometimes said that microservices shouldn't share a database but this depends on what your apps are doing, the project history and the extent to which you want the parts to be worked on separately.
There are questions you have to answer about running databases at scale, such as your future load and whether you want to use relational databases if you're going to try to span availability zones. And there are
some specific to kubernetes, especially around how you associate DB Pods to data. See https://stackoverflow.com/a/53980021/9705485. Another popular option is to use a managed DB service from a cloud provider. If you do run the DB in k8s then I'd suggest looking for a helm chart or looking at an operator, such as the kubeDB operator, to avoid crafting the kubernetes descriptors yourself and to get more guidance on running the DB and setting it up.
If it's a new project and you've not used k8s before then you'll also have to decide where to host your code, your docker images and your deployment descriptors and how to setup your CI pipelines. If you've not got answers to these questions already then I'd suggest looking at Jenkins-X as it will provide you with out of the box defaults for a whole cluster and CI setup and a template ('build pack') for building node apps and deploying them to staging and prod environments through a pipeline.
I have configured a working EFK(Elasticesearch,Fluentd,Kibana) in one of my kubernetes cluster builded in GCP. I have two more clusters and installed the same EFK in remaining too. Now If I want to monitor the logs of each cluster environment,then I need to check all the three kibana console. Please let me know is it possible to centralize the all EFK builded in three clusters, So that I can manage to see the pod logs from all my clusters in a single Kibana console. Any help or suggestion will be helpful.
In fact Kibana only draws and allows to sort/manage data which exists in Elasticsearch. Let's say, you have 3 k8s clusters. Consequently, you have 3 DaemonSet of Fluentd. All you should do - is configure all Fluentd deployments to send data to the one and only Elasticsearch endpoint, to which the Kibana is connected.
We have some Kubernetes clusters that have been deployed using kops in AWS.
We really like using the upstream/official images.
We have been wondering whether or not there was a good way to monitor the systems without installing software directly on the hosts? Are there docker containers that can extract the information from the host? I think that we are likely concerned with:
Disk space (this seems to be passed through to docker via df
Host CPU utilization
Host memory utilization
Is this host/node level information already available through heapster?
Not really a question about kops, but a question about operating Kubernetes. kops stops at the point of having a functional k8s cluster. You have networking, DNS, and nodes have joined the cluster. From there your world is your oyster.
There are many different options for monitoring with k8s. If you are a small team I usually recommend offloading monitoring and logging to a provider.
If you are a larger team or have more specific needs then you can look at such options as Prometheus and others. Poke around in the https://github.com/kubernetes/charts repository, as I know there is a Prometheus chart there.
As with any deployment of any form of infrastructure you are going to need Logging, Monitoring, and Metrics. Also, do not forget to monitor the monitoring ;)
I am using https://prometheus.io/, it goes naturally with kubernetes.
Kubernetes api already exposes a bunch of metrics in prometheus format,
https://github.com/kubernetes/ingress-nginx also exposes prometheus metrics (enable-vts-status: "true"), and you can also install https://github.com/prometheus/node_exporter as a daemonset to monitor CPU, disk, etc...
I install one prometheus inside the cluster to monitor internal metrics and one outside the cluster to monitor LBs and URLs.
Both send alerts to the same https://github.com/prometheus/alertmanager that MUST be outside the cluster.
It took me about a week to configure everything properly.
It was worth it.
I am trying to figure out how to best collect metrics from a set of spring boot based services running within a Kubernetes cluster. Looking at the various docs, it seems that the choice for internal monitoring is between Actuator or Spectator with metrics being pushed to an external collection store such as Redis or StatsD or pulled, in the case of Prometheus.
Since the number of instances of a given service is going to vary, I dont see how Prometheus can be configured to poll those running services since it will lack knowledge of them. I am also building around a Eureka service registry so not sure if that is polled first in this configuration.
Any real world insight into this kind of approach would be welcome.
You should use the Prometheus java client (https://www.robustperception.io/instrumenting-java-with-prometheus/) for instrumenting. Approaches like redis and statsd are to be avoided, as they mean hitting the network on every single event - greatly limiting what you can monitor.
Use file_sd service discovery in Prometheus to provide it with a list of targets from Eureka (https://www.robustperception.io/using-json-file-service-discovery-with-prometheus/), though if you're using Kubernetes like your tag hints Prometheus has a direct integration there.