Error Prometheus endpoint for checking AlertManager - kubernetes

I installed Prometheus (follow in this link: https://devopscube.com/setup-prometheus-monitoring-on-kubernetes/)
But, when checking status of Targets, it shows "Down" for AlertManager service, every another endpoint are up, please see the attached file
Then, I check Service Discovery, the discovered labels shows:
"address="192.168.180.254:9093"
__meta_kubernetes_endpoint_address_target_kind="Pod"
__meta_kubernetes_endpoint_address_target_name="alertmanager-6c666985cc-54rjm"
__meta_kubernetes_endpoint_node_name="worker-node1"
__meta_kubernetes_endpoint_port_protocol="TCP"
__meta_kubernetes_endpoint_ready="true"
__meta_kubernetes_endpoints_name="alertmanager"
__meta_kubernetes_namespace="monitoring"
__meta_kubernetes_pod_annotation_cni_projectcalico_org_podIP="192.168.180.254/32"
__meta_kubernetes_pod_annotationpresent_cni_projectcalico_org_podIP="true"
__meta_kubernetes_pod_container_name="alertmanager"
__meta_kubernetes_pod_container_port_name="alertmanager"
__meta_kubernetes_pod_container_port_number="9093""
But Target Labels show another port (8080), I don't know why:
instance="192.168.180.254:8080"
job="kubernetes-service-endpoints"
kubernetes_name="alertmanager"
kubernetes_namespace="monitoring"

First, if you want to install prometheus and grafana without getting sick, you need to do it though helm.
First install helm
And then
helm install installationWhatEverName stable/prometheus-operator

I've reproduced your issue on GCE.
If you are using version 1.16+ you have probably changed apiVersion as in tutorial you have Deployment in extensions/v1beta1. Since K8s 1.16+ you need to change it to apiVersion: apps/v1. Otherwise you will get error like:
error: unable to recognize "STDIN": no matches for kind "Deployment" in version "extensions/v1beta1"
Second thing, in 1.16+ you need to specify selector. If you will not do it you will receive another error:
`error: error validating "STDIN": error validating data: ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec; if you choose to ignore these errors, turn validation off with --validate=false`
It would look like:
...
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
...
Regarding port 8080 please check this article with example.
Port: Port is the port number which makes a service visible to
other services running within the same K8s cluster. In other words,
in case a service wants to invoke another service running within the
same Kubernetes cluster, it will be able to do so using port specified
against “port” in the service spec file.
It worked for my environment in GCE. Did you configure firewall for your endpoints?
In addition. In Helm 3 some hooks were deprecated. You can find this information here.
If you still have issue please provide your YAMLs witch applied changes to version 1.16+.

Related

Kubectl error upon applying agones fleet: ensure CRDs are installed first

I am using minikube (docker driver) with kubectl to test an agones fleet deployment. Upon running kubectl apply -f lobby-fleet.yml (and when I try to apply any other agones yaml file) I receive the following error:
error: resource mapping not found for name: "lobby" namespace: "" from "lobby-fleet.yml": no matches for kind "Fleet" in version "agones.dev/v1"
ensure CRDs are installed first
lobby-fleet.yml:
apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: lobby
spec:
replicas: 2
scheduling: Packed
template:
metadata:
labels:
mode: lobby
spec:
ports:
- name: default
portPolicy: Dynamic
containerPort: 7600
container: lobby
template:
spec:
containers:
- name: lobby
image: gcr.io/agones-images/simple-game-server:0.12 # Modify to correct image
I am running this on WSL2, but receive the same error when using the windows installation of kubectl (through choco). I have minikube installed and running for ubuntu in WSL2 using docker.
I am still new to using k8s, so apologies if the answer to this question is clear, I just couldn't find it elsewhere.
Thanks in advance!
In order to create a resource of kind Fleet, you have to apply the Custom Resource Definition (CRD) that defines what is a Fleet first.
I've looked into the YAML installation instructions of agones, and the manifest contains the CRDs. you can find it by searching kind: CustomResourceDefinition.
I recommend you to first try to install according to the instructions in the docs.

Failure in creating service when using "topologySpreadConstraints" in Deployment definition

We are running an application inside a cluster we created in GKE. We have created required yamls (consisting of Service and Deployment definition). We recently have decided to use Pod Topology for that I have added following piece in my Deployment yaml file under spec section-
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: node
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: foo-app
This change is working as expected when I am running the service inside a minikube cluster while the same change is not working inside a GKE cluster. It throws an error-
Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(Deployment.spec.template.spec): unknown field "topologySpreadConstraints" in io.k8s.api.core.v1.PodSpec
I searched a lot but could not find a satisfactory answer. Has anybody faced this problem? Please help me understand the problem and its resolution.
Thanks in Advance.
I assume you are running 1.17.7-gke.17 on your GKE cluster. Unfortunately this is the latest version you can upgrade to, through the rapid channel, at the time of this post.
topologySpreadConstraints is available in Kubernetes v1.18 FEATURE STATE: [beta]

Can't see my service traces in Jaeger UI (local node, Docker for Mac)

I'm trying to play around with Jaeger and open-tracing in my local k8s node (Docker for Mac) and having some trouble see traces in the UI.
I'm using the Jaeger operator and deployment annotations to inject the jaeger sidecar. The Jaeger cr is configured to sample constantly every request.
Up until this point, everything seems to be fine but when I send some HTTP traffic to my pods (Through nginx-ingress) I can see it coming but can't find any traces in Jaeger UI.
From reading the documentation, these steps should've implicitly collect and send the traces.
Am I missing something?
You need to enable open tracing in nginx ingress controller.
To enable the instrumentation we must enable OpenTracing in the configuration ConfigMap:
data:
enable-opentracing: "true"
To enable or disable instrumentation for a single Ingress, use the enable-opentracing annotation:
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/enable-opentracing: "true"
You must also set the host to use when uploading traces:
jaeger-collector-host: jaeger-agent.default.svc.cluster.local
https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/

OpenShift - how can pods resolve each other names

I m trying to have cloudera manager and cloudera agents on openshift, in order to run the installation I need to get all the pods communicating with each other.
Manually, I modified the /etc/hosts on the manager and add all the agents and on the agents I added the manager and all the other agents.
Now I wanted to automate this, let suppose I add a new agent, I want it to resolve the manager and the host (I can get a part of it done, by passing the manager name as an env variable and with a shell script add it to the /etc/hosts, not the ideal way but still solution). But the second part would be more difficult, to get the manager to resolve every new agent, and also to resolve every other agent on the same service.
I was wondering if there is a way so every pod on the cluster can resolve the others names ?
I have to services cloudera-manager with one pod, and an other service cloudera-agent with -let's say- 3 agents.
do you have any idea ?
thank you.
Not sure, but it looks like you could benefit from StatefulSets.
There are other ways to get the other pods ips (like using a headless service or requesting to the serverAPI directly ) but StatefulSets provide :
Stable, unique network identifiers
Stable, persistent storage.
Lots of other functionality that facilitates the deployment of a special kind of clusters like distributed databases. Not sure my term 'distributed' here is correct, but it helps me remind what they are for :).
If you want to get all Pods running under a certain Service, make sure to use a headless Service (i.e. set clusterIP: None). Then, you can query your local DNS-Server for the Service and will receive A-Records for all Pods assigned to it:
---
apiVersion: v1
kind: Service
metadata:
name: my-sv
namespace: my-ns
labels:
app: my-app
spec:
clusterIP: None
selector:
app: my-app
Then start your Pods (make sure to give app: labels for assignment) and query your DNS-Server from any of them:
kubectl exec -ti my-pod --namespace=my-ns -- /bin/bash
$ nslookup my-sv.my-ns.svc.cluster.local
Server: 10.255.3.10
Address: 10.255.3.10#53
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.24.11
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.5.73
Name: my-sv.my-ns.svc.cluster.local
Address: 10.254.87.6

Why don't I have a default serviceAccount on kubernetes?

I'm trying to get Kubernetes running on some local machines running CoreOS. I'm loosely following this guide. Everything seems to be up and running, and I'm able to connect to the api via kubectl. However, when I try to create a pod, I get this error:
Pod "redis-master" is forbidden: Missing service account default/default: <nil>
Doing kubectl get serviceAccounts confirms that I don't have any ServiceAccounts:
NAME SECRETS
According to the documentation, each namespace should have a default ServiceAccount. Running kubectl get namespace confirms that I have the default namespace:
NAME LABELS STATUS
default <none> Active
I'm brand new to Kubernetes and CoreOS, so I'm sure there's something I'm overlooking, but I can't for the life of me figure out what's going on. I'd appreciate any pointers.
UPDATE
It appears the kube-controller-manager isn't running. When I try to run it, I get this message:
I1104 21:09:49.262780 26292 plugins.go:69] No cloud provider specified.
I1104 21:09:49.262935 26292 nodecontroller.go:114] Sending events to api server.
E1104 21:09:49.263089 26292 controllermanager.go:217] Failed to start service controller: ServiceController should not be run without a cloudprovider.
W1104 21:09:49.629084 26292 request.go:302] field selector: v1 - secrets - type - kubernetes.io/service-account-token: need to check if this is versioned correctly.
W1104 21:09:49.629322 26292 request.go:302] field selector: v1 - serviceAccounts - metadata.name - default: need to check if this is versioned correctly.
W1104 21:09:49.636082 26292 request.go:302] field selector: v1 - serviceAccounts - metadata.name - default: need to check if this is versioned correctly.
W1104 21:09:49.638712 26292 request.go:302] field selector: v1 - secrets - type - kubernetes.io/service-account-token: need to check if this is versioned correctly.
Since I'm running this locally, I don't have a cloud provider. I tried to define --cloud-provider="" but it still complains with the same error.
The default service account for each namespace is created by the service account controller, which is a loop that is part of the kube-controller-manager binary. So, verify that binary is running, and check its logs for anything that suggests it can't create a service account, make sure you set the "--service-account-private-key-file=somefile" to a file that has a valid PEM key.
Alternatively, if you want to make some progress without service accounts, and come back to that later, you can disable the admission controller that is blocking your pods by removing the "ServiceAccount" option from your api-server's --admission-controllers flag. But you will probably want to come back and fix that later.
This worked for me
--disable-admission-plugins=ServiceAccount