How to join 2 sets of Prometheus metrics? - kubernetes

AKS = 1.17.9
Prometheus = 2.16.0
kube-state-metrics = 1.8.0
My use case: I want to alert when 1 of my persistent volumes are not in a "Bound" phase and only when this falls within a predefined set of namespaces.
This got me to my first attempt at joining Prometheus metrics - so, please bear with me : )
I opted to use the following to obtain the pv phase:
kube_persistentvolume_status_phase{phase="Bound",job="kube-state-metrics"}
Renders:
kube_persistentvolume_status_phase{instance="10.147.5.110:8080",job="kube-state-metrics",persistentvolume="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d",phase="Bound"} 1
kube_persistentvolume_status_phase{instance="10.147.5.110:8080",job="kube-state-metrics",persistentvolume="pvc-165d5006-erd4-481e-8acc-eed4a04a3bce",phase="Bound"} 1
This worked well, except for the fact that it does not include the namespace.
So I managed to determine the persistentvolumeclaim namespaces with this:
kube_persistentvolumeclaim_info{namespace=~"monitoring|vault"}
Renders:
kube_persistentvolumeclaim_info{instance="10.147.5.110:8080",job="kube-state-metrics",namespace="vault",persistentvolumeclaim="vault-file",storageclass="default",volumename="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d"} 1
kube_persistentvolumeclaim_info{instance="10.147.5.110:8080",job="kube-state-metrics",namespace="monitoring",persistentvolumeclaim="prometheus-prometheus-db-prometheus-prometheus-0",storageclass="default",volumename="pvc-165d5006-erd4-481e-8acc-eed4a04a3bce"} 1
So my idea was to join these sets with the matching values in the following fields:
(kube_persistentvolume_status_phase)persistentvolume
on
(kube_persistentvolumeclaim_info)volumename  
BUT, if I understood it correctly you are only able to join two metrics sets on labels that match exactly (text and their values). I hence opted for the "instance" and "job" labels as these were common on both sides and matching. 
kube_persistentvolume_status_phase{phase!="Bound",job="kube-state-metrics"}  * on(instance,job) group_left(namespace) kube_persistentvolumeclaim_info{namespace=~"monitoring|vault"}
Renders:
Error executing query: found duplicate series for the match group {instance="10.147.5.110:8080" , job="kube-state-metrics"} on the right hand-side of the operation: [{__name__="kube_persistentvolumeclaim_info", instance="10.147.5.110:8080", job="kube-state-metrics", namespace="monitoring", persistentvolumeclaim="alertmanager-prometheusam-db-alertmanager-prometheusam-0", storageclass="default", volumename="pvc-b8406fb8-3262-7777-8da8-151815e05d75"}, {__name__="kube_persistentvolumeclaim_info", instance="10.147.5.110:8080", job="kube-state-metrics", namespace="vault", persistentvolumeclaim="vault-file", storageclass="default", volumename="pvc-33197ae6-d42a-777e-b8ca-efbd66a8750d"}];many-to-many matching not allowed: matching labels must be unique on one side
So in all fairness, the query does communicate well on what the problem is - so I attempted to solve this with the "ignoring" option - attempting to keep only the matching labels and values (instance and job) and "excluding/ignoring" the non-matching ones on both sides. This did not work either - resulting in a parsing error. Which in turn nudged me to take a step back and reassess what I am doing.
I am just a bit concerned that I am perhaps barking up the wrong tree here.
My question is: Is this at all possible and if so how? or is there perhaps another, more prudent way to achieve this?
Thanks in advance!

Related

How to use binary operators in LOKI ? Grafana

I'm creating a panel to show the error count in logs for canary instances. First, I need to find whether the instance is canary or not. So, if the instance is canary then I have to show the error log count for that instance.
To filter the canary instance - I have stack label so if the stack contains one instance then it should be a canary instance.
The expression should check the instance count of each and every stack, if a stack has one instance then it needs to search for the keyword in the log.
How do achieve this? I am looking for an expression something like below.
sum(count_over_time({component="stack-blue.*" ,cloud=~"${cloud}" ,environment=~"${environment}" ,location=~"${location}" } |= "Unable to record" [$__interval]))
and
(count(count by(hostname)(count_over_time({component="stack-blue.*",cloud=~"${cloud}" ,environment=~"${environment}" ,location=~"${location}" } [$__interval]))) == 1)
You can combine two separate queries A and B with a Math expression $A && $B.
Note that you can decide whether a query or expression is displayed in the panel by clicking on the eye symbol.

Terraform EKS specify node-role.kubernetes.io label on node group

In the terraform aws_eks_node_group resource I can't set :
labels = {
"node-role.kubernetes.io/others" = "other"
}
as AWS complains labels key should not contains kubernetes.io.
Error: error creating EKS Node Group (my-cluster:others): InvalidParameterException: Label cannot contain reserved labels kubernetes.io/
{
ClusterName: "my-cluster",
Message_: "Label cannot contain reserved labels kubernetes.io/",
NodegroupName: "others"
}
Also, EC2 instances spawned have no name and I have no clue on how to specify a Name for my instances based on their node group.
Any idea on how to achieve this ?
As per the documentation you can't use specific labels, regarding labels:
The kubernetes.io/ and k8s.io/ prefixes are reserved for Kubernetes core components. Valid label values must be 63 characters or less and must be empty or begin and end with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between.
Regarding your specific label. There are many issues since k8s 1.15 or 1.16 where some change in the core kubernetes don’t allow that label. one detailed issue
As for naming of EC2 instances created by EKS Node Group. Currently, there is no way to pass "Name" tag. This question is a duplicate of this where you can also find the answer how to name your instances for time being.
node-role.kubernetes.io and kubernetes.io - this is DIFFERENT prefixes

How to use OSRM's match service

As stated in the header: how can I use the match call?
I tried
http://router.project-osrm.org/match/v1/driving/8.610048,46.99917;8.530232,47.051?overview=full&radiuses=49;49
I am not sure, whether the list of radiuses is given correctly.
I can't get it work. I also tried [49;49] or {49;49} The command works with route:
http://router.project-osrm.org/route/v1/driving/8.610048,46.99917;8.530232,47.051?overview=full
For backround see here
Edit: If you look at the example here, itr seems, the timestamps are not needed /match/v1/{profile}/{coordinates}?steps={true|false}&geometries={polyline|polyline6|geojson}&overview={simplified|full|false}&annotations={true|false}
From the docs:
Large jumps in the timestamps (> 60s) or improbable transitions lead to trace splits if a complete matching could not be found.
I think that's the problem with your request. The two given points are more than 60s appart and OSRM cannot match them successfully. The radiuses are specified correctly.
The following query works for me:
http://router.project-osrm.org/match/v1/driving/8.610048,46.99917;8.620048,46.99917?overview=full&radiuses=49;49
This returns:
{"tracepoints":[{"location":[8.610971,46.998963],"name":"Alte Kantonstrasse","hint":"GKUFgJEhBwAAAAAAHQAAAAAAAAC5AAAAAAAAAB0AAAAAAAAAuQAAAPsCAACbZIMAsyXNAgBhgwCCJs0CAAAPABki8hY=","matchings_index":0,"waypoint_index":0,"alternatives_count":0},{"location":[8.620295,46.999681],"name":"Schönenbuchstrasse","hint":"nIEFAJ7IFIA3AAAAZAAAAAAAAADYAAAANwAAAGQAAAAAAAAA2AAAAPsCAAAHiYMAgSjNAhCIgwCCJs0CAAAPABki8hY=","matchings_index":0,"waypoint_index":1,"alternatives_count":5}],"matchings":[{"distance":922.3,"duration":114.1,"weight":114.1,"weight_name":"routability","geometry":"onz}Gqyps#Wg#S_#aCaFMUYo#c#w#OKOCWmAWs#aBiDsAsCMYH[HY\\_#h#ObBW^w#BQAUKu#ASF[ZaABOFYpAyIf#mD","confidence":0.000982,"legs":[{"distance":922.3,"duration":114.1,"weight":114.1,"summary":"","steps":[]}]}],"code":"Ok"}
So the two given input points 8.610048,46.99917 and 8.620048,46.99917 are matched to 8.610971,46.998963 and 8.620295,46.999681.
So as far as I can see, if you want to implement something like that, you need to give OSRM more input points on its way which are less than 60s apart.
See also here for an explanation about the differences between route and match service.

Stata: append two datasets, retain value labels

I'm using Stata14 and I'm trying to append two survey datasets that have ~200 variables with same names but different values and value labels. I would like to do the appending so that value labels are retained from the dataset 'on disk'.
Here is an example describing my problem:
Variable in dataset 1 (master):
value - label
1 - yes
2 - no
Same variable in dataset 2 (appended to master):
value - label
1 - yes absolutely
2 - no definitely not
3 - maybe
4 - don't know
Result with append using "dataset 2.dta"
value - label
1 - yes
2 - no
3 - 3
4 - 4
Desired result:
value - label
1 - yes
2 - no
3 - maybe
4 - don't know
Is there any way to do this directly using append? If not, any suggestions on doing the task efficiently are most welcome.
You want to make value labels consistent, which is sensible, fine and easy to do.
When you have appended all the datasets, you then overwrite any value label assignment with a quick
label define whatever 1 yes 2 no 3 maybe 4 "don't know"
label val myvar whatever
with a , modify on the first if a set of value labels with that name already exists.
It's a task to do late. It doesn't need to be fixed before or during the append, and it can most easily be done at that point.
Naturally, this is tedious for several variables, but it's not difficult to understand. Furthermore, even if append were able to take instructions on which labels to use, you would still have to spell that out. In your example, the value labels you want are not actually in use in any of the datasets. So, there will be some inevitable pain. There is a mess to sort out and what the fix is can't be fully automated because it depends on your ideas of what labels are best.
in short, the answer is
NOPE
so you have to be smart. Try to use this trick http://www.stata.com/support/faqs/data-management/keeping-same-variable-with-collapse/ where you get a local copy of the labels that you will be attaching to the full dataset afterwards.

Create ordinal array with multiple groups

I need to categorize a dataset according to different age groups. The categorization depends on whether the Sex is Male or Female. I first subset the data by gender and then use the ordinal function (dataset is from a Matlab example). The following code crashes on the last line when I try to vertically concatenate the subsets:
load hospital;
subset_m=hospital(hospital.Sex=='Male',:);
subset_f=hospital(hospital.Sex=='Female',:);
edges_f=[0 20 max(subset_f.Age)];
edges_m=[0 30 max(subset_m.Age)];
labels_m = {'0-19','20+'};
labels_f = {'0-29','30+'};
subset_m.AgeGroup= ordinal(subset_m.Age,labels_m,[],edges_m);
subset_f.AgeGroup = ordinal(subset_f.Age,labels_f,[],edges_f);
vertcat(subset_m,subset_f);
Error using dataset/vertcat (line 76)
Could not concatenate the dataset variable 'AgeGroup' using VERTCAT.
Caused by:
Error using ordinal/vertcat (line 36)
Ordinal levels and their ordering must be identical.
Edit
It seems that a vital part was missing in the question, here is the answer to the corrected question. You need to use join rather than vertcat, for example:
joinFull = join(subset_f,subset_m,'LeftKeys','LastName','RightKeys','LastName','type','rightouter','mergekeys',true)
Solution of original problem
It seems like you are actually trying to work with the wrong variable. If I change all instances of hospitalCopy into hospital then everything works fine for me.
Perhaps you copied hospital and edited it, thus losing the validity of the input.
If you really need to have hospitalCopy make sure to assign to it directly after load hospital.
If this does not help, try using clear all before running the code and make sure there is no file called 'hospital' in your current directory.