Safely Evaluating Input of Multiple Types - OPA Gatekeeper/Rego - kubernetes

I'm trying to deploy a Constraint Template to my Kubernetes cluster for enforcing PodDisriptionBudgets contain a maxUnavailable percentage higher than a given percentage, and denying integer values.
However, I'm unsure how to safely evaluate maxUnavailable since it can be an integer or a string. Here is the constraint template I am using:
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: pdbrequiredtolerance
spec:
crd:
spec:
names:
kind: PdbRequiredTolerance
validation:
# Schema for the `parameters` field
openAPIV3Schema:
properties:
minAllowed:
type: integer
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package pdbrequiredtolerance
# Check that maxUnavailable exists
violation[{"msg": msg }] {
not input.review.object.spec.maxUnavailable
msg := "You must use maxUnavailable on your PDB"
}
# Check that maxUnavailable is a string
violation[{"msg": msg}] {
not is_string(input.review.object.spec.maxUnavailable)
msg := "maxUnavailable must be a string"
}
# Check that maxUnavailable is a percentage
violation[{"msg": msg}] {
not endswith(input.review.object.spec.maxUnavailable,"%")
msg := "maxUnavailable must be a string ending with %"
}
# Check that maxUnavailable is in the accpetable range
violation[{"msg": msg}] {
percentage := split(input.review.object.spec.maxUnavailable, "%")
to_number(percentage[0]) < input.parameters.minAllowed
msg := sprintf("You must have maxUnavailable of %v percent or higher", [input.parameters.minAllowed])
}
When I enter a PDB with a value that's too high, I receive the expected error:
Error from server ([pdb-must-have-max-unavailable] You must have maxUnavailable of 30 percent or higher)
However, when I use a PDB with an integer value:
Error from server (admission.k8s.gatekeeper.sh: __modset_templates["admission.k8s.gatekeeper.sh"]["PdbRequiredTolerance"]_idx_0:14: eval_type_error: endswith: operand 1 must be string but got number)
This is because endswith rule is trying to evaluate a string. Is there any way around this in Gatekeeper? Both PDBs I specified are valid Kubernetes manifests. I do not wish to return this confusing error to our end users, and would rather clarify that they cannot use integers.

I believe this was solved elsewhere, but for posterity, one solution to this would be to simply convert the value of variable type to a known type (like string) before doing the comparison or operation.
maxUnavailable := sprintf("%v", [input.review.object.spec.maxUnavailable])
maxUnavailable can now safely be dealt with as a string regardless of the original type.

Related

How to replace some of the values in Helm template definition?

I have a value someField in my values.yaml file that looks like this:
someField:
field1: 1
field2: someValue
field3: someObject
...
I now want to add this to my template, but change some of the fields and add some other fields and maybe remove some of them too.
someField:
field1: 2
field3: someObject
...
field100: someNewValue
I can write to my template all the fields one by one and then do what I want, but there are multiple fields that are available and I would like to avoid listing them all.
I can use toYaml, but it would fully write the original value, and I do not know how to augment it "on the fly"
Extended example:
I have another helm chart installed that is called KEDA, and it defines a CRD:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: {scaled-object-name}
spec:
scaleTargetRef:
apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1
kind: {kind-of-target-resource} # Optional. Default: Deployment
name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject
envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300 # Optional. Default: 300 seconds
idleReplicaCount: 0 # Optional. Default: ignored, must be less than minReplicaCount
minReplicaCount: 1 # Optional. Default: 0
maxReplicaCount: 100 # Optional. Default: 100
fallback: # Optional. Section to specify fallback options
failureThreshold: 3 # Mandatory if fallback section is included
replicas: 6 # Mandatory if fallback section is included
advanced: # Optional. Section to specify advanced options
restoreToOriginalReplicaCount: true/false # Optional. Default: false
horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options
name: {name-of-hpa-resource} # Optional. Default: keda-hpa-{scaled-object-name}
behavior: # Optional. Use to modify HPA's scaling behavior
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
triggers:
- type: service-bus
authenticationRef: name-of-auth
I have my own HelmChart where I want to generate this CRD from a definition, but change some values.
So a user would provide spec, but I would augment it by e.g. adding 'authenticationRef' to every element of the triggers array.

HELM YAML - arrays values that sometimes requires spaces/tabs but sometimes not?

I am confused when to use spaces and when not to when it comes to arrays and configs.
I think for single value arrays you need to use spaces:
ie:
values:
- "hello"
- "bye"
- "yes"
However this is wrong:
spec:
scaleTargetRef:
name: sb-testing
minReplicaCount: 3
triggers:
- type: azure-servicebus
metadata:
direction: in
When the values are a map, the helm interpreter complains when I add spaces:
error: error parsing deploy.yaml: error converting YAML to JSON: yaml: line 12: did not find expected '-' indicator
Doesn't when I don't:
spec:
scaleTargetRef:
name: sb-testing
minReplicaCount: 3
triggers:
- type: azure-servicebus
metadata:
direction: in
I can't seem to find any rules about this.
An array of objects in YAML can start with or without spaces. Both are valid in YAML syntax.
values:
- "hello"
- "bye"
- "yes"
values:
- "hello"
- "bye"
- "yes"
Make sure that the keys of the same block must be in the same column.
Sample:
spec:
scaleTargetRef:
name: sb-testing
minReplicaCount: 3
triggers:
- type: azure-servicebus
metadata: # "metadata" and "type" in the same column
direction: in
or
spec:
scaleTargetRef:
name: sb-testing
minReplicaCount: 3
triggers:
- type: azure-servicebus
metadata:
direction: in

Find nested key-value pair in yaml

I'm trying to use yq to find if a key-value pair exists in a yaml.
Here's an example yaml:
houseThings:
- houseThing:
thingType: chair
- houseThing:
thingType: table
- houseThing:
thingType: door
I just want an expression that evaluates to true (or any value, or exits with zero status) if the key-value pair of thingType: door exists in the yaml above.
The best I can do so far is find if the value exists by recursively walking all nodes and checking their value:
yq eval '.. | select(. == "door")' my_file.yaml which returns door. But I also want to make sure thingType is its key.
You could use the select statement under houseThing as
yq e '.houseThings[].houseThing | select(.thingType == "door")' yaml
or do a recursive look for it
yq e '.. | select(has("thingType")) | select(.thingType == "door")' yaml

how to explicitly write two references in ruamel.yaml

If I have multiple references and when I write them to a YAML file using ruaml.yaml from Python I get:
<<: [*name-name, *help-name]
but instead I would prefer to have
<<: *name-name
<<: *help-name
Is there an option to achieve this while writing to the file?
UPDATE
descriptions:
- &description-one-ref
description: >
helptexts:
- &help-one
help_text: |
questions:
- &question-one
title: "title test"
reference: "question-one-ref"
field: "ChoiceField"
choices:
- "Yes"
- "No"
required: true
<<: *description-one-ref
<<: *help-one
riskvalue_max: 10
calculations:
- conditions:
- comparator: "equal"
value: "Yes"
actions:
- riskvalue: 0
- conditions:
- comparator: "equal"
value: "No"
actions:
- riskvalue: 10
Currently I'm reading such a file and modify specific values within python and then want to write it back. When I'm writing I'm getting the issue that the references are as list and not as outlined.
That means the workflow is as: I'm reading the doc via
yaml = ruamel.yaml.YAML()
with open('test.yaml') as f:
data = yaml.load(f)
for k in data.keys():
if k == 'questions':
q = data.get(k)
for i in range(0, len(q)):
q[i]['title'] = "my new title"
f.close()
g = open('new_file.yaml', 'w')
yaml(data)
g.close()
No, there is no such option, as it would lead to an invalid YAML file.
The << is a mapping key, for which the value is interpreted
specially assuming the parser implements to the language independent
merge key specification. And a mapping key must be unique
according to the YAML specification:
The content of a mapping node is an unordered set of key: value node
pairs, with the restriction that each of the keys is unique.
That ruamel.yaml (< 0.15.75) doesn't throw an error on such
duplicate key is a bug. On duplicate normal keys, ruamel.yaml
does throw an error. The bug is inherited from PyYAML (which is not
specification conformant, and does not throw an error even on
duplicate normal keys).
However with a little pre- and post-processing what you want to do can
be easily achieved. The trick is to make the YAML valid before parsing
by making the offending duplicate << keys unique (but recognisable)
and then, when writing the YAML back to file, substituting these
unique keys by <<: * again. In the following the first occurence of
<<: * is replaced by [<<, 0]:, the second by [<<, 1]: etc.
The * needs to be part of the substitution, as there are no anchors in
the document for those aliases.
import sys
import subprocess
import ruamel.yaml
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.indent(sequence=4, offset=2)
class DoubleMergeKeyEnabler(object):
def __init__(self):
self.pat = '<<: ' # could be at the root level mapping, so no leading space
self.r_pat = '[<<, {}]: ' # probably not using sequences as keys
self.pat_nr = -1
def convert(self, doc):
while self.pat in doc:
self.pat_nr += 1
doc = doc.replace(self.pat, self.r_pat.format(self.pat_nr), 1)
return doc
def revert(self, doc):
while self.pat_nr >= 0:
doc = doc.replace(self.r_pat.format(self.pat_nr), self.pat, 1)
self.pat_nr -= 1
return doc
dmke = DoubleMergeKeyEnabler()
with open('test.yaml') as fp:
# we don't do this line by line, that would not work well on flow style mappings
orgdoc = fp.read()
doc = dmke.convert(orgdoc)
data = yaml.load(doc)
data['questions'][0].anchor.always_dump = True
#######################################
# >>>> do your thing on data here <<< #
#######################################
with open('output.yaml', 'w') as fp:
yaml.dump(data, fp, transform=dmke.revert)
res = subprocess.check_output(['diff', '-u', 'test.yaml', 'output.yaml']).decode('utf-8')
print('diff says:', res)
which gives:
diff says:
which means the files are the same on round-trip (as long as you don't
change anything before dumping).
Setting preserve_quotes and calling ident() on the YAML instance are necessary to
preserve your superfluous quotes, resp. keeping the indentation.
Since the anchor question-one has no alias, you need to enable dumping explicitly by
setting always_dump on that attribute to True. If necessary you can recursively
walk over data and set anchor.always_dump = True when .anchor.value is not None

ruamel parser error on reading special characters

I am using ruamel.yaml (0.15.37) and have a data structure like:
- !Message
Name: my message
Messages:
- !Message
name: InputMsg1
- !Variable
Name: control_word
Length: 8
Type: Signed
Unit: % # ruamel parser erro
If I read the YAML-file I get the error
File "_ruamel_yaml.pyx", line 904, in
_ruamel_yaml.CParser._parse_next_event (ext/_ruamel_yaml.c:12818) ruamel.yaml.scanner.ScannerError: while scanning for the next token
found character that cannot start any token
If I start with any other character then no error will be generated.
- !Message
Name: my message
Messages:
- !Message
name: InputMsg1
- !Variable
Name: control_word
Length: 8
Type: Signed
Unit: a % # no parser erro
I also tried %
The percent sign is an indicator character and those cannot start plain scalars. So you will have to quote the percent sign:
Unit: "%"
or
Unit: '%'
(you can probably also make it a literal block scalar:
Unit: |
%
or folding scalar, but I don't think that is better readable).
Since & is an indicator character as well that will throw the same error, but you seem to (mistakingly) assume you can do HTML escapes in YAML (you can't).