In Grafana I am getting a "400 Bad Request Client sent an HTTP request to an HTTPS server" when trying to update datasource configmaps

In Grafana I am getting a "400 Bad Request Client sent an HTTP request to an HTTPS server" when trying to update datasource configmaps - kubernetes

In Grafana I notice that when I deploy a configmap that should add a datasource it makes no change and does not add the new datasource - note that the configmap is in the cluster and in the correct namespace.
If I make a change to the configmap I get the following error if I look at the logs for the grafana-sc-datasources container:
POST request sent to http://localhost:3000/api/admin/provisioning/datasources/reload. Response: 400 Bad Request Client sent an HTTP request to an HTTPS server.
I assume I do not see any changes because it can not make the post request.
I played around a bit and at one point I did see changes being made/updated in the datasources:
I changed the protocol to http under grafana: / server: / protocol: and I was NOT able to open the grafana website but I did notice that if I did make a change to a datasource configmap in the cluster then I would see a successful 200 message in logs of the grafana-sc-datasources container : POST request sent to http://localhost:3000/api/admin/provisioning/datasources/reload. Response: 200 OK {"message":"Datasources config reloaded"}.
So I assume just need to know how to get Grafana to send the POST request as https instead of http.
Can someone point me to what might be wrong and how to fix it?
Note that I am pretty new to K8s, grafana and helmcharts.
Here is a configmap that I am trying to get to work:
apiVersion: v1
kind: ConfigMap
metadata:
name: jaeger-${NACKLE_ENV}-grafana-datasource
labels:
grafana_datasource: '1'
data:
jaeger-datasource.yaml: |-
apiVersion: 1
datasources:
- name: Jaeger-${NACKLE_ENV}
type: jaeger
access: browser
url: http://jaeger-${NACKLE_ENV}-query.${NACKLE_ENV}.svc.cluster.local:16690
version: 1
basicAuth: false
Here is the current Grafana values file:
# use 1 replica when using a StatefulSet
# If we need more than 1 replica, then we'll have to:
# - remove the `persistence` section below
# - use an external database for all replicas to connect to (refer to Grafana Helm chart docs)
replicas: 1
image:
pullSecrets:
- docker-hub
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: eks.amazonaws.com/capacityType
operator: In
values:
- ON_DEMAND
persistence:
enabled: true
type: statefulset
storageClassName: biw-durable-gp2
podDisruptionBudget:
maxUnavailable: 1
admin:
existingSecret: grafana
sidecar:
datasources:
enabled: true
label: grafana_datasource
dashboards:
enabled: true
label: grafana_dashboard
labelValue: 1
dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default
dashboards:
default:
node-exporter:
gnetId: 1860
revision: 23
datasource: Prometheus
core-dns:
gnetId: 12539
revision: 5
datasource: Prometheus
fluentd:
gnetId: 7752
revision: 6
datasource: Prometheus
ingress:
apiVersion: networking.k8s.io/v1
enabled: true
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/healthcheck-port: traffic-port
alb.ingress.kubernetes.io/healthcheck-path: '/api/health'
alb.ingress.kubernetes.io/healthcheck-protocol: HTTPS
alb.ingress.kubernetes.io/backend-protocol: HTTPS
# Redirect to HTTPS at the ALB
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
alb.ingress.kubernetes.io/actions.ssl-redirect: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
spec:
rules:
- http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: ssl-redirect
port:
name: use-annotation
defaultBackend:
service:
name: grafana
port:
number: 80
livenessProbe: { "httpGet": { "path": "/api/health", "port": 3000, "scheme": "HTTPS" }, "initialDelaySeconds": 60, "timeoutSeconds": 30, "failureThreshold": 10 }
readinessProbe: { "httpGet": { "path": "/api/health", "port": 3000, "scheme": "HTTPS" } }
service:
type: NodePort
name: grafana
rolePrefix: app-role
env: eks-test
serviceAccount:
name: grafana
annotations:
eks.amazonaws.com/role-arn: ""
pod:
spec:
serviceAccountName: grafana
grafana.ini:
server:
# don't use enforce_domain - it causes an infinite redirect in our setup
# enforce_domain: true
enable_gzip: true
# NOTE - if I set the protocol to http I do see it make changes to datasources but I can not see the website
protocol: https
cert_file: /biw-cert/domain.crt
cert_key: /biw-cert/domain.key
users:
auto_assign_org_role: Editor
# https://grafana.com/docs/grafana/v6.5/auth/gitlab/
auth.gitlab:
enabled: true
allow_sign_up: true
org_role: Editor
scopes: read_api
auth_url: https://gitlab.biw-services.com/oauth/authorize
token_url: https://gitlab.biw-services.com/oauth/token
api_url: https://gitlab.biw-services.com/api/v4
allowed_groups: nackle-teams/devops
securityContext:
fsGroup: 472
runAsUser: 472
runAsGroup: 472
extraConfigmapMounts:
- name: "cert-configmap"
mountPath: "/biw-cert"
subPath: ""
configMap: biw-grafana-cert
readOnly: true

Related

Example Nginx plus ingress for sticky sessions during canary deployment

I’m deploying 2 services to kubernetes pods which simply echo a version number; echo-v1 & echo-v2
Where echo-v2 is considered the canary deployment, I can demonstrate sticky sessions as canary weight is reconfigured from 0 to 100 using canary & canary-weight annotations.
2 ingresses are used:
The first routes to echo-v1 with a session cookie annotation.
The second routes to echo-v2 with canary true,canary weight and session cookie annotations.
The second ingress I can apply without impacting those sessions started on the first ingress and new sessions follow the canary weighting as expected.
However I’ve since learned that those annotations are for nginx community and won’t work with nginx plus.
How can I achieve the same using ingress(es) with nginx plus?

This is the ingress configuration that works for me using Nginx community vs Nginx plus.
Nginx community:
(coffee-v1 service)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/affinity: "cookie"
name: ingress-coffee
spec:
rules:
- http:
paths:
- path: /coffee
pathType: Exact
backend:
service:
name: coffee-v1
port:
number: 80
(coffee-v2 'canary' service)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "100"
name: ingress-coffee-canary
spec:
rules:
- http:
paths:
- path: /coffee
pathType: Exact
backend:
service:
name: coffee-v2
port:
number: 80
Nginx plus:
(coffee-v1 & coffee-v2 as type 'virtualserver' not 'ingress')
apiVersion: k8s.nginx.org/v1
kind: VirtualServer
metadata:
name: cafe
spec:
host: cloudbees-training.group.net
tls:
secret: cloudbees-trn.aks.group.net-tls
upstreams:
- name: coffee-v1
service: coffee-v1-svc
port: 80
sessionCookie:
enable: true
name: srv_id_v1
path: /coffee
expires: 2h
- name: coffee-v2
service: coffee-v2-svc
port: 80
sessionCookie:
enable: true
name: srv_id_v2
path: /coffee
expires: 2h
routes:
- path: /coffee
matches:
- conditions:
- cookie: srv_id_v1
value: ~*
action:
pass: coffee-v1
- conditions:
- cookie: srv_id_v2
value: ~*
action:
pass: coffee-v2
# 3 options to handle new session below:
#
# 1) All new sessions to v1:
# action:
# pass: coffee-v1
#
# 2) All new sessions to v2:
# action:
# pass: coffee-v2
#
# 3) Split new sessions by weight
# Note: 0,100 / 100,0 weightings causes sessions
# to drop for the 0 weighted service:
# splits:
# - weight: 50
# action:
# pass: coffee-v1
# - weight: 50
# action:
# pass: coffee-v2

Didn't creating kiali ingress resource after kiali deployment

I deployed the kiali-operator helm chart in the kiali-operator namespace and istio in the istio-system namespace. Now I am trying to deploy the kiali workload in the istio-system.
But somehow ingress rule did not create. I attached the kiali deployment YAML file for reference.
apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
name: kiali
namespace: istio-system
spec:
istio_labels:
app_label_name: "app.kubernetes.io/name"
installation_tag: "kiali"
istio_namespace: "istio-system"
version: "default"
auth:
strategy: token
custom_dashboards:
- name: "envoy"
deployment:
accessible_namespaces: ["**"]
ingress:
# default: additional_labels is empty
# additional_labels:
# ingressAdditionalLabel: "ingressAdditionalLabelValue"
class_name: "nginx"
default: enabled is undefined
enabled: true
# default: override_yaml is undefined
override_yaml:
metadata:
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/secure-backends: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
rules:
- http:
paths:
- path: "/kiali"
pathType: Prefix
backend:
service:
name: "kiali"
port:
number: 20001
instance_name: "kiali"
external_services:
custom_dashboards:
enabled: true
istio:
component_status:
components:
- app_label: "istiod"
is_core: true
is_proxy: false
- app_label: "istio-ingressgateway"
is_core: true
is_proxy: true
# default: namespace is undefined
namespace: istio-system
- app_label: "istio-egressgateway"
is_core: false
is_proxy: true
# default: namespace is undefined
namespace: istio-system
enabled: true
config_map_name: "istio"
envoy_admin_local_port: 15000
# default: istio_canary_revision is undefined
istio_canary_revision:
current: "1-9-9"
upgrade: "1-10-2"
istio_identity_domain: "svc.cluster.local"
istio_injection_annotation: "sidecar.istio.io/inject"
istio_sidecar_annotation: "sidecar.istio.io/status"
istio_sidecar_injector_config_map_name: "istio-sidecar-injector"
istiod_deployment_name: "istiod"
istiod_pod_monitoring_port: 15014
root_namespace: ""
url_service_version: ""
prometheus:
# Prometheus service name is "metrics" and is in the "telemetry" namespace
url: "<prome_url>"
grafana:
auth:
ca_file: ""
insecure_skip_verify: false
password: "password"
token: ""
type: "basic"
use_kiali_token: false
username: "user"
enabled: true
# Grafana service name is "grafana" and is in the "telemetry" namespace.
in_cluster_url: '<grafana_url>'
url: '<grafana_url>'
tracing:
enabled: true
in_cluster_url: '<jaeger-url>'
use_grpc: true
Advance thanks for any help!!

TLS error Nginx-ingress-controller fails to start after AKS upgrade to v1.23.5 from 1.21- traefik still tries to get from *v1beta1.Ingress

We deploy service with helm. The ingress template looks like that:
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ui-app-ingress
{{- with .Values.ingress.annotations}}
annotations:
{{- toYaml . | nindent 4}}
{{- end}}
spec:
rules:
- host: {{ .Values.ingress.hostname }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: {{ include "ui-app-chart.fullname" . }}
port:
number: 80
tls:
- hosts:
- {{ .Values.ingress.hostname }}
secretName: {{ .Values.ingress.certname }}
as you can see, we already use networking.k8s.io/v1 but if i watch the treafik logs, i find this error:
1 reflector.go:138] pkg/mod/k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource (get ingresses.extensions)
what results in tls cert error:
time="2022-06-07T15:40:35Z" level=debug msg="Serving default certificate for request: \"example.de\""
time="2022-06-07T15:40:35Z" level=debug msg="http: TLS handshake error from 10.1.0.4:57484: remote error: tls: unknown certificate"
time="2022-06-07T15:40:35Z" level=debug msg="Serving default certificate for request: \"example.de\""
time="2022-06-07T15:53:06Z" level=debug msg="Serving default certificate for request: \"\""
time="2022-06-07T16:03:31Z" level=debug msg="Serving default certificate for request: \"<ip-adress>\""
time="2022-06-07T16:03:32Z" level=debug msg="Serving default certificate for request: \"<ip-adress>\""
PS C:\WINDOWS\system32>
i already found out that networking.k8s.io/v1beta1 is not longer served, but networking.k8s.io/v1 was defined in the template all the time as ApiVersion.
Why does it still try to get from v1beta1? And how can i fix this?
We use this TLSOptions:
apiVersion: traefik.containo.us/v1alpha1
kind: TLSOption
metadata:
name: default
namespace: default
spec:
minVersion: VersionTLS12
maxVersion: VersionTLS13
cipherSuites:
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
we use helm-treafik rolled out with terraform:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
meta.helm.sh/release-name: traefik
meta.helm.sh/release-namespace: traefik
creationTimestamp: "2021-06-12T10:06:11Z"
generation: 2
labels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: traefik
helm.sh/chart: traefik-9.19.1
name: traefik
namespace: traefik
resourceVersion: "86094434"
uid: 903a6f54-7698-4290-bc59-d234a191965c
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/name: traefik
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: traefik
helm.sh/chart: traefik-9.19.1
spec:
containers:
- args:
- --global.checknewversion
- --global.sendanonymoususage
- --entryPoints.traefik.address=:9000/tcp
- --entryPoints.web.address=:8000/tcp
- --entryPoints.websecure.address=:8443/tcp
- --api.dashboard=true
- --ping=true
- --providers.kubernetescrd
- --providers.kubernetesingress
- --providers.file.filename=/etc/traefik/traefik.yml
- --accesslog=true
- --accesslog.format=json
- --log.level=DEBUG
- --entrypoints.websecure.http.tls
- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.web.http.redirections.entrypoint.scheme=https
- --entrypoints.web.http.redirections.entrypoint.permanent=true
- --entrypoints.web.http.redirections.entrypoint.to=:443
image: traefik:2.4.8
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: 9000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
name: traefik
ports:
- containerPort: 9000
name: traefik
protocol: TCP
- containerPort: 8000
name: web
protocol: TCP
- containerPort: 8443
name: websecure
protocol: TCP
readinessProbe:
failureThreshold: 1
httpGet:
path: /ping
port: 9000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
resources: {}
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
readOnlyRootFilesystem: true
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
- mountPath: /tmp
name: tmp
- mountPath: /etc/traefik
name: traefik-cm
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65532
serviceAccount: traefik
serviceAccountName: traefik
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoSchedule
key: env
operator: Equal
value: conhub
volumes:
- emptyDir: {}
name: data
- emptyDir: {}
name: tmp
- configMap:
defaultMode: 420
name: traefik-cm
name: traefik-cm
status:
availableReplicas: 3
conditions:
- lastTransitionTime: "2022-06-07T09:19:58Z"
lastUpdateTime: "2022-06-07T09:19:58Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2021-06-12T10:06:11Z"
lastUpdateTime: "2022-06-07T16:39:01Z"
message: ReplicaSet "traefik-84c6f5f98b" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 2
readyReplicas: 3
replicas: 3
updatedReplicas: 3
resource "helm_release" "traefik" {
name = "traefik"
namespace = "traefik"
create_namespace = true
repository = "https://helm.traefik.io/traefik"
chart = "traefik"
set {
name = "service.spec.loadBalancerIP"
value = azurerm_public_ip.pub_ip.ip_address
}
set {
name = "service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-resource-group"
value = var.resource_group_aks
}
set {
name = "additionalArguments"
value = "{--accesslog=true,--accesslog.format=json,--log.level=DEBUG,--entrypoints.websecure.http.tls,--entrypoints.web.http.redirections.entrypoint.to=websecure,--entrypoints.web.http.redirections.entrypoint.scheme=https,--entrypoints.web.http.redirections.entrypoint.permanent=true,--entrypoints.web.http.redirections.entrypoint.to=:443}"
}
set {
name = "deployment.replicas"
value = 3
}
timeout = 600
depends_on = [
azurerm_kubernetes_cluster.aks
]
}

I found out that the problem was the version of the traefik image:
i quick fixed it by setting the latest image:
kubectl set image deployment/traefik traefik=traefik:2.7.0 -n traefik

Not connecting GRPC requist using aws nlb with ACM cert

I have k8s cluster for using gRPC service with envoy proxy, all gRPC and web request collect Envoy and passed into backend , Envoy SVC started with nlb, and nlb attached with ACM certificate. without nlb certificate request passed into backend correctly and getting response, but need to use nlb-url:443, again if I'm attaching ACM cert into nlb,and its not at all getting response. why?
Or again I need to use another ingress for treat ssl and route?
envoy-svc.yaml
apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-2:12345676789:certificate/ss304s07-3ss2-4s73-8744-bs2sss123460
service.beta.kubernetes.io/aws-load-balancer-type: nlb
creationTimestamp: "2021-06-11T02:50:24Z"
finalizers:
- service.kubernetes.io/load-balancer-cleanup
name: envoy-service
namespace: default
spec:
externalTrafficPolicy: Cluster
ports:
- nodePort: 31156
port: 443
protocol: TCP
targetPort: 80
selector:
name: envoy
sessionAffinity: None
type: LoadBalancer
envoy-conf
admin:
access_log_path: /dev/stdout
address:
socket_address: { address: 0.0.0.0, port_value: 8801 }
static_resources:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/config/listener/v3/listener.proto#config-listener-v3-listener
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto#extensions-filters-network-http-connection-manager-v3-httpconnectionmanager
"#type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
http2_protocol_options: {}
access_log:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/access_loggers/file/v3/file.proto
#
# You can also configure this extension with the qualified
# name envoy.access_loggers.http_grpc
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/access_loggers/grpc/v3/als.proto
- name: envoy.access_loggers.file
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/access_loggers/file/v3/file.proto#extensions-access-loggers-file-v3-fileaccesslog
"#type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
# Console output
path: /dev/stdout
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains:
- "*"
routes:
- match:
prefix: /
grpc:
route:
cluster: greeter_service
cors:
allow_origin_string_match:
- prefix: "*"
allow_methods: GET, PUT, DELETE, POST, OPTIONS
# custom-header-1 is just an example. the grpc-web
# repository was missing grpc-status-details-bin header
# which used in a richer error model.
# https://grpc.io/docs/guides/error/#richer-error-model
allow_headers: accept-language,accept-encoding,user-agent,referer,sec-fetch-mode,origin,access-control-request-headers,access-control-request-method,accept,cache-control,pragma,connection,host,name,x-grpc-web,x-user-agent,grpc-timeout,content-type
expose_headers: grpc-status-details-bin,grpc-status,grpc-message,authorization
max_age: "1728000"
http_filters:
- name: envoy.filters.http.grpc_web
# This line is optional, but adds clarity to the configuration.
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/filters/http/grpc_web/v3/grpc_web.proto
"#type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
- name: envoy.filters.http.cors
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/filters/http/cors/v3/cors.proto
"#type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
- name: envoy.filters.http.grpc_json_transcoder
typed_config:
"#type": type.googleapis.com/envoy.extensions.filters.http.grpc_json_transcoder.v3.GrpcJsonTranscoder
proto_descriptor: "/etc/envoy-sync/sync.pb"
ignore_unknown_query_parameters: true
services:
- "com.tk.system.sync.Synchronizer"
print_options:
add_whitespace: true
always_print_primitive_fields: true
always_print_enums_as_ints: true
preserve_proto_field_names: true
- name: envoy.filters.http.router
typed_config:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/extensions/filters/http/router/v3/router.proto
"#type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
clusters:
# https://www.envoyproxy.io/docs/envoy/v1.15.0/api-v3/config/cluster/v3/cluster.proto#config-cluster-v3-cluster
- name: greeter_service
type: LOGICAL_DNS
connect_timeout: 0.25s
lb_policy: round_robin
load_assignment:
cluster_name: greeter_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: micro-deployment
port_value: 8081
http2_protocol_options: {} # Force HTTP/2

OPA (running as host level separate service ) policies are not getting enforced via envoy while calling the API

My scenario - one example-app service exposed through Nodeport in K8s cluster .
The service is having one envoy side car . OPA is running as separate service on a same node.
My app-deployment specs -
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
labels:
app: example-app
spec:
replicas: 1
selector:
matchLabels:
app: example-app
template:
metadata:
labels:
app: example-app
spec:
initContainers:
- name: proxy-init
image: openpolicyagent/proxy_init:v5
# Configure the iptables bootstrap script to redirect traffic to the
# Envoy proxy on port 8000, specify that Envoy will be running as user
# 1111, and that we want to exclude port 8282 from the proxy for the
# OPA health checks. These values must match up with the configuration
# defined below for the "envoy" and "opa" containers.
args: ["-p", "8000", "-u", "1111"]
securityContext:
capabilities:
add:
- NET_ADMIN
runAsNonRoot: false
runAsUser: 0
containers:
- name: app
image: openpolicyagent/demo-test-server:v1
ports:
- containerPort: 8080
- name: envoy
image: envoyproxy/envoy:v1.14.4
securityContext:
runAsUser: 1111
volumeMounts:
- readOnly: true
mountPath: /config
name: proxy-config
args:
- "envoy"
- "--config-path"
- "/config/envoy.yaml"
volumes:
- name: proxy-config
configMap:
name: proxy-config
My Envoy specs(loaded through config maps) -
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 8000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"#type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
codec_type: auto
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: backend
domains:
- "*"
routes:
- match:
prefix: "/"
route:
cluster: service
http_filters:
- name: envoy.filters.http.ext_authz
typed_config:
"#type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
grpc_service:
envoy_grpc:
cluster_name: opa
timeout: 0.250s
- name: envoy.filters.http.router
typed_config: {}
clusters:
- name: service
connect_timeout: 0.25s
type: strict_dns
lb_policy: round_robin
load_assignment:
cluster_name: service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8080
- name: opa
connect_timeout: 0.250s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
load_assignment:
cluster_name: opa
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: opa
port_value: 9191
admin:
access_log_path: "/dev/null"
address:
socket_address:
address: 0.0.0.0
port_value: 8001
OPA deployment specs (exposed through nodeport as a service):-
apiVersion: apps/v1
kind: Deployment
metadata:
name: opa
labels:
app: opa
spec:
replicas: 1
selector:
matchLabels:
app: opa
template:
metadata:
labels:
app: opa
name: opa
spec:
containers:
- name: opa
image: openpolicyagent/opa:latest-envoy
ports:
- name: http
containerPort: 8181
securityContext:
runAsUser: 1111
args:
- "run"
- "--server"
- "--log-level=info"
- "--log-format=json-pretty"
- "--set=decision_logs.console=true"
- "--set=plugins.envoy_ext_authz_grpc.addr=:9191"
- "--set=plugins.envoy_ext_authz_grpc.query=data.envoy.authz.allow"
- "--ignore=.*"
- "/policy/policy.rego"
volumeMounts:
- readOnly: true
mountPath: /policy
name: opa-policy
volumes:
- name: opa-policy
configMap:
name: opa-policy
Sample policy.rego (loaded through configmap)-
package envoy.authz
import input.attributes.request.http as http_request
default allow = false
token = {"valid": valid, "payload": payload} {
[_, encoded] := split(http_request.headers.authorization, " ")
[valid, _, payload] := io.jwt.decode_verify(encoded, {"secret": "secret"})
}
allow {
is_token_valid
action_allowed
}
is_token_valid {
token.valid
now := time.now_ns() / 1000000000
token.payload.nbf <= now
now < token.payload.exp
}
action_allowed {
http_request.method == "GET"
token.payload.role == "guest"
glob.match("/people*", [], http_request.path)
}
action_allowed {
http_request.method == "GET"
token.payload.role == "admin"
glob.match("/people*", [], http_request.path)
}
action_allowed {
http_request.method == "POST"
token.payload.role == "admin"
glob.match("/people", [], http_request.path)
lower(input.parsed_body.firstname) != base64url.decode(token.payload.sub)
}
While calling REST API getting below response -
curl -i -H "Authorization: Bearer $ALICE_TOKEN" http://$SERVICE_URL/people
HTTP/1.1 403 Forbidden
date: Sat, 13 Feb 2021 08:50:29 GMT
server: envoy
content-length: 0
OPA decision logs -
{
"addrs": [
":8181"
],
"diagnostic-addrs": [],
"level": "info",
"msg": "Initializing server.",
"time": "2021-02-13T08:48:27Z"
}
{
"level": "info",
"msg": "Starting decision logger.",
"plugin": "decision_logs",
"time": "2021-02-13T08:48:27Z"
}
{
"addr": ":9191",
"dry-run": false,
"enable-reflection": false,
"level": "info",
"msg": "Starting gRPC server.",
"path": "",
"query": "data.envoy.authz.allow",
"time": "2021-02-13T08:48:27Z"
}
What I am doing wrong here ? there is nothing in decision logs for my rest call to the API . OPA policy should invoke through envoy filter which is not happening.
Please help .