Grafana alert provisioning issue - grafana

I'd like to be able to provision alerts, and try to follow this instructions, but no luck!
OS Grafana version: 9.2.0 Running in Kubernetes
Steps that I take:
Created a new alert rule from UI.
Extract alert rule from API:
curl -k https://<my-grafana-url>/api/v1/provisioning/alert-rules/-4pMuQFVk -u admin:<my-admin-password>
It returns the following:
---
id: 18
uid: "-4pMuQFVk"
orgID: 1
folderUID: 3i72aQKVk
ruleGroup: cpu_alert_group
title: my_cpu_alert
condition: B
data:
- refId: A
queryType: ''
relativeTimeRange:
from: 600
to: 0
datasourceUid: _SaubQF4k
model:
editorMode: code
expr: system_cpu_usage
hide: false
intervalMs: 1000
legendFormat: __auto
maxDataPoints: 43200
range: true
refId: A
- refId: B
queryType: ''
relativeTimeRange:
from: 0
to: 0
datasourceUid: "-100"
model:
conditions:
- evaluator:
params:
- 3
type: gt
operator:
type: and
query:
params:
- A
reducer:
params: []
type: last
type: query
datasource:
type: __expr__
uid: "-100"
expression: A
hide: false
intervalMs: 1000
maxDataPoints: 43200
refId: B
type: classic_conditions
updated: '2022-12-07T20:01:47Z'
noDataState: NoData
execErrState: Error
for: 5m
Deleted the alert rule from UI.
Made a configmap from above alert-rule, as such:
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-alerting
data:
alerting.yaml: |-
apiVersion: 1
groups:
- id: 18
uid: "-4pMuQFVk"
orgID: 1
folderUID: 3i72aQKVk
ruleGroup: cpu_alert_group
title: my_cpu_alert
condition: B
data:
- refId: A
queryType: ''
relativeTimeRange:
from: 600
to: 0
datasourceUid: _SaubQF4k
model:
editorMode: code
expr: system_cpu_usage
hide: false
intervalMs: 1000
legendFormat: __auto
maxDataPoints: 43200
range: true
refId: A
- refId: B
queryType: ''
relativeTimeRange:
from: 0
to: 0
datasourceUid: "-100"
model:
conditions:
- evaluator:
params:
- 3
type: gt
operator:
type: and
query:
params:
- A
reducer:
params: []
type: last
type: query
datasource:
type: __expr__
uid: "-100"
expression: A
hide: false
intervalMs: 1000
maxDataPoints: 43200
refId: B
type: classic_conditions
updated: '2022-12-07T20:01:47Z'
noDataState: NoData
execErrState: Error
for: 5m
I mount above configmap in grafana container (in /etc/grafna/provisioning/alerting).
The full manifest of the deployment is as follow:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
meta.helm.sh/release-name: grafana
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2022-12-08T18:31:30Z"
generation: 4
labels:
app.kubernetes.io/instance: grafana
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: grafana
app.kubernetes.io/version: 9.3.0
helm.sh/chart: grafana-6.46.1
name: grafana
namespace: monitoring
resourceVersion: "648617"
uid: dc06b802-5281-4f31-a2b2-fef3cf53a70b
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: grafana
app.kubernetes.io/name: grafana
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
checksum/config: 98cac51656714db48a116d3109994ee48c401b138bc8459540e1a497f994d197
checksum/dashboards-json-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
checksum/sc-dashboard-provider-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
checksum/secret: 203f0e4d883461bdd41fe68515fc47f679722dc2fdda49b584209d1d288a5f07
creationTimestamp: null
labels:
app.kubernetes.io/instance: grafana
app.kubernetes.io/name: grafana
spec:
automountServiceAccountToken: true
containers:
- env:
- name: GF_SECURITY_ADMIN_USER
valueFrom:
secretKeyRef:
key: admin-user
name: grafana
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
key: admin-password
name: grafana
- name: GF_PATHS_DATA
value: /var/lib/grafana/
- name: GF_PATHS_LOGS
value: /var/log/grafana
- name: GF_PATHS_PLUGINS
value: /var/lib/grafana/plugins
- name: GF_PATHS_PROVISIONING
value: /etc/grafana/provisioning
image: grafana/grafana:9.3.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 10
httpGet:
path: /api/health
port: 3000
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
name: grafana
ports:
- containerPort: 3000
name: grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/health
port: 3000
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/grafana/grafana.ini
name: config
subPath: grafana.ini
- mountPath: /etc/grafana/provisioning/alerting
name: grafana-alerting
- mountPath: /var/lib/grafana
name: storage
dnsPolicy: ClusterFirst
enableServiceLinks: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 472
runAsGroup: 472
runAsUser: 472
serviceAccount: grafana
serviceAccountName: grafana
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: grafana-alerting
name: grafana-alerting
- configMap:
defaultMode: 420
name: grafana
name: config
- emptyDir: {}
name: storage
However, grafana fail to start with this error:
Failed to start grafana. error: failure to map file alerting.yaml: failure parsing rules: rule group has no name set
I fixed above error by adding Group Name, but similar errors about other missing elements kept showing up again and again (to the point that I stopped chasing, as I couldn't figure out what exactly the correct schema is). Digging in, it looks like the format/schema returned from API in step 2, is different than the schema that's pointed out in the documentation.
Why is the schema of the alert-rule returned from API different? Am I supposed to convert it, and if so how? Otherwise, what am I doing wrong?
Edit: Replaced Statefulset with deployment, since I was able to reproduce this in a normal/minimal Grafana deployment too.

Related

configuring keycloak with external postgres database

How do we configure keycloak to use the external postgres (AWS RDS)?
We deployed it in kubernetes using quarkus distro and update dthe DB env variables in our deployment.yaml , however it is still taking the local h2 data base and not the postgres.
For better understanding providing the deployment.yaml file we are using:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "5"
kubectl.kubernetes.io/last-applied-configuration: |
creationTimestamp: "2022-06-21T16:47:29Z"
generation: 5
labels:
app: keycloak
name: keycloak
namespace: kc***
resourceVersion: "29233550"
uid: 3634683e-657c-4278-9002-82a3ce64b968
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: keycloak
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: keycloak
spec:
containers:
- args:
- start
- --hostname=kc-test.k8.com
- --https-certificate-file=/opt/pem/cert-pem/cert.pem
- --https-certificate-key-file=/opt/pem/key-pem/key.pem
- --log-level=DEBUG
env:
- name: KEYCLOAK_ADMIN
value: ****
- name: KEYCLOAK_ADMIN_PASSWORD
value: *****
- name: PROXY_ADDRESS_FORWARDING
value: "true"
- name: DB_ADDR
value: jdbc:postgresql://database.c**7irl*****.us-east-1.rds.amazonaws.com/database
- name: DB_DATABASE
value: ****
- name: DB_USER
value: postgres
- name: DB_SCHEMA
value: public
- name: DB_VENDOR
value: POSTGRES
- name: JGROUPS_DISCOVERY_PROTOCOL
value: dns.DNS_PING
- name: JGROUPS_DISCOVERY_PROPERTIES
value: dns_query=keycloak
- name: CACHE_OWNERS_COUNT
value: "2"
- name: CACHE_OWNERS_AUTH_SESSIONS_COUNT
value: "2"
image: quay.io/keycloak/keycloak:17.0.0
imagePullPolicy: IfNotPresent
name: keycloak
ports:
- containerPort: 7600
name: jgroups
protocol: TCP
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /realms/master
port: 8443
scheme: HTTPS
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 30
resources: {}
securityContext:
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/pem/key-pem
name: key-pem
- mountPath: /opt/pem/cert-pem
name: cert-pem
- mountPath: /opt/keycloak/data
name: keydata
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: key-pem
name: key-pem
- configMap:
defaultMode: 420
name: cert-pem
name: cert-pem
- emptyDir: {}
name: keydata
status:
availableReplicas: 3
conditions:
- lastTransitionTime: "2022-06-21T18:02:32Z"
lastUpdateTime: "2022-06-21T18:02:32Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2022-06-21T18:01:53Z"
lastUpdateTime: "2022-06-21T18:16:41Z"
message: ReplicaSet "keycloak-5c84476694" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 5
readyReplicas: 3
replicas: 3
updatedReplicas: 3
Is your external DB also in same namespace?
if yes you can use below way.
<external postgres (AWS RDS)>secret-name in k8s secret contains all the below details.
Using this method it will dynamically fetch details from secret.
env:
- name: DB_DATABASE
valueFrom:
secretKeyRef:
name: database-secret-name
key: dbname
- name: DB_ADDR
valueFrom:
secretKeyRef:
name: database-secret-name
key: host
- name: DB_PORT
valueFrom:
secretKeyRef:
name: database-secret-name
key: port
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: database-secret-name
key: password
- name: DB_USER
valueFrom:
secretKeyRef:
name: database-secret-name
key: user
if your external postgres in different namespace, copy you database secret to keycloak namespace and give it a try.
DB_ADDR is env variable for Keycloak versions 16-. Use doc for you Keycloak version https://www.keycloak.org/server/all-config
Keycloak 17+ has KC_DB_URL:
db-url
The full database JDBC URL.
If not provided, a default URL is set based on the selected database vendor. For instance, if using 'postgres', the default JDBC URL would be 'jdbc:postgresql://localhost/keycloak'.
CLI: --db-url
Env: KC_DB_URL
Of course configure also other env variables for your Keycloak version properly.

TLS error Nginx-ingress-controller fails to start after AKS upgrade to v1.23.5 from 1.21- traefik still tries to get from *v1beta1.Ingress

We deploy service with helm. The ingress template looks like that:
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ui-app-ingress
{{- with .Values.ingress.annotations}}
annotations:
{{- toYaml . | nindent 4}}
{{- end}}
spec:
rules:
- host: {{ .Values.ingress.hostname }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: {{ include "ui-app-chart.fullname" . }}
port:
number: 80
tls:
- hosts:
- {{ .Values.ingress.hostname }}
secretName: {{ .Values.ingress.certname }}
as you can see, we already use networking.k8s.io/v1 but if i watch the treafik logs, i find this error:
1 reflector.go:138] pkg/mod/k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource (get ingresses.extensions)
what results in tls cert error:
time="2022-06-07T15:40:35Z" level=debug msg="Serving default certificate for request: \"example.de\""
time="2022-06-07T15:40:35Z" level=debug msg="http: TLS handshake error from 10.1.0.4:57484: remote error: tls: unknown certificate"
time="2022-06-07T15:40:35Z" level=debug msg="Serving default certificate for request: \"example.de\""
time="2022-06-07T15:53:06Z" level=debug msg="Serving default certificate for request: \"\""
time="2022-06-07T16:03:31Z" level=debug msg="Serving default certificate for request: \"<ip-adress>\""
time="2022-06-07T16:03:32Z" level=debug msg="Serving default certificate for request: \"<ip-adress>\""
PS C:\WINDOWS\system32>
i already found out that networking.k8s.io/v1beta1 is not longer served, but networking.k8s.io/v1 was defined in the template all the time as ApiVersion.
Why does it still try to get from v1beta1? And how can i fix this?
We use this TLSOptions:
apiVersion: traefik.containo.us/v1alpha1
kind: TLSOption
metadata:
name: default
namespace: default
spec:
minVersion: VersionTLS12
maxVersion: VersionTLS13
cipherSuites:
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
we use helm-treafik rolled out with terraform:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
meta.helm.sh/release-name: traefik
meta.helm.sh/release-namespace: traefik
creationTimestamp: "2021-06-12T10:06:11Z"
generation: 2
labels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: traefik
helm.sh/chart: traefik-9.19.1
name: traefik
namespace: traefik
resourceVersion: "86094434"
uid: 903a6f54-7698-4290-bc59-d234a191965c
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/name: traefik
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: traefik
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: traefik
helm.sh/chart: traefik-9.19.1
spec:
containers:
- args:
- --global.checknewversion
- --global.sendanonymoususage
- --entryPoints.traefik.address=:9000/tcp
- --entryPoints.web.address=:8000/tcp
- --entryPoints.websecure.address=:8443/tcp
- --api.dashboard=true
- --ping=true
- --providers.kubernetescrd
- --providers.kubernetesingress
- --providers.file.filename=/etc/traefik/traefik.yml
- --accesslog=true
- --accesslog.format=json
- --log.level=DEBUG
- --entrypoints.websecure.http.tls
- --entrypoints.web.http.redirections.entrypoint.to=websecure
- --entrypoints.web.http.redirections.entrypoint.scheme=https
- --entrypoints.web.http.redirections.entrypoint.permanent=true
- --entrypoints.web.http.redirections.entrypoint.to=:443
image: traefik:2.4.8
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: 9000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
name: traefik
ports:
- containerPort: 9000
name: traefik
protocol: TCP
- containerPort: 8000
name: web
protocol: TCP
- containerPort: 8443
name: websecure
protocol: TCP
readinessProbe:
failureThreshold: 1
httpGet:
path: /ping
port: 9000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
resources: {}
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
readOnlyRootFilesystem: true
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
- mountPath: /tmp
name: tmp
- mountPath: /etc/traefik
name: traefik-cm
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65532
serviceAccount: traefik
serviceAccountName: traefik
terminationGracePeriodSeconds: 60
tolerations:
- effect: NoSchedule
key: env
operator: Equal
value: conhub
volumes:
- emptyDir: {}
name: data
- emptyDir: {}
name: tmp
- configMap:
defaultMode: 420
name: traefik-cm
name: traefik-cm
status:
availableReplicas: 3
conditions:
- lastTransitionTime: "2022-06-07T09:19:58Z"
lastUpdateTime: "2022-06-07T09:19:58Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2021-06-12T10:06:11Z"
lastUpdateTime: "2022-06-07T16:39:01Z"
message: ReplicaSet "traefik-84c6f5f98b" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 2
readyReplicas: 3
replicas: 3
updatedReplicas: 3
resource "helm_release" "traefik" {
name = "traefik"
namespace = "traefik"
create_namespace = true
repository = "https://helm.traefik.io/traefik"
chart = "traefik"
set {
name = "service.spec.loadBalancerIP"
value = azurerm_public_ip.pub_ip.ip_address
}
set {
name = "service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-resource-group"
value = var.resource_group_aks
}
set {
name = "additionalArguments"
value = "{--accesslog=true,--accesslog.format=json,--log.level=DEBUG,--entrypoints.websecure.http.tls,--entrypoints.web.http.redirections.entrypoint.to=websecure,--entrypoints.web.http.redirections.entrypoint.scheme=https,--entrypoints.web.http.redirections.entrypoint.permanent=true,--entrypoints.web.http.redirections.entrypoint.to=:443}"
}
set {
name = "deployment.replicas"
value = 3
}
timeout = 600
depends_on = [
azurerm_kubernetes_cluster.aks
]
}
I found out that the problem was the version of the traefik image:
i quick fixed it by setting the latest image:
kubectl set image deployment/traefik traefik=traefik:2.7.0 -n traefik

cp: cannot stat '/opt/flink/opt/flink-metrics-prometheus-*.jar': No such file or directory in apache flink

I am upgrade apache flink 1.10 to apache flink 1.11 in kubernetes, but the jobmanager kubernetes pod log shows:
cp: cannot stat '/opt/flink/opt/flink-metrics-prometheus-*.jar': No such file or directory
this is my jobmanager pod yaml:
kind: Deployment
apiVersion: apps/v1
metadata:
name: report-flink-jobmanager
namespace: middleware
selfLink: /apis/apps/v1/namespaces/middleware/deployments/report-flink-jobmanager
uid: b7bd8f0d-cddb-44e7-8bbe-b96e68dbfbcd
resourceVersion: '13655071'
generation: 44
creationTimestamp: '2020-06-08T02:11:33Z'
labels:
app.kubernetes.io/instance: report-flink
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: flink
app.kubernetes.io/version: 1.10.0
component: jobmanager
helm.sh/chart: flink-0.1.15
annotations:
deployment.kubernetes.io/revision: '6'
meta.helm.sh/release-name: report-flink
meta.helm.sh/release-namespace: middleware
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: report-flink
app.kubernetes.io/name: flink
component: jobmanager
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: report-flink
app.kubernetes.io/name: flink
component: jobmanager
spec:
volumes:
- name: flink-config-volume
configMap:
name: report-flink-config
items:
- key: flink-conf.yaml
path: flink-conf.yaml.tpl
- key: log4j.properties
path: log4j.properties
- key: security.properties
path: security.properties
defaultMode: 420
- name: flink-pro-persistent-storage
persistentVolumeClaim:
claimName: flink-pv-claim
containers:
- name: jobmanager
image: 'flink:1.11'
command:
- /bin/bash
- '-c'
- >-
cp /opt/flink/opt/flink-metrics-prometheus-*.jar
/opt/flink/opt/flink-s3-fs-presto-*.jar /opt/flink/lib/ && wget
https://repo1.maven.org/maven2/com/github/oshi/oshi-core/3.4.0/oshi-core-3.4.0.jar
-O /opt/flink/lib/oshi-core-3.4.0.jar && wget
https://repo1.maven.org/maven2/net/java/dev/jna/jna/5.4.0/jna-5.4.0.jar
-O /opt/flink/lib/jna-5.4.0.jar && wget
https://repo1.maven.org/maven2/net/java/dev/jna/jna-platform/5.4.0/jna-platform-5.4.0.jar
-O /opt/flink/lib/jna-platform-5.4.0.jar && cp
$FLINK_HOME/conf/flink-conf.yaml.tpl
$FLINK_HOME/conf/flink-conf.yaml && $FLINK_HOME/bin/jobmanager.sh
start; while :; do if [[ -f $(find log -name '*jobmanager*.log'
-print -quit) ]]; then tail -f -n +1 log/*jobmanager*.log; fi;
done
workingDir: /opt/flink
ports:
- name: blob
containerPort: 6124
protocol: TCP
- name: rpc
containerPort: 6123
protocol: TCP
- name: ui
containerPort: 8081
protocol: TCP
- name: metrics
containerPort: 9999
protocol: TCP
env:
- name: JVM_ARGS
value: '-Djava.security.properties=/opt/flink/conf/security.properties'
- name: FLINK_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: APOLLO_META
valueFrom:
configMapKeyRef:
name: pro-config
key: apollo.meta
- name: ENV
valueFrom:
configMapKeyRef:
name: pro-config
key: env
resources: {}
volumeMounts:
- name: flink-config-volume
mountPath: /opt/flink/conf/flink-conf.yaml.tpl
subPath: flink-conf.yaml.tpl
- name: flink-config-volume
mountPath: /opt/flink/conf/log4j.properties
subPath: log4j.properties
- name: flink-config-volume
mountPath: /opt/flink/conf/security.properties
subPath: security.properties
- name: flink-pro-persistent-storage
mountPath: /opt/flink/data/
livenessProbe:
tcpSocket:
port: 6124
initialDelaySeconds: 10
timeoutSeconds: 1
periodSeconds: 15
successThreshold: 1
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 6123
initialDelaySeconds: 20
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: jobmanager
serviceAccount: jobmanager
securityContext: {}
schedulerName: default-scheduler
strategy:
type: Recreate
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 44
replicas: 1
updatedReplicas: 1
unavailableReplicas: 1
conditions:
- type: Available
status: 'False'
lastUpdateTime: '2020-08-19T06:26:56Z'
lastTransitionTime: '2020-08-19T06:26:56Z'
reason: MinimumReplicasUnavailable
message: Deployment does not have minimum availability.
- type: Progressing
status: 'False'
lastUpdateTime: '2020-08-19T06:42:56Z'
lastTransitionTime: '2020-08-19T06:42:56Z'
reason: ProgressDeadlineExceeded
message: >-
ReplicaSet "report-flink-jobmanager-7b8b9bd6bb" has timed out
progressing.
should I remove the not exists jar file? how to fix this?

Next.js Environment variables work locally but not when hosted on Kubernetes

Have a Next.js project.
This is my next.config.js file, which I followed through with on this guide: https://dev.to/tesh254/environment-variables-from-env-file-in-nextjs-570b
module.exports = withCSS(withSass({
webpack: (config) => {
config.plugins = config.plugins || []
config.module.rules.push({
test: /\.svg$/,
use: ['#svgr/webpack', {
loader: 'url-loader',
options: {
limit: 100000,
name: '[name].[ext]'
}}],
});
config.plugins = [
...config.plugins,
// Read the .env file
new Dotenv({
path: path.join(__dirname, '.env'),
systemvars: true
})
]
const env = Object.keys(process.env).reduce((acc, curr) => {
acc[`process.env.${curr}`] = JSON.stringify(process.env[curr]);
return acc;
}, {});
// Fixes npm packages that depend on `fs` module
config.node = {
fs: 'empty'
}
/** Allows you to create global constants which can be configured
* at compile time, which in our case is our environment variables
*/
config.plugins.push(new webpack.DefinePlugin(env));
return config
}
}),
);
I have a .env file which holds the values I need. It works when run on localhost.
In my Kubernetes environment, within the deploy file which I can modify, I have the same environment variables set up. But when I try and identify them they come off as undefined, so my application cannot run.
I refer to it like:
process.env.SOME_VARIABLE
which works locally.
Does anyone have experience making environment variables function on Next.js when deployed? Not as simple as it is for a backend service. :(
EDIT:
This is what the environment variable section looks like.
EDIT 2:
Full deploy file, edited to remove some details
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "38"
creationTimestamp: xx
generation: 40
labels:
app: appname
name: appname
namespace: development
resourceVersion: xx
selfLink: /apis/extensions/v1beta1/namespaces/development/deployments/appname
uid: xxx
spec:
progressDeadlineSeconds: xx
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: appname
tier: sometier
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: appname
tier: sometier
spec:
containers:
- env:
- name: NODE_ENV
value: development
- name: PORT
value: "3000"
- name: SOME_VAR
value: xxx
- name: SOME_VAR
value: xxxx
image: someimage
imagePullPolicy: Always
name: appname
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 3000
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 100m
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 1
conditions:
- lastTransitionTime: xxx
lastUpdateTime: xxxx
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 40
readyReplicas: 1
replicas: 1
updatedReplicas: 1
.env Works in docker or docker-compose, they do not work in Kubernetes, if you want to add them you can by configmaps objects or add directly to each deployment an example (from documentation):
apiVersion: v1
kind: Pod
metadata:
name: envar-demo
labels:
purpose: demonstrate-envars
spec:
containers:
- name: envar-demo-container
image: gcr.io/google-samples/node-hello:1.0
env:
- name: DEMO_GREETING
value: "Hello from the environment"
- name: DEMO_FAREWELL
value: "Such a sweet sorrow
Also, the best and standard way is to use config maps, for example:
containers:
- env:
- name: DB_DEFAULT_DATABASE
valueFrom:
configMapKeyRef:
key: DB_DEFAULT_DATABASE
name: darwined-env
And the config map:
apiVersion: v1
data:
DB_DEFAULT_DATABASE: darwined_darwin_dev_1
kind: ConfigMap
metadata:
creationTimestamp: null
labels:
io.kompose.service: darwin-env
name: darwined-env
Hope this helps.

Sometimes missing desiredContainers

My Zalenium is deployed in Kubernetes. I have set option desiredContainers = 2 and it's working. But sometimes desired containers are not available. Tests are working properly, even when desired containers not available. After "restart" containers appears, but I have no idea why they sometimes dissapears. Does anyone have idea what's going on?
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: zalenium
namespace: zalenium-omdc
selfLink: /apis/extensions/v1beta1/namespaces/zalenium-omdc/deployments/zalenium
uid: cbafe254-3e28-4889-a09e-ccfa500ff628
resourceVersion: '25201258'
generation: 24
creationTimestamp: '2019-09-17T13:24:52Z'
labels:
app: zalenium
instance: zalenium
annotations:
deployment.kubernetes.io/revision: '24'
kubectl.kubernetes.io/last-applied-configuration: >
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"zalenium","instance":"zalenium"},"name":"zalenium","namespace":"zalenium-omdc"},"spec":{"replicas":1,"selector":{"matchLabels":{"instance":"zalenium"}},"template":{"metadata":{"labels":{"app":"zalenium","instance":"zalenium"}},"spec":{"containers":[{"args":["start"],"env":[{"name":"ZALENIUM_KUBERNETES_CPU_REQUEST","value":"250m"},{"name":"ZALENIUM_KUBERNETES_CPU_LIMIT","value":"1000m"},{"name":"ZALENIUM_KUBERNETES_MEMORY_REQUEST","value":"500Mi"},{"name":"ZALENIUM_KUBERNETES_MEMORY_LIMIT","value":"2Gi"},{"name":"DESIRED_CONTAINERS","value":"2"},{"name":"MAX_DOCKER_SELENIUM_CONTAINERS","value":"16"},{"name":"SELENIUM_IMAGE_NAME","value":"elgalu/selenium"},{"name":"VIDEO_RECORDING_ENABLED","value":"true"},{"name":"SCREEN_WIDTH","value":"1440"},{"name":"SCREEN_HEIGHT","value":"900"},{"name":"MAX_TEST_SESSIONS","value":"1"},{"name":"NEW_SESSION_WAIT_TIMEOUT","value":"1800000"},{"name":"DEBUG_ENABLED","value":"false"},{"name":"SEND_ANONYMOUS_USAGE_INFO","value":"true"},{"name":"TZ","value":"UTC"},{"name":"KEEP_ONLY_FAILED_TESTS","value":"false"},{"name":"RETENTION_PERIOD","value":"3"}],"image":"dosel/zalenium:3","imagePullPolicy":"IfNotPresent","livenessProbe":{"httpGet":{"path":"/status","port":4444},"initialDelaySeconds":90,"periodSeconds":5,"timeoutSeconds":1},"name":"zalenium","ports":[{"containerPort":4444,"protocol":"TCP"}],"readinessProbe":{"httpGet":{"path":"/status","port":4444},"timeoutSeconds":1},"resources":{"requests":{"cpu":"500m","memory":"500Mi"}},"volumeMounts":[{"mountPath":"/home/seluser/videos","name":"zalenium-videos"},{"mountPath":"/tmp/mounted","name":"zalenium-data"}]}],"serviceAccountName":"zalenium","volumes":[{"emptyDir":{},"name":"zalenium-videos"},{"emptyDir":{},"name":"zalenium-data"}]}}}}
spec:
replicas: 1
selector:
matchLabels:
instance: zalenium
template:
metadata:
creationTimestamp: null
labels:
app: zalenium
instance: zalenium
spec:
volumes:
- name: zalenium-videos
emptyDir: {}
- name: zalenium-data
emptyDir: {}
containers:
- name: zalenium
image: 'dosel/zalenium:3'
args:
- start
ports:
- containerPort: 4444
protocol: TCP
env:
- name: ZALENIUM_KUBERNETES_CPU_REQUEST
value: 250m
- name: ZALENIUM_KUBERNETES_CPU_LIMIT
value: 1000m
- name: ZALENIUM_KUBERNETES_MEMORY_REQUEST
value: 500Mi
- name: ZALENIUM_KUBERNETES_MEMORY_LIMIT
value: 2Gi
- name: DESIRED_CONTAINERS
value: '2'
- name: MAX_DOCKER_SELENIUM_CONTAINERS
value: '16'
- name: SELENIUM_IMAGE_NAME
value: elgalu/selenium
- name: VIDEO_RECORDING_ENABLED
value: 'false'
- name: SCREEN_WIDTH
value: '1920'
- name: SCREEN_HEIGHT
value: '1080'
- name: MAX_TEST_SESSIONS
value: '1'
- name: NEW_SESSION_WAIT_TIMEOUT
value: '7200000'
- name: DEBUG_ENABLED
value: 'false'
- name: SEND_ANONYMOUS_USAGE_INFO
value: 'true'
- name: TZ
value: UTC
- name: KEEP_ONLY_FAILED_TESTS
value: 'false'
- name: RETENTION_PERIOD
value: '3'
- name: SEL_BROWSER_TIMEOUT_SECS
value: '7200'
- name: BROWSER_STACK_WAIT_TIMEOUT
value: 120m
resources:
limits:
memory: 1Gi
requests:
cpu: 500m
memory: 500Mi
volumeMounts:
- name: zalenium-videos
mountPath: /home/seluser/videos
- name: zalenium-data
mountPath: /tmp/mounted
livenessProbe:
httpGet:
path: /status
port: 4444
scheme: HTTP
initialDelaySeconds: 90
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /status
port: 4444
scheme: HTTP
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
nodeSelector:
dedicated: omdc
serviceAccountName: zalenium
serviceAccount: zalenium
securityContext: {}
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: dedicated
operator: In
values:
- omdc
schedulerName: default-scheduler
tolerations:
- key: dedicated
operator: Equal
value: omdc
effect: NoSchedule
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
status:
observedGeneration: 24
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Available
status: 'True'
lastUpdateTime: '2019-10-22T06:57:52Z'
lastTransitionTime: '2019-10-22T06:57:52Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
- type: Progressing
status: 'True'
lastUpdateTime: '2019-10-31T09:14:01Z'
lastTransitionTime: '2019-09-17T13:24:52Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "zalenium-6df85c7f49" has successfully progressed.