How to separate debezium metrics by source table? - debezium

I use debezium postgres connector (docker) with JMX. Everything works fine. The Dockerfile from which the image is built is this:
ARG DEBEZIUM_VERSION
FROM debezium/connect:${DEBEZIUM_VERSION}
ARG JMX_AGENT_VERSION
RUN mkdir /kafka/etc && cd /kafka/etc &&\
curl -so jmx_prometheus_javaagent.jar \
https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/$JMX_AGENT_VERSION/jmx_prometheus_javaagent-$JMX_AGENT_VERSION.jar
COPY config.yml /kafka/etc/config.yml
config.yml
startDelaySeconds: 0
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
rules:
- pattern : "kafka.connect<type=connect-worker-metrics>([^:]+):"
name: "kafka_connect_worker_metrics_$1"
- pattern : "kafka.connect<type=connect-metrics, client-id=([^:]+)><>([^:]+)"
name: "kafka_connect_metrics_$2"
labels:
client: "$1"
- pattern: "debezium.([^:]+)<type=connector-metrics, context=([^,]+), server=([^,]+), key=([^>]+)><>RowsScanned"
name: "debezium_metrics_RowsScanned"
labels:
plugin: "$1"
name: "$3"
context: "$2"
table: "$4"
- pattern: "debezium.([^:]+)<type=connector-metrics, context=([^,]+), server=([^>]+)>([^:]+)"
name: "debezium_metrics_$4"
labels:
plugin: "$1"
name: "$3"
context: "$2"
Both of these are taken directly from an official example
The issue if that only RowsScanned metric has "table" label, all others do not. Other mbeans do not have a key set, so there is nowhere to extract the table name from.
I want metrics like TotalNumberOfCreateEventsSeen to have this label as well. Is there a way do it?

Related

Worker unable to connect to the master and invalid args in webport for Locust

I am trying to set up a load test for an endpoint. This is what I have followed so far:
Dockerfile
FROM python:3.8
# Add the external tasks directory into /tasks
WORKDIR /src
ADD requirements.txt .
RUN pip install --no-cache-dir --upgrade locust==2.10.1
ADD run.sh .
ADD load_test.py .
ADD load_test.conf .
# Expose the required Locust ports
EXPOSE 5557 5558 8089
# Set script to be executable
RUN chmod 755 run.sh
# Start Locust using LOCUS_OPTS environment variable
CMD ["bash", "run.sh"]
# Modified from:
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/docker/Dockerfile
run.sh
#!/bin/bash
LOCUST="locust"
LOCUS_OPTS="--config=load_test.conf"
LOCUST_MODE=${LOCUST_MODE:-standalone}
if [[ "$LOCUST_MODE" = "master" ]]; then
LOCUS_OPTS="$LOCUS_OPTS --master"
elif [[ "$LOCUST_MODE" = "worker" ]]; then
LOCUS_OPTS="$LOCUS_OPTS --worker --master-host=$LOCUST_MASTER_HOST"
fi
echo "${LOCUST} ${LOCUS_OPTS}"
$LOCUST $LOCUS_OPTS
# Copied from
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/docker/locust/run.sh
This is how I have written the load test locust script:
import json
from locust import HttpUser, constant, task
class CategorizationUser(HttpUser):
wait_time = constant(1)
#task
def predict(self):
payload = json.dumps(
{
"text": "Face and Body Paint washable Rubies Halloween item 91#385"
}
)
_ = self.client.post("/predict", data=payload)
I am invoking that with a configuration:
locustfile = load_test.py
headless = false
users = 1000
spawn-rate = 1
run-time = 5m
host = IP
html = locust_report.html
So, after building and pushing the Docker image and creating a k8s cluster on GKE, I am deploying it. This is how the deployment.yaml looks like:
# Copied from
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/kubernetes/templates/deployment.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
namespace: default
name: locust-master-deployment
labels:
name: locust
role: master
spec:
replicas: 1
selector:
matchLabels:
name: locust
role: master
template:
metadata:
labels:
name: locust
role: master
spec:
containers:
- name: locust
image: gcr.io/PROJECT_ID/IMAGE_URI
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: master
- name: LOCUST_LOG_LEVEL
value: DEBUG
ports:
- name: loc-master-web
containerPort: 8089
protocol: TCP
- name: loc-master-p1
containerPort: 5557
protocol: TCP
- name: loc-master-p2
containerPort: 5558
protocol: TCP
---
kind: Deployment
apiVersion: apps/v1
metadata:
namespace: default
name: locust-worker-deployment
labels:
name: locust
role: worker
spec:
replicas: 2
selector:
matchLabels:
name: locust
role: worker
template:
metadata:
labels:
name: locust
role: worker
spec:
containers:
- name: locust
image: gcr.io/PROJECT_ID/IMAGE_URI
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: worker
- name: LOCUST_MASTER
value: locust-master-service
- name: LOCUST_LOG_LEVEL
value: DEBUG
After deployment, I am exposing the required ports like so:
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type NodePort \
--port 5558 \
--target-port 5558 \
--name locust-5558
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type NodePort \
--port 5557 \
--target-port 5557 \
--name locust-5557
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type LoadBalancer \
--port 80 \
--target-port 8089 \
--name locust-web
The cluster and the nodes provision successfully. But the moment I hit the IP of locust-web, I am getting:
Any suggestions on how to resolve the bug?
Since you are exposing your pods and you are trying to access to them outside the cluster (your web application), you have to port-forward them or add an Ingress in order to access to your locust pods.
My first approach will be trying to ping or send requests to your locust pods with a simple port-forward.
More infos about the port forwarding here.
Probably the environment variables set by k8s is colliding with locust’s (LOCUST_WEB_PORT specifically). Change your setup so that no containers are named ”locust”.
See https://github.com/locustio/locust/issues/1226 for a similar issue.

Error parsing yaml to json: did not find expected key - kubernetes

I am trying to create a k8s job with the below yaml,
apiVersion: batch/v1
kind: Job
metadata:
name: mysql-with-ttl
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: mysql
image: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "test1234"
command: ["/bin/sh","-c"]
args: ["mysql --defaults-extra-file=/home/mysql-config/mysql-config --host yyy.mysql.database.azure.com -e"create database test; show databases" && mysql --defaults-extra-file=/home/mysql-config/mysql-config --host yyy.mysql.database.azure.com test < /home/schema/test-schema.sql"]
volumeMounts:
- name: mysql-config-vol
mountPath: /home/mysql-config
- name: schema-config-vol
mountPath: /home/schema
volumes:
- name: mysql-config-vol
configMap:
name: mysql-config
- name: schema-config-vol
configMap:
name: test-schema
restartPolicy: Never
Some issue with the args given above, so I am getting the below error:
error: error parsing k8s-job.yaml: error converting YAML to JSON: yaml: line 15: did not find expected ',' or ']'
I have to pass the commands in args to 1) login to mysql server 2) create database called "test" 3) import the sql schema to the created database in mysql. But, there's an error with the syntax and I am unable to figure out where exactly the issue is.
Can anyone please help me to fix this? Thanks in Advance!
Figured out the way, the following args is working. Please refer if needed,
args: ["mysql --defaults-extra-file=/home/mysql-config/mysql-config --host yyy.mysql.database.azure.com -e 'create database obortech_qa; ' && mysql --defaults-extra-file=/home/mysql-config/mysql-config --host yyy.mysql.database.azure.com obortech_qa < /home/schema/test-schema.sql"]

Tekton: yq Task gives safelyRenameFile [ERRO] Failed copying from /tmp/temp & [ERRO] open /workspace/source permission denied error

We have a Tekton pipeline and want to replace the image tags contents of our deployment.yml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: microservice-api-spring-boot
spec:
replicas: 3
revisionHistoryLimit: 3
selector:
matchLabels:
app: microservice-api-spring-boot
template:
metadata:
labels:
app: microservice-api-spring-boot
spec:
containers:
- image: registry.gitlab.com/jonashackt/microservice-api-spring-boot#sha256:5d8a03755d3c45a3d79d32ab22987ef571a65517d0edbcb8e828a4e6952f9bcd
name: microservice-api-spring-boot
ports:
- containerPort: 8098
imagePullSecrets:
- name: gitlab-container-registry
Our Tekton pipeline uses the yq Task from Tekton Hub to replace the .spec.template.spec.containers[0].image with the "$(params.IMAGE):$(params.SOURCE_REVISION)" name like this:
- name: substitute-config-image-name
taskRef:
name: yq
runAfter:
- fetch-config-repository
workspaces:
- name: source
workspace: config-workspace
params:
- name: files
value:
- "./deployment/deployment.yml"
- name: expression
value: .spec.template.spec.containers[0].image = \"$(params.IMAGE)\":\"$(params.SOURCE_REVISION)\"
Sadly the yq Task doesn't seem to work, it produces a green
Step completed successfully, but shows the following errors:
16:50:43 safelyRenameFile [ERRO] Failed copying from /tmp/temp3555913516 to /workspace/source/deployment/deployment.yml
16:50:43 safelyRenameFile [ERRO] open /workspace/source/deployment/deployment.yml: permission denied
Here's also a screenshot from our Tekton Dashboard:
Any idea on how to solve the error?
The problem seems to be related to the way how the Dockerfile of https://github.com/mikefarah/yq now handles file permissions (for example this fix among others). The 0.3 version of the Tekton yq Task uses the image https://hub.docker.com/layers/mikefarah/yq/4.16.2/images/sha256-c6ef1bc27dd9cee57fa635d9306ce43ca6805edcdab41b047905f7835c174005 which produces the error.
One work-around to the problem could be the usage of the yq Task version 0.2 which you can apply via:
kubectl apply -f https://raw.githubusercontent.com/tektoncd/catalog/main/task/yq/0.2/yq.yaml
This one uses the older docker.io/mikefarah/yq:4#sha256:34f1d11ad51dc4639fc6d8dd5ade019fe57cf6084bb6a99a2f11ea522906033b and works without the error.
Alternatively you can simply create your own yq based Task that won't have the problem like this:
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: replace-image-name-with-yq
spec:
workspaces:
- name: source
description: A workspace that contains the file which need to be dumped.
params:
- name: IMAGE_NAME
description: The image name to substitute
- name: FILE_PATH
description: The file path relative to the workspace dir.
- name: YQ_VERSION
description: Version of https://github.com/mikefarah/yq
default: v4.2.0
steps:
- name: substitute-with-yq
image: alpine
workingDir: $(workspaces.source.path)
command:
- /bin/sh
args:
- '-c'
- |
set -ex
echo "--- Download yq & add to path"
wget https://github.com/mikefarah/yq/releases/download/$(params.YQ_VERSION)/yq_linux_amd64 -O /usr/bin/yq &&\
chmod +x /usr/bin/yq
echo "--- Run yq expression"
yq e ".spec.template.spec.containers[0].image = \"$(params.IMAGE_NAME)\"" -i $(params.FILE_PATH)
echo "--- Show file with replacement"
cat $(params.FILE_PATH)
resources: {}
This custom Task simple uses the alpine image as base and installs yq using the Plain binary wget download. Also it uses yq exactly as you would do on the command line locally, which makes development of your expression so much easier!
As a bonus it outputs the file contents so you can check the replacement results right in the Tekton pipeline!
You need to apply it with
kubectl apply -f tekton-ci-config/replace-image-name-with-yq.yml
And should now be able to use it like this:
- name: replace-config-image-name
taskRef:
name: replace-image-name-with-yq
runAfter:
- dump-contents
workspaces:
- name: source
workspace: config-workspace
params:
- name: IMAGE_NAME
value: "$(params.IMAGE):$(params.SOURCE_REVISION)"
- name: FILE_PATH
value: "./deployment/deployment.yml"
Inside the Tekton dashboard it will look somehow like this and output the processed file:

Is it possible to add dependent environment variables in Google Cloud Run?

I would like to specify dependent environment variables on a Cloud Run service.
If the environment variables have been defined in a .env file it would look like this
DATABASE_NAME=my-database
DATABASE_USER=root
DATABASE_PASSWORD=P4SSw0rd!
DATABASE_PORT=5432
DATABASE_HOST="/socket/my-database-socket"
DATABASE_URL="user=${DATABASE_USER} password=${DATABASE_PASSWORD} dbname=${DATABASE_NAME} host=${DATABASE_HOST}"
In this example, DATABASE_URL depends on every other environment variables.
To deploy the service I run the following command:
gcloud run deploy my-service \
--image gcr.io/my-project/my-image:latest \
--region europe-west1 \
--port 80 \
--platform managed \
--allow-unauthenticated \
--set-env-vars 'DATABASE_NAME=my-database' \
--set-env-vars 'DATABASE_USER=root' \
--set-env-vars 'DATABASE_PASSWORD=P4SSw0rd!' \
--set-env-vars 'DATABASE_PORT=5432' \
--set-env-vars 'DATABASE_HOST="/socket/my-database-socket"' \
--set-env-vars 'DATABASE_URL="user=$(DATABASE_USER) password=$(DATABASE_PASSWORD) dbname=$(DATABASE_NAME) host=$(DATABASE_HOST)"'
Here is the created YAML definition of the service (some values are omitted)
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: my-service
spec:
template:
metadata:
name: ...
spec:
containerConcurrency: 80
timeoutSeconds: 300
containers:
- image: ...
ports:
- name: http1
containerPort: 80
env:
- name: DATABASE_NAME
value: my-database
- name: DATABASE_USER
value: root
- name: DATABASE_PASSWORD
value: P4SSw0rd!
- name: DATABASE_HOST
value: /socket/my-database-socket
- name: DATABASE_URL
value: user=$(DATABASE_USER) password=$(DATABASE_PASSWORD) dbname=$(DATABASE_NAME) host=$(DATABASE_HOST)
The problem is that when the service is running, the env vars in DATABASE_URL seem not interpolated.
I read that Kubernetes supports dependent env vars but I can't figure out how to make this run in Cloud Run.
I am wondering if it is supported in Cloud Run in the end.
It's likely this may work in Knative open source (which uses Kubernetes to execute pods) but not on Google Cloud Run (fully hosted), which runs on a proprietary execution engine.

Combining multiple Local-SSD on a node in Kubernetes (GKE)

The data required by my container is too large to fit on one local SSD. I also need to access the SSD's as one filesystem from my container. So I would need to attach multiple ones. How do I combine them (single partition, RAID0, etc) and make them accessible as one volume mount in my container?
This link shares how to mount an SSD https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/local-ssd to a mount path. I am not sure how you would merge multiple.
edit
The question asks how one would "combine" multiple SSD devices, individually mounted, on a single node in GKE.
WARNING
this is experimental and not intended for production use without
knowing what you are doing and only tested on gke version 1.16.x.
The approach includes a daemonset using a configmap to use nsenter (with wait tricks) for host namespace and privileged access so you can manage the devices. Specifically for GKE Local SSDs, we can unmount those devices and then raid0 them. InitContainer for the dirty work as this type of task seems most apparent for something you'd need to mark complete, and to then kill privileged container access (or even the Pod). Here is how it is done.
The example assumes 16 SSDs, however, you'll want to adjust the hardcoded values as necessary. Also, ensure your OS image reqs, I use Ubuntu. Also make sure the version of GKE you use starts local-ssd's at sd[b]
ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: local-ssds-setup
namespace: search
data:
setup.sh: |
#!/bin/bash
# returns exit codes: 0 = found, 1 = not found
isMounted() { findmnt -rno SOURCE,TARGET "$1" >/dev/null;} #path or device
# existing disks & mounts
SSDS=(/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq)
# install mdadm utility
apt-get -y update && apt-get -y install mdadm --no-install-recommends
apt-get autoremove
# OPTIONAL: determine what to do with existing, I wipe it here
if [ -b "/dev/md0" ]
then
echo "raid array already created"
if isMounted "/dev/md0"; then
echo "already mounted - unmounting"
umount /dev/md0 &> /dev/null || echo "soft error - assumed device was mounted"
fi
mdadm --stop /dev/md0
mdadm --zero-superblock "${SSDS[#]}"
fi
# unmount disks from host filesystem
for i in {0..15}
do
umount "${SSDS[i]}" &> /dev/null || echo "${SSDS[i]} already unmounted"
done
if isMounted "/dev/sdb";
then
echo ""
echo "unmount failure - prevent raid0" 1>&2
exit 1
fi
# raid0 array
yes | mdadm --create /dev/md0 --force --level=0 --raid-devices=16 "${SSDS[#]}"
echo "raid array created"
# format
mkfs.ext4 -F /dev/md0
# mount, change /mnt/ssd-array to whatever
mkdir -p /mnt/ssd-array
mount /dev/md0 /mnt/ssd-array
chmod a+w /mnt/ssd-array
wait.sh: |
#!/bin/bash
while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do sleep 1; done
DeamonSet pod spec
spec:
hostPID: true
nodeSelector:
cloud.google.com/gke-local-ssd: "true"
volumes:
- name: setup-script
configMap:
name: local-ssds-setup
- name: host-mount
hostPath:
path: /tmp/setup
initContainers:
- name: local-ssds-init
image: marketplace.gcr.io/google/ubuntu1804
securityContext:
privileged: true
volumeMounts:
- name: setup-script
mountPath: /tmp
- name: host-mount
mountPath: /host
command:
- /bin/bash
- -c
- |
set -e
set -x
# Copy setup script to the host
cp /tmp/setup.sh /host
# Copy wait script to the host
cp /tmp/wait.sh /host
# Wait for updates to complete
/usr/bin/nsenter -m/proc/1/ns/mnt -- chmod u+x /tmp/setup/wait.sh
# Give execute priv to script
/usr/bin/nsenter -m/proc/1/ns/mnt -- chmod u+x /tmp/setup/setup.sh
# Wait for Node updates to complete
/usr/bin/nsenter -m/proc/1/ns/mnt /tmp/setup/wait.sh
# If the /tmp folder is mounted on the host then it can run the script
/usr/bin/nsenter -m/proc/1/ns/mnt /tmp/setup/setup.sh
containers:
- image: "gcr.io/google-containers/pause:2.0"
name: pause
For high performance use cases, use the Ephemeral storage on local SSDs GKE feature. All local SSDs will be configures as a (striped) raid0 array and mounted into the pod.
Quick summary:
Create the node pool or cluster with the option: --ephemeral-storage local-ssd-count=X
Schedule to nodes with cloud.google.com/gke-ephemeral-storage-local-ssd.
Add an emptyDir volume.
Mount it with volumeMounts.
Here's how I used it with a DaemonSet:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: myapp
labels:
app: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
nodeSelector:
cloud.google.com/gke-ephemeral-storage-local-ssd: "true"
volumes:
- name: localssd
emptyDir: {}
containers:
- name: myapp
image: <IMAGE>
volumeMounts:
- mountPath: /scratch
name: localssd
You can use DaemonSet yaml file to deploy the pod will run on startup, assuming already created a cluster with 2 local-SSD (this pod will be in charge of creating the Raid0 disk)
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: ssd-startup-script
labels:
app: ssd-startup-script
spec:
template:
metadata:
labels:
app: ssd-startup-script
spec:
hostPID: true
containers:
- name: ssd-startup-script
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#!/bin/bash
sudo curl -s https://get.docker.com/ | sh
echo Done
The pod that will have access to the disk array in the above example is “/mnt/disks/ssd-array”
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: ubuntu
volumeMounts:
- mountPath: /mnt/disks/ssd-array
name: ssd-array
args:
- sleep
- "1000"
nodeSelector:
cloud.google.com/gke-local-ssd: "true"
tolerations:
- key: "local-ssd"
operator: "Exists"
effect: "NoSchedule"
volumes:
- name: ssd-array
hostPath:
path: /mnt/disks/ssd-array
After deploying the test-pod, SSH to the pod from your cloud-shell or any instance.
Then run :
kubectl exec -it test-pod -- /bin/bash
After that you should be able to see the created file in the ssd-array disk.
cat test-file.txt