Custom metric for vertx - vert.x

I am using Vertx micrometer (with Prometheus option) and I would like to ask if it's possible to add a custom metric.
I would like to keep track of a number of users with a specific property so I need to call a service (via event bus) to get this number. Finally, I want to include this number in the metrics collected by Prometheus.
Thank you
Michael

Related

Limiting the number of times an endpoint of Kubernetes pod can be accessed?

I have a machine learning model inside a docker image. I pushed the docker image to google container registry and then deploy it inside a Kubernetes pod. There is a fastapi application that runs on Port 8000 and this Fastapi endpoint is public
(call it mymodel:8000).
The structure of fastapi is :
app.get("/homepage")
asynd def get_homepage()
app.get("/model):
aysnc def get_modelpage()
app.post("/model"):
async def get_results(query: Form(...))
User can put query and submit them and get results from the machine learning model running inside the docker. I want to limit the number of times a query can be made by all the users combined. So if the query limit is 100, all the users combined can make only 100 queries in total.
I thought of a way to do this:
Store a database that stores the number of times GET and POST method has been called. As soon as the total number of times POST has been called crosses the limit, stop accepting any more queries.
Is there an alternative way of doing this using Kubernetes limits? Such as I can define a limit_api_calls such that the total number of times mymodel:8000 is accessed is at max equal to limit_api_calls.
I looked at the documentation and I could only find setting limits for CPUs, Memory and rateLimits.
There are several approaches that could satisfy your needs.
Custom implementation: As you mentioned, keep in a persistence layer the number of API calls received and deny requests after it has been reached.
Use a service mesh: Istio (for instance) will let you limit the number of requests received and act as a circuit breaker.
Use an external Api Manager: Apigee will also let you limit and even charge your users, however if it is only for internal use (not pay per use) I definitely won't recommend it.
The tricky part is what you want to happen after the limit has been reached, if it is just a pod you may exit the application to finish and clear it.
Otherwise, if you have a deployment with its replica set and several resources associated with it (like configmaps), you probably want to use some kind of asynchronous alert or polling check to clean up everything related to your deployment. You may want to have a deep look at orchestrators like Airflow (Composer) and use several tools such as Helm for keeping deployments easy.

Kogito - wait until data from multiple endpoints is received

I am using Kogito with Quarkus. I have set on drl rule and am using a bpmn configuration. As can be seen below, currently one endpoint is exposed, that starts the process. All needed data is received from the initial request, it is then evaluated and process goes on.
I would like to extend the workflow to have two separate endpoints. One to provide the age of the person and another to provide the name. The process must wait until all needed data is gathered before it proceeds with evaluation.
Has anybody come across a similar solution?
Technically you could use a signal or message to add more data into a process instance before you execute the rules over the entire data, see https://docs.kogito.kie.org/latest/html_single/#ref-bpmn-intermediate-events_kogito-developing-process-services.
In order to do that you need to have some sort of correlation between these events, otherwise, how do you map that event name 1 should be matched to event age 1. If you can keep the process instance id, then the second event can either trigger a rest endpoint to the specific process instance or send it a message via a message broker.
You also have your own custom logic to aggregate the events and only fire a new process instance once your criteria of complete data is met, and there is also plans in Kogito to extend the capabilities of how correlation is done, allowing for instance to use variables of the process as the identifier. For example, if you have person.id as correlation and event to name and age of the same id would signal the same process instance. HOpe this info helps.

track a lot of batch jobs (start and end time)

we have a lot of jobs (for example batch jobs), that are executed each day. Therefor we’d like to have an overview of all jobs.
→ track start time and end time (–> complete runtime).
All of these infos should be available in a visualisation.
Is InflixDB with Grafana a good solution for this or do you recommend another app?
I think InfluxDB and Grafana are really a good starting point to collect data from your services.
You'll also need to integrate some type of metrics library and an exporter in your code.
On Java you could use Micrometer (https://micrometer.io/) and Prometheus.
Here you can find more information about them: https://micrometer.io/docs/registry/prometheus
After having integrated metrics in your code you simply need to configure Grafana to use data from InfluxDB and configure InfluxDB to scrape your metrics endpoint.

add record level custom latency metric in kafka streams

I trying to add a specific metric to my kafka-streams application that will measure latency and report in to the jmx.
I'm using StreamsDSL in scala so using the ProcessorAPI for metrics (which I know is possible) will not work for me.
the basic things I would like to understand is:
how to extract specific record properties (i.e headers) to use as part of the metric calculation
How to add the new metric to the metrics reported to the jmx
Thanks!
You will need to fall back to the Processor API to access record metadata like headers and to register custom metrics.
Note thought, that you can mix-and-match the DSL and the Processor API, so it's not necessary to move off the DSL. Instead, you can pluging custom Processors or Transformers via KStream.process() or KStream.transform() (note, that there are multiple "siblings" to transform() that you might want to use instead of the transform()).

Kafka Streams - accessing data from the metrics registry

I'm having a difficult time finding documentation on how to access the data within the Kafka Streams metric registry, and I think I may be trying to fit a square peg in a round hole. I was hoping to get some advice on the following:
Goal
Collect metrics being recorded in the Kafka Streams metrics registry and send these values to an arbitrary end point
Workflow
This is what I think needs to be done, and I've complete all of the steps except the last (having trouble with that one because the metrics registry is private). But I may be going about this the wrong way:
Define a class that implements the MetricReporter interface. Build a list of the metrics that Kafka creates in the metricChange method (e.g. whenever this method is called, update a hashmap with the currently registered metrics).
Specify this class in the metric.reporters configuration property
Set up a process that polls the Kafka Streams metric registry for the current data, and ship the values to an arbitrary end point
Anyways, the last step doesn't appear to be possible in Kafka 0.10.0.1 since the metrics registry isn't exposed. Could some please let me know this if is the correct workflow (sounds like it's not..), or if I am misunderstanding the process for extracting the Kafka Streams metrics?
Although the metrics registry is not exposed, you can still get the value of a given KafkaMetric by its KafkaMetric.value() / KafkaMetric.value(timestamp) methods. For example, as you observed in the JMXRporter, it keeps the list of KafkaMetrics from the instantiated init() and metricChange/metricRemoval methods, and then in its MBean implementation, when getAttribute is called, it will call its corresponding KafkaMetrics.value() function. So for your customized reporter, you can apply similar patterns, for example, periodically poll all kept KafkaMetrics.value() and then pipe the results to your end point.
The MetricReporter interface in org.apache.kafka.common.metrics already enables you to manage all Kafka stream metrics in the reporter. So kafka internal registry is not needed.