Adding a custom tag based on topicName (wildcard) via using JmxTrans to send Kafka JMX to influxDb - apache-kafka

Basically what i wanted to achieve was to get MessageInPerSec metric for all the topic in kafka and to add the custom tag as topicName in the influx db so as to query based on the topic not based on the 'ObjDomain' definition, below are my JmxTrans configuration, (Note using the wildcard for the topic as to fetch the data MessageInPerSec JMX attribute for all the topic)
{
"servers": [
{
"port": "9581",
"host": "192.168.43.78",
"alias": "kafka-metric",
"queries": [
{
"outputWriters": [
{
"#class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://192.168.43.78:8086/",
"database": "kafka",
"username": "admin",
"password": "root"
}
],
"obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=*",
"attr": [
"Count",
"MeanRate",
"OneMinuteRate",
"FiveMinuteRate",
"FifteenMinuteRate"
],
"resultAlias": "newTopic"
}
],
"numQueryThreads": 2
}
]
}
which yields a result in the Influx DB as follow
[name=newTopic, time=1589425526087, tags={attributeName=FifteenMinuteRate,
className=com.yammer.metrics.reporting.JmxReporter$Meter, objDomain=kafka.server,
typeName=type=BrokerTopicMetrics,name=MessagesInPerSec,topic=backblaze_smart},
precision=MILLISECONDS, fields={FifteenMinuteRate=1362.9446063537794, _jmx_port=9581
}]
and create tag with whole objDomain spefcified in the config, but i wanted to have topic as a seperate tag that is something as follow
[name=newTopic, time=1589425526087, tags={attributeName=FifteenMinuteRate,
className=com.yammer.metrics.reporting.JmxReporter$Meter, objDomain=kafka.server,
topic=backblaze_smart,
typeName=type=BrokerTopicMetrics,name=MessagesInPerSec,topic=backblaze_smart},
precision=MILLISECONDS, fields={FifteenMinuteRate=1362.9446063537794, _jmx_port=9581
}]
was not able to find any adequate documentation for the same on how to use the wildcard value of topic as a separate tag using jmxtrans and writing it to the InfluxDB.

You just need to add the following additional properties for Influx output writer. Just make sure you are using the latest version of jmxtrans release. The docs are here: https://github.com/jmxtrans/jmxtrans/wiki/InfluxDBWriter
"typeNames": ["topic"],
"typeNamesAsTags": "true"
I have listed your config with the above modifications.
{
"servers": [
{
"port": "9581",
"host": "192.168.43.78",
"alias": "kafka-metric",
"queries": [
{
"outputWriters": [
{
"#class": "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url": "http://192.168.43.78:8086/",
"database": "kafka",
"username": "admin",
"password": "root",
"typeNames": ["topic"],
"typeNamesAsTags": "true"
}
],
"obj": "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=*",
"attr": [
"Count",
"MeanRate",
"OneMinuteRate",
"FiveMinuteRate",
"FifteenMinuteRate"
],
"resultAlias": "newTopic"
}
],
"numQueryThreads": 2
}
]
}

Related

Getting started with KrakenD

I need some beginner help to KrakenD. I am running it on Ubuntu. The config is provided below.
I am able to reach the /healthz API without problem.
My challenge is that the /hello path returns error 500. I want this path to redirect to a Quarkus app that runs at http://getting-started36-getting-going.apps.bamboutos.hostname.us/.
Why is this not working? If I modify the /hello backend and use a fake host, I get the exacts ame result. This suggests that KrakendD is not even trying to connect to the backend.
In logs, KrakendD is saying:
Error #01: invalid character 'H' looking for beginning of value
kraken.json:
{
"version": 2,
"port": 9080,
"extra_config": {
"github_com/devopsfaith/krakend-gologging": {
"level": "DEBUG",
"prefix": "[KRAKEND]",
"syslog": false,
"stdout": true,
"format": "default"
}
},
"timeout": "3000ms",
"cache_ttl": "300s",
"output_encoding": "json",
"name": "KrakenD API Gateway Service",
"endpoints": [
{
"endpoint": "/healthz",
"extra_config": {
"github.com/devopsfaith/krakend/proxy": {
"static": {
"data": { "status": "OK"},
"strategy": "always"
}
}
},
"backend": [
{
"url_pattern": "/",
"host": ["http://fake-backend"]
}
]
},
{
"endpoint": "/hello",
"extra_config": {},
"backend": [
{
"url_pattern": "/hello",
"method": "GET",
"host": [
"http://getting-started36-getting-going.apps.bamboutos.hostname.us/"
]
}
]
}
]
}
What am I missing?
add "encoding": "string" to the backend section.
"backend": [
{
"url_pattern": "/hello",
"method": "GET",
"encoding": "string" ,
"host": [
"http://getting-started36-getting-going.apps.bamboutos.hostname.us/"
]
}
]

Kafka JDBC Source connector Insert or Update

I configured a Kafka JDBC Source connector in order to push on a Kafka topic the record changed (insert or update) from a PostgreSQL database.
I use "timestamp+incrementing" mode. Seems to work fine.
I didnt't configure the JDBC Sink connector because I'm using a Kafka Consumer that listen on the topic.
The message on the topic is a JSON. This is an example:
{
"schema": {
"type": "struct",
"fields": [
{
"type": "int64",
"optional": false,
"field": "id"
},
{
"type": "int64",
"optional": true,
"name": "org.apache.kafka.connect.data.Timestamp",
"version": 1,
"field": "entity_create_date"
},
{
"type": "int64",
"optional": true,
"name": "org.apache.kafka.connect.data.Timestamp",
"version": 1,
"field": "entity_modify_date"
},
{
"type": "int32",
"optional": true,
"field": "entity_version"
},
{
"type": "string",
"optional": true,
"field": "firstname"
},
{
"type": "string",
"optional": true,
"field": "lastname"
}
],
"optional": false,
"name": "author"
},
"payload": {
"id": 1,
"entity_create_date": 1600287236682,
"entity_modify_date": 1600287236682,
"entity_version": 1,
"firstname": "George",
"lastname": "Orwell"
}
}
As you can see there is no info about if this change is captured by Source connector because of an insert or an update.
I need this information. How can solve?
You can't get that information using the JDBC Source connector, unless you do something bespoke in the source schema and triggers.
This is one of the reasons why log-based CDC is generally a better way to get events from the source database, and for other reasons including:
capturing deletes
capturing the type of operation
capturing all events, not just what's there at the time when the connector polls.
For more details on the nuances of this see this blog or a talk based on the same.
Using a CDC based approach as suggested by #Robin Moffatt may be the proper way to handle your requirement. Checkout https://debezium.io/
However, looking at your table data you could use "entity_create_date" and "entity_modify_date" in your consumer to determine if the message in an insert or update. If "entity_create_date" = "entity_modify_date" then it's an insert else it's an update.

Write correct JSON about PlanDefinition and ActivityDefinition on FHIR

I want to write a JSON to create a PlanDefinition resource with some ActivityDefinition resources inside it to persiste on FHIR r4 server these resources.
My sandbox server is Hapi FHIR
Two questions:
The first: How can I write it
The second: When I'll wirte the correct JSON, the result will be the creation of one PlanDefinition resource and some ActivityDefinition resources, or will be created only one PlanDefinition resource with these informations inside it?
This is my JSON to create a simple PlanDefinition, but I donìt know how to add ActivityDefinition inside it
{
"resourceType": "PlanDefinition",
"id": "999999",
"meta": {
"versionId": "1",
"lastUpdated": "2020-04-16T11:10:45.868+00:00",
"source": "#YS2h8QIqvGKHDy4x"
},
"url": "www.myserver.it",
"identifier": [ {
"system": "www.myserver.it",
"value": "jtr-pd1"
} ],
"version": "versione 1",
"status": "active",
"action": [ {
"title": "A",
"definitionCanonical": "#Process_Alex1"
}, {
"title": "B",
"definitionCanonical": "#Process_Alex2"
}, {
"title": "C",
"definitionCanonical": "ActivityDefinition"
} ]
}
Typically in FHIR we don't contain resources inside each other. References instead point to other independently maintained resource instances. For example, multiple PlanDefinitions might point to the same ActivityDefinition because that one activity is a 'step' in multiple protocols/order sets.
If you have a situation where an activity definition is tied to a single PlanDefinition and can't exist independent of that PlanDefinition (e.g. if the PlanDefinition were deleted, the ActivityDefinition would go too; no other PlanDefinition can point to the Activity, any update to the activity would be considered an update to the plan, etc.), you can send the ActivityDefinition as a 'contained' resource. Your instance would look like this:
{
"resourceType": "PlanDefinition",
"id": "999999",
"meta": {
"versionId": "1",
"lastUpdated": "2020-04-16T11:10:45.868+00:00",
"source": "#YS2h8QIqvGKHDy4x"
},
"contained": [ {
"resourceType": "ActivityDefinition",
"id": "Process_Alex1",
...
},
{
"resourceType": "ActivityDefinition",
"id": "Process_Alex2",
...
} ],
{
"url": "www.myserver.it",
"identifier": [ {
"system": "www.myserver.it",
"value": "jtr-pd1"
} ],
"version": "versione 1",
"status": "active",
"action": [ {
"title": "A",
"definitionCanonical": "#Process_Alex1"
}, {
"title": "B",
"definitionCanonical": "#Process_Alex2"
}, {
"title": "C",
"definitionCanonical": "http://somewhere.org/ActivityDefinition/foo"
} ]
}

Only data from node 1 visible in a 2 node OrientDB cluster

I created a 2-node OrientDB cluster by following the below steps. But while distributing it, the data present in only one of the node is accessible. Please can you help me debug this issue. The OrientDB version is 2.2.6
Steps involved :
Utilized plocal mode in ETL tool and stored part of the data in node 1 and the other part in node2. The data stored actually belongs to just one class of vertex alone. ( On checking the data from console, the data has got injested properly ).
Then executed both the nodes in distributed mode, data from only one machineis accessible.
The default-distributed-db-config.json file is specified below :
{
"autoDeploy": true,
"readQuorum": 1,
"writeQuorum": 1,
"executionMode": "undefined",
"readYourWrites": true,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"address": {
"servers" : [ "orientmaster" ]
},
"address_1": {
"servers" : [ "orientslave1" ]
},
"*": {
"servers": ["<NEW_NODE>"]
}
}
}
There are two clusters created for the vertex named address namely address and address_1. The data in machine orientslave1 is stored using ETL tool into cluster address_1 , similarly the data in machine orientmaster is stored into the cluster address. ( I've ensured that both of these cluster ids are different at time of creation )
However when these two machines are connected together in distributed mode, the data in cluster address_1 is only visible
The ETL json is attached below :
{
"source": { "file": { "path": "/home/ubuntu/labvolume1/DataStorage/geo1_5lacs.csv" } },
"extractor": { "csv": {"columnsOnFirstLine": false, "columns":["place:string"] } },
"transformers": [
{ "vertex": { "class": "ADDRESS", "skipDuplicates":true } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:/home/ubuntu/labvolume1/orientdb/databases/ETL_Test1",
"dbType": "graph",
"dbUser": "admin",
"dbPassword": "admin",
"dbAutoCreate": true,
"wal": false,
"tx":false,
"classes": [
{"name": "ADDRESS", "extends": "V", "clusters":1}
], "indexes": [
{"class":"ADDRESS", "fields":["place:string"], "type":"UNIQUE" }
]
}
}
}
Please let me know, if there is anything i'm doing wrongly

Orion - cygnus integration

We are trying to integrate Orion, Cygnus and Ckan together.
I have followed these steps in order to make this happen:
Install and configure Cygnus with the Fiware Ckan info(Cygnus up and running)
Login in Ckan and get the API key and configure this in the Cygnus settings
Orion steps:
queryUpdate = APPEND data
{
"contextElements": [{
"type": "Room",
"isPattern": "false",
"id": "26JanRoom",
"attributes": [{
"name": "temperature",
"type": "float",
"value": "888"
}]
}],
"updateAction": "APPEND"
}
subscribeContext = subscribe with the entity id created above(our Cygnus host is given as reference "reference": "CYGNUS HOST", )
{
"entities": [{
"type": "Room",
"isPattern": "false",
"id": "26JanRoom"
}],
"attributes": ["temperature"],
"reference": "CYGNUS HOST",
"duration": "P1M",
"notifyConditions": [{
"type": "ONCHANGE",
"condValues": ["temperature"]
}],
"throttling": "PT5S"
}
queryUpdate = UPDATE data
{
"contextElements": [{
"type": "Room",
"isPattern": "false",
"id": "26JanRoom",
"attributes": [{
"name": "temperature",
"type": "float",
"value": "111"
}]
}],
"updateAction": "UPDATE"
}
What we expect is to receive some notifications in the Cygnus side, but there is nothing sent from the Orion (orion.lab.fi-ware.org:1026/)
Could you please help us on this topic?
Thanks kr
Omer Ozdemir
Your are using
"condValues": ["pressure"]
which means that every time the attribute named pressure change, Orion will trigger a notification. However, your update is modifiygin temperature.
Please, have a look to the subscribe context operation section at Orion documentation.