Grafana Dashboard for Gatling-influxdb Setup - grafana

I am trying to import some readymade dashboard from grafana for my setup( gatling/influxdb) but those are not working somehow specially simulation parameter in grafana dashboard.if someone using same setup , can please share their json file.
below is my config for gatling and influxdb
gatling.conf
data {
writers = [console,file,graphite] # The list of DataWriters to which Gatling write simulation data (currently supported : console, file, graphite)
console {
light = false # When set to true, displays a light version without detailed request stats
writePeriod = 5 # Write interval, in seconds
}
file {
bufferSize = 8192 # FileDataWriter's internal data buffer size, in bytes
}
leak {
noActivityTimeout = 30 # Period, in seconds, for which Gatling may have no activity before considering a leak may be happening
}
graphite {
light = false # only send the all* stats
host = "localhost" # The host where the Carbon server is located
port = 2003 # The port to which the Carbon server listens to (2003 is default for plaintext, 2004 is default for pickle)
protocol = "tcp" # The protocol used to send data to Carbon (currently supported : "tcp", "udp")
rootPathPrefix = "gatling" # The common prefix of all metrics sent to Graphite
bufferSize = 8192 # Internal data buffer size, in bytes
writePeriod = 1 # Write period, in seconds
}
and influxdb.conf contains below parameters
[graphite]]
# Determines whether the graphite endpoint is enabled.
enabled = true
database = "gatling"
# retention-policy = ""
bind-address = ":2003"
protocol = "tcp"
consistency-level = "one"
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Batching
# will buffer points in memory if you have many coming in.
# Flush if this many points get buffered
batch-size = 5000
# number of batches that may be pending in memory
# batch-pending = 10
# Flush at least this often even if we haven't hit buffer limit
# batch-timeout = "1s"
# UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
# udp-read-buffer = 0
### This string joins multiple matching 'measurement' values providing more control over the final measurement name.
separator = "."

You can find free dashboards via Grafana search: https://grafana.com/grafana/dashboards/?search=gatling

Related

Gatling not sending metrics to InfluxDB using graphite protocol

I followed the BlazeMeter article to monitor Gatling tests with Grafana and InfluxDB but no data is sent to InfluxDB and not any database created with the name "gatlingdb". InfluxDB is up and listen to port :2003. This is the log from InfluxDB:
2022-01-07T13:57:53.019217Z info Starting graphite service {"log_id": "0YuD8znW000", "service": "graphite", "addr": ":2003", "batch_size": 5000, "batch_timeout": "1s"}
And I set gatling.conf fields to these:
data {
writers = [console,file,graphite] # The list of DataWriters to which Gatling write simulation data (currently supported : console, file, graphite)
console {
light = false # When set to true, displays a light version without detailed request stats
writePeriod = 5 # Write interval, in seconds
}
file {
bufferSize = 8192 # FileDataWriter's internal data buffer size, in bytes
}
leak {
noActivityTimeout = 30 # Period, in seconds, for which Gatling may have no activity before considering a leak may be happening
}
graphite {
light = false # only send the all* stats
host = "localhost" # The host where the Carbon server is located
port = 2003 # The port to which the Carbon server listens to (2003 is default for plaintext, 2004 is default for pickle)
protocol = "tcp" # The protocol used to send data to Carbon (currently supported : "tcp", "udp")
rootPathPrefix = "gatling" # The common prefix of all metrics sent to Graphite
bufferSize = 8192 # Internal data buffer size, in bytes
writePeriod = 1 # Write period, in seconds
}
and influxdb.conf contains below parameters
[[graphite]]
# Determines whether the graphite endpoint is enabled.
enabled = true
database = "gatlingdb"
# retention-policy = ""
bind-address = ":2003"
protocol = "tcp"
consistency-level = "one"
# These next lines control how batching works. You should have this enabled
# otherwise you could get dropped metrics or poor performance. Batching
# will buffer points in memory if you have many coming in.
# Flush if this many points get buffered
batch-size = 5000
# number of batches that may be pending in memory
# batch-pending = 10
# Flush at least this often even if we haven't hit buffer limit
# batch-timeout = "1s"
# UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
# udp-read-buffer = 0
### This string joins multiple matching 'measurement' values providing more control over the final measurement name.
separator = "."
### Default tags that will be added to all metrics. These can be overridden at the template level
### or by tags extracted from metric
# tags = ["region=us-east", "zone=1c"]
### Each template line requires a template pattern. It can have an optional
### filter before the template and separated by spaces. It can also have optional extra
### tags following the template. Multiple tags should be separated by commas and no spaces
### similar to the line protocol format. There can be only one default template.
templates = [
"gatling.*.*.*.count measurement.simulation.request.status.field",
"gatling.*.*.*.min measurement.simulation.request.status.field",
"gatling.*.*.*.max measurement.simulation.request.status.field",
"gatling.*.*.*.percentiles95 measurement.simulation.request.status.field",
"gatling.*.*.*.percentiles99 measurement.simulation.request.status.field"
]
Now I am running test through gatling but after successful completion of test no database with name galingdb is getting created on influxdb.
I am not sure what else I need to add.
You need create database manually:
> influx
> CREATE DATABASE gatlingdb

Telegraf connection to Mosquitto using TLS

In my system (with raspberry) I have some sensors that publish data to Mosquitto, I'm using Telegraf to transfer the data do an influxDB database and I'm using Grafana to show the data.
During the test without TLS connection (in mosquittos) everything works correctly but when I activated the TLS I start to have a problem with Telegraf.
The sensor are sending the data to the broker using the client.key, client.crt and ca.crt.
In the broker I can see the data from the sensor. So I think the problem in not in this.
In telegraf (I suppose it works as client) I tried to configure the TLS connection.
Looking at the telegraf.service status , it is active and running. Looking at the journal I don't see errors in the connection but I can't see any data from the broker.
In Telegraf.conf I set the certificate as you can see here below. Instead using pem file I used the file that I use for the sensor or other client connected to the system: the extension is different and I don't know if the problem is here.
Here the configuration of Telegraf (mqtt_consumer)
# # Read metrics from MQTT topic(s)
[[inputs.mqtt_consumer]]
# ## Broker URLs for the MQTT server or cluster. To connect to multiple
# ## clusters or standalone servers, use a seperate plugin instance.
# ## example: servers = ["tcp://localhost:1883"]
# ## servers = ["ssl://localhost:1883"]
# ## servers = ["ws://localhost:1883"]
servers = ["tcp://192.168.1.58:8883"]
#
# ## Topics that will be subscribed to.
topics = [
"sensors/#"
]
#
# ## The message topic will be stored in a tag specified by this value. If set
# ## to the empty string no topic tag will be created.
# # topic_tag = "topic"
#
# ## QoS policy for messages
# ## 0 = at most once
# ## 1 = at least once
# ## 2 = exactly once
# ##
# ## When using a QoS of 1 or 2, you should enable persistent_session to allow
# ## resuming unacknowledged messages.
# # qos = 0
#
# ## Connection timeout for initial connection in seconds
# # connection_timeout = "30s"
#
# ## Maximum messages to read from the broker that have not been written by an
# ## output. For best throughput set based on the number of metrics within
# ## each message and the size of the output's metric_batch_size.
# ##
# ## For example, if each message from the queue contains 10 metrics and the
# ## output metric_batch_size is 1000, setting this to 100 will ensure that a
# ## full batch is collected and the write is triggered immediately without
# ## waiting until the next flush_interval.
# # max_undelivered_messages = 1000
#
# ## Persistent session disables clearing of the client session on connection.
# ## In order for this option to work you must also set client_id to identify
# ## the client. To receive messages that arrived while the client is offline,
# ## also set the qos option to 1 or 2 and don't forget to also set the QoS when
# ## publishing.
# # persistent_session = false
#
# ## If unset, a random client ID will be generated.
client_id = ""
#
# ## Username and password to connect MQTT server.
#username = ""
#password = ""
#
# ## Optional TLS Config
tls_ca = "/etc/telegraf/ca.crt"
tls_cert = "/etc/telegraf/client.crt"
tls_key = "/etc/telegraf/client.key"
# ## Use TLS but skip chain & host verification
# insecure_skip_verify = false
#
# ## Data format to consume.
# ## Each data format has its own unique set of configuration options, read
# ## more about them here:
# ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
How can I check the connection to the broker in Telegraf? Is it correct the configuration or I should use only .pem file?
Your MQTT URL starts with tcp:// but it should start with ssl:// for a MQTT over SSL connection.

How To Push Gatling Perf Results To EC2 Grafana/InfluxDB instance

I have spun an t2.micro Ubuntu 18.04 EC2 instance and in this EC2 instance i have installed manually Grafana and InfluxDB .
Both Grafana and InfluxDB have been installed successfully with no errors,but now what i expect is when i run Gatling tests at my
windows local ,results should get pushed live to InfluxDB and eventually to Grafana
Here is my extract of Gatling.conf settings
data {
writers = [console, file, graphite] # The list of DataWriters to which Gatling write simulation data (currently supported : console, file, graphite, jdbc)
console {
#light = false # When set to true, displays a light version without detailed request stats
#writePeriod = 5 # Write interval, in seconds
}
graphite {
light = false # only send the all* stats
host = "http://ec2-54-67-97-86.us-west-1.compute.amazonaws.com" # The host where the Carbon server is located
port = 2003 # The port to which the Carbon server listens to (2003 is default for plaintext, 2004 is default for pickle)
protocol = "tcp" # The protocol used to send data to Carbon (currently supported : "tcp", "udp")
rootPathPrefix = "gatling" # The common prefix of all metrics sent to Graphite
bufferSize = 8192 # GraphiteDataWriter's internal data buffer size, in bytes
writeInterval = 1 # GraphiteDataWriter's write interval, in seconds
}
Problem is I see no data in influx instance when i run my Gatling tests from local
ubuntu#ip-172-31-9-16:~$ influx -host ec2-54-67-97-86.us-west-1.compute.amazonaws.com Connected to http://ec2-54-67-97-86.us-west-1.compute.amazonaws.com:8086 version 1.7.7
InfluxDB shell version: 1.7.7
> show databases
name: databases
name
----
_internal
gatling
graphite
> use graphite
Using database graphite
> show series
key
---
X-Grafana-Org-Id:
Can someone help to debug this ,that why no data is being received at influx DB
I suggest you to check your graphite listener in influx.
To do it open your influxdb.conf and find [[graphite]] block.
For default settings it should look like that:
[[graphite]]
# Determines whether the graphite endpoint is enabled.
enabled = true
database = "gatlingdb"
retention-policy = ""
bind-address = ":2003"
protocol = "tcp"
consistency-level = "one"
templates = [
"gatling.*.*.*.* measurement.simulation.request.status.field",
"gatling.*.users.*.*measurement.simulation.measurement.request.field"
]
More info here: https://gatling.io/docs/current/realtime_monitoring/#influxdb

C10k Tsung Gatling and PlayWS

I'm new in Load Testing, but I googled a lot and configured test system on Amazon.
The system consist of: the Websocket server, on Play framework, and some load testing machines. I tried such load testing tools: Tsung and Gatling.
My testing scenario: I create >10k users each of them connects to the server and start sending messages for each second. I tuned linux to handle more than 100k connections. I playaround with JAVA_OPTS for gatling (added more memory and ParallelGC usage). I playaround with akka on the server side to handle 100-300 dispatcher threads. I ordered 36 vCPU machine with 60 GB RAM and 10GBit channel for server machine.
But the result was the same Tsung and Gatling send near 10k messages per second from the one machine(I sent just text messages < 160Bytes).
Can someone explain me. Why I can't reach more than 10k concurrent users (1 message per second). And what am I doing wrong?
actor {
default-dispatcher = {
fork-join-executor {
throughput = 1000
parallelism-factor = 36.0
parallelism-max = 154
}
}
}
JAVA_OPTS="-Xmx3800m -Xms3800m -Xmn2g -XX:+UseParallelGC -XX:ParallelGCThreads=20"
Linux configs
sudo ulimits -n > 999999
sudo vim /etc/sysctl.conf
# General gigabit tuning
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 4096 16777216
net.ipv4.tcp_wmem = 4096 4096 16777216
#
# # This gives the kernel more memory for TCP
# # which you need with many (100k+) open socket connections
net.ipv4.tcp_mem = 4096 65536 16777216
#
# # Backlog
net.core.netdev_max_backlog = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_syncookies = 1
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 0
# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65535
# Controls the maximum size of a message, in bytes
kernel.msgmax = 65535
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_orphan_retries = 5
fs.file-max = 999999
net.ipv4.tcp_max_orphans = 819200
net.core.somaxconn = 65535
net.ipv4.tcp_congestion_control = cubic
I tested with 5 client machines and I found that the server side on play framework can handle 50k messages per second from 50k users. with such configuration. The problem is that one client machine can't send more than 10k mps from 10k users. Maybe someone know other load testing tools that can send more than 10k mps from 10k users.

Fiware - Cygnus to Cosmos, I can not upload data to HDFS

I've got an Orion instance with Cygnus, subcription and notify work fine but I can not send files to cosmos.lab.fi-ware.org from my instance.
[ERROR - es.tid.fiware.orionco nnectors.cosmosinjector.OrionHDFSSink.start(OrionHDFSSink.java:108)] Connection to http://130.206.80.46:14000 refused
My cygnus.conf :
# APACHE_FLUME_HOME/conf/cygnus.conf
# The next tree fields set the sources, sinks and channels used by Cygnus. You could use different names than the
# ones suggested below, but in that case make sure you keep coherence in properties names along the configuration file.
# Regarding sinks, you can use multiple ones at the same time; the only requirement is to provide a channel for each
# one of them (this example shows how to configure 3 sinks at the same time).
cygnusagent.sources = http-source
cygnusagent.sinks = hdfs-sink
cygnusagent.channels = hdfs-channel
#=============================================
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = hdfs-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler =
es.tid.fiware.fiwareconnectors.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default organization (organization semantic depend on the persistence sink)
cygnusagent.sources.http-source.handler.default_organization = org42
# ============================================
# OrionHDFSSink configuration
# channel name from where to read notification events
cygnusagent.sinks.hdfs-sink.channel = hdfs-channel
# sink class, must not be changed
cygnusagent.sinks.hdfs-sink.type =
es.tid.fiware.fiwareconnectors.cygnus.sinks.OrionHDFSSink
# The FQDN/IP address of the Cosmos deployment where the notification events will be persisted
cygnusagent.sinks.hdfs-sink.cosmos_host = 130.206.80.46
# port of the Cosmos service listening for persistence operations; 14000 for httpfs, 50070 for webhdfs and free choice for inifinty
cygnusagent.sinks.hdfs-sink.cosmos_port = 14000
# default username allowed to write in HDFS
cygnusagent.sinks.hdfs-sink.cosmos_default_username = myUsername
# default password for the default username
cygnusagent.sinks.hdfs-sink.cosmos_default_password = **********
# HDFS backend type (webhdfs, httpfs or infinity)
cygnusagent.sinks.hdfs-sink.hdfs_api = httpfs
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.hdfs-sink.attr_persistence = column
# prefix for the database and table names, empty if no prefix is desired
cygnusagent.sinks.hdfs-sink.naming_prefix =
# Hive port for Hive external table provisioning
cygnusagent.sinks.hdfs-sink.hive_port = 10000
#=============================================
# hdfs-channel configuration
# channel type (must not be changed)
cygnusagent.channels.hdfs-channel.type = memory
# capacity of the channel
cygnusagent.channels.hdfs-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.hdfs-channel.transactionCapacity = 100
Error log :
2015-02-04 22:52:28,627 (lifecycleSupervisor-1-1)
[INFO - es.tid.fiware.orionconnectors.cosmosinjector.hdfs.HttpFSBackend.createDir(HttpFSBackend.java:68)]
HttpFS operation: PUT 130.206.80.46:14000/webhdfs/v1/user/maxime.mularz/4planet/?op=mkdirs&user.name=maxime.mularz HTTP/1.1
2015-02-04 22:53:31,690 (lifecycleSupervisor-1-1)
[ERROR -.tid.fiware.orionconnectors.cosmosinjector.OrionHDFSSink.start(OrionHDFSSink.java:108)]
Connection to http://130.206.80.46:14000 refused
2015-02-04 22:56:02,182 (SinkRunner-PollingRunner-DefaultSinkProcessor)
[INFO - es.tid.fiware.orionconnectors.cosmosinjector.OrionHDFSSink.persist(OrionHDFSSink.java:212)]
Persisting data. File: Room1-Room-temperature-float.txt, Data: 2015- 02-04T22:56:02.182|1423086962|Room1|Room|temperature|float|90)
2015-02-04 22:56:02,183 (SinkRunner-PollingRunner-DefaultSinkProcessor)
[INFO - es.tid.fiware.orionconnectors.cosmosinjector.hdfs.HttpFSBackend.exists(HttpFSBackend.java:158)]
HttpFS operation: GET 130.206.80.46:14000/webhdfs/v1/user/maxime.mularz/4planet/Room1-Room-temperature-float.txt?op=getfilestatus&user.name=maxime.mularz HTTP/1.1
Thanks in advance.
Fixed once the access from the Lanion node to the spanish node has been enabled.