Set up Prometheus with Docker Compose - docker-compose

I am new to Prometheus and Docker Compose. I have a project structure with docker-compose.yml file that setup Prometheus and Grafana:
/prometheus-grafana/prometheus/docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus:v2.21.0
ports:
- 9000:9090
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
grafana:
image: grafana/grafana:$GRAFANA_VERSION
environment:
GF_SECURITY_ADMIN_USER: $GRAFANA_ADMIN_USER
GF_SECURITY_ADMIN_PASSWORD: $GRAFANA_ADMIN_PASSWORD
ports:
- 3000:3000
volumes:
- grafana-storage:/var/lib/grafana
depends_on:
- prometheus
networks:
- internal
networks:
internal:
volumes:
prometheus-data:
grafana-storage:
I just added more config in /prometheus-grafana/prometheus/prometheus/prometheus.yml like this, the part that I added is the last 3 lines:
global:
scrape_interval: 30s
scrape_timeout: 10s
rule_files:
- alert.yml
scrape_configs:
- job_name: services
metrics_path: /metrics
static_configs:
- targets:
- 'prometheus:9090'
- 'idonotexists:564'
- job_name: myapp
scrape_interval: 10s
static_configs:
- targets:
- localhost:2112
I started Prometheus by running: docker-compose up -d, I got to Prometheus on http://localhost:9000/graph but I don't see the new config I added to prometheus.yml, in docker-compose.yml file, there is a line:
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
Am I supposed to change this too refers to the path of my another /prometheus-grafana/prometheus/prometheus/prometheus.yml in my project instead of from /etc/prometheus/prometheus.yml as in reality, this file doesn't exist /etc/prometheus/prometheus.yml
Thank you in advance.

If you have prometheus configured with --config.file=/etc/prometheus/prometheus.yml it expects its config file to be at exact this position.
Since you have the the following mount points already configured:
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
you just need to put your config file into ./prometheus and name it prometheus.yml

Related

How to add the Kafka Exporter as a data source to Grafana?

I'm trying a simplified example of using Kafka and the Kafka Exporter with Grafana. I have a docker-compose.yml similar to the following:
version: '2'
networks:
app-tier:
driver: bridge
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
environment:
- 'ALLOW_ANONYMOUS_LOGIN=yes'
networks:
- app-tier
kafka:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
kafka-exporter:
build: kafka-exporter
ports:
- "9308:9308"
networks:
- app-tier
entrypoint: ["run.sh"]
grafana:
image: grafana/grafana
ports:
- "3000:3000"
networks:
- app-tier
where run.sh is a wrapper script to keep retrying to run the exporter while Kafka is starting, see https://github.com/khpeek/kafka-exporter-example. The problem is that when I log into Grafana and try to add a Prometheus data source with URL http://kafka-exporter:9308/metrics, I get an error:
Error reading Prometheus: bad_response: readObjectStart: expect { or n, but found <, error found in #1 byte of ...|<html> |..., bigger context ...|<html> <head><title>Kafka Exporter</title>|...
Here is how it looks in the UI:
It seems like Grafana ignores the /metrics path and tries to scrape data from http://kafka-exporter:9308 directly, which indeed looks like the error message describes:
By contrast, the /metrics endpoint contains the actual metrics:
> curl localhost:9308/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 20
...
Why am I getting this error? Shouldn't Grafana pick up the path?
To show the implementation of #usuario's answer, the docker-compose.yml needs a Prometheus service,
version: '2'
networks:
app-tier:
driver: bridge
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
environment:
- 'ALLOW_ANONYMOUS_LOGIN=yes'
networks:
- app-tier
kafka:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
kafka-exporter:
build: kafka-exporter
ports:
- "9308:9308"
networks:
- app-tier
entrypoint: ["run.sh"]
cli:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
prometheus:
image: bitnami/prometheus:latest
ports:
- "9090:9090"
volumes:
- "./prometheus/prometheus.yml:/opt/bitnami/prometheus/conf/prometheus.yml"
networks:
- app-tier
grafana:
image: grafana/grafana
ports:
- "3000:3000"
networks:
- app-tier
where the prometheus/prometheus.yml configuration file is statically configured to scrape the Kafka Exporter:
global:
scrape_interval: 10s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: kafka-exporter
metrics_path: /metrics
honor_labels: false
honor_timestamps: true
sample_limit: 0
static_configs:
- targets: ['kafka-exporter:9308']
Now the Prometheus data source can be added and metrics from the Kafka Exporter such as kafka_consumergroup_lag can be viewed:
You cannot connect Grafana Prometheus datasource to an exporter directly. You need to set up a Prometheus server, scrape from that server the kafka exporter metrics and finally connect Grafana to the Prometheus server.

How to configure fluent-bit, Fluentd, Loki and Grafana using docker-compose?

I am trying to run Fluent-bit in docker and view logs in Grafana using Loki but I can't see any labels in Grafana. The Loki data source reports that it works and found labels.
I need to figure out how to get docker logs from fluent-bit -> loki -> grafana. Any logs.
Here is my docker-compose.yaml
version: "3.3"
networks:
loki:
external: true
services:
fluent-bit:
image: grafana/fluent-bit-plugin-loki:latest
container_name: fluent-bit
environment:
LOKI_URL: http://loki:3100/loki/api/v1/push
networks:
- loki
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
logging:
options:
tag: infra.monitoring
Here is my config file.
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
[Output]
Name loki
Match *
Url ${LOKI_URL}
RemoveKeys source
Labels {job="fluent-bit"}
LabelKeys container_name
BatchWait 1
BatchSize 1001024
LineFormat json
LogLevel info
Here are my Grafana and Loki setups
grafana:
image: grafana/grafana
depends_on:
- prometheus
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana:rw
- ./grafana/provisioning:/etc/grafana/provisioning
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
- GF_INSTALL_PLUGINS=grafana-piechart-panel
- GF_RENDERING_SERVER_URL=http://renderer:8081/render
- GF_RENDERING_CALLBACK_URL=http://grafana:3000/
- GF_LOG_FILTERS=rendering:debug
restart: unless-stopped
networks:
- traefik
- loki
labels:
- "traefik.enable=true"
- "traefik.http.routers.grafana.rule=Host(`grafana-int.mydomain.com`)"
- "traefik.http.services.grafana.loadbalancer.server.port=3000"
- "traefik.docker.network=traefik"
loki:
image: grafana/loki:latest
container_name: loki
expose:
- "3100"
networks:
- loki
renderer:
image: grafana/grafana-image-renderer:2.0.0
container_name: grafana-image-renderer
expose:
- "8081"
environment:
ENABLE_METRICS: "true"
networks:
- loki
I have tried using the following config as described in the docs linked in a comment below but still no labels.
[SERVICE]
Flush 1
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name syslog
Path /tmp/in_syslog
Buffer_Chunk_Size 32000
Buffer_Max_Size 64000
[OUTPUT]
Name loki
Match *
Url ${LOKI_URL}
RemoveKeys source
Labels {job="fluent-bit"}
LabelKeys container_name
BatchWait 1
BatchSize 1001024
LineFormat json
LogLevel info
I tried this config but still no labels.
[INPUT]
#type tail
format json
read_from_head true
path /var/log/syslog
pos_file /tmp/container-logs.pos
[OUTPUT]
Name loki
Match *
Url ${LOKI_URL}
RemoveKeys source
LabelKeys container_name
BatchWait 1
BatchSize 1001024
LineFormat json
LogLevel info
After playing around with this for a while I figured the best way was to collect the logs in fluent-bit and forward them to Fluentd, then output to Loki and read those files in Grafana.
Here is a config which will work locally.
docker-compose.yaml for Fluentd and Loki.
version: "3.8"
networks:
appnet:
external: true
volumes:
host_logs:
services:
fluentd:
image: grafana/fluent-plugin-loki:master
command:
- "fluentd"
- "-v"
- "-p"
- "/fluentd/plugins"
environment:
LOKI_URL: http://loki:3100
LOKI_USERNAME:
LOKI_PASSWORD:
container_name: "fluentd"
restart: always
ports:
- '24224:24224'
networks:
- appnet
volumes:
- host_logs:/var/log
# Needed for journald log ingestion:
- /etc/machine-id:/etc/machine-id
- /dev/log:/dev/log
- /var/run/systemd/journal/:/var/run/systemd/journal/
- type: bind
source: ./config/fluent.conf
target: /fluentd/etc/fluent.conf
- type: bind
source: /var/lib/docker/containers
target: /fluentd/log/containers
logging:
options:
tag: docker.monitoring
loki:
image: grafana/loki:master
container_name: "loki"
restart: always
networks:
- appnet
ports:
- 3100
volumes:
- type: bind
source: ./config/loki.conf
target: /loki/etc/loki.conf
depends_on:
- fluentd
fluent.conf
<source>
#type forward
bind 0.0.0.0
port 24224
</source>
<match **>
#type loki
url "http://loki:3100"
flush_interval 1s
flush_at_shutdown true
buffer_chunk_limit 1m
extra_labels {"job":"localhost_logs", "host":"localhost", "agent":"fluentd"}
<label>
fluentd_worker
</label>
</match>
loki.conf
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 5m
chunk_retain_period: 30s
schema_config:
configs:
- from: 2020-10-16
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h
storage_config:
boltdb:
directory: /tmp/loki/index
filesystem:
directory: /tmp/loki/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
docker-compose.yaml for fluent-bit
version: "3.8"
networks:
appnet:
external: true
services:
fluent-bit:
image: fluent/fluent-bit:latest
container_name: "fluent-bit"
restart: always
ports:
- '2020:2020'
networks:
- appnet
volumes:
- type: bind
source: ./config/fluent-bit.conf
target: /fluent-bit/etc/fluent-bit.conf
read_only: true
- type: bind
source: ./config/parsers.conf
target: /fluent-bit/etc/parsers.conf
read_only: true
- type: bind
source: /var/log/
target: /var/log/
- type: bind
source: /var/lib/docker/containers
target: /fluent-bit/log/containers
fluent-bit.conf
[SERVICE]
Flush 2
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name tail
Path /fluent-bit/log/containers/*/*-json.log
Tag docker.logs
Parser docker
[OUTPUT]
Name forward
Match *
Host fluentd
parsers.conf
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On

traefik v2.2 help using only docker-compose router service entrypoint

Started learning about docker, traefik for playing in home.
Aim: Put everything all together in docker-compose.yml and .env files, understand basics, comment accordingly.
Want to get dashboard from traefik.test.local/dashboard rather test.local:8080, similarly api should be accessed from traefik.test.local/api. So that don't have to think about port numbers.
added lines to /etc/hosts
127.0.0.1 test.local
127.0.0.1 traefik.test.local
docker-compose.yml
version: "3.7"
services:
traefik:
# The official v2 Traefik docker image
image: traefik:v2.2
# Lets name the container
container_name: traefik
command:
# Enables the web UI
- "--api.insecure=true"
# Tells Traefik to listen to docker
- "--providers.docker"
ports:
# The HTTP port
- "80:80"
# The Web UI (enabled by --api.insecure=true)
- "8080:8080"
volumes:
# So that Traefik can listen to the Docker events
- /var/run/docker.sock:/var/run/docker.sock
#labels:
#- "traefik.http.routers.router.rule=Host(`traefik.test.local/dashboard`)"
#- "traefik.http.routers.router.rule=Host(`traefik.test.local/api`)"
restart:
always
Not able to understand how to connect from router to services. Also correct me if I am wrong anywhere. Thank you.
PS: OS: kde-neon
you can achieve this using the following definition, you need to add labels for the routers and service and not only the router
proxy:
image: traefik:v2.1
command:
- '--providers.docker=true'
- '--entryPoints.web.address=:80'
- '--entryPoints.metrics.address=:8082'
- '--providers.providersThrottleDuration=2s'
- '--providers.docker.watch=true'
- '--providers.docker.swarmMode=true'
- '--providers.docker.swarmModeRefreshSeconds=15s'
- '--providers.docker.exposedbydefault=false'
- '--providers.docker.defaultRule=Host("traefik.lvh.me")'
- '--accessLog.bufferingSize=0'
- '--api=true'
- '--api.dashboard=true'
- '--api.insecure=true'
- '--ping.entryPoint=web'
volumes:
- '/var/run/docker.sock:/var/run/docker.sock:ro'
ports:
- '80:80'
- '8080:8080'
restart:
always
deploy:
labels:
- traefik.enable=true
- traefik.docker.network=monitoring
- traefik.http.services.traefik-dashboard.loadbalancer.server.port=8080
- traefik.http.routers.traefik-dashboard.rule=Host(`dashboard.traefik.lvh.me`)
- traefik.http.routers.traefik-dashboard.service=traefik-dashboard
- traefik.http.routers.traefik-dashboard.entrypoints=web
- traefik.http.services.traefik-api.loadbalancer.server.port=80
- traefik.http.routers.traefik-api.rule=Host(`api.traefik.lvh.me`)
- traefik.http.routers.traefik-api.service=traefik-api
- traefik.http.routers.traefik-api.entrypoints=web
logging:
driver: json-file
options:
'max-size': '10m'
'max-file': '5'
also if you use lvh.me domain you not need to edit /etc/hosts

Traefik v2.1.4 - How to create a static route and redirect to a specific host and port

I'm a beginner with Traefik v2.1.4. I'm using in a docker container. I'm trying to set up a static route. I found some examples using the toml configuration file.
[providers]
[providers.file]
[http]
[http.routers]
[http.routers.netdata]
rule = "Host(`netdata.my-domain.com`)"
service = "netdata"
entrypoint=["http"]
[http.services]
[http.services.netdata.loadbalancer]
[[http.services.netdata.loadbalancer.servers]]
url = "https://192.168.0.2:19999"
Following this example I would like to convert it to docker labels of my docker-compose.
My docker-compose file:
version: "3.7"
services:
traefik:
image: traefik:v2.1.4
container_name: traefik
restart: always
command:
- "--log.level=DEBUG"
- "--api.insecure=false"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.letsresolver.acme.tlschallenge=true"
- "--certificatesresolvers.letsresolver.acme.email=my-email#domain.com"
- "--certificatesresolvers.letsresolver.acme.storage=/letsencrypt/acme.json"
labels:
- "traefik.enable=true"
# middleware redirect
- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
# global redirect to https
- "traefik.http.routers.redirs.rule=hostregexp(`{host:.+}`)"
- "traefik.http.routers.redirs.entrypoints=web"
- "traefik.http.routers.redirs.middlewares=redirect-to-https"
# dashboard
- "traefik.http.routers.traefik.rule=Host(`traefik.my-domain.com`)"
- "traefik.http.routers.traefik.service=api#internal"
- "traefik.http.routers.traefik.middlewares=admin"
- "traefik.http.routers.traefik.tls.certresolver=letsresolver"
- "traefik.http.routers.traefik.entrypoints=websecure"
- "traefik.http.middlewares.admin.basicauth.users=user:hash-passwordXXX"
ports:
- "80:80"
- "443:443"
- "8080:8080"
volumes:
- "./letsencrypt:/letsencrypt"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
networks:
default:
external:
name: network
It is possible to use 2 providers together: file and docker.
Your docker-compose.yml:
services:
traefik:
image: traefik:2.2.1
command: traefik --configFile=/etc/traefik/traefik.yml
ports:
- "80:80"
- "8080:8080"
volumes:
- ./traefik.yml:/etc/traefik/traefik.yml
- ./routes.yml:/etc/traefik/routes.yml
- /var/run/docker.sock:/var/run/docker.sock
# your services go here ...
Your traefik.yml:
api:
dashboard: true
insecure: true
entryPoints:
web:
address: :80
providers:
docker: {}
file:
filename: /etc/traefik/routes.yml
watch: true
Your routes.yml:
http:
routers:
hello:
rule: PathPrefix(`/hello`)
service: hello#docker
rule: PathPrefix(`/world`)"
service: world#docker
These are only examples, don't use them in production environment directly, of course.
There is no docker label mentioned to specify url (in https://docs.traefik.io/v2.1/routing/providers/docker/#routers). And I tried to use url instead of port, but it does not work.
So I suggest to use the file provider (https://docs.traefik.io/v2.1/providers/file/).
suggeston for implementation:
update your config with:
services:
...
traefik:
...
command:
...
- "--providers.file.directory=/path/to/dynamic/conf"
config:
- source: redirect.toml
target: /path/to/dynamic/conf/redirect.toml
...
...
configs:
redirect.toml:
file: redirect.toml
and create redirect.toml with your redirection (as in your example).
Of course you can also bindmount the config into the container, or create your own traefik image containing the config, or ...
In case you want to work with labels, you can start a service which redirects with socat
services:
...
netdata:
image: alpine/socat
command: tcp-listen:80,fork,reuseaddr tcp-connect:192.168.0.2:19999
deploy:
labels:
traefik.enable: "true"
traefik.http.routers.netdata.rule: Host(`netdata.my-domain.com`)
traefik.http.services.netdata_srv.loadbalancer.server.port: 80
# hm, and probably tell to forward as https, ...

How to configure prometheus with alertmanager?

docker-compose.yml:
This is the docker-compose to run the prometheus, node-exporter and alert-manager service. All the services are running great. Even the health status in target menu of prometheus shows ok.
version: '2'
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanger/alert.rules:/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
node-exporter:
image: prom/node-exporter
ports:
- '9100:9100'
alertmanager:
image: prom/alertmanager
privileged: true
volumes:
- ./alertmanager/alertmanager.yml:/alertmanager.yml
command:
- '--config.file=/alertmanager.yml'
ports:
- '9093:9093'
prometheus.yml
This is the prometheus config file with targets and alerts target sets. The alertmanager target url is working fine.
global:
scrape_interval: 5s
external_labels:
monitor: 'my-monitor'
# this is where I have simple alert rules
rule_files:
- ./alertmanager/alert.rules
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
alerting:
alertmanagers:
- static_configs:
- targets: ['some-ip:9093']
alert.rules:
Just a simple alert rules to show alert when service is down
ALERT service_down
IF up == 0
alertmanager.yml
This is to send the message on slack when alerting occurs.
global:
slack_api_url: 'https://api.slack.com/apps/A90S3Q753'
route:
receiver: 'slack'
receivers:
- name: 'slack'
slack_configs:
- send_resolved: true
username: 'tara gurung'
channel: '#general'
api_url: 'https://hooks.slack.com/services/T52GRFN3F/B90NMV1U2/QKj1pZu3ZVY0QONyI5sfsdf'
Problems:
All the containers are working fine I am not able to figure out the exact problem.What am I really missing. Checking the alerts in prometheus shows.
Alerts
No alerting rules defined
Your ./alertmanager/alert.rules file is not included in your docker config, so it is not available in the container. You need to add it to the prometheus service:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
And probably give an absolute path inside prometheus.yml:
rule_files:
- "/alertmanager/alert.rules"
You also need to make sure you alerting rules are valid. Please see the prometheus docs for details and examples. You alert.rules file should look something like this:
groups:
- name: example
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown
expr: up == 0
for: 5m
Once you have multiple files, it may be better to add the entire directory as a volume rather than individual files.
If you need answers to this question see the explanation on this link
How to make alert rules visible on Prometheus User Interface?
Your alert rules inside the prometheus.yml should look like this
rule_files:
- "/etc/prometheus/alert.rules.yml"
You need to stop the alertmanager and prometheus containers and run this
docker run -d --name prometheus_ops -p 9191:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml -v $(pwd)/alert.rules.yml:/etc/prometheus/alert.rules.yml prom/prometheus
Verify if you can see the alert.rule config path : Prometheus container ID and go to cd /etc/prometheus
docker exec -it fa99f733f69b sh