How to configure prometheus with alertmanager? - docker-compose

docker-compose.yml:
This is the docker-compose to run the prometheus, node-exporter and alert-manager service. All the services are running great. Even the health status in target menu of prometheus shows ok.
version: '2'
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanger/alert.rules:/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
node-exporter:
image: prom/node-exporter
ports:
- '9100:9100'
alertmanager:
image: prom/alertmanager
privileged: true
volumes:
- ./alertmanager/alertmanager.yml:/alertmanager.yml
command:
- '--config.file=/alertmanager.yml'
ports:
- '9093:9093'
prometheus.yml
This is the prometheus config file with targets and alerts target sets. The alertmanager target url is working fine.
global:
scrape_interval: 5s
external_labels:
monitor: 'my-monitor'
# this is where I have simple alert rules
rule_files:
- ./alertmanager/alert.rules
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
alerting:
alertmanagers:
- static_configs:
- targets: ['some-ip:9093']
alert.rules:
Just a simple alert rules to show alert when service is down
ALERT service_down
IF up == 0
alertmanager.yml
This is to send the message on slack when alerting occurs.
global:
slack_api_url: 'https://api.slack.com/apps/A90S3Q753'
route:
receiver: 'slack'
receivers:
- name: 'slack'
slack_configs:
- send_resolved: true
username: 'tara gurung'
channel: '#general'
api_url: 'https://hooks.slack.com/services/T52GRFN3F/B90NMV1U2/QKj1pZu3ZVY0QONyI5sfsdf'
Problems:
All the containers are working fine I am not able to figure out the exact problem.What am I really missing. Checking the alerts in prometheus shows.
Alerts
No alerting rules defined

Your ./alertmanager/alert.rules file is not included in your docker config, so it is not available in the container. You need to add it to the prometheus service:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
And probably give an absolute path inside prometheus.yml:
rule_files:
- "/alertmanager/alert.rules"
You also need to make sure you alerting rules are valid. Please see the prometheus docs for details and examples. You alert.rules file should look something like this:
groups:
- name: example
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown
expr: up == 0
for: 5m
Once you have multiple files, it may be better to add the entire directory as a volume rather than individual files.

If you need answers to this question see the explanation on this link
How to make alert rules visible on Prometheus User Interface?
Your alert rules inside the prometheus.yml should look like this
rule_files:
- "/etc/prometheus/alert.rules.yml"
You need to stop the alertmanager and prometheus containers and run this
docker run -d --name prometheus_ops -p 9191:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml -v $(pwd)/alert.rules.yml:/etc/prometheus/alert.rules.yml prom/prometheus
Verify if you can see the alert.rule config path : Prometheus container ID and go to cd /etc/prometheus
docker exec -it fa99f733f69b sh

Related

Unable to see services with traefik

I'm a beginner and Im a bit confused about how traefik works...
I want to use the app freqtrade (trading bot) as a docker service and replicate it with different type of configuration, if you have 5min you can go check this guy I want to do the same thing...
But I don't understant why I can't see my app running with traefik :
What I did :
Configure my domain to my server like that :
server config
And on this machine I create a docker swarm and the treafik service with this tutorial and then, my docker compose file look like that :
```
version: '3.3'
services:
traefik:
# Use the latest v2.2.x Traefik image available
image: traefik:v2.2
ports:
# Listen on port 80, default for HTTP, necessary to redirect to HTTPS
- 80:80
# Listen on port 443, default for HTTPS
- 443:443
networks:
- traefik-public
deploy:
placement:
constraints:
# Make the traefik service run only on the node with this label
# as the node with it has the volume for the certificates
- node.labels.traefik-public.traefik-public-certificates == true
labels:
# Enable Traefik for this service, to make it available in the public network
- traefik.enable=true
# Use the traefik-public network (declared below)
- traefik.docker.network=traefik-public
# Use the custom label "traefik.constraint-label=traefik-public"
# This public Traefik will only use services with this label
# That way you can add other internal Traefik instances per stack if needed
- traefik.constraint-label=traefik-public
# admin-auth middleware with HTTP Basic auth
# Using the environment variables USERNAME and HASHED_PASSWORD
- traefik.http.middlewares.admin-auth.basicauth.users=${USERNAME?Variable not set}:${HASHED_PASSWORD?Variable not set}
# https-redirect middleware to redirect HTTP to HTTPS
# It can be re-used by other stacks in other Docker Compose files
- traefik.http.middlewares.https-redirect.redirectscheme.scheme=https
- traefik.http.middlewares.https-redirect.redirectscheme.permanent=true
# traefik-http set up only to use the middleware to redirect to https
# Uses the environment variable DOMAIN
- traefik.http.routers.traefik-public-http.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.traefik-public-http.entrypoints=http
- traefik.http.routers.traefik-public-http.middlewares=https-redirect
# traefik-https the actual router using HTTPS
# Uses the environment variable DOMAIN
- traefik.http.routers.traefik-public-https.rule=Host(`${DOMAIN?Variable not set}`)
- traefik.http.routers.traefik-public-https.entrypoints=https
- traefik.http.routers.traefik-public-https.tls=true
# Use the special Traefik service api#internal with the web UI/Dashboard
- traefik.http.routers.traefik-public-https.service=api#internal
# Use the "le" (Let's Encrypt) resolver created below
- traefik.http.routers.traefik-public-https.tls.certresolver=le
# Enable HTTP Basic auth, using the middleware created above
- traefik.http.routers.traefik-public-https.middlewares=admin-auth
# Define the port inside of the Docker service to use
- traefik.http.services.traefik-public.loadbalancer.server.port=8080
volumes:
# Add Docker as a mounted volume, so that Traefik can read the labels of other services
- /var/run/docker.sock:/var/run/docker.sock:ro
# Mount the volume to store the certificates
- traefik-public-certificates:/certificates
command:
# Enable Docker in Traefik, so that it reads labels from Docker services
- --providers.docker
# Add a constraint to only use services with the label "traefik.constraint-label=traefik-public"
- --providers.docker.constraints=Label(`traefik.constraint-label`, `traefik-public`)
# Do not expose all Docker services, only the ones explicitly exposed
- --providers.docker.exposedbydefault=false
# Enable Docker Swarm mode
- --providers.docker.swarmmode
# Create an entrypoint "http" listening on port 80
- --entrypoints.http.address=:80
# Create an entrypoint "https" listening on port 443
- --entrypoints.https.address=:443
# Create the certificate resolver "le" for Let's Encrypt, uses the environment variable EMAIL
- --certificatesresolvers.le.acme.email=${EMAIL?Variable not set}
# Store the Let's Encrypt certificates in the mounted volume
- --certificatesresolvers.le.acme.storage=/certificates/acme.json
# Use the TLS Challenge for Let's Encrypt
- --certificatesresolvers.le.acme.tlschallenge=true
# Enable the access log, with HTTP requests
- --accesslog
# Enable the Traefik log, for configurations and errors
- --log
# Enable the Dashboard and API
- --api
volumes:
# Create a volume to store the certificates, there is a constraint to make sure
# Traefik is always deployed to the same Docker node with the same volume containing
# the HTTPS certificates
traefik-public-certificates:
networks:
traefik-public:
driver: overlay
attachable: true
```
And deploy it :
docker stack deploy -c traefik.yml traefik
After that traefik works fine. Why I can't see the port 8080 in my entrypoint ? is it important for others services ?
Entrypoint traefik
I try to disable the firewall in configuration of the server and also do ufw allow 8080 but nothing change...
I create my a application like I create traefik service with this docker-compose file :
---
version: '3'
networks:
traefik_traefik-public:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
# image: freqtradeorg/freqtrade:develop
# Use plotting image
# image: freqtradeorg/freqtrade:develop_plot
# Build step - only needed when additional dependencies are needed
# build:
# context: .
# dockerfile: "./docker/Dockerfile.custom"
restart: unless-stopped
container_name: freqtrade
volumes:
- "./user_data:/freqtrade/user_data"
# Expose api on port 8080 (localhost only)
# Please read the https://www.freqtrade.io/en/stable/rest-api/ documentation
# before enabling this.
networks:
- traefik_traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
delay: 5s
command: >
trade
--logfile /freqtrade/user_data/logs/freqtrade.log
--db-url sqlite:////freqtrade/user_data/tradesv3.sqlite
--config /freqtrade/user_data/config.json
--strategy SampleStrategy
labels:
- traefik.http.routers.bot001.tls=true'
- traefik.http.routers.bot001.rule=Host(`bot001.bots.lordgoliath.com`)'
- traefik.http.services.bot001.loadbalancer.server.port=8080'
and this is a part of the configuation file of the bot (to access to the UI)
"api_server": {
"enabled": true,
"enable_openapi": true,
"listen_ip_address": "0.0.0.0",
"listen_port": 8080,
"verbosity": "info",
"jwt_secret_key": "somethingrandom",
"CORS_origins": ["https://bots.lordgoliath.com"],
"username": "api",
"password": "api"
},
then :
docker stack deploy -c docker-compose.yml freqtrade
So I have that :
goliath#localhost:~/freqtrade_test/user_data$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
nkvpjjztjibg freqtrade_freqtrade replicated 1/1 freqtradeorg/freqtrade:stable
6qryu28ute9i traefik_traefik replicated 1/1 traefik:v2.2 *:80->80/tcp, *:443->443/tcp
I see the bot running with the command docker service logs freqtrade_freqtrade but
when I try to go on my domain to see it have only the Traefik dashboard and can't see anything else running.
traefik http
traefik https
how I can see my app freqtrade running ? how can I access to the bot UI via my domain ?
Thanks !
Sorry for my bad English I hope this is clear enough to understand my problem
UPDATE
docker service inspect --pretty freqtrade_freqtrade
ID: o6bpaso69i9n6etybtj09xsqi
Name: ft1_freqtrade
Labels:
com.docker.stack.image=freqtradeorg/freqtrade:stable
com.docker.stack.namespace=ft1
Service Mode: Replicated
Replicas: 1
Placement:
Constraints: [node.role == manager]
UpdateConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Update order: stop-first
RollbackConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Rollback order: stop-first
ContainerSpec:
Image: freqtradeorg/freqtrade:stable#sha256:3b2f2acb5b9cfedaa7b07cf56af01d1a750bce4c3054bdbaf40ac27935c984eb
Args: trade --logfile /freqtrade/user_data/logs/freqtrade.log --db-url sqlite:////freqtrade/user_data/tradesv3.sqlite --config /freqtrade/user_data/config.json --strategy SampleStrategy
Mounts:
Target: /freqtrade/user_data
Source: /home/goliath/freqtrade_test/user_data
ReadOnly: false
Type: bind
Resources:
Networks: traefik_traefik-public
Endpoint Mode: vip
UPDATE NEW docker-compose.yml
---
version: '3'
networks:
traefik_traefik-public:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
# image: freqtradeorg/freqtrade:develop
# Use plotting image
# image: freqtradeorg/freqtrade:develop_plot
# Build step - only needed when additional dependencies are needed
# build:
# context: .
# dockerfile: "./docker/Dockerfile.custom"
restart: unless-stopped
container_name: freqtrade
volumes:
- "./user_data:/freqtrade/user_data"
# Expose api on port 8080 (localhost only)
# Please read the https://www.freqtrade.io/en/stable/rest-api/ documentation
# before enabling this.
networks:
- traefik_traefik-public
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
restart_policy:
condition: on-failure
delay: 5s
labels:
- 'traefik.enabled=true'
- 'traefik.http.routers.bot001.tls=true'
- 'traefik.http.routers.bot001.rule=Host(`bot001.bots.lordgoliath.com`)'
- 'traefik.http.services.bot001.loadbalancer.server.port=8080'
command: >
trade
--logfile /freqtrade/user_data/logs/freqtrade.log
--db-url sqlite:////freqtrade/user_data/tradesv3.sqlite
--config /freqtrade/user_data/config.json
--strategy SampleStrategy
UPDATE docker network ls
goliath#localhost:~/freqtrade_test$ docker network ls
NETWORK ID NAME DRIVER SCOPE
003e00401b5d bridge bridge local
9f3d9a222928 docker_gwbridge bridge local
09a33afad0c9 host host local
r4u268yenm5u ingress overlay swarm
bed40e4a5c62 none null local
qo9w45gitke5 traefik_traefik-public overlay swarm
This is the minimal config you need to integrate in order to see the traefik dashboard on localhost:8080
version: "3.9"
services:
traefik:
image: traefik:latest
command: |
--api.insecure=true
ports:
- 8080:8080
Then, your minimal configuration to get traefik to route example.com to itself:
version: "3.9"
networks:
public:
attachable: true
name: traefik
services:
traefik:
image: traefik:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
command: |
--api.insecure=true
--providers.docker.exposedbydefault=false
--providers.docker.swarmmode
--providers.docker.network=traefik
ports:
- 80:80
networks:
- public
deploy:
labels:
traefik.enable: "true"
traefik.http.routers.traefik.rule: Host(`example.com`)
traefik.http.services.traefik.loadbalancer.server.port: 8080
Now, minimal https support - using Traefik self signed certs to start with. Note that we configure tls on the https entrypoint, which means traefik implicitly creates http and https variants for each router.
version: "3.9"
networks:
public:
attachable: true
name: traefik
services:
traefik:
image: traefik:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
command: |
--api.insecure=true
--providers.docker.exposedbydefault=false
--providers.docker.swarmmode
--providers.docker.network=traefik
--entrypoints.http.address=:80
--entrypoints.https.address=:443
--entrypoints.https.http.tls=true
deploy:
placement:
constraints:
- node.role == manager
ports:
# - 8080:8080
- 80:80
- 443:443
networks:
- public
deploy:
labels:
traefik.enable: "true"
traefik.http.routers.traefik.rule: Host(`example.com`)
traefik.http.services.traefik.loadbalancer.server.port: 8080
At this point, gluing in your le config should be simple.
Your freqtrade stack compose would need to be this. If this is a single node swarm, just omit the placement constraints, but when the swarm is large enough to have workers, then tasks that don't need to be on managers should explicitly be kept on workers.
Traefik needs to talk to the swarm api over the docker socket, which is on manager nodes only, which is why it must be node.role==manager.
version: "3.9"
networks:
traefik:
external: true
services:
freqtrade:
image: freqtradeorg/freqtrade:stable
command: ...
volumes: ...
networks:
- traefik
deploy:
placement:
constraints:
- node.role == worker
restart_policy:
max_attempts: 5
labels:
traefik.enabled: "true"
traefik.http.routers.bot001.rule: Host(`bot001.bots.lordgoliath.com`)
traefik.http.services.bot001.loadbalancer.server.port: 8080

How to add the Kafka Exporter as a data source to Grafana?

I'm trying a simplified example of using Kafka and the Kafka Exporter with Grafana. I have a docker-compose.yml similar to the following:
version: '2'
networks:
app-tier:
driver: bridge
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
environment:
- 'ALLOW_ANONYMOUS_LOGIN=yes'
networks:
- app-tier
kafka:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
kafka-exporter:
build: kafka-exporter
ports:
- "9308:9308"
networks:
- app-tier
entrypoint: ["run.sh"]
grafana:
image: grafana/grafana
ports:
- "3000:3000"
networks:
- app-tier
where run.sh is a wrapper script to keep retrying to run the exporter while Kafka is starting, see https://github.com/khpeek/kafka-exporter-example. The problem is that when I log into Grafana and try to add a Prometheus data source with URL http://kafka-exporter:9308/metrics, I get an error:
Error reading Prometheus: bad_response: readObjectStart: expect { or n, but found <, error found in #1 byte of ...|<html> |..., bigger context ...|<html> <head><title>Kafka Exporter</title>|...
Here is how it looks in the UI:
It seems like Grafana ignores the /metrics path and tries to scrape data from http://kafka-exporter:9308 directly, which indeed looks like the error message describes:
By contrast, the /metrics endpoint contains the actual metrics:
> curl localhost:9308/metrics
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0
go_gc_duration_seconds_sum 0
go_gc_duration_seconds_count 0
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 20
...
Why am I getting this error? Shouldn't Grafana pick up the path?
To show the implementation of #usuario's answer, the docker-compose.yml needs a Prometheus service,
version: '2'
networks:
app-tier:
driver: bridge
services:
zookeeper:
image: 'bitnami/zookeeper:latest'
environment:
- 'ALLOW_ANONYMOUS_LOGIN=yes'
networks:
- app-tier
kafka:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
kafka-exporter:
build: kafka-exporter
ports:
- "9308:9308"
networks:
- app-tier
entrypoint: ["run.sh"]
cli:
image: 'bitnami/kafka:latest'
environment:
- KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
- ALLOW_PLAINTEXT_LISTENER=yes
networks:
- app-tier
prometheus:
image: bitnami/prometheus:latest
ports:
- "9090:9090"
volumes:
- "./prometheus/prometheus.yml:/opt/bitnami/prometheus/conf/prometheus.yml"
networks:
- app-tier
grafana:
image: grafana/grafana
ports:
- "3000:3000"
networks:
- app-tier
where the prometheus/prometheus.yml configuration file is statically configured to scrape the Kafka Exporter:
global:
scrape_interval: 10s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: kafka-exporter
metrics_path: /metrics
honor_labels: false
honor_timestamps: true
sample_limit: 0
static_configs:
- targets: ['kafka-exporter:9308']
Now the Prometheus data source can be added and metrics from the Kafka Exporter such as kafka_consumergroup_lag can be viewed:
You cannot connect Grafana Prometheus datasource to an exporter directly. You need to set up a Prometheus server, scrape from that server the kafka exporter metrics and finally connect Grafana to the Prometheus server.

Set up Prometheus with Docker Compose

I am new to Prometheus and Docker Compose. I have a project structure with docker-compose.yml file that setup Prometheus and Grafana:
/prometheus-grafana/prometheus/docker-compose.yml
version: '3'
services:
prometheus:
image: prom/prometheus:v2.21.0
ports:
- 9000:9090
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
grafana:
image: grafana/grafana:$GRAFANA_VERSION
environment:
GF_SECURITY_ADMIN_USER: $GRAFANA_ADMIN_USER
GF_SECURITY_ADMIN_PASSWORD: $GRAFANA_ADMIN_PASSWORD
ports:
- 3000:3000
volumes:
- grafana-storage:/var/lib/grafana
depends_on:
- prometheus
networks:
- internal
networks:
internal:
volumes:
prometheus-data:
grafana-storage:
I just added more config in /prometheus-grafana/prometheus/prometheus/prometheus.yml like this, the part that I added is the last 3 lines:
global:
scrape_interval: 30s
scrape_timeout: 10s
rule_files:
- alert.yml
scrape_configs:
- job_name: services
metrics_path: /metrics
static_configs:
- targets:
- 'prometheus:9090'
- 'idonotexists:564'
- job_name: myapp
scrape_interval: 10s
static_configs:
- targets:
- localhost:2112
I started Prometheus by running: docker-compose up -d, I got to Prometheus on http://localhost:9000/graph but I don't see the new config I added to prometheus.yml, in docker-compose.yml file, there is a line:
command: --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml
Am I supposed to change this too refers to the path of my another /prometheus-grafana/prometheus/prometheus/prometheus.yml in my project instead of from /etc/prometheus/prometheus.yml as in reality, this file doesn't exist /etc/prometheus/prometheus.yml
Thank you in advance.
If you have prometheus configured with --config.file=/etc/prometheus/prometheus.yml it expects its config file to be at exact this position.
Since you have the the following mount points already configured:
volumes:
- ./prometheus:/etc/prometheus
- prometheus-data:/prometheus
you just need to put your config file into ./prometheus and name it prometheus.yml

traefik v2.2 help using only docker-compose router service entrypoint

Started learning about docker, traefik for playing in home.
Aim: Put everything all together in docker-compose.yml and .env files, understand basics, comment accordingly.
Want to get dashboard from traefik.test.local/dashboard rather test.local:8080, similarly api should be accessed from traefik.test.local/api. So that don't have to think about port numbers.
added lines to /etc/hosts
127.0.0.1 test.local
127.0.0.1 traefik.test.local
docker-compose.yml
version: "3.7"
services:
traefik:
# The official v2 Traefik docker image
image: traefik:v2.2
# Lets name the container
container_name: traefik
command:
# Enables the web UI
- "--api.insecure=true"
# Tells Traefik to listen to docker
- "--providers.docker"
ports:
# The HTTP port
- "80:80"
# The Web UI (enabled by --api.insecure=true)
- "8080:8080"
volumes:
# So that Traefik can listen to the Docker events
- /var/run/docker.sock:/var/run/docker.sock
#labels:
#- "traefik.http.routers.router.rule=Host(`traefik.test.local/dashboard`)"
#- "traefik.http.routers.router.rule=Host(`traefik.test.local/api`)"
restart:
always
Not able to understand how to connect from router to services. Also correct me if I am wrong anywhere. Thank you.
PS: OS: kde-neon
you can achieve this using the following definition, you need to add labels for the routers and service and not only the router
proxy:
image: traefik:v2.1
command:
- '--providers.docker=true'
- '--entryPoints.web.address=:80'
- '--entryPoints.metrics.address=:8082'
- '--providers.providersThrottleDuration=2s'
- '--providers.docker.watch=true'
- '--providers.docker.swarmMode=true'
- '--providers.docker.swarmModeRefreshSeconds=15s'
- '--providers.docker.exposedbydefault=false'
- '--providers.docker.defaultRule=Host("traefik.lvh.me")'
- '--accessLog.bufferingSize=0'
- '--api=true'
- '--api.dashboard=true'
- '--api.insecure=true'
- '--ping.entryPoint=web'
volumes:
- '/var/run/docker.sock:/var/run/docker.sock:ro'
ports:
- '80:80'
- '8080:8080'
restart:
always
deploy:
labels:
- traefik.enable=true
- traefik.docker.network=monitoring
- traefik.http.services.traefik-dashboard.loadbalancer.server.port=8080
- traefik.http.routers.traefik-dashboard.rule=Host(`dashboard.traefik.lvh.me`)
- traefik.http.routers.traefik-dashboard.service=traefik-dashboard
- traefik.http.routers.traefik-dashboard.entrypoints=web
- traefik.http.services.traefik-api.loadbalancer.server.port=80
- traefik.http.routers.traefik-api.rule=Host(`api.traefik.lvh.me`)
- traefik.http.routers.traefik-api.service=traefik-api
- traefik.http.routers.traefik-api.entrypoints=web
logging:
driver: json-file
options:
'max-size': '10m'
'max-file': '5'
also if you use lvh.me domain you not need to edit /etc/hosts

Alert message not showing up in slack using prometheus and alertmanager

I am trying to get the alert found by Prometheus to be notified in slack using alertmanager.
This is the alert.rules file and is working fine
groups:
- name: Instances
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
# Prometheus templates apply here in the annotation and label fields of the alert.
annotations:
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
summary: 'Instance {{ $labels.instance }} down'
It is showing the one instance down successfully.
But What is the problem in my alertmanager.yml that it is not sending the notifications to slack. I have also setup the slack webhook successfully and have even tested if the hook is working fine while created a hook with the service that is provided by the slack
alertmanager.yml
groups:
- name: Instances
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
# Prometheus templates apply here in the annotation and label fields of the alert.
annotations:
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
summary: 'Instance {{ $labels.instance }} down'
[tgurung#ip131 prometheus_graphana_myversion]$ cat alertmanager/alertmanager.yml
route:
receiver: 'slack-notifications'
#group_by: [alertname, datacenter, app]
receivers:
- name: 'slack-notifications'
slack_configs:
- api_url: https://hooks.slack.com/services/T52GRFN3F/B93KTCUHH/JC
channel: #general
send_resolved: true
# Alertmanager templates apply here.
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
When running docker-compose up I get following
prometheus_1 | level=error ts=2018-02-06T09:36:35.580565429Z caller=notifier.go:454 component=notifier alertmanager=http://x.x.x.x:9093/api/v1/alerts count=0 msg="Error sending alert" err="Post http://x.X.x.x:9093/api/v1/alerts: dial tcp x.x.x.x:9093: getsockopt: no route to host"
Solving above error:
To solve the above routing issues I ran the alertmanager in completely new instance and than that error is overcome
Going to the API links in the error message I can see this
{"status":"success","data":[]}
And this is of alert_manager and looks working great.
alertmanager_1 | level=info ts=2018-02-06T09:36:37.66654544Z caller=main.go:141 msg="Starting Alertmanager" version="(version=0.13.0, branch=HEAD, revision=fb713f6d8239b57c646cae30f78e8b4b8861a1aa)"
alertmanager_1 | level=info ts=2018-02-06T09:36:37.66661402Z caller=main.go:142 build_context="(go=go1.9.2, user=root#d83981af1d3d, date=20180112-10:32:46)"
alertmanager_1 | level=info ts=2018-02-06T09:36:37.668103448Z caller=main.go:279 msg="Loading configuration file" file=/alertmanager/alertmanager.yml
alertmanager_1 | level=info ts=2018-02-06T09:36:37.673288146Z caller=main.go:354 msg=Listening address=:9093
Here is the prometheus.yml config file
global:
scrape_interval: 5s
external_labels:
monitor: 'my-monitor'
#alerting rules file
rule_files:
- '/alertmanager/alert.rules'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
alerting:
alertmanagers:
- static_configs:
- targets: ["54.36.X.X:9093"] #this is the alertmanager service url
Here is my docker-compose.yml
version: '2'
volumes:
grafana_data: {}
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
- ./alertmanager/alertmanager.yml:/alertmanager/alertmanager.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
links:
- "alertmanager"
node-exporter:
image: prom/node-exporter
ports:
- '9100:9100'
alertmanager:
image: prom/alertmanager
privileged: true
volumes:
- ./alertmanager/alertmanager.yml:/alertmanager/alertmanager.yml
command:
- '--config.file=/alertmanager/alertmanager.yml'
ports:
- '9093:9093'
The alertmanager status link does not show the config being passed from volume in docker-composer. It's showing default configs
As some people have already pointed out, alertmanager configuration only includes how to send alerts and NOT to create alerts. That's prometheus' job.
Take a look at this repo, which sets prometheus with alertmanager very simply
https://github.com/stefanprodan/dockprom