A Jobber Docker container (running periodic tasks) outputs on stdout, which is captured by Filebeat (with Docker containers autodiscovery flag on) and then sent to Logstash (within an ELK stack) or to Elasticsearch directly.
Now on Kibana, the document looks as such:
#timestamp Jan 20, 2020 # 20:15:07.752
...
agent.type filebeat
container.image.name jobber_jobber
...
message {
"job": {
"command":"curl http://my.service/run","name":"myperiodictask",
"status":"Good",
"time":"0 */5 * * * *"
},
"startTime":1579540500,
"stdout":"{\"startDate\":\"2020-01-20T16:35:00.000Z\",\"endDate\":\"2020-01-20T17:00:00.000Z\",\"zipped\":true,\"size\":3397}",
"succeeded":true,
"user":"jobberuser",
"version":"1.4"
}
...
Note: above 'message' field is a simple string reflecting a json object; above displayed format is for clearer readability.
My goal is to be able to request Elastic on the message fields, so I can filter by Jobber tasks for instance.
How can I make that happen ?
I know Filebeat uses plugins and the container tags to apply this or that filter: are there any for Jobber? If not, how to do this?
Even better would be to be able to exploit the fields of the Jobber task result (under the 'stdout' field)! Could you please direct me to ways to implement that?
Filebeat provides processors to handle such tasks.
Below's a configuration to handle the needs "Decode the json in the 'message' field", "Decode the json in the 'stdout' within" (both using the decode_json_fields processor), and other Jobber-related needs.
Note that given example filter the events going through Filebeat by a 'custom-tag' label given to the Docker container hosting the Jobber process. The docker.container.labels.custom-tag: jobber condition should be replaced according to your usecase.
filebeat.yml:
processors:
# === Jobber events processing ===
- if:
equals:
docker.container.labels.custom-tag: jobber
then:
# Drop Jobber events which are not job results
- drop_event:
when:
not:
regexp:
message: "{.*"
# Json-decode event's message part
- decode_json_fields:
when:
regexp:
message: "{.*"
fields: ["message"]
target: "jobbertask"
# Json-decode message's stdout part
- decode_json_fields:
when:
has_fields: ["jobbertask.stdout"]
fields: ["jobbertask.stdout"]
target: "jobbertask.result"
# Drop event's decoded fields
- drop_fields:
fields: ["message"]
- drop_fields:
when:
has_fields: ["jobbertask.stdout"]
fields: ["jobbertask.stdout"]
The decoded fields are placed in the "jobbertask" field. This is to avoid index-mapping collision on the root fields. Feel free to replace "jobbertask" by any other field name, keeping care of mapping collisions.
In my case, this works whether Filebeat addresses the events to Logstash or to Elasticsearch directly.
Related
I m trying to remove some fields, I use filebeat 7.14 on Kubernetes
I tried as described in the doc
processors:
- drop_fields:
when:
contains
fields: ["host.os.name", "host.os.codename", "host.os.family"]
ignore_missing: false
container failed "ERROR instance/beat.go:989
Exiting: Failed to start crawler:
starting input failed: Error while initializing input:
missing or invalid condition
failed to initialize condition"
ignore_missing still messing
- drop_fields:
fields: ["host.os.name", "host.os.codename", "host.os.family"]
fields are still present
you don't seem to have a condition set under the when. take a look at https://www.elastic.co/guide/en/beats/filebeat/7.14/defining-processors.html#conditions and make sure you've got something for it to match
am using filebeat to forward incoming logs from haproxy to Kafka topic but after forwarding filebeat is adding so much metadata to the kafka message which consumes more memory which I want to avoid.
Example of message sinked to kafka from filebeat where it is adding metadata, host and lot of other things:
{
"#timestamp": "2017-03-27T08:14:09.508Z",
"beat": {
"hostname": "stage-kube03",
"name": "stage-kube03",
"version": "5.2.1"
},
"input_type": "log",
"message": {
"message": {
"activityType": null
},
"offset": 3783008,
"source": "/var/log/audit.log",
"type": "log"
}
How do I control/reduce the additional metadata filebeat adds to kafka message along with the log line payload? below is my filebeat.yml file
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/haproxy.log
#exclude_files: [".gz$"]
#fields:
# codec: plain
# token: USER_TOKEN
# type: haproxy_log
#fields_under_root: true
#- c:\programdata\elasticsearch\logs\*
processors:
- drop_event:
# fields: ["prospector","event","dataset"]
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
exclude_lines: ['^source']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
### Multiline options
# Multiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#multiline.pattern: ^\[
# Defines if the pattern set under pattern should be negated or not. Default is false.
#multiline.negate: false
# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#multiline.match: after
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 3
#index.codec: best_compression
#_source.enabled: false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
#============================= Elastic Cloud ==================================
# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
# hosts: ["localhost:9200"]
# Enabled ilm (beta) to use index lifecycle management instead daily indices.
#ilm.enabled: false
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
#============================== Xpack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: false
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch:
output.kafka:
hosts: ["10.12.0.90:9092"]
topic: "data-meter-topic"
codec.json:
pretty: true
You need to remove the additional add_host_metadata and add_cloud_metadata metadata you're adding explicitly and remove the remainder of the fields with the drop_field processor:
I've tested your configuration and changed the following:
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/*.log
output.console:
pretty: true
processors:
- drop_fields:
fields: ["agent", "log", "input", "host", "ecs" ]
#- add_host_metadata: ~
#- add_cloud_metadata: ~
The result:
{
"#timestamp": "2020-11-27T15:55:17.098Z",
"#metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.10.0"
},
"message": "2020-11-27 00:29:58 status installed libc-bin:amd64 2.28-10"
}
According to the documentation, you can't remove some of the metadata, namely the #timestamp and type (which should include the #metadata field).
The drop_fields processor specifies which fields to drop if a certain
condition is fulfilled. The condition is optional. If it’s missing,
the specified fields are always dropped. The #timestamp and type
fields cannot be dropped, even if they show up in the drop_fields
list.
EDIT:
Since you appear to be running filebeat 5.2.1, I've tried the following configuration with even better success than filebeat 7.x:
filebeat.prospectors:
- input_type: log
paths:
- /var/log/*.log
output.console:
pretty: true
processors:
- drop_fields:
fields: ["log_type", "input_type", "offset", "beat", "source"]
Result:
{
"#timestamp": "2020-11-30T09:51:40.404Z",
"message": "2020-11-27 00:29:58 status half-configured vim:amd64 2:8.1.0875-5",
"type": "log"
}
EDIT2:
Conversely, because you've posted a filebeat 6.8.0 version output, I've also tested with this very same version:
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/*.log
output.console:
pretty: true
processors:
- drop_fields:
fields: ["beat", "source", "prospector", "offset", "host", "log", "input", "event", "fileset" ]
#- add_host_metadata: ~
#- add_cloud_metadata: ~
Output:
{
"#timestamp": "2020-11-30T10:08:26.176Z",
"#metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.8.0"
},
"message": "2020-11-27 00:29:58 status unpacked vim:amd64 2:8.1.0875-5"
}
I have added output.mongodb in filebeat.yml file but it is showing error "Exiting: error initializing publisher: output type mongodb undefined"
Does anyone here has any different fail safe approach towards this requirement where I want to redirect filebeat output directly to mongo database?
Filbeat.yml file
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/test.log
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
### Multiline options
# Multiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#multiline.pattern: ^\[
# Defines if the pattern set under pattern should be negated or not. Default is false.
#multiline.negate: false
# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#multiline.match: after
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
reload.period: 5s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 2
#index.codec: best_compression
#_source.enabled: false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "10.27.3.235:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
#============================= Elastic Cloud ==================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output ------------------------------
# output.elasticsearch:
# # Array of hosts to connect to.
# # hosts: ["10.27.3.235:9200"]
# hosts: ["http://10.27.3.235:9200"]
# index: "filebeatSYS-%{[agent.version]}-%{+yyyy.MM.dd}"
# setup.template:
# name: 'api-access'
# pattern: 'api-access-*'
# enabled: false
#
# # Optional protocol and basic auth credentials.
# #protocol: "https"
# #username: "elastic"
# #password: "changeme"
# #index: "filebeat-%{+yyyy.MM.dd}"
#-------------------------- MongoDB output ------------------------------
output.mongodb:
enabled: true
# URL format, according to mgo.v2 doc : [mongodb://][user:pass#]host1[:port1][,host2[:port2],...][/database][?options]
# More info : https://godoc.org/gopkg.in/mgo.v2#Dial
hosts: ["mongodb://<my-db-url-inserted-here>:27017"]
# The mongodb database to push to
db: "<my-db-here>"
# The database collection to push to
# Could be configured like key/keys of the Redis output : https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html#_key_2
collection: "filebeat"
# https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html#_loadbalance
loadbalance: true
# https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html#_timeout_4
timeout: 5s
# https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html#_max_retries_4
max_retries: 5
# https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html#_bulk_max_size_4
bulk_max_size: 2048
#----------------------------- Logstash output --------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#================================ Processors =====================================
# Configure processors to enhance or manipulate events generated by the beat.
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
#================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
You get the error
Exiting: error initializing publisher: output type mongodb undefined
because Filebeat does not support this kind of output. Take a look at the Output Configuration doc of Filebeat. There is no output for MongoDB mentioned. Filebeat supports only the following outputs:
Elasticsearch
Logstash
Kafka
Redis
File
Console
Elastic Cloud
By defining
output.mongodb:
Filebeat crashes because 'mongodb' is an unknown/undefined configuration-field in the output-element.
Does anyone here has any different fail safe approach towards this requirement where I want to redirect filebeat output directly to mongo database?
Logstash has a dedicated MongoDB output plugin. So you could send the data from Filebeat to Logstash which sends it to your MongoDB (this approach is not direct but a valid workaround).
Below is how im trying to add a custom fiels name in my filebeat 7.2.0
filebeat.inputs:
- type: log
enabled: true
paths:
- D:\Oasis\Logs\Admin_Log\*
- D:\Oasis\Logs\ERA_Log\*
- D:\OasisServices\Logs\*
processors:
- add_fields:
fields:
application: oasis
and with this, im expecting a new field called application whose data entries will be 'oasis'.
But i dont get any.
I also tried
fields:
application: oasis/'oasis'
Help me with this.
If you want to add a customized field for every log, you should put the "fields" configuration in the same level of type. Try the following:
- type: log
enabled: true
paths:
- D:\Oasis\Logs\Admin_Log\*
- D:\Oasis\Logs\ERA_Log\*
- D:\OasisServices\Logs\*
fields.application: oasis
There are two ways to add custom fields on filebeat, using the fields option and using the add_fields processor.
To add fields using the fields option, your configuration needs to be something like the one below.
filebeat.inputs:
- type: log
paths:
- 'D:/path/to/your/files/*'
fields:
custom_field: 'custom field value'
fields_under_root: true
To add fields using the add_fields processor, you can try the following configuration.
filebeat.inputs:
- type: log
paths:
- 'D:/path/to/your/files/*'
processors:
- add_fields:
target: ''
fields:
custom_field: 'custom field value'
Both configurations will create a field named custom_field with the value custom field value in the root of your document.
The fields option can be used per input and the add_fields processor is applied to all the data exported by the filebeat instance.
Just remember to pay attention to the indentation of your configuration, if it is wrong filebeat won't work correctly or even start.
Say I want to run a task only when a specific tag is NOT in the list of tags supplied on the command line, even if other tags are specified. Of these, only the last one will work as I expect in all situations:
- hosts: all
tasks:
- debug:
msg: 'not TAG (won't work if other tags specified)'
tags: not TAG
- debug:
msg: 'always, but not if TAG specified (doesn't work; always runs)'
tags: always,not TAG
- debug:
msg: 'ALWAYS, but not if TAG in ansible_run_tags'
when: "'TAG' not in ansible_run_tags"
tags: always
Try it with different CLI options and you'll hopefully see why I find this a bit perplexing:
ansible-playbook tags-test.yml -l HOST
ansible-playbook tags-test.yml -l HOST -t TAG
ansible-playbook tags-test.yml -l HOST -t OTHERTAG
Questions: (a) is that expected behavior? and (b) is there a better way or some logic I'm missing?
I'm surprised I had to dig into the (undocumented, AFAICT) variable ansible_run_tags.
Amendment: It was suggested that I post my actual use case. I'm using ansible to drive system updates on Debian family systems. I'm trying to notify at the end if a reboot is required unless the tag reboot was supplied, in which case cause a reboot (and wait for system to come back up). Here is the relevant snippet:
- name: check and perhaps reboot
block:
- name: Check if a reboot is required
stat:
path: /var/run/reboot-required
get_md5: no
register: reboot
tags: always,reboot
- name: Alert if a reboot is required
fail:
msg: "NOTE: a reboot required to finish uppdates."
when:
- ('reboot' not in ansible_run_tags)
- reboot.stat.exists
tags: always
- name: Reboot the server
reboot:
msg: rebooting after Ansible applied system updates
when: reboot.stat.exists or ('force-reboot' in ansible_run_tags)
tags: never,reboot,force-reboot
I think my original question(s) still have merit, but I'm also willing to accept alternative methods of accomplishing this same functionality.
For completeness, and since only #paul-sweeney has offered any alternative solution, I'll answer my own question with my current best solution and let people pick / up-vote their favorite:
---
- name: run only if 'TAG' not specified
debug:
msg: 'ALWAYS, but not if TAG in ansible_run_tags'
when: "'TAG' not in ansible_run_tags"
tags: always
I know it's an old(ish) question, but I had a similar requirement.
It's probably something best implemented another way ... but ... sometimes it can be useful.
I'd achieve it by setting a fact if the tag IS specified, then outputting the message only if the fact is not set, something like:
---
- name: "test task runs only if tag missing"
hosts: all
tasks:
- name: "suppress message if tag given"
set_fact: suppress_message=yes
tags: reboot,never
- name: "message"
debug:
msg: "You didn't say 'reboot'"
when: suppress_message is not defined
I think that we have states for controlling (example: started, restarted, stopped), states for installing (present,absent) and components (webserver, db,...).
Ansible is lacking a good separation of those 3 dimensions and mixing those 3 dimensions in a single tag system is leading to confusion.
For example, if you have a 'webserver' and a 'DB' tag, you want to 'restart' the DB and not the webserver using a 'restart' tag.
But it won't work if the 'restart' tasks of the DB and the webserver are in the same tasks file with the same 'restart' tag as the 'restart' tag will start both the DB and the webserver...
So you will have probably to separate webserver and DB tasks in 2 separate files and use the tag at the level of the include.
Using tags means that you have a tree of options, not a matrix of options.
I like the tag concept but the fact that it is not possible to use it in conditional expressions is making it less appealing.
What I recommend is to declare tags in a role but map them into variables as a first task. So the 'restart' and 'db' tags would become boolean variables in my role and use when: instead of tags:
ansible-playbook has a skip-tags option. The example from the docs is
ansible-playbook example.yml --skip-tags "packages"
https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_tags.html