Splunk Kafka Add-on doesn't read chef managed configuration files - apache-kafka

We are using Chef to manage our infrastructure, and I'm running into an issue where the Splunk TA (Add-on for Kafka) simply refuses to acknowledge I've dropped kafka_credential.conf file in the local directory of the plugin. If I use the "Web UI", it generates an entry properly and it shows up in the add-on configuration.
[root#ip-10-14-1-42 local]# ls
app.conf inputs.conf kafka.conf kafka_credentials.conf
[root#ip-10-14-1-42 local]# grep -nr "" *.conf
app.conf:1:# MANAGED BY CHEF. PLEASE DO NOT MODIFY!
app.conf:2:[install]
app.conf:3:is_configured = 1
inputs.conf:1:# MANAGED BY CHEF. PLEASE DO NOT MODIFY!
inputs.conf:2:[kafka_mod]
inputs.conf:3:interval = 60
inputs.conf:4:start_by_shell = false
inputs.conf:5:
inputs.conf:6:[kafka_mod://my_app]
inputs.conf:7:kafka_cluster = default
inputs.conf:8:kafka_topic = log-my_app
inputs.conf:9:kafka_topic_group = my_app
inputs.conf:10:kafka_partition_offset = earliest
inputs.conf:11:index = main
kafka.conf:1:# MANAGED BY CHEF. PLEASE DO NOT MODIFY!
kafka.conf:2:[global_settings]
kafka.conf:3:log_level = INFO
kafka.conf:4:index = main
kafka.conf:5:use_kv_store = 0
kafka.conf:6:use_multiprocess_consumer = 1
kafka.conf:7:fetch_message_max_bytes = 1048576
kafka_credentials.conf:1:# MANAGED BY CHEF. PLEASE DO NOT MODIFY!
kafka_credentials.conf:2:[default]
kafka_credentials.conf:3:kafka_brokers = 10.14.2.164:9092,10.14.2.194:9092
kafka_credentials.conf:4:kafka_partition_offset = earliest
kafka_credentials.conf:5:index = main
Upon restarting splunk, the add-on is installed, and even the input is created under the Inputs section, but the cluster itself is "not available" and when examining the logs I see this:
2017-08-09 01:40:25,442 INFO pid=29212 tid=MainThread file=kafka_mod.py:main:168 | Start Kafka
2017-08-09 01:40:30,508 INFO pid=29212 tid=MainThread file=kafka_config.py:_get_kafka_clusters:228 | Clusters: {}
2017-08-09 01:40:30,509 INFO pid=29212 tid=MainThread file=kafka_config.py:__init__:188 | No Kafka cluster are configured
It seems like this plugin is only respecting clusters created through the WebUI. That is not going to work as we want to be able to fully configure this through Chef. Short of hacking the REST API, and fudging around with the .py files in the addon directory and forcing a dictionary in, what are my options?
Wondering if anyone has encountered this before.

If I had to guess it is silently rejecting the files because # is not traditionally used for comments in INI files. Try a ; instead.

Related

TypeError--using slurm queue to submit pyiron jobs

I'm facing some issues while running pyiron jobs on my HPC via the pysqa adapter. I had accidentally erased the main pyiron directory containing pyiron, projects and resources folders. I had copied all the three from another cluster. The only thing that I think will cause problem is sqlite.db file in the resources folder. Previously, I had no issues running interactive VASP jobs through the adapter. I'm guessing something happened after the deletion incident.
The pyiron version I'm using is: 0.2.17
Here is a minimal example using an Interactive vasp job that I have tried:
from pyiron import Project
pr = Project('Al-test')
structure = pr.create_structure('Al', 'fcc', 4.05)
pr.remove_jobs(recursive=True)
from pysqa import QueueAdapter
sqa = QueueAdapter(directory='~/pyiron/resources/queues/')
sqa.queue_view
pr.job_table()
job = pr.create_job(pr.job_type.Vasp, 'job_int')
job.structure = structure
job.server.run_mode.interactive = True
job.executable.executable_path = '~/pyiron/resources/vasp/bin/run_vasp_5.4.4_std_mpi.sh'
job.input.incar['NCORE']=4
job.server.queue = 'slurm'
job.server.cores=16
job.server.view_queues()
sqa.get_queue_status()
job.run(run_again=True)
end of the error log:
~/pyiron/pyiron/pyiron/base/server/generic.py in queue_id(self, qid)
208 qid (int): queue ID
209 """
--> 210 self._queue_id = int(qid)
211
212 #property
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
Some inputs/feedback on this would be greatly appreciated.
Thanks!
We updated the queuing system interface in pyiron 0.3.X you can read more about this here:
https://pyiron.org/news/releases/2020/09/06/pyiron-0-3-X-HPC-release.html
For pyiron 0.3.X we have a detailed installation guide available on readthedocs.org:
https://pyiron.readthedocs.io/en/latest/source/installation.html#remote-hpc-cluster
So I highly recommend updating to pyiron 0.3.13.
Apart from this the error message basically says that the submission was not successful. If you navigate to the jobs working directory job.working_directory you should find a run_queue.sh script in the working directory. This is the script pyiron is using to submit the job to the queuing system. You can try to submit it manually using sbatch run_queue.sh this should print the queue id if successful and otherwise the error message from your queuing system.

what is the correct configuration of mod_ping on ejabberd-18.12.1?

I am using ejabberd server version 18.12.1 with stream management enabled. When the user disconnects from the internet, its presence remains online so I decided to use mod_ping to kill the connection after a timeout using mod ping
I used the following config in ejabberd.yml file :
mod_ping:
send_pings: true
ping_ack_timeout: 32
timeout_action: kill
considering the default value of ping_interval : 60.
Ping does not seem to be working with this configuration. Am I missing any other configuration ? should the client enable something to make this working ? is there any ping log that I can check?
Note: using the modules page of the web admin of ejabberd server, the config value of the ping_ack_timeout of mod_ping seems to be different from the one in the ejabberd.yml file, why is that?
[{ping_interval,60},
{ping_ack_timeout,32000},
{send_pings,true},
{timeout_action,kill}]
Note: using the modules page of the web admin of ejabberd server, the config value of the ping_ack_timeout of mod_ping seems to be different from the one in the ejabberd.yml file, why is that?
That is expected: you set the human-configurable option in seconds, and later the internal time value is expressed in milliseconds (the time unit used by erlang).
Am I missing any other configuration ? should the client enable something to make this working ? is there any ping log that I can check?
That should be enough. Try with other clients, just to check if that affects in any way. I've installed ejabberd 18.12, configured like this:
loglevel: 5
...
mod_ping:
send_pings: true
ping_interval: 10
ping_ack_timeout: 15
timeout_action: kill
Then I start ejabberd and login with Tkaber client (but I think any client is good for testing ping). Every ten seconds, the client receives this query:
<iq to='user1#localhost/tka1'
from='user1#localhost'
type='get'
id='rr-1552642185584-13814872912241253802-5xOvCCobbU2TCC/RT4GaqD6M8bo=-55238004'>
<ping xmlns='urn:xmpp:ping'/>
</iq>
And at the same time, the ejabberd log file shows several messages, starting with this one:
10:29:30.585 [debug] route:
#iq{id = <<"rr-1552642185584-13814872912241253802-5xOvCCobbU2TCC/RT4GaqD6M8bo=-55238004">>,
type = get,lang = <<>>,
from = #jid{user = <<"user1">>,server = <<"localhost">>,resource = <<>>,
luser = <<"user1">>,lserver = <<"localhost">>,
lresource = <<>>},
to = #jid{user = <<"user1">>,server = <<"localhost">>,
resource = <<"tka1">>,luser = <<"user1">>,
lserver = <<"localhost">>,lresource = <<"tka1">>},
sub_els = [#ping{}],
meta = #{}}

rsyslog 5.8 imfile outside /var/log not picking up log files

I would like to pick up logs of different types from various locations other than /var/log and send them to a central location.
Using RH 6.6 and rsyslog 5.8 the configuration works fine when using path within /var/log. If I use other path like /opt/appname/log/file.log. The rsyslog client does not pick up the log. I do not see any error or message when running rsyslogd in debug mode.
Example:
Client:
...
$InputFileName /opt/appname/test.log
$InputFileTag APPNAME1
$InputFileStateFile stat-APPNAME1
$InputFileSeverity info
$InputFilePersistStateInterval 200
$InputFileFacility local3 # alto tried with other local
$InputRunFileMonitor
...
Server:
...
$template HostAudit, "/opt/logs/%HOSTNAME%/test.log" # tried differnt path
$template auditFormat, "%msg%\n"
local3.* ?HostAudit;auditFormat
...
Any recommendations?, I appreciate your help!!!
Bill
I would first try these:
Verify that the state file names are unique
Verify that every $InputFileName points to an existing regular file
Remove some of the files that you want to be monitored from the configuration. It could be that there is a problem with only one of the monitored files. That would make rsyslog ignore the rest of the files.
I had this with "$InputFileStateFile tomcat-log" for each of the individual tomcat logs. Each of the state file name needs to be unique. For me it worked by changing it to instances of:
"$InputFileStateFile tomcat-manager"
"$InputFileStateFile tomcat-localhost"
etc...
Another option is to just add numbers to the end of the state file name.
"$InputFileStateFile tomcat-log1"
"$InputFileStateFile tomcat-log2"

Icinga2 check_load plugin thresholds

I've recently installed Icinga2 on a bunch of Ubuntu LXC containers. I have a master node where you can log into icingaweb to check status.
However the load thresholds seem low and I cannot see how our even where you can adjust the parameters. May I ask for someone to point me in the right direction? Is this done on the master or the remote nodes? What's the file and where does it sit in the file structure?
I installed Icinga2 on Ubuntu 16.04 server from the Icinga2 PPA
Create a service definition for load in master:
apply Service "load" {
import "generic-service"
check_command = "load"
vars.load_wload1 = 5
vars.load_wload5 = 4
vars.load_wload15 = 3
vars.load_cload1 = 10
vars.load_cload5 = 6
vars.load_cload15 = 4
command_endpoint = host.address
assign where host.name == "monitored client"
}
More info here

Creating PostgreSQL DataSource via pax-jdbc config file on karaf 4

On my karaf 4.0.8 I've installed the feature pax-jdbc-postgresql. The DataFactory for PostgreSQL is installed:
org.osgi.service.jdbc.DataSourceFactory]
osgi.jdbc.driver.class org.postgresql.Driver
osgi.jdbc.driver.name PostgreSQL JDBC Driver
osgi.jdbc.driver.version PostgreSQL 9.4 JDBC4.1 (build 1203)
service.bundleid 204
service.scope singleton
Using Bundles com.eclipsesource.jaxrs.publisher (184)
I've create the file etc/org.ops4j.datasource-psql-sandbox.cfg:
osgi.jdbc.driver.class=org.postgresql.Driver
osgi.jdbc.driver.name=PostgreSQL
url=jdbc:postgresql://localhost:5432/sandbox
dataSourceName=psql-sandbox
user=sandbox
password=sandbox
After that, I see the confirmation in karaf.log that the file was processed:
2017-02-10 14:54:17,468 | INFO | 41-88b277ae0921) |
DataSourceRegistration | 154 - org.ops4j.pax.jdbc.config -
0.9.0 | Detected config for DataSource psql-sandbox. Tracking DSF with filter
(&(objectClass=org.osgi.service.jdbc.DataSourceFactory)(osgi.jdbc.driver.class=org.postgresql.Driver)(osgi.jdbc.driver.name=PostgreSQL))
However, I see no new DataSource in services list in console. What went wrong? I see no exceptions in log ....
The log message tell you that the config was processed and it is now searching for a suitable DataSourceFactory OSGi service.
The problem in your case is that it does not find such a service. So to debug this you should list all DataSourceFactory services and check their properties.
service:list DataSourceFactory
In my case it shows this:
[org.osgi.service.jdbc.DataSourceFactory]
-----------------------------------------
osgi.jdbc.driver.class = org.postgresql.Driver
osgi.jdbc.driver.name = PostgreSQL JDBC Driver
...
As you see it does not match the filter you see in the log. Generally you should only provide either osgi.jdbc.driver.class or osgi.jdbc.driver.name not both. If you remove the osgi.jdbc.driver.name line the config will work.
There is no error message as the system can not know if the error is transient or not. Basically as soon as you install a matching OSGi service the DataSource will be created.