Add a comment/description to supervisord process - supervisord

Is it possible to add a comment/description to a supervisor process?
Right now when I run a status action i get:
supervisor> status api_stock
api_stock RUNNING pid 7875, uptime 0:10:09
And I would like something like:
api_stock RUNNING pid 7875, uptime 0:10:09 desc: This app is maintained by dev team 2

Related

vscode remote-ssg : server status check failed - waiting and retrying

This case that i can't connect to the remote because of "server status check failed - waiting and retrying" have happened several times.
However, when i delete the directory "data" and the file which has the suffix with '.log','.pid' or '.token' in remote server under the direcotory ".vscode-server" , this problem should be solved.[1]
[1]: https://i.stack.imgur.com/pwEwf.png
on your remote server side, check vscode-server daemon process is not quit from last connect, kill them all and retry
$ ps aux | grep vscode-server
$ kill -2 pid
I tried rebooting the remote machine and it worked.

PlayFramework Hangs After Days

The server run successfully at one time, but it hangs after days with no error logs. Then, all requests would not get the response.
This is the start command with options
sudo /opt/dev -Dhttps.port=443 -Dhttp.port=9000 -J-Xms3277m -J-Xmx3277m -J-XX:ParallelGCThreads=2 -J-Xmn2574M -J-XX:+UseConcMarkScMarkSweepGC -J-XX:+CMSClassUnloadingEnabled -J-server &
/opt/dev is the script file generated from activator stage
===========server info==========
linux: Ubuntu 14.04.5 LTS
ram: 4G
openjdk version "1.8.0_141"
===========process info========
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME COMMAND
15037 root 20 0 5978800 2.280g 31216 S 0.0 58.3 63:33.82 java
===========port info ===================
tcp6 :::9000 :::* LISTEN 15037/java
tcp6 :::443 :::* LISTEN 15037/java
===========other info==========
play version 2.3.2
scala version 2.11.1
akka setting
akka.jvm-exit-on-fatal-error = false
play.akka.jvm-exit-on-fatal-error = false
akka.default-dispatcher.fork-join-executor.pool-size-max =64
akka.actor.debug.receive = on
===========================================
These steps could help identify the problem.. or they could be just first steps in this direction.
Try to start with adding -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/where/to/put/hprof according you start script params think you need to use -J-XX instead of -XX. This will create heap-dump in case of OOM.
Add logging in endpoints (at start and at end) to be able to check if play receives request or even this does not happen.
While you have unresponsive play, try to check open file descriptors and compare it with your limits. To check you can find pid of your java process and call sudo ls -al /proc/7333/fd/|wc -l to see your limits use ulimit -a.
Would be nice to try to control akka queues. For the case if you use same dispatcher for frontend requests purposes and for some backoffice processing (dispatcher could be filled with long background tasks)
I would do all the diagnostic steps that Evgeny suggested, plus:
Change "akka.jvm-exit-on-fatal-error" and "play.akka.jvm-exit-on-fatal-error" to true, this may be masking your problem.
Take a stack dump of the running process when it is in this state and use that to identify the problem or post it here. See How to get a complete stack trace of a running java program that is taking 100% cpu?

JBoss EAP 6.2 running as Window Service with prunsrv - Can't stop service

I'm trying to get JBoss EAP 6.2 running as a window service on Windows Server (64bit). I've define the service with prunsrv as follows:
prunsrv install "JBoss EAP" --DisplayName="JBoss EAP" --LogLevel=DEBUG --LogPath=d:\Java\jboss-eap-6.2\domain\log\ --LogPrefix=service --StdOutput=auto --StdError=auto --StopTimeout=10 --StartMode=exe --StartImage=cmd.exe --StartPath=d:\Java\jboss-eap-6.2\bin ++StartParams="/c;domain.bat" --StopMode=exe --StopImage=cmd.exe --StopPath=d:\Java\jboss-eap-6.2\bin ++StopParams="/c;jboss-cli.bat;--controller=192.168.50.3:8888;--connect;command=/host=master:shutdown"
and I changed the logon user for the service so that it runs under an account that is part of the Administators group.
I'm able to start the service just fine but I can't get it to stop properly. By "properly" I mean when I stop the service, I get the Service Control dialog that says:
"Windows is attempting to stop the following service on the local computer"
Watching the JBoss log file I can see that JBoss shuts down properly and with Task Manager I can see that all the Java.exe processes disappear. However, the Service Control dialog does not close 'normally'. It eventually times out and I get another dialog with "Error 1053: The service did not respond to the start or control request in a timely mannger" and Task Manager shows prunsrv.exe still running. The service is then hung up in a "Stopping" state and I have to use TaskKill to kill the task and reset the service to a state where I can restart it.
When I look in the service.*.log file, I see the following:
[2015-09-11 11:42:55] [debug] ( prunsrv.c:844 ) [25200] reportServiceStatusE: 4, 0, 0, 0
[2015-09-11 11:42:57] [debug] ( prunsrv.c:844 ) [25200] reportServiceStatusE: 4, 0, 0, 0
[2015-09-11 11:42:57] [debug] ( prunsrv.c:844 ) [25200] reportServiceStatusE: 3, 0, 3000, 0
[2015-09-11 11:42:57] [info] ( prunsrv.c:943 ) [10984] Stopping service...
[2015-09-11 11:42:57] [debug] ( prunsrv.c:1057) [10984] Waiting for stop worker to finish...
I have been trying for a couple of days to figure out what the problem is but I've hit a wall and I'm out of ideas. So far, LogLevel=DEBUG hasn't shed any light on the problem so I'm looking for ideas that might help me debug this. Can anyone give me a suggestion?
Thanks
As it turned out, the problem was with the startup command I was using. The startup command needed a "set NOPAUSE=Y" at the front of it. E.g.
++StartParams="/c;set;NOPAUSE=Y;&&;domain.bat"
Once I added that, it did the trick.

Solaris svcs command shows wrong status

I have freshly installed an application on solaris 5.10 . When checked through ps -ef | grep hyperic | grep agent, process are up and running . When checked the status through svcs hyperic-agent command, the output shows that the agent is in maintenance mode . Application is working fine and I dont have any issues with the application . Please help
There are several reasons that lead to that behavior:
Starter (start/exec property of service) returned status that is different from SMF_EXIT_OK (zero). Than you may check logs:
# svcs -x ssh
...
See: /var/svc/log/network-ssh:default.log
If you check logs, you may see following messages that means, starter script failed or incorrectly written:
[ Aug 11 18:40:30 Method "start" exited with status 96 ]
Another reason for such behavior is that service faults during while its working (i.e. one of processes coredumps or receives kill signal or all processes exits) as described here: https://blogs.oracle.com/lianep/entry/smf_5_fault_retry_models
The actual system that provides SMF facilities for monitoring that is System Contracts. You may determine contract ID of online service with svcs -v (field CTID):
# svcs -vp svc:/network/smtp:sendmail
STATE NSTATE STIME CTID FMRI
online - Apr_14 68 svc:/network/smtp:sendmail
Apr_14 1679 sendmail
Apr_14 1681 sendmail
Than watch events with ctwatch:
# ctwatch 68
CTID EVID CRIT ACK CTTYPE SUMMARY
68 28 crit no process contract empty
Than there are two options to handle that:
There is a real problem with service so it eventually faults. Than debug the application.
It is normal behavior of service, so you should edit and re-import your service manifest, to make SMF less paranoid. I.e. configure ignore_error and duration properties.

Custom Munin plugin won't report

I've built my first Munin plugin to give us the size of our Redis queue, but it won't report for some reason. Every other plugin on the node, including other Redis-centric plugins work fine.
Here's the plugin code:
#!/bin/sh
case $1 in
config)
cat <<'EOM'
multigraph redis_queue_size
graph_title Redis Queue Size
graph_info The size of Redis queue
graph_category redis
graph_vlabel Messages
redisqueue.label redisqueue
redisqueue.type GAUGE
redisqueue.min 0
EOM
exit 0;;
esac
queuelength=`redis-cli llen mykeyname`
printf "redisqueue.value "
echo $queuelength
The plugin is in /usr/share/munin/plugins/redis_queue_
The plugin is symlinked to /etc/munin/plugins/redis_queue_
I made sure to restart the service
$ sudo service munin-node force-reload
If I run sudo munin-run redis_queue_ I get the correct output:
redisqueue.value 1567595
If I run munin-node-config I get the following:
redis_queue_ | yes |
If I connect to the instance from the master using telnet to fetch the plugin, I get:
$ telnet 10.101.21.56 4949
Trying 10.101.21.56...
Connected to 10.101.21.56.
Escape character is '^]'.
# munin node at redis01.example.com
fetch redis_queue_
redisqueue.value 1035336
The master shows an empty graph for it, but the "last updated" time isn't increasing. I initially had the plugin configured a little differently (it wasn't producing good output) so all the values are -nan. Once I fixed the output, I expected the plugin to start working, but all efforts have failed.
Everything looks right, but yet still no values in the graph.
Edit: Munin v1.4.6