Maximum number of supervisord managed processes? - supervisord

We have an issue with supervisord being unable to handle more than 200 processes at once - adding any more causes it to die on startup.
Anyone have any experience with supervisord and a large number of managed processes?

Figured it out - it's a bug in supervisord that doesn't allow for more than 1024 file descriptors.
https://github.com/Supervisor/supervisor/issues/26

Related

Artemis: AMQ222210: Storage usage is beyond max-disk-usage. System will start blocking producers

I'm sending a message from Application A to Artemis but I'm getting this error from Application A:
AMQ212054: Destination address=my-service is blocked. If the system is configured to block make sure you consume messages on this configuration.
Looking at the logs of artemis starting up this is what I see which I believe is the cause:
AMQ222210: Storage usage is beyond max-disk-usage. System will start blocking producers
I've looked at the documentation here and found nothing that could help. Also have logged into the running container and changed the 'max-disk-usage' to 100 as per my google research and so far nothing has helped.
I'm running artemis using the following command:
docker run -it --rm -e ARTEMIS_USERNAME=artemis -e ARTEMIS_PASSWORD=artemis -p 8161:8161 -p 61616:61616 vromero/activemq-artemis
Any help is appreciated~ Thank you
You are receiving this message because you computer's disk space is over 90% full and Artemis blocks producers once this happens. To solve your problem you can either:
Clear up disk space on your computer so that it is below 90% .
Increase how full your disk can be before Artimes blocks producers. To do this you need to modify the broker configuration file which is located at:
path-to-broker\artemis\etc\broker.xml
In this file, there is a tag labeled max-disk-usage which is by default set to 90. Simply increase this to 100 (or whatever value you feel comfortable with).
Note that the reason Artemis configures your brokers to start blocking producers once you computer's disk space usage reaches 90% and above is to prevent potentially using up all of your disk space in the case of message back log.
I've downloaded a different version and this issue hasn't occurred anymore.

Nginx with varnish error: failed (24: Too many open files)

I am running varnish with nginx as proxy on ubuntu and I am getting (24: Too many open files) error every few days.
Restarting nginx solves the problem.
After researching about this error I found that the common solution is to increase worker_rlimit_nofile in nginx.conf.
I feel like this is not a real solution since the limit I will set might reach as well.
Why nginx keeps these files (I believe these are the sockets) open? and what will a solution to my situation?
UPDATE:
I just noticed there are hundreds of varnish sockets open when I run lsof. I believe my issue is that these sockets don't get closed.
It's a good practice to increase the standard max number of files open on your server when it is a web server, the same goes for the number of ephemeral ports.
I think the default number of opened files is 1024 which is way too small for varnish
I am setting it to 131072
ulimit -n 131072

supervisord autorestart max tries?

http://supervisord.org/configuration.html#program-x-section-values says you can use autorestart=true to restart on exit, but doesn't say how to give a maximum amount of restarts (within startsecs) before giving up. Is there a way to do this? Note: I'm not talking about the first startup, but about the event that a program crashes after, say, running fine for 10 days.
According to the docs, autorestart doesn't care about startretries:
autorestart controls whether supervisord will autorestart a program if
it exits after it has successfully started up (the process is in the
RUNNING state).
supervisord has a different restart mechanism for when the process is
starting up (the process is in the STARTING state). Retries during
process startup are controlled by startsecs and startretries.
You should use startretries as well, ex of program configuration:
[program:consumer_example]
command=command example
process_name=%(program_name)s_%(process_num)02d
numprocs=1
autostart=true
autorestart=true
startretries=10
user=USERNAME
As you can see I used startretries with 10, when you not inform into program it uses the default value (3).
I think that you need is to use the startretries parameter..
http://supervisord.org/configuration.html?highlight=startretries#program-x-section-example
best regards

bind9 (named) does not start in multi-threaded mode

From the bind9 man page, I understand that the named process starts one worker thread per CPU if it was able to determine the number of CPUs. If its unable to determine, a single worker thread is started.
My question is how does it calculate the number of CPUs? I presume by CPU, it means cores. The Linux machine I work is customized and has kernel 2.6.34 and does not support lscpu or nproc utilities. named is starting a single thread even if i give -n 4 option. Is there any other way to force named to start multiple threads?
Thanks in advance.

Too many open mongoDB connections when using Celery

I'm using Celery to download feeds and resize images. The feeds and image paths are then stored in MongoDB using mongoengine. When I check current connections (db.serverStatus()["connections"]) after running the tasks I have between 50-80 "current" connections, which remain open until I shutdown celeryd. Has anyone experienced this issue and/or do you know what I can do to solve it?
Thanks,
Kenzic
This just means that there are between 50 and 80 connections open to the MongoDB server, and isn't cause for concern. PyMongo (and therefore MongoEngine) maintain an internal pool of connections (that is, sockets) to mongod, so even when nothing is happening (no active queries, commands, etc), the connections remain open to the database for the next time they will be used. By default, PyMongo attempts to retain no more than 10 open connections per Connection object.
Are you experiencing any specific problems due to the number of open connections?