Setting limits to resource usage with Supervisord - celery

I'm using Supervisor to control a Celery worker that takes an album containing one or more photos and generates a video from them using avconv. Unfortunately, for bigger albums the process is using too many resources and being shut down.
Is there a parameter I can set in the Supervisor config file I can set to renice the process in order to limit the resources it uses and prevent it being killed off? I haven't been able to find any in the documentation, but this seems like an obvious need. Alternatively, does Celery allow something similar so I can set it there?

Superlance plugin can be used to limit memory usage of any supervisord managed process.
Supervisord can natively manage process priority, set it's umask, etc.. See docs for details.

Related

Total network waiting time for all http requests

Not sure whether anybody has similar query here, assume we do a performance tracing on a single page application completely loaded on a url, how to get the total waiting time on networking time for all resources?
I've never heard about an existing tool that would provide the exact metric you're looking for. But you could write a mini script that uses the Resource Timing API. It is easy to list all network requests and sum up waiting times. More info here.
Then, if you need to automate measures, you can use Puppeteer to run your script on a headless Chrome.

How to make uchiwa dashboard url be able to adjust threshold?

me again..
I had done all the sensu-uchiwa-graphite set up. And i get a new request,:(. Rather than go to change the threshold in check.json file on sensu server..any plugin at the UCHIWA that this adjustment will be shown in Uchiwa dashboard? I asked because in case that my application teams wanna change it by themselves without accessing to server.
I think sensu-admin in enterprise is available but we need to pay big money per year ;(...
Thanks in advance to help.
Sumana W.
This is fairly doable if you use a configuration management system like Chef/Ansible/Puppet - especially if you run standalone checks on the sensu-client.
This allows the clients to define their own thresholds, rather than changing the sensu servers themselves.
See https://sensuapp.org/docs/latest/reference/checks.html#standalone-checks
In this case, the definitions for the checks are sitting on the client servers and they have the choice of their thresholds or configurations. The client itself manages how often to run the check and sends the output back to the server, rather than the server requesting the checks. This helps quite a bit as far as scaling or multitenancy.
The other way to accomplish this, if you are tied to serverside checks, would be to use client attributes (https://sensuapp.org/docs/0.25/reference/checks.html#check-token-substitution)
For example, you can have a cpu check that says something like check-cpu.sh -w :::cpu_warn::: -c :::cpu_critical::: and these come from a cpu_warn and cpu_critical value from the client.json on the client server.
Source: We use sensu extensively in an enterprise environment across thousands of hosts and have been working through these same issues.

Bluemix Auto Scaling API

Is there a way for me to programmatically get notified when Bluemix auto scaling has scaled up or down?
I'm reading streaming data from a queue and would like to make sure the number of instances that I have are balanced and data is partitioned correctly
At present this kind of notification service is not available, only you can do is query the instance scaling history in Web UI. I think this requirement is interesting and should be considered to provide to developer in the future.
This kind of alert isn't available yet but you can write a simple script monitoring output of
cf app (appname)
It returns the number of instances running and the state of each one, with the right combination of awk and grep (or a perl script for example) you could have your own alerter while waiting for this of functionality

mqsvc.exe pegs cpu at full usage when deploying nservicebus to production

When I deployed my site that uses nservice to a new production box, it was unusably slow...
After some debugging I discovered that mqsvc.exe was taking up 50% of the CPU usage and the other 50% was being taken up by w3wp.exe
I found this post here:
http://geekswithblogs.net/michaelstephenson/archive/2010/05/07/139717.aspx
which recommended the following:
Make sure you set the windows service for NserviceBus Generic Host to the right credentials
Make sure you have the queue set with the right permissions
Make sure you turn on the right logging configuration in NServiceBus
So I figured the issue was something related to permissions, but even after trying to set the permissions correctly (I thought) I still wasn't able to resolve the issue.
If you allow NServiceBus to create its own queues, then it will create them with the correct permissions it needs.
The problem comes in when you set up a web application, and then the queues are created, and then the identity the application runs under changes. Then you get exactly this problem. NServiceBus tries to check the queue for a message, it does not have access to do so, so it immediately retries over and over, and you spike the processor.
The fix: Delete the queue. Restart the web application. NServiceBus takes over.
Edit: As noted in the comments, NServiceBus 3.x doesn't invoke the installers by default, which means queues are not automatically created in production unless you ask it to. See the documentation page on Installers for more detail.
For a web application (or any other situation where you're not using NServiceBus.Host) you can invoke the installers as part of the fluent config. There is a full example in the NServiceBus download, but here is a link to the relevant file on GitHub.
The issue did end up being that the website needed to be granted explicit permissions to the queues.
I found a number of resources online telling me this, but I still had to spend a good amount of time monkeying around with exactly WHICH account needed access... turned out that since my application pools were set to run as ApplicationPoolIdentity, I need to grant the account permissions by adding the following account to the nservicebus queue:
IIS AppPool\{APP POOL NAME}
I granted full access rights, though I'm sure you could refine that a bit if you needed to.
Hopefully, this will help anyone who runs into the same issues.
(This is my first attempt at the "Answer your own question" mechanism so please let me know if I am doing something wrong..)

celery task clean-up with DB backend

I'm trying to understand how and when tasks are cleaned up in celery. From looking at the task docs I see that:
Old results will be cleaned automatically, based on the
CELERY_TASK_RESULT_EXPIRES setting. By default this is set to expire
after 1 day: if you have a very busy cluster you should lower this
value.
But this quote is from the RabbitMQ Result Backend section and I do not see any similar text in the Database Backend section. So my question is: is there a backend agnostic approach I can take for old task clean-up with celery and if not is there a DB Backend specific approach I should take? Incase it makes any difference I'm using django-celery. Thanks.
If you click on the link to the setting doc for CELERY_TASK_RESULT_EXPIRES:
http://docs.celeryproject.org/en/latest/userguide/configuration.html#result-expires
It does say that database supports this, but then you need to run celery beat (there's a default periodic task, called every day, to remove expired results).
The backend docs in the task should probably mention this as well, maybe there should be a dedicated guide for backends too. If you want to lobby for this, then please open up an issue at https://github.com/celery/celery/issues