unregistered task type import errors in celery - celery

I'm having headaches with getting celery to work with my folder structure. Note I am using virtualenv but it should not matter.
cive /
celery_app.py
__init__.py
venv
framework /
tasks.py
__init__.py
civeAPI /
files tasks.py need
cive is my root project folder.
celery_app.py:
from __future__ import absolute_import
from celery import Celery
app = Celery('cive',
broker='amqp://',
backend='amqp://',
include=['cive.framework.tasks'])
# Optional configuration, see the application user guide.
app.conf.update(
CELERY_TASK_RESULT_EXPIRES=3600,
)
if __name__ == '__main__':
app.start()
tasks.py (simplified)
from __future__ import absolute_import
#import other things
#append syspaths
from cive.celery_app import app
#app.task(ignore_result=False)
def start(X):
# do things
def output(X):
# output files
def main():
for d in Ds:
m = []
m.append( start.delay(X) )
output( [n.get() for n in m] )
if __name__ == '__main__':
sys.exit(main(sys.argv[1:]))
I then start workers via (outside root cive dir)
celery -A cive worker --app=cive.celery_app:app -l info
which seems to work fine, loading the workers and showing
[tasks]
. cive.framework.tasks.start_sessions
But when I try to run my tasks.py via another terminal:
python tasks.py
I get the error:
Traceback (most recent call last):
File "tasks.py", line 29, in <module>
from cive.celery_app import app
ImportError: No module named cive.celery_app
If I rename the import to:
from celery_app import app #without the cive.celery_app
I can eventually start the script but celery returns error:
Received unregistered task of type 'cive.start_sessions'
I think there's something wrong with my imports or config but I can't say what.

So this was a python package problem, not particularly a celery issue. I found the solution by looking at How to fix "Attempted relative import in non-package" even with __init__.py .
I've never even thought about this before, but I wasn't running python in package mode. The solution is cd'ing out of your root project directory, then running python as a package (note there is no .py after tasks):
python -m cive.framework.tasks
Now when I run the celery task everything works.

Related

pytest works from commandline, running testfile causes ModuleNotFound error

Project structure
Project
--libs/
----__init.py__
----handler.py
--tests/
----handler_test.py
----__init.py__
Running pytest from commandline works and the code is executed.
However, if I run handler_test.py directly, I receive an error:
Traceback (most recent call last):
File ".../tests/handler_test.py", line 5, in <module>
from libs.handler import ConfigurationHandler
ModuleNotFoundError: No module named 'libs'
handler_test.py
import pytest
from libs.handler import ConfigurationHandler
#some test cases
I added pytest as a pre-commit hook, which will execute all tests. However, running pytest to test a single test file is a bit overkill. Having the opportunity to execute test files would be great.

Import from the root of the repository when running a jupyter notebook

I have a repository with the following setup:
│
└───foo_lib
│ │ bar.py
│
└───notebooks
│ my_notebook.ipynb
So basically I have some common python code in foo_lib and some notebook in notebooks
In my_notebook I want to use the code from foo_lib. So I do:
from foo_lib import bar
But that doesn't work because the root of the repo isn't in my python path when the notebook is executed.
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-e2c421feccf4> in <module>
----> 1 from foo_lib import bar
ModuleNotFoundError: No module named 'foo_lib'
The hack I've been using is to put %cd .. in the first cell. Then the working directory is the root of the repo and I can import fine. But it's not idempotent, so if I run the cell more than once, imports break again.
I found an idempotent solution. I can use globals()["_dh"][0] which points to the directory containing the notebook, when running in jupyter:
import os
os.chdir(os.path.join(globals()["_dh"][0], ".."))
Unfortuantely, this doesn't work when I run my notebook programatically using nbconvert:
import json
import nbconvert
import nbformat
def run_notebook():
ep = nbconvert.preprocessors.ExecutePreprocessor()
with open("notebooks/my_notebook.ipynb") as fp:
nb = nbformat.read(fp, as_version=4)
nb, resources = ep.preprocess(nb)
print(json.dumps(nb, indent=2))
if __name__ == "__main__":
run_notebook()
When I run this script from the root of the repository, globals()["_dh"][0] points to the root of the repository
So I'm looking for a solution to this import problem that:
is idempotent
works when executing from the browser/jupyter
works when executing using nbconvert
is short: I would have to copy paste the code in every notebook (since before that code runs, I can't do imports).
Is there a better way to do this?
I've figured out that the local repository code and be added to the site-package by calling:
pip install -e .

Struggling with Tornado - Celery integration

I am in the process of building a Tornado Web server which requires to perform some blocking tasks such as zipping video files, etc which are time consuming. Ideally, I would like to hand off these tasks
to a subprocess utility like Celery and inform the Client accordingly. So, I took this code from GitHub & modified it to get better understanding. I am using Ubuntu 12.04 LTS and here's what I have done so far:
Installed Tornado and working great on its own.
Installed RabbitMQ (apt-get install rabbitmq-server).
Installed Tornado Celery (downloaded and unzipped the tar file).
Executed python -m pip install tornado-celery
Borrowed 2 example python files from above Github Repo and modified it.
My Tasks snippet code (myTornadoTasks.py) is:
import os
import time
from datetime import datetime
from celery import Celery
from celery import task
celery = Celery("myTornadoTasks", broker="amqp://")
celery.conf.CELERY_RESULT_BACKEND = os.environ.get('CELERY_RESULT_BACKEND', 'amqp')
#celery.task
def add(x, y):
return int(x) + int(y)
..... some more code
#celery.task
def error(msg):
raise Exception(msg)
if __name__ == "__main__":
celery.start()
I start the tasks using celery worker -A myTornadoTasks &. Here everything works fine and I see 24 threads running.
My Tornado Celery snippet code (myTornadoCelery.py) is:
from tornado import gen
from tornado import ioloop
from tornado.web import asynchronous, RequestHandler, Application
import myTornadoTasks
import tcelery tcelery.setup_nonblocking_producer()
class AsyncHandler(RequestHandler):
#asynchronous
def get(self):
myTornadoTasks.sleep.apply_async(args=[3], callback=self.on_result)
def on_result(self, response):
self.write(str(response.result))
self.finish()
... some more code
application = Application([
(r"/async-sleep", AsyncHandler),
(r"/gen-async-sleep", GenAsyncHandler),
(r"/gen-async-sleep-add", GenMultipleAsyncHandler),
])
if __name__ == "__main__":
application.listen(8887)
ioloop.IOLoop.instance().start()
I start Tornado using python -m tcelery --app=myTornadoCelery --address=0.0.0.0
I get an error AttributeError: 'module' object has no attribute 'celery'
Questions:
What does the above error mean? What am I missing?
Did I miss any steps here? Where is RabbitMQ invoked?
How do I increase/decrease the number of Celery workers from 24?
Thanks.
Updated with Error:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/lib/python2.7/dist-packages/tcelery/__main__.py", line 62, in <module>
main()
File "/usr/local/lib/python2.7/dist-packages/tcelery/__main__.py", line 56, in main
cmd.execute_from_commandline()
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 309, in execute_from_commandline
argv = self.setup_app_from_commandline(argv)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 469, in setup_app_from_commandline
self.app = self.find_app(app)
File "/usr/local/lib/python2.7/dist-packages/celery/bin/base.py", line 489, in find_app
return find_app(app, symbol_by_name=self.symbol_by_name)
File "/usr/local/lib/python2.7/dist-packages/celery/app/utils.py", line 240, in find_app
found = sym.celery
AttributeError: 'module' object has no attribute 'celery'

Fail to import IPython parallel in Jupyter

I recently made an update of IPython to 4.0.0 and installed Jupyter 4.0.6.
I wanted to use Ipython parallel, and after starting the engines in the notebook, I imported:
from IPython import parallel
And it fails:
~/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
warn("IPython.utils.traitlets has moved to a top-level traitlets package.")
~/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/IPython/utils/pickleutil.py:3: UserWarning: IPython.utils.pickleutil has moved to ipykernel.pickleutil
warn("IPython.utils.pickleutil has moved to ipykernel.pickleutil")
~/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/IPython/utils/jsonutil.py:3: UserWarning: IPython.utils.jsonutil has moved to jupyter_client.jsonutil
warn("IPython.utils.jsonutil has moved to jupyter_client.jsonutil")
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-5652e9e33a4d> in <module>()
----> 1 from IPython import parallel
~/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/IPython/parallel/__init__.py in <module>()
31
32 from .client.asyncresult import *
---> 33 from .client.client import Client
34 from .client.remotefunction import *
35 from .client.view import *
~/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/IPython/parallel/client/client.py in <module>()
38 from IPython.utils.capture import RichOutput
39 from IPython.utils.coloransi import TermColors
---> 40 from IPython.utils.jsonutil import rekey, extract_dates, parse_date
41 from IPython.utils.localinterfaces import localhost, is_local_ip
42 from IPython.utils.path import get_ipython_dir
ImportError: cannot import name rekey
So I tried:
pip install rekey
But no distribution were found.
Note that it fails the same way in the notebook, be it open with ipython notebook or jupyter notebook, and in the console.
Also note that there is a warning:
UserWarning: IPython.utils.jsonutil has moved to jupyter_client.jsonutil
But rekey does not exist in the module jupyter_client.jsonutil
Question: How can I have IPython parallel to work within Jupyter ?
What am I missing ?
I found the problem I think (at least it works):
First, I had to import ipyparallel instead of IPython.parallel
See here: http://jupyter.readthedocs.org/en/latest/migrating.html#imports
EDIT: I get this OSError, but the fix was apparently useless, and it works without. I still don't get why I had this error, though.
Then, I had another error, when starting the client:
OSError: Connection file '~/.ipython/profile_default/security/ipcontroller-client.json' not found.
You have attempted to connect to an IPython Cluster but no Controller could be found.
Please double-check your configuration and ensure that a cluster is running.
So I just copy the directory ~/.ipython/profile_default to ~/.jupyter/profile_default
And it works!

pyql is not accessable after its installation from native for Ubuntu14.04 python2.7

After installing pyql from the package source(according to its own wizard described in this file) on Ubuntu 14.04 into /usr/local/lib/python2.7/dist-packages/ folder, all tests finished with success. This folder includes all additional installed packages, which are accessible, from the python using native import command.
But, this specific installation python don’t see, and I cannot import anything from it.
Have you any idea, what I need to define in addition?
Thanks,
Yigal B.
There are results of my sys.path:
import sys
print(sys.path)
['/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu',
'/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages/PILcompat',
'/usr/lib/python2.7/dist-packages/gst-0.10',
'/usr/lib/python2.7/dist-packages/gtk-2.0',
'/usr/lib/pymodules/python2.7',
'/usr/lib/python2.7/dist-packages/wx-2.8-gtk2-unicode']
python -c 'from quantlib.settings import __quantlib_version__; print __quantlib_version__'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: No module named quantlib.settings
Can you post the content of you sys.path? And the output of
python -c 'from quantlib.settings import __quantlib_version__; print __quantlib_version__'