Getting a ModuleNotFoundError while running an celery app on a poetry environment - celery

So I have a simple celery app code(app.py) which looks something like this:
``from celery import Celery
from functools import lru_cache`
#lru_cache def get_celery() -> Celery:
celery_app = Celery(
'worker',
broker='redis://XXXXXXXXX/4',
backend='redis://XXXXXXXX/5',
)
celery_app.autodiscover_tasks(
['api.v1.worker.worker'],
force=True
)
return celery_app`
Now when I run this app.py file through the VSCode terminal I get a error like this:
ModuleNotFoundError: No module named 'api'
My directory looks something like this:
app
|__api
| |__v1
| |__endpoints
| |__utils
| |__worker
| |__worker.py
|__celery_app
|__app.py
The API and celery_app folders are on the same level. Can anyone please help me understand and fix this issue.
I'm running this entire project on a poetry shell and my pyproject.toml file looks something like this:
[tool.poetry.dependencies]
python = "3.10.9" requests = "^2.28.2" numpy = "^1.24.1" pandas = "^1.5.3" fastapi = "^0.89.1" datetime = "^5.0" mimir = "^0.6.3" jsonify = "^0.5" uvicorn = {extras = ["standard"], version = "^0.20.0"} appdirs = "^1.4.4" black = "^22.12.0" click = "^8.1.3" pydantic = "^1.10.4" pymongo = "^4.3.3" regex = "^2022.10.31" motor = "^3.1.1" json-utils = "^0.2" celery = "^5.2.7" redis = "^4.4.2"
To try resolving this issue I commented out this part of the code:
celery_app.autodiscover_tasks( ['api.v1.worker.worker'], force=True )
and the celery app worked fine. But i want the code to work along with the api.v1.worker.worker statement.

Related

locust unrecognized arguments when running as lib

The following codes are from Locust examples - use_as_lib.
import gevent
from locust import HttpUser, task
from locust.env import Environment
from locust.stats import stats_printer, stats_history
from locust.log import setup_logging
setup_logging("INFO", None)
class MyUser(HttpUser):
host = "https://docs.locust.io"
#task
def t(self):
self.client.get("/")
env = Environment(user_classes=[MyUser])
runner = env.create_local_runner()
web_ui = env.create_web_ui("127.0.0.1", 8089)
env.events.init.fire(environment=env, runner=runner, web_ui=web_ui)
gevent.spawn(stats_printer(env.stats))
gevent.spawn(stats_history, env.runner)
runner.start(1, spawn_rate=10)
gevent.spawn_later(60, lambda: runner.quit())
runner.greenlet.join()
web_ui.stop()
If I run it with python use_as_lib.py, everything works fine. But if I run it with python use_as_lib.py -c argument01 -b argument02, it will fail with:
use_as_lib.py: error: unrecognized arguments: -c -b argument02
In my case, the snippet above is part of a big program, which has its own command line arguments.
I checked a bit, seems argument_parser.ui_extra_args_dict() here invoked by env.create_web_ui("127.0.0.1", 8089) will parse all the arguments, which cause this issue.
Any ideas on how to fix it ? Thanks!
You can pass a parsed set of parameters when you create the environment, slightly less hack-y than your suggestion. Something like this:
parser = locust.argument_parser.get_parser()
parsed_options = parser.parse_args("-f yourlocustfile.py --headless <other params>")
env = Environment(user_classes=[MyUser], parsed_options=parsed_options=parsed_options)
Would that work?
Here's the workaround I used. So far it works. But more like a hack.
# ...
tmp = sys.argv
sys.argv = [sys.argv[0]]
env.create_web_ui("127.0.0.1", 8089)
sys.argv = tmp
# ...

Pytest hangs on testing database

I'm using Pytest to test connection to database which hangs
when I run it:
def test_db():
db.create_all()
new_comment = Comments(comment='python rocks')
db.session.add(new_comment)
db.session.commit()
entry = Comments.query.all()
assert len(entry) == 1
db.drop_all()
The table is created successfully but I can't run select * from Comments; as it hangs too. I have to kill both windows.
How can I fix this ?
This code did the trick:
from sqlalchemy.engine.reflection import Inspector
from sqlalchemy import create_engine
engine = create_engine('mysql://user:password#localhost/table')
inspector = Inspector.from_engine(engine)
def test_db():
db.create_all()
assert inspector.get_table_names()[0] == 'comments'
db.drop_all()

"Dag Seems to be missing" error in a Cloud Composer Airflow Dynamic DAG

I have a dynamic Airflow DAG in Google Cloud Composer gets created, listed in the web-server and ran (backfill) without error.
However, there are issues:
When clicking on the DAG in web url, it says "DAG seems to be
missing"
Can't see Graph view/Tree view as showing the error above
Can't manually trigger the DAG as showing the error above
Trying to fix this for couple days...any hint will be helpful. Thank you!
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.contrib.operators.gcs_to_bq import GoogleCloudStorageToBigQueryOperator
from google.cloud import storage
from airflow.models import Variable
import json
args = {
'owner': 'xxx',
'start_date':'2020-11-5',
'provide_context': True
}
dag = DAG(
dag_id='dynamic',
default_args=args
)
def return_bucket_files(bucket_name='xxxxx', **kwargs):
client = storage.Client()
bucket = client.get_bucket(bucket_name)
blobs = bucket.list_blobs()
file_list = [blob.name for blob in blobs]
return file_list
def dynamic_gcs_to_gbq_etl(file, **kwargs):
mapping = json.loads(Variable.get("xxxxx"))
database = mapping[0][file]
table = mapping[1][file]
task=GoogleCloudStorageToBigQueryOperator(
task_id= f'gcs_load_{file}_to_gbq',
bucket='xxxxxxx',
source_objects=[f'{file}'],
destination_project_dataset_table=f'xxx.{database}.{table}',
write_disposition="WRITE_TRUNCATE",
autodetect=True,
skip_leading_rows=1,
source_format='CSV',
dag=dag)
return task
start_task = DummyOperator(
task_id='start',
dag=dag
)
end_task = DummyOperator(
task_id='end',
dag=dag)
push_bucket_files = PythonOperator(
task_id="return_bucket_files",
provide_context=True,
python_callable=return_bucket_files,
dag=dag)
for file in return_bucket_files():
gcs_load_task = dynamic_gcs_to_gbq_etl(file)
start_task >> push_bucket_files >> gcs_load_task >> end_task
This issue means that the Web Server is failing to fill in the DAG bag on its side - this problem is most likely not with your DAG specifically.
My suggestion would be right now to try and restart the web server (via the installation of some dummy package).
Similar issues reported in this post as well here.

How to use multiple pytest conftest files in one test run with a duplicated parser.addoption?

I have a pytest testing project running selenium tests that has a structure like:
ProjRoot
|
|_Pytest.ini
|_____________TestFolderA
| |
| |_test_folderA_tests1.py
| |_test_folderA_tests2.py
|
|____________TestFolderB
| |
| |_test_folderB_test1.py
| |_test_folderA_tests2.py
|
|
|___________TestHelperModules
| |
| |_VariousTestHelperModules
|
|____________DriversAndTools
|___(contains chromedriver.exe, firefox profile folder etc)
I have a confTest.py file which I currently run in the ProjRoot, which I use as a setup and tear down for establishing the browser session for each test that is run. It runs each test twice. Once for Chrome and once for Firefox. In my tests I just utilise the resulting driver fixture. The conftest file is as below:
#conftest.py
import pytest
import os
import rootdir_ref
from selenium.webdriver.common.keys import Keys
import time
from webdriverwrapper.pytest import *
from webdriverwrapper import Chrome
from webdriverwrapper import DesiredCapabilities
from webdriverwrapper import Firefox
from webdriverwrapper import FirefoxProfile
#when running tests from command line we should be able to pass --url=www..... for a different website, check what order these definitions need to be in
def pytest_addoption(parser):
parser.addoption('--url', default='https://test1.testsite.com.au')
#pytest.fixture(scope='function')
def url(request):
return request.config.option.url
browsers = {
'firefox': Firefox,
'chrome': Chrome,
}
#pytest.fixture(scope='function',
params=browsers.keys())
def browser(request):
if request.param == 'firefox':
firefox_capabilities = DesiredCapabilities.FIREFOX
firefox_capabilities['marionette'] = True
firefox_capabilities['handleAlerts'] = True
theRootDir = os.path.dirname(rootdir_ref.__file__)
ffProfilePath = os.path.join(theRootDir, 'DriversAndTools', 'FirefoxSeleniumProfile')
geckoDriverPath = os.path.join(theRootDir, 'DriversAndTools', 'geckodriver.exe')
profile = FirefoxProfile(profile_directory=ffProfilePath)
print (ffProfilePath)
print (geckoDriverPath)
b = browsers[request.param](firefox_profile=profile, capabilities=firefox_capabilities, executable_path=geckoDriverPath)
elif request.param == 'chrome':
desired_cap = DesiredCapabilities.CHROME
desired_cap['chromeOptions'] = {}
desired_cap['chromeOptions']['args'] = ['--disable-plugins', '--disable-extensions']
theRootDir = os.path.dirname(rootdir_ref.__file__)
chromeDriverPath = os.path.join(theRootDir, 'DriversAndTools', 'chromedriver.exe')
b = browsers[request.param](chromeDriverPath)
else:
b = browsers[request.param]()
request.addfinalizer(lambda *args: b.quit())
return b
#pytest.fixture(scope='function')
def driver(browser, url):
driver = browser
driver.maximize_window()
driver.get(url)
return driver
What I’d like to do is have a conftest file in each Test Folder instead of the ProjRoot. But if I take this existing conftest file and put it in each test folder and then run pytest from the project root using
python –m pytest
letting pytest pickup the test directories from pytest.ini (expecting the test folders to run with their respectively contained conftest files) I have issues with the parser.addoption --url already having been added. The end of the error message is:
ClientScripts\conftest.py:19: in pytest_addoption
parser.addoption('--url', default='https://test1.coreplus.com.au/coreplus01')
..\..\..\VirtEnv\VirtEnv\lib\site-packages\_pytest\config.py:521: in addoption
self._anonymous.addoption(*opts, **attrs)
..\..\..\VirtEnv\VirtEnv\lib\site-packages\_pytest\config.py:746: in addoption
raise ValueError("option names %s already added" % conflict)
E ValueError: option names {'--url'} already added
The purpose of the --url addoption is so I can override the defaults in the conftest file at commandline if I want to point them all to a different url at the same time, but otherwise let them default to running to different url's as specified in their conftest files.
I had a similar issue.
Error was gone after removing all cached files and venv.

Flask, blueprints uses celery task and got cycle import

I have an application with Blueprints and Celery
the code is here:
config.py
import os
from celery.schedules import crontab
basedir = os.path.abspath(os.path.dirname(__file__))
class Config:
SECRET_KEY = os.environ.get('SECRET_KEY') or ''
SQLALCHEMY_COMMIT_ON_TEARDOWN = True
RECORDS_PER_PAGE = 40
SQLALCHEMY_DATABASE_URI = ''
CELERY_BROKER_URL = ''
CELERY_RESULT_BACKEND = ''
CELERY_RESULT_DBURI = ''
CELERY_TIMEZONE = 'Europe/Kiev'
CELERY_ENABLE_UTC = False
CELERYBEAT_SCHEDULE = {}
#staticmethod
def init_app(app):
pass
class DevelopmentConfig(Config):
DEBUG = True
WTF_CSRF_ENABLED = True
APP_HOME = ''
SQLALCHEMY_DATABASE_URI = 'mysql+mysqldb://...'
CELERY_BROKER_URL = 'sqla+mysql://...'
CELERY_RESULT_BACKEND = "database"
CELERY_RESULT_DBURI = 'mysql://...'
CELERY_TIMEZONE = 'Europe/Kiev'
CELERY_ENABLE_UTC = False
CELERYBEAT_SCHEDULE = {
'send-email-every-morning': {
'task': 'app.workers.tasks.send_email_task',
'schedule': crontab(hour=6, minute=15),
},
}
class TestConfig(Config):
DEBUG = True
WTF_CSRF_ENABLED = False
TESTING = True
SQLALCHEMY_DATABASE_URI = 'mysql+mysqldb://...'
class ProdConfig(Config):
DEBUG = False
WTF_CSRF_ENABLED = True
SQLALCHEMY_DATABASE_URI = 'mysql+mysqldb://...'
CELERY_BROKER_URL = 'sqla+mysql://...celery'
CELERY_RESULT_BACKEND = "database"
CELERY_RESULT_DBURI = 'mysql://.../celery'
CELERY_TIMEZONE = 'Europe/Kiev'
CELERY_ENABLE_UTC = False
CELERYBEAT_SCHEDULE = {
'send-email-every-morning': {
'task': 'app.workers.tasks.send_email_task',
'schedule': crontab(hour=6, minute=15),
},
}
config = {
'development': DevelopmentConfig,
'default': ProdConfig,
'production': ProdConfig,
'testing': TestConfig,
}
class AppConf:
"""
Class to store current config even out of context
"""
def __init__(self):
self.app = None
self.config = {}
def init_app(self, app):
if hasattr(app, 'config'):
self.app = app
self.config = app.config.copy()
else:
raise TypeError
init.py:
import os
from flask import Flask
from celery import Celery
from config import config, AppConf
def create_app(config_name):
app = Flask(__name__)
app.config.from_object(config[config_name])
config[config_name].init_app(app)
app_conf.init_app(app)
# Connect to Staging view
from staging.views import staging as staging_blueprint
app.register_blueprint(staging_blueprint)
return app
def make_celery(app=None):
app = app or create_app(os.getenv('FLASK_CONFIG') or 'default')
celery = Celery(__name__, broker=app.config.CELERY_BROKER_URL)
celery.conf.update(app.conf)
TaskBase = celery.Task
class ContextTask(TaskBase):
abstract = True
def __call__(self, *args, **kwargs):
with app.app_context():
return TaskBase.__call__(self, *args, **kwargs)
celery.Task = ContextTask
return celery
tasks.py:
from app import make_celery, app_conf
cel = make_celery(app_conf.app)
#cel.task
def send_realm_to_fabricdb(realm, form):
some actions...
and here is the problem:
The Blueprint "staging" uses task send_realm_to_fabricdb, so it makes: from tasks import send_realm_to_fabricdb
than, when I just run application, everything goes ok
BUT, when I'm trying to run celery: celery -A app.tasks worker -l info --beat, it goes to cel = make_celery(app_conf.app) in tasks.py, got app=None and trying to create application again: registering a blueprint... so I've got cycle import here.
Could you tell me how to break this cycle?
Thanks in advance.
I don't have the code to try this out, but I think things would work better if you move the creation of the Celery instance out of tasks.py and into the create_app function, so that it happens at the same time the app instance is created.
The argument you give to the Celery worker in the -A option does not need to have the tasks, Celery just needs the celery object, so for example, you could create a separate starter script, say celery_worker.py that calls create_app to create app and cel and then give it to the worker as -A celery_worker.cel, without involving the blueprint at all.
Hope this helps.
What I do to solve this error is that I create two Flask instance which one is for Web app, and another is for initial Celery instance.
Like #Miguel said, I have
celery_app.py for celery instance
manager.py for Flask instance
And in these two files, each module has it's own Flask instance.
So I can use celery.task in Views. And I can start celery worker separately.
Thanks Bob Jordan, you can find the answer from https://stackoverflow.com/a/50665633/2794539,
Key points:
1. make_celery do two things at the same time: create celery app and run celery with flask content, so you can create two functions to do make_celery job
2. celery app must init before blueprint register
Having the same problem, I ended up solving it very easily using shared_task (docs), keeping a single app.py file and not having to instantiate the flask app multiple times.
The original situation that led to the circular import:
from src.app import celery # src.app is ALSO importing the blueprints which are importing this file which causes the circular import.
#celery.task(bind=True)
def celery_test(self):
sleep(5)
logger.info("Task processed by Celery.")
The current code that works fine and avoids the circular import:
# from src.app import celery <- not needed anymore!
#shared_task(bind=True)
def celery_test(self):
sleep(5)
logger.info("Task processed by Celery.")
Please mind that I'm pretty new to Celery so I might be overseeing important stuff, it would be great if someone more experienced can give their opinion.