Why asyncio Queue is behaving so weirdly here, even though putting an item there it is showing empty?
In [1]: from multiprocessing import Queue
In [2]: q = Queue()
In [3]: q.empty()
Out[3]: True
In [4]: q.put(100)
In [5]: q.empty()
Out[5]: False
In [6]: from asyncio import Queue
In [7]: q = Queue()
In [8]: q.empty()
Out[8]: True
In [9]: q.put(100)
Out[9]: <generator object Queue.put at 0x7f97849bafc0>
In [10]: q.empty()
Out[10]: True
Because you didn't put anything:
q.put(100)
put here - is not a plain function, it's a coroutine. You should await it to put item in queue.
For example:
import asyncio
from asyncio import Queue
async def main():
q = Queue()
print(q.empty()) # True
await q.put(100)
print(q.empty()) # False
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
As Mikhail Gerasimov's answer, q.put(100) is a coroutine and explaining more detail...
Calling a coroutine does not start its code running – the coroutine
object returned by the call doesn’t do anything until you schedule its
execution. There are two basic ways to start it running: call await
coroutine or yield from coroutine from another coroutine (assuming the
other coroutine is already running!), or schedule its execution using
the ensure_future() function or the AbstractEventLoop.create_task()
method.
Coroutines (and tasks) can only run when the event loop is running.
It is from Python Coroutines doc.
In the Mikhail Gerasimov's example,
Another coroutine async def main() calls await with coroutine q.put(100) and event loop is running loop.run_until_complete(main()) such as the above description.
Related
I receive an error RuntimeError: Event loop is closed each time when i try to make more than one async call function inside my test. I already tried to use all suggestions on stackoverflow to rewrite event_loop fixture but nothing works. I wonder what i'm missing
Run test command: python -m pytest tests/ --asyncio-mode=auto
requirements.txt
pytest==7.1.2
pytest-asyncio==0.18.3
pytest-html==3.1.1
pytest-metadata==2.0.1
test.py
async def test_user(test_client_fast_api):
assert 200 == 200
request_first = test_client_fast_api.post( # works fine
"/first_route",
)
request_second = test_client_fast_api.post( # recieve RuntimeError: Event loop is closed
"/second_route",
)
conftest.py
#pytest.fixture()
def event_loop():
try:
loop = asyncio.get_running_loop()
except RuntimeError:
loop = asyncio.new_event_loop()
yield loop
loop.close()
It took me all afternoon to solve this problem.
I also try to succeed from other people's code, here is my code.
Add a file conftest.py to the directory where the test script is placed.
And write the following code.
import pytest
from main import app
from httpx import AsyncClient
#pytest.fixture(scope="session")
def anyio_backend():
return "asyncio"
#pytest.fixture(scope="session")
async def client():
async with AsyncClient(app=app, base_url="http://test") as client:
print("Client is ready")
yield client
And then write a test script test_xxx.py.
import pytest
from httpx import AsyncClient
#pytest.mark.anyio
async def test_run_not_exists_schedule(client: AsyncClient):
response = await client.get("/schedule/list")
assert response.status_code == 200
schedules = response.json()["data"]["schedules"]
schedules_exists = [i["id"] for i in schedules]
not_exists_id = max(schedules_exists) + 1
request_body = {"id": not_exists_id}
response = await client.put("/schedule/run_cycle", data=request_body)
assert response.status_code != 200
#pytest.mark.anyio
async def test_run_adfasdfw(client: AsyncClient):
response = await client.get("/schedule/list")
assert response.status_code == 200
schedules = response.json()["data"]["schedules"]
schedules_exists = [i["id"] for i in schedules]
not_exists_id = max(schedules_exists) + 1
request_body = {"id": not_exists_id}
response = await client.put("/schedule/run_cycle", data=request_body)
assert response.status_code != 200
This is the real test code for my own project. You can change it to your own.Finally, run in the project's terminal python -m pytest.If all goes well, it should be ok's.This may involve libraries that need to be installed.
pytest
httpx
Yeah wow I had a similar afternoon to your experience #Bai Jinge
This is the event loop fixture and TestClient pattern that worked for me:
from asyncio import get_event_loop
from unittest import TestCase
from async_asgi_testclient import TestClient
#pytest.fixture(scope="module")
def event_loop():
loop = get_event_loop()
yield loop
#pytest.mark.asyncio
async def test_example_test_case(self):
async with TestClient(app) as async_client:
response = await async_client.get(
"/api/v1/example",
query_string={"example": "param"},
)
assert response.status_code == HTTP_200_OK
Ref to relevant GitHub issue: https://github.com/tiangolo/fastapi/issues/2006#issuecomment-689611040
Please note - I could NOT figure our how to use Class based tests. Neither unittest.TestCase or asynctest.case.TestCase would work for me. pytest-asyncio docs (here) state that:
Test classes subclassing the standard unittest library are not supported, users are recommended to use unittest.IsolatedAsyncioTestCase or an async framework such as asynctest.
I want to launch 10 OS subprocess with asyncio. I can do that with gather for example and then I can find out at the end of the event loop, the status of each tasks. But I have to wait for the whole thing to finish. Even when each task run concurrently.
Is there a way to know that subprocess 1 already finished and react to that event, even before the other 9 tasks have completed?
I am working with Python >3.7 (3.8.6 and 3.9.1).
Maybe my question should be: Once that the event loop is running. Is there a way to find out the status of the tasks being running?
Or, the way it is expected that the task itself would do any after work after the await statement is completed but before returning and leaving the event loop.
I'll try that approach. In the meantime this is the code I am using for my basic testings:
Example of what I want:
import time
async def osrunner(cmd):
proc = await asyncio.create_subprocess_shell(
cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE)
stdout, stderr = await proc.communicate()
if stdout:
print(f'[stdout]\n{stdout.decode()}')
if stderr:
print(f'[stderr]\n{stderr.decode()}')
return True
async def main():
cmd00='sleep 35'
cmd01='sleep 15'
cmd02='sleep 25'
cmd03='sleep 5'
task0 = asyncio.create_task( osrunner(cmd00) )
task1 = asyncio.create_task( osrunner(cmd01) )
task2 = asyncio.create_task( osrunner(cmd02) )
task3 = asyncio.create_task( osrunner(cmd03) )
await task0
await task1
await task2
await task3
print(f"started main at {time.strftime('%X')}")
asyncio.run(main()) #<------------------I want to poll the status of the tasks and do something while the others are still unfinished
print(f"finished main at {time.strftime('%X')}")
I'm a python starter, and I'm trying to write some data analysis programs. the program is like below:
import asyncio
import time
class Test:
def __init__(self, task):
self.task = task
time.sleep(5) # here's some other jobs...
print(f'{self.task = }')
async def main():
result = []
tasks = ['task1', 'task2', 'task3', 'task4', 'task5', 'task6', 'task7', 'task8', 'task9']
print(f"started at {time.strftime('%X')}")
# I have a program structure like this, can I use async?
# how to start init tasks at almost the same time?
for task in tasks:
result.append(Test(task))
print(f"finished at {time.strftime('%X')}")
asyncio.run(main())
I've tried some other way like multiprocessing, it works, code like below:
...
def main():
result = []
tasks = ['task1', 'task2', 'task3', 'task4', 'task5', 'task6', 'task7', 'task8', 'task9']
print(f"started at {time.strftime('%X')}")
# I have a program structure like this, can I use async?
# how to start init tasks at the same time?
p = Pool()
result = p.map(operation, [(task,) for task in tasks])
print(f"finished at {time.strftime('%X')}")
...
but I still want to learn some 'modern way' to do this. I've found a module named 'ray', it's new.
But could async do this? I'm still wondering...
If someone can give me some advice, thanks a lot.
Your example code won't necessarily benefit from async IO, because __init__ is not "awaitable". You might be able to benefit from async if your code were structured differently and has an appropriate bottleneck. For example, if we had:
class Task:
def __init__(self):
<some not io bound stuff>
<some io bound task>
We could re-structure this to:
class Task:
def __init__(self):
<some not io bound stuff>
async def prime(self):
await <some io bound task>
Then in your main loop you can initialise the tasks as you're doing, then run the slow prime step in your event loop.
My advice here though would be to resist doing this unless you know you definitely have a problem. Coroutines can be quite fiddly, so you should only do this if you need to do it!
When I want to randomly shuffle a list in Python, I do:
from random import shuffle
shuffle(mylist)
How would I do the equivalent to an instance of asyncio.Queue? Do I have to convert the queue to a list, shuffle the list, and then put them back on the Queue? Or is there a way to do it directly?
As you can see in Queue source code, items in Queue are actually stored in _queue attribute. It can be used to extend Queue through inheritance:
import asyncio
from random import shuffle
class MyQueue(asyncio.Queue):
def shuffle(self):
shuffle(self._queue)
async def main():
queue = MyQueue()
await queue.put(1)
await queue.put(2)
await queue.put(3)
queue.shuffle()
while not queue.empty():
item = await queue.get()
print(item)
if __name__ == '__main__':
asyncio.run(main())
If you want to shuffle existing Queue instance, you can do it directly:
queue = asyncio.Queue()
shuffle(queue._queue)
It's usually not a good solution for obvious reason, but on the other hand probability that Queue's implementation will change in future in a way to make it problem seems relatively low (to me at least).
I need following workflow for my celery tasks.
when taskA finishes with success I want to execute taskB.
I know there is signal #task_success but this returns only task's result, and I need access to parameters of previous task's arguments. So I decided for code like these:
#app.task
def taskA(arg):
# not cool, but... https://github.com/celery/celery/issues/3797
from shopify.tasks import taskA
taskA(arg)
#task_postrun.connect
def fetch_taskA_success_handler(sender=None, **kwargs):
from gcp.tasks import taskB
if kwargs.get('state') == 'SUCCESS':
taskB.apply_async((kwargs.get('args')[0], ))
The problem is the taskB seems to be executed in some endless loop many, many times instead only once.
This way it works correctly:
#app.task
def taskA(arg):
# not cool, but... https://github.com/celery/celery/issues/3797
# otherwise it won't added in periodic tasks
from shopify.tasks import taskA
return taskA(arg)
#task_postrun.connect
def taskA_success_handler(sender=None, state=None, **kwargs):
resource_name = kwargs.get('kwargs', {}).get('resource_name')
if resource_name and state == 'SUCCESS':
if sender.name == 'shopify.tasks.taskA':
from gcp.tasks import taskB
taskB.apply_async(kwargs={
'resource_name': resource_name
})
just for reference:
celery==4.1.0
Django==2.0
django-celery-beat==1.1.0
django-celery-results==1.0.1
flower==0.9.2
amqp==2.2.2
Python 3.6