There are some discrepancies on the way the Celery canvas works in async and eager mode. I've noticed that a group followed by a chain in a dynamic task that replaces itself does not send the results along to the next on the chain.
Well, that seems complicated, let me show an example:
Given the following task:
#shared_task(bind=True)
def grouped(self, val):
task = (
group(asum.s(val, n) for n in range(val)) | asum.s(val)
)
raise self.replace(task)
when it's grouped in another canvas like this:
#shared_task(bind=True)
def flow(self, val):
workflow = (asum.s(1, val) |
asum.s(2) |
grouped.s() |
amul.s(3))
return self.replace(workflow)
the task amul will not receive the results from grouped when in eager mode.
To really ilustrate the issue, I've created a sample project on github where you can dive in into problem and help-me out with some quick solutions and possibly, some PR's on the celery project.
https://github.com/gutomaia/celery_equation
---- edited ----
On the project, I state the different behavior in both ways of using celery. In async mode, thouse tasks works as expected.
>>> from equation.main import *
>>> from equation.tasks import *
>>> flow.delay(1).get()
78
>>> flow.delay(2).get()
120
>>> flow.delay(100).get()
47895
I was struggling with this situation in a test case. For future readers, at least as of celery 4.4.0, the following idiom will work in all contexts, including synchronous, in-process execution:
return self.replace(...)
Using raise or simply letting the function end right after Task.replace will only work in asynchronous mode. The relevant code is right at the end of Task.replace:
if self.request.is_eager:
return sig.apply().get()
else:
sig.delay()
raise Ignore('Replaced by new task')
Sadly, eager mode will never be the same as running an actual worker. There's too many intricacies while running an actual worker for eager mode to be the exact same thing.
I agree that things like this should fall into special cases when using eager mode but some discrepancy is expected.
Please submit a PR if you know how to fix this issue and we can review the fix there. Thank you!
grouped() is not returning anything, so how do you expect amul to get the result??
Related
I am pretty new to Python, and I am coding a discord bot using discord.py rewrite, python 3.7. I typed up a command, but the bot seems to completely ignore it, and it doesn't give me any errors.
#client.command(pass_context = True)
async def poll(ctx, polltopic, *pollquestions):
print("Poll command activated.")
reactions = [":one:", ":two:", ":three:", ":four:",":five:",":six:", ":seven:", ":eight:", ":nine:", ":ten:"]
number = 0
await ctx.send("**POLL:** "+str(polltopic))
for x in pollquestions:
await ctx.send(str(reactions[number])+" "+str(x))
number = number+1
The print function used for debugging shows nothing in the output. As other websites advised me to do, I put:
await client.process_commands(message)
at the end of the on_message function. It is still completely ignoring the command, and not giving me any errors.
The problem is probably staring me right in the face, but I don't see it.
I found the error: it had nothing to do with the command syntax itself.
I had a return function exiting the on_message function early, before it got to the await client.process_commands(message), so the bot was ignoring the commands.
Currently I'm using Eclipse with Nokia/Red plugin which allows me to write robot framework test suites. Support is Python 3.6 and Selenium for it.
My project is called "Automation" and Test suites are in .robot files.
Test suites have test cases which are called "Keywords".
Test Cases
Create New Vehicle
Create new vehicle with next ${registrationno} and ${description}
Navigate to data section
Those "Keywords" are imported from python library and look like:
#keyword("Create new vehicle with next ${registrationno} and ${description}")
def create_new_vehicle_Simple(self,registrationno, description):
headerPage = HeaderPage(TestCaseKeywords.driver)
sideBarPage = headerPage.selectDaten()
basicVehicleCreation = sideBarPage.createNewVehicle()
basicVehicleCreation.setKennzeichen(registrationno)
basicVehicleCreation.setBeschreibung(description)
TestCaseKeywords.carnumber = basicVehicleCreation.save()
The problem is that when I run test cases, in log I only get result of this whole python function, pass or failed. I can't see at which step it failed- is it at first or second step of this function.
Is there any plugin or other solution for this case to be able to see which exact python function pass or fail? (of course, workaround is to use in TC for every function a keyword but that is not what I prefer)
If you need to "step into" a python defined keyword you need to use python debugger together with RED.
This can be done with any python debugger,if you like to have everything in one application, PyDev can be used with RED.
Follow below help document, if you will face any problems leave a comment here.
RED Debug with PyDev
If you are wanting to know which statement in the python-based keyword failed, you simply need to have it throw an appropriate error. Robot won't do this for you, however. From a reporting standpoint, a python based keyword is a black box. You will have to explicitly add logging messages, and return useful errors.
For example, the call to sideBarPage.createNewVehicle() should throw an exception such as "unable to create new vehicle". Likewise, the call to basicVehicleCreation.setKennzeichen(registrationno) should raise an error like "failed to register the vehicle".
If you don't have control over those methods, you can do the error handling from within your keyword:
#keyword("Create new vehicle with next ${registrationno} and ${description}")
def create_new_vehicle_Simple(self,registrationno, description):
headerPage = HeaderPage(TestCaseKeywords.driver)
sideBarPage = headerPage.selectDaten()
try:
basicVehicleCreation = sideBarPage.createNewVehicle()
except:
raise Exception("unable to create new vehicle")
try:
basicVehicleCreation.setKennzeichen(registrationno)
except:
raise exception("unable to register new vehicle")
...
I'm trying a simple sequence of tests on an API:
Create a user resource with a POST
Request the user resource with a GET
Delete the user resource with a DELETE
I've a single frisby test spec file mytest_spec.js. I've broken the test into 3 discrete steps, each with their own toss() like:
f1 = frisby.create("Create");
f1.post(post_url, {user_id: 1});
f1.expectStatus(201);
f1.toss();
// stuff...
f2 = frisby.create("Get");
f2.get(get_url);
f2.expectStatus(200);
f2.toss();
//Stuff...
f3 = frisby.create("delete");
f3.get(delete_url);
f3.expectStatus(200);
f3.toss();
Pretty basic stuff, right. However, there is no guarantee they'll execute in order as far as I can tell as they're asynchronous, so I might get a 404 on test 2 or 3 if the user doesn't exist by the time they run.
Does anyone know the correct way to create sequential tests in Frisby?
As you correctly pointed out, Frisby.js is asynchronous. There are several approaches to force it to run more synchronously. The easiest but not the cleanest one is to use .after(() -> ... you can find more about after() in Fisby.js docs.
There is a suitable method in the sbt.Exctracted to add the TaskKey to the current state. Assume I have inState: State:
val key1 = TaskKey[String]("key1")
Project.extract(inState).append(Seq(key1 := "key1 value"), inState)
I have faced with the strange behavior when I do it twice. I got the exception in the following example:
val key1 = TaskKey[String]("key1")
val key2 = TaskKey[String]("key2")
val st1: State = Project.extract(inState).append(Seq(key1 := "key1 value"), inState)
val st2: State = Project.extract(st1).append(Seq(key2 := "key2 value"), st1)
Project.extract(st2).runTask(key1, st2)
leads to:
java.lang.RuntimeException: */*:key1 is undefined.
The question is - why does it work like this? Is it possible to add several TaskKeys while executing the particular task by several calls to sbt.Extracted.append?
The example sbt project is sbt.Extracted append-example, to reproduce the issue just run sbt fooCmd
Josh Suereth posted the answer to sbt-dev mail list. Quote:
The append function is pretty dirty/low-level. This is probably a bug in its implementation (or the lack of documentation), but it blows away any other appended setting when used.
What you want to do, (I think) is append into the current "Session" so things will stick around and the user can remove what you've done via "sesison clear" command.
Additonally, the settings you're passing are in "raw" or "fully qualified" form. If you'd for the setting you write to work exactly the same as it would from a build.sbt file, you need to transform it first, so the Scopes match the current project, etc.
We provide a utility in sbt-server that makes it a bit easier to append settings into the current session:
https://github.com/sbt/sbt-remote-control/blob/master/server/src/main/scala/sbt/server/SettingUtil.scala#L11-L29
I have tested the proposed solution and that works like a charm.
I'm trying to better understand common strategies regarding results and errors in Celery.
I see that results have statuses/states and stores results if requested -- when would I use this data? Should error handling and data storage be contained within the task?
Here is a sample scenario, in case it helps better understand my objective:
I have a geocoding task that goeocodes user addresses. If the task fails or succeeds, I'd like to update a field in the database letting the user know. (Error handling) On success I'd like the geocoded data to be inserted into the database (Data storage)
What approach should take?
Let me preface this by saying that I'm still getting a feel for Celery myself. That being said, I have some general inclinations about how I'd go about tackling this, and since no one else has responded, I'll give it a shot.
Based on what you've written, a relatively simple (though I suspect non-optimized) solution is to follow the broad contours of the blog comment spam task example from the documentation.
app.models.py
class Address(models.Model):
GEOCODE_STATUS_CHOICES = (
('pr', 'pre-check'),
('su', 'success'),
('fl', 'failed'),
)
address = models.TextField()
...
geocode = models.TextField()
geocode_status = models.CharField(max_length=2,
choices=GEOCODE_STATUS_CHOICES,
default='pr')
class AppUser(models.Model):
name = models.CharField(max_length=100)
...
address = models.ForeignKey(Address)
app.tasks.py
from celery import task
from app.models import Address, AppUser
from some_module import geocode_function #assuming this returns a string
#task()
def get_geocode(appuser_pk):
user = AppUser.objects.get(pk=appuser_pk)
address = user.address
try:
result = geocode_function(address.address)
address.geocode = result
address.geocode_status = 'su' #set address object as successful
address.save()
return address.geocode #this is optional -- your task doesn't have to return anything
on the other hand, you could also choose to decouple the geo-
code function from the database update for the object instance.
Also, if you're thinking about chaining tasks together, you
might think about if it's advantageous to pass a parameter as
an input or partial input into the child task.
except Exception as e:
address.geocode_status = 'fl' #address object fails
address.save()
#do something_else()
raise #re-raise the error, in case you want to trigger retries, etc
app.views.py
from app.tasks import *
from app.models import *
from django.shortcuts import get_object_or_404
def geocode_for_address(request, app_user_pk):
app_user = get_object_or_404(AppUser, pk=app_user_pk)
...etc.etc. --- **somewhere calling your tasks with appropriate args/kwargs
I believe this meets the minimal requirements you've outlined above. I've intentionally left the view undeveloped since I don't have a sense of how exactly you want to trigger it. It sounds like you also may want some sort of user notification when their address can't be geocoded ("I'd like to update a field in a database letting a user know"). Without knowing more about the specifics of this requirement, I would it sounds like something that might be best accomplished in your html templates (if instance.attribute value is X, display q in template) or by using a django.signals (set up a signal for when a user.address.geocode_status switches to failure -- say, by emailing the user to let them know, etc.).
In the comments to the code above, I mentioned the possibility of decoupling and chaining the component parts of the get_geocode task above. You could also think about decoupling the exception handling from the get_geocode task, by writing a custom error handler task, and using the link_error parameter (for instance., add.apply_async((2, 2), link_error=error_handler.s(), where error_handler has been defined as a task in app.tasks.py ). Also, whether you choose to handle errors via the main task (get_geocode) or via a linked error handler, I would think that you would want to get much more specific about how to handle different sorts of errors (e.g., do something with connection errors different than with address data being incorrectly formatted).
I suspect there are better approaches, and I'm just beginning to understand how inventive you can get by chaining tasks, using groups and chords, etc. Hope this helps at least get you thinking about some of the possibilities. I'll leave it to others to recommend best practices.