Is it safe to use Celery task IDs in HTTP requests? - celery

I am starting to use Celery in a Flask-based web application to run async tasks on the server side.
Several resources get an '/action' sub-resource to which the user/client can send a POST including a JSON-body specifying an action, for example:
curl -H "Content-Type: application/json" -X POST \
-d '{"doPostprocessing": { "update": true}}}' \
"http://localhost:5000/api/results/123/action"
They get a 202 ACCEPTED response with a header
Location: http://localhost:5000/api/results/123/action/8c742418-4ade-474f-8c54-55deed09b9e5
they can poll to get the final result (or get another 202 ACCEPTED if the task is still running).
The ID I am returning for the action is the celery.result.AsyncResult.id.
Is this a safe thing to do? What kind of problems do I create when passing Celery task ids directly to the public?
If not, is there a recommended way to it? Preferably one which avoids having to track the state of the tasks explicitly.

You will be fine using the task ID. Celery uses Kombu's uuid function, which in turn uses uuid4 by default. uuid4 is randomly generated, rather than based off mac address (which uuid1 is), so will be 'random enough'.
The only other way would be to have an API endpoint that returns the status of all running tasks for the user. i.e. remove any task ID. But you will then remove the ability to query an individual task. Other options will effectively mask the task ID behind a different random number, so you'll have the same brute force problem.
I'd recommend having a look through the security Stack Exchange for UUID questions (https://security.stackexchange.com/search?q=uuid). Some of these will no doubt be equivalent to what you're looking for.

Related

Which HTTP Verb should I use to claim and lock an item in a job queue?

I plan on using an HTTP REST interface to connect to a Job Control service.
One key operation is to request a computational Job.
The caller does not know the ID of the Job; that is what it will be told.
The job will be marked in the database as locked by the service.
The data needed for processing of the job will be returned to the caller.
Later on, when the caller is done processing the job, it will send the results back via another REST call.
Now it knows the ID of the record to be updated.
The second REST call will update the Job record with the results.
and change the Job's status and release the lock.
Only the Success/Fail status needs to be returned.
I am leaning towards using PUT for each operation because no new record is being created; it is being updated in both cases.
Is this proper? Can the first PUT return a large JSON payload with the Job data or does it just return an HTTP status? Should I use a POST instead, even though I am not creating a record, just updating it?
I would have used a GET for the first operation, but a GET is not supposed to change any objects on the service, and I am locking it, which is a change. Is locking a record acceptable in a GET request?
Which HTTP Verb should I use to claim and lock an item in a job queue?
Key idea: a REST API is a facade - your application/service pretends to be an HTTP compliant document store. All of the interesting things that happen are side effects triggered by modifying documents. See Jim Webber, 2011.
With that in mind...
POST is fine. It's okay to use POST.
PUT/PATCH are a good for remote authoring; the client fetches your representation of a resource, makes edits to his local copy, and sends you a copy of the representation (PUT) or a patch document describing the changes (PATCH). The server can then apply those edits to its copy, or not.
So for your specific example, I would expect the client to GET a representation of your resource, change the information in that representation from unlocked to locked, and then to PUT the changed representation back to your server. You server would be expected to update your copy of the representation to match.
It may remind you of a declarative style - the client tells the server what the representation should look like, and it's up to the server to figure out how to do that.
Included for Completeness, NOT Recommened:
The HTTP method registry also includes a method LOCK, with a corresponding UNLOCK. The semantics for these method tokens are defined by the WebDAV specification. If your meaning of LOCK matches that of WebDAV, then using that might be an answer. Note that the specification includes comments like
Any resource that supports the LOCK method MUST, at minimum, support the XML request and response formats defined herein.
Unless you are already in a space where people are expecting to be able to use general-purpose WebDAV clients to interact with your API, that's probably not a good fit.
The HTTP method registry is extendable. So you could define the semantics of your own method token, then push to have it adopted as a standard.

How can I set my local waves network successfully?

I have 2 questions.
I did execute local waves network.
I want to set 2 miner nodes.
First booted node did woking well and mining blocks.
Second booted node did woking well but just syncing blocks.
Second node didn't mining blocks.
Second node also did set "miner.enable=yes" and have 1000WAVES.
Is there anything else that needs to be set for this node to be minor? Or does this node need time to participate in the mining schedule?
I want to get miner info by using REST API.
My local node's config did set like followings.
api-key-hash = "H6nsiifwYKYEx6YzYD7woP1XCn72RVvx6tC1zjjLXqsu"
And I did call API like this.
curl -X GET http://127.0.0.1:6869/debug/minerInfo -H "Content-Type:application/json" -H "api_key: H6nsiifwYKYEx6YzYD7woP1XCn72RVvx6tC1zjjLXqsu"
But I got error message like this.
{"error":2,"message":"Provided API key is not correct"}
I did call same API in "https://nodes-testnet.wavesnodes.com/api-docs/index.html#/debug/minerInfo_1"
But I got same error message.
How can I call this API successpully?
That should be enough, but if your first node has 99.9999 million Waves and the second one - 1000, the first one will generate 99.9999% of blocks, so maybe it is not the proper time to generate a block for the second node.
You should add header X-Api-Key with an actual API key, not with the hash of it. For example, you had "myawesomekey" and got a hash from it (H6nsiifwYKYEx6YzYD7woP1XCn72RVvx6tC1zjjLXqsu), then you send a header X-Api-Key=myawesomekey

Finding all the users in Jira using the REST API

I'm trying to list all the users in Jira using the REST API, I'm currently using the search user feature using GET : https://docs.atlassian.com/jira/REST/server/#api/2/user-findUsers
The thing is it says that the result will by default display the 50 first result and that we can expand that result up to 1000. Compared to other features available in the REST API, the pagination here is not specified.
An example is the group member feature : https://docs.atlassian.com/jira/REST/server/#api/2/group-getUsersFromGroup
Thus I did a test and with my test Jira filled with 2 members, tried to get only one result and see if there was some sort of indication referring to the rest of my result.
The response provided will only give the results and no ways to get to know if there was more thatn 1000 (or 1 in my example), it's maybe logical but in the case of an organization with more than 1000 members, listing all the users doing this : http://jira/rest/api/2/user/search?username=.&maxResults=1000&includeInactive=true will only give at most 1000 results.
I'm getting all the users no matter what their name are using . as the matching character.
Thanks for your help!
What you can do, is to calculate manually the number of users.
Let's say you have 98 users in your system.
First search will give you 50 users. Now you have an array and you can get the length of that array which is 50.
Since you do not know if there are 50 or 51 users, you execute another search with the parameter &startAt=50.
This time the array length is 48 instead of 50 and you know that you've reached all the users in the system.
From speaking to Atlassian support, it seems like the user/search endpoint has a bug where it will only ever return the first 1,000 results at most.
One possible other way to get all of the users in your JIRA instance is to use the Crowd API's /rest/usermanagement/1/search endpoint:
curl -X GET \
'https://jira.url/rest/usermanagement/1/search?entity-type=user&start-index=0&max-results=1000&expand=user' \
-H 'Accept: application/json' -u username:password
You'll need to create a new JIRA User Server entry to create Crowd credentials (the username:password parameter above) for your application to use in its REST API calls:
Go to User Management.
Select JIRA User Server.
Add an application.
Enter the application name and password that the application will use when accessing your JIRA server application.
Enter the IP address, addresses, or IP CIDR block of the application, and click Save.

Pass complex object to delete method of REST Service

I have REST service that manage the resource EASYPAY.. In this moment this service exposes 3 different methods:
Get a EasyPay request (GET);
Insert a Easypay request (POST);
Update a Easypay request (PUT).
Whe I inserto or update a request I must insert also a row on a trace table on my database.
Now I have to delete a Easypay request and I must add also a row on the trace table.
I wanted to use the DELETE HTTP verb, but I saw that with delete I cannot pass a complex object but just the ID of the request to delete.
I cannot use the PUT HTTP verb because I have already used it, and in any case it would not be conceptually correct...
I do not want to do more that one call from client to server (one for delete the request, the other one to add a row in the trace table)..
So I do not know how to solve the problem.
EDIT
I try to explain better... I have a web site that is deployed on two different server. One for front-end and one for Back-end. Back-end expose some REST services just for front-end and it has no access to internet (just to intranet).
The customer that access the web site can do a payment via a system called XPAY and it works really similar to paypal (XPAY is just another virtual POS).
So when the customer try to do a payment, I save some information on the database + I trace the attempt of the payment, then he is redirected to XPAY. There, he can do the payment. At the endy XPAY return to the web-site (the front end) communicating us the result of the payment.
The result is in the URL of payment, so i must take all the information in the url and send them to the back-end.
According to the result, I must update (if result is ok) or delete (if result is ko) the information I saved before and write a row on the trace table.
What do you suggest?
Thank you
There are actually a couple of ways to solve your problem. First, REST is just an architectural style and not a protocol. Therefore REST does not dictate how an URI has to be made up or what parameters you pass. It only requires a unique resource identifier and probably that it should be self-descriptive, meaning that a client can take further actions based on the returned content (HATEOAS, included links even to itself and proper content type specification).
DELETE
As you want to keep a trace of the deleted resource in some other table, you can either pass data within the URI itself maybe as query parameter (even JSON can be encoded in order to be passed as query parameter) or use custom HTTP headers to pass (meta-)information to the backend.
Sending a complex object (it does not matter if it is XML or JSON) as query parameter may cause certain issues though as some HTTP frameworks limit the maximum URI size to roughly 2000 characters. So if the invoked URI exceeds this limit, the backend may have trouble to fulfill the request.
Although the hypertext transfer protocol does not define a maximum number (or size) of headers certain implementations may raise an error if the request is to large though.
POST
You of course also have the possibility to send a new temporary resource to the backend which may be used to remove the pending payment request and add a new entry to the trace table.
According to the spec:
The action performed by the POST method might not result in a resource that can be identified by a URI. In this case, either 200 (OK) or 204 (No Content) is the appropriate response status, depending on whether or not the response includes an entity that describes the result.
This makes POST request feasible for short living temporary resources which trigger some processing on the server side. This is useful if you want to design some queue-like or listener system where you place an action for execution into the system. As a POST request may contain a body, you can therefore send the POS response within the body of the POST request. This action request can then be used to remove the pending POS request and add a new entry to the trace table.
PATCH
Patch is a way a client can instruct a server to transform one or more resources from state 1 to state 2. Therefore, the client is responsible for breaking down the neccessary actions the server has to take to transform the resources to their desired state while the server tries to execute them. A client always works on a known state (which it gathered at some previous time). This allows the client to modify the current state to a desired state and therefore know the needed steps for the transition. As of its atomic requirements either all instruction succeed or none of them.
A JSON Patch for your scenario may look like this:
PATCH /path/to/resource HTTP/1.1
Host: backend.server.org
Content-lengt: 137
Content-Type: application/json-patch+json
If-Match: "abc123"
[
{ "op": "remove", "path": "/easyPayRequest/12345" }
{ "op": "add", "path": "/trace/12345", "value": [ "answer": "POSAnswerHere" ] }
]
where 12345 is the ID of the actual easypay request and POSAnswerHere should be replaced with the actual response of the POS service or what the backend furthermore expects to write as a trace.
The If-Match header in the example just gurantees that the patch request is executed on the latest known state. If in the meantime a further process changed the state (which also generates a new If-Match value) the request will fail with a 412 Precondition Failed failure.
Discussion
While DELETE may initially be the first choice, it is by far not the best solution in your situation in my opinion as this request is not really idempotent. The actual POS entity deletion is idempotent but the add of the trace is not as multiple sends of the same request will add an entry for each request (-> side-effect). This however contradicts the idempotency requirements of the DELETE operation to some degree.
POST on the otherhand is an all-purpose operation that does not guarantee idempotency (as PATCH does not gurantee either). While it is mainly used to create new resources on the server side, only the server (or the creators of that server-application) know what it actually does with the request (though this may be extended to all operations). As there are no transactional restrictions adding the trace might succeed while the deletion of the pending request entity might fail or vice versa. While this may be handled by the dev-team, the actual operation does not give any gurantees on that issue. This may be a major concern if the server is not in your own hands and thus can not be modified or checked easily.
The PATCH request, which is still in RFC, may contain therefore a bit more semantic then the POST request. It also specifies the ability to modify more then one resource per request explicitely and insist on atomicity which also requires a transaction-like handling. JSON Patch is quite intuitive and conveys a more semantics then just adding the POS response to a POST entity body.
In my opinion PATCH should therefore be prefered over POSTor DELETE.

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST