github api is it possible to bring down requests? - github

Playing around with the Github API, I see that there is a limit of 5,000 requests an hour. Is there a way to increase this? (Not the main question)
When I hit the /users/:username/events endpoint, I get back an array of events. The ones that are PushEvent have an array called commits. Each commit has its own endpoint that I can hit to pull more data.
This racks up requests super quickly and I was wondering if there was a better way to do this.
Thanks!

This limit is in place to prevent a single user from using too many resources, so it's not likely that it will be raised. If you want to make one request for multiple events, you can do that with the v4 GraphQL API. Note that the API limits are different for the v4 API and are scored based on points, but it's possible that you may be able to structure your query so as to be more efficient and allow you more data to be fetched.
This answer explains how you can write a GraphQL query to inquire about multiple objects with one request.
There may be alternative ways to get the information you want. For example, if you want to know about push events as they happen instead of after the fact, you may be able to set up a webhook, which isn't rate limited.

Related

How to design DELETE REST API that requires lots of data?

I want to implement a DELETE REST API. But I need the option to provide a list of IDs to be deleted. This list could be arbitrarily long and may not fit within a URL.
I know POST supports this, but support for this and DELETE seems debatable. I wonder how others are handling this case.
How would an API be designed to handle this case?
This is unfortunately one of the biggest limitations in REST, but there are ways around it.
In this case I would abstract out a new entity, DeletionRequest, and have that get posted or put with the appropriate IDs. Since it is a new entity it would have its own rest endpoints.
A nice side effect of this is that the endpoints and entity can be expanded out to support async requests. If you want to delete a ton of data you don't want to rely on it happening in a single request, as things like timeouts can get in the way. With a DeletionRequest the user can get an ID for the deletion request on the first push, and then check the status with a GET request. Behind the scenes you can use an async system (celery, sidekiq, etc) to actually delete things and update the status of the DeletionRequest.
You don't have to take it that far to start, of course, but this would allow you to expand the application in that direction without having to change your API.
The URI is the resource identifier, so in my opinion the DELETE should not contain a body even if you can do it with your client and server. Either you send your data in the URI or you send it prior the DELETE.
I see 3 options here, but maybe there are others:
Do what Robert says and POST a transaction resource instead like DeletionRequest.
Group the resources you want to delete and DELETE the entire group.
Do a massive hack and PATCH the collection of resources you want to delete from.

REST api - can I call multiple APIs within another API? What happens if one of the internal API calls fail?

So I have a question regarding REST API design:
I have multiple REST APIs that have specific behaviour, e.g. update vehicle status (e.g. available, on-hire, maintenance, retired) which does some validation to check to see if vehicle currently has any hire bookings before changing status to say retired etc. Another one is to log any instances of damage to the car once a vehicle has been returned from hire, this just records general comments on vehicle condition. And another one which retires a vehicle from use (e.g.e end of life), this duplicates the logic and actions of the other two APIs as part of this RETIRE API.
However I want to prevent the duplication of code/logic by just changing the RETIRE API to call the individual APIs as part of the call to this API, this will help when I need to change the logic of the other APIs and prevent me from the duplicating this logic in the RETIRE api.
So within the current API design there is error handling, e.g. if a specific action causes an error then the API will roll back the transaction and display an error to the user with why it has failed, otherwise commit the changed data. This works great.
However if I call each API within this RETIRE API how can I handle the errors, e.g. if the RETIRE API first calls the 'DAMAGE API' to record any damage and that succeeds it will commit the data, but then the 'VEHICLE STATUS' API fails it would send a response to the user with the relevant error, but this is where the problem is, the damage API has already ran and succeeded so this data is already saved... So if the user tries again and this time everything succeeds I will have duplicate data in the 'damage' section.
So how can I only commit the data for all of the APIs once they have all returned with a success? Or is it better to have the 3 APIs that will still be independent with each other but maybe create functions and call these functions within each API so if I change the logic of a given action they will all follow suit?
Sorry for the story but I just wanted to help get my problem across to you all.
Thanks in advance
Your description indicates that you want the RETIRE API to use some of the logic in the DAMAGE API, but not all of it... So factor out the stuff you want to reuse into an internal method that can be called by both APIs.

How to get list of contributors of particular organisation in GitHub API

Is there any way to get all contributors from organisation on GitHub using GitHub API or any external service?
I am trying to get all contributors from angular organisation using GitHub API.
I've found only one solution:
Get all repos from angular organisation using this request:
GET https://api.github.com/orgs/angular/repos
For each repo, get all its contributors by this request:
GET https://api.github.com/repos/angular/:repo/contributors
Merge all derived data to one array.
It seems to work, but I think this solution very cumbersome. I'm sending around 300 requests this way, and they are processing around 20 seconds(app will be frozen until all requests are not finished).
Questions:
Are there any alternatives to this approach?
Is it ok for github registered app to handle such many requests? I mention, these 300 requests are sending each time application starts.
Are there any alternatives to this approach?
No, not really -- I can't think of a better approach for this.
Is it ok for github registered app to handle such many requests? I mention, these 300 requests are sending each time application starts.
You should be fine as long as you respect the primary and secondary GitHub API rate limits.
https://developer.github.com/v3/#rate-limiting
https://developer.github.com/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits
The primary limits allow you to make 5000 authenticated requests per hour per user. The secondary limits will be triggered if you start making lots of concurrent requests (e.g. hundreds of requests per second for more than several second). So, you should be fine if you need to make 300 requests, just make sure you dial down the concurrency.
It would be even better if the application cached some of this information so that it can make conditional requests:
https://developer.github.com/v3/#conditional-requests

Delete multiple records using REST

What is the REST-ful way of deleting multiple items?
My use case is that I have a Backbone Collection wherein I need to be able to delete multiple items at once. The options seem to be:
Send a DELETE request for every single record (which seems like a bad idea if there are potentially dozens of items);
Send a DELETE where the ID's to delete are strung together in the URL (i.e., "/records/1;2;3");
In a non-REST way, send a custom JSON object containing the ID's marked for deletion.
All options are less than ideal.
This seems like a gray area of the REST convention.
Is a viable RESTful choice, but obviously has the limitations you have described.
Don't do this. It would be construed by intermediaries as meaning “DELETE the (single) resource at /records/1;2;3” — So a 2xx response to this may cause them to purge their cache of /records/1;2;3; not purge /records/1, /records/2 or /records/3; proxy a 410 response for /records/1;2;3, or other things that don't make sense from your point of view.
This choice is best, and can be done RESTfully. If you are creating an API and you want to allow mass changes to resources, you can use REST to do it, but exactly how is not immediately obvious to many. One method is to create a ‘change request’ resource (e.g. by POSTing a body such as records=[1,2,3] to /delete-requests) and poll the created resource (specified by the Location header of the response) to find out if your request has been accepted, rejected, is in progress or has completed. This is useful for long-running operations. Another way is to send a PATCH request to the list resource, /records, the body of which contains a list of resources and actions to perform on those resources (in whatever format you want to support). This is useful for quick operations where the response code for the request can indicate the outcome of the operation.
Everything can be achieved whilst keeping within the constraints of REST, and usually the answer is to make the "problem" into a resource, and give it a URL.
So, batch operations, such as delete here, or POSTing multiple items to a list, or making the same edit to a swathe of resources, can all be handled by creating a "batch operations" list and POSTing your new operation to it.
Don't forget, REST isn't the only way to solve any problem. “REST” is just an architectural style and you don't have to adhere to it (but you lose certain benefits of the internet if you don't). I suggest you look down this list of HTTP API architectures and pick the one that suits you. Just make yourself aware of what you lose out on if you choose another architecture, and make an informed decision based on your use case.
There are some bad answers to this question on Patterns for handling batch operations in REST web services? which have far too many upvotes, but ought to be read too.
If GET /records?filteringCriteria returns array of all records matching the criteria, then DELETE /records?filteringCriteria could delete all such records.
In this case the answer to your question would be DELETE /records?id=1&id=2&id=3.
I think Mozilla Storage Service SyncStorage API v1.5 is a good way to delete multiple records using REST.
Deletes an entire collection.
DELETE https://<endpoint-url>/storage/<collection>
Deletes multiple BSOs from a collection with a single request.
DELETE https://<endpoint-url>/storage/<collection>?ids=<ids>
ids: deletes BSOs from the collection whose ids that are in the provided comma-separated list. A maximum of 100 ids may be provided.
Deletes the BSO at the given location.
DELETE https://<endpoint-url>/storage/<collection>/<id>
http://moz-services-docs.readthedocs.io/en/latest/storage/apis-1.5.html#api-instructions
This seems like a gray area of the REST convention.
Yes, so far I have only come accross one REST API design guide that mentions batch operations (such as a batch delete): the google api design guide.
This guide mentions the creation of "custom" methods that can be associated via a resource by using a colon, e.g. https://service.name/v1/some/resource/name:customVerb, it also explicitly mentions batch operations as use case:
A custom method can be associated with a resource, a collection, or a service. It may take an arbitrary request and return an arbitrary response, and also supports streaming request and response. [...] Custom methods should use HTTP POST verb since it has the most flexible semantics [...] For performance critical methods, it may be useful to provide custom batch methods to reduce per-request overhead.
So you could do the following according to google's api guide:
POST /api/path/to/your/collection:batchDelete
...to delete a bunch of items of your collection resource.
I've allowed for a wholesale replacement of a collection, e.g. PUT ~/people/123/shoes where the body is the entire collection representation.
This works for small child collections of items where the client wants to review a the items and prune-out some and add some others in and then update the server. They could PUT an empty collection to delete all.
This would mean GET ~/people/123/shoes/9 would still remain in cache even though a PUT deleted it, but that's just a caching issue and would be a problem if some other person deleted the shoe.
My data/systems APIs always use ETags as opposed to expiry times so the server is hit on each request, and I require correct version/concurrency headers to mutate the data. For APIs that are read-only and view/report aligned, I do use expiry times to reduce hits on origin, e.g. a leaderboard can be good for 10 mins.
For much larger collections, like ~/people, I tend not to need multiple delete, the use-case tends not to naturally arise and so single DELETE works fine.
In future, and from experience with building REST APIs and hitting the same issues and requirements, like audit, I'd be inclined to use only GET and POST verbs and design around events, e.g. POST a change of address event, though I suspect that'll come with its own set of problems :)
I'd also allow front-end devs to build their own APIs that consume stricter back-end APIs since there's often practical, valid client-side reasons why they don't like strict "Fielding zealot" REST API designs, and for productivity and cache layering reasons.
You can POST a deleted resource :). The URL will be
POST /deleted-records
and the body will be
{"ids": [1, 2, 3]}

How do I create a stack in a REST API?

I am working on a distributed execution server. I have decided to use a REST API based on HTTP on the server. The clients will connect to the server and GET the next task to be accomplished. Obviously I need to "update" the task that is retrieved to ensure that it is only processed once. A GET is not supposed to have any side effects (like changing the state of the resource retrieved). I could use a POST (to update the resource), but I also need to retrieve it. I am thinking that I could have a URL that a POST marks the task as "claimed", then a GET marks the task as retrieved. Unfortunately I have a side effect on GET again. Is this just not going to work in REST? I am OK with have a "function" resource to do this, but don't want to give up the paradigm without a little research.
Pat O
If nothing else fits, you're supposed to use a POST request. Nothing prevents you from returning the resource on a POST request. But it becomes apparent that something (in this case) will happen to that resource, which wouldn't be the case when using a GET request.
REST is really just a concept, and you can implement it however you want. There is no one 'right way', as everyones use cases are different. (yes I understand that there is a defined spec out there, but you can still do it however you want) In this situation if your GET needs to have a side effect, it will have a side effect. Just make sure to properly document what you did (and potentially why you did it).
However it sounds like you're just trying to create a queue with multiple subscribers, and if the subscribers are automated (such as scripts or other machines) you may want to look at using an actual queue. (http://www.rabbitmq.com/getstarted.html).
If you are using this to power a web UI or something where actual people process this, you could also use a queue, with your GET request simply pulling the next item from the queue.
Note that when using most of the messaging systems you will not be able to guarantee the order in which the messages are pulled from the queue, so if the order is necessary you may not be able to use this approach.