I have a PHP application that uses the Github API to check if the local system of the user is up to date with the lastest version on Github.
By default you can only query the API 60 times per hour. But i can increase this when i authenticate first. But, you can also increase the rate limit without authenticating. Which is what i want.
The example that i found says i only have to call this URL through CUrl:
https://api.github.com/users/whatever?client_id=xxxxxxxxxxxxxx&client_secret=yyyyyyyyyyyyyyyyyyyyy
Not sure if this will work, also no idea what they mean with whatever in the URL.
In anyway this should do trick for increasing the rate limit. But they also say the following:
This method should only be used for server-to-server calls. You should
never share your client secret with anyone or include it in
client-side browser code.
Since my application is an "open source" PHP application, then that basically means i'm going to share my secret key with others who use my application...
Is there any other way to increase the rate limit, without worrying that i'm sharing sensitive data with others?
You have to contact GitHub support to have your ratelimit increased, contrary to what spuder claims. They'll raise it for you.
As for not sharing your client_id or client_secret, spuder has the right answer there. Use an environment variable in your production system and get that and set the secret/id pair from there. Alternatively, use a configuration file that will be added to .gitignore so you never accidentally commit it.
Why not set the client secret as a variable, and purposely put a fake client secret in your code. That will prevent anyone who pulls your code from accidentally or maliciously using your key. To actually use the program, the end user will have to change the key variable.
The number of API requests per hour can not be increased {Correction: unless you contact github}
Related
So I have a platform that works like this: Users can create accounts by logging in with their Google (I USE AUTH0) and then they can create "Projects" which contain lots of other unimportant stuff regarding my current problem (like todo lists, ability to upload files etc; they can also Edit the project by changing some of it's attributes like name, description, theme and so on). There is a home page where everyone can see each other's projects and access them (but not upload files, change the tasks in the to do lists; this is possible only by the person that owns it).
By using a tool like Burp, people can see the request made from frontend to backend, for example when accessing one of the projects, and modify it on the fly.
This is what it looks like inside Burp when they access one of the projects:
As you can see there is a Get request to /projects/idOfTheProject; they can replace the GET with DELETE for example and they will successfully delete it; they can also see what is sent to the backend when a project is edited (name changed, description, thumbnail picture etc) and change anything they want about it.
How should I prevent this?
What I've looked at so far:
a. JWT - Probably the best fitting for my situation, but required the most work to be done (as I already have my platform almost finished with no such a security measure implemented yet, so I may need to rewrite a lot of things in both backend and frontend)
b. Sending the user's id that initiated the action as well to the backend and verify if it has the necessary privileges - the worst solution as users can access each other's profile and see the id, then just change another field in the request's JSON
c. Have a sort of token for each user and send that instead of the user's id - in this way somebody can't get your token by just looking at the communication between frontend and backend (only if it is using YOUR account). That token should be taken maybe somewhere from the auth0 when they create their account? If they provide something like that; or I can just create it myself and store it alongside the other user variables. You would still see the requests in plain text but even if you modified something you would still have to "guess" the owner's token, which will be impossible.
For frontend I use NextJS and for backend Flask.
Thank you in advance!
The TL;DR is that you don’t. A determined user will always be able to see what requests are being sent out by the code running on their computer and over their network. What you are describing when asking how to prevent people from “sniffing” these requests is security through obscurity, which isn’t actually secure at all.
What you should do instead is have an authorization system on your backend which will check if the current user can perform a given action on a given resource. For example, verifying that a user is an administrator before allowing them to delete a blog post, or making sure that the current user is on the same account as another user before allowing the current user to see details about the other user.
For my application/crawlers I collect lots of data which leads to exceeding the rate limit very often. I crawl News-Pages and stuff like that so the Token doesn´t need any permissions like posting stuff.
When using the Graph Explorer you can create an User Access Token(lasts 1-2 hours before expiring). You can create as many as you want so I thought it may be possible to abuse this and overcome the rate limit. I tested it and it somehow worked. Did like 6000 API calls with 2 Tokens in under an hour.
Questions:
Did someone else try this already? If so, did Facebook notice and shut down the Account?
Is it possible to request a new User Token from the Graph Explorer via Code or something else like a virtual machine running with an mouse makro to generate new Tokens every ~30mins?
Yes. Yes :) It can go til banning the account or the IP from which the requests are made.
Access Tokens can be obtained by code and you can create more and make some balancing between them combined with different proxies you request through.
HOWEVER I recommend you to use the Facebook ways and respect their politics.
Background:
I have a single page application that pulls data from a REST API. The API is designed such that the only URL necessary is the API root, ie https://example.com/api which provides URLs for other resources so that the client doesn't need to have any knowledge of how they are constructed.
API Design
The API has three main classes of data:
Module: Top level container
Category: A sub-container in a specific module
Resource: An item in a category
SPA Design
The app consuming the API has views for listing modules, viewing a particular module's details, and viewing a particular resource. The way the app works is it keeps all loaded data in a store. This store is persistent until the page is closed/refreshed.
The Problem:
My question is, if the user has navigated to a resource's detail view (example.com/resources/1/) and then they refresh the page, how do I load that particular resource without knowing its URL for the API?
Potential Solutions:
Hardcode URLs
Hardcoding the URLs would be fairly straightforward since I control both the API and the client, but I would really prefer to stick to a self describing API where the client doesn't need to know about the URLs.
Recursive Fetch
I could fetch the data recursively. For example, if the user requests a Resource with a particular ID, I could perform the following steps.
Fetch all the modules.
For each module, fetch its categories
Find the category that contains the requested resource and fetch the requested resource's details.
My concern with this is that I would be making a lot of unnecessary requests. If we have 100 modules but the user is only ever going to view 1 of them, we still make 100 requests to get the categories in each module.
Descriptive URLs
If I nested URLs like example.com/modules/123/categories/456/resources/789/, then I could do 3 simple lookups since I could avoid searching through the received data. The issue with this approach is that the URLs quickly become unwieldy, especially if I also wanted to include a slug for each resource. However, since this approach allows me to avoid hardcoding URLs and avoid making unnecessary network requests, it is currently my preferred option.
Notes:
I control both the client application and the API, so I can make changes in either place.
I am open to redesigning the API if necessary
Any ideas for how to address this issue would by greatly appreciated.
Expanding on my comment in an answer.
I think this is a very common problem and one I've struggled with myself. I don't think Nicholas Shanks's answer truly solves this.
This section in particular I take some issues with:
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The implication I take from this, is that urls on your application are only valid for the life-time of the history file or disk cache, and cannot be shared with other users.
If that is good enough for your use-case, then this is probably the simplest, but I feel that there's a lot of cases where this is not true. The most obvious one indeed being the ability to share urls from the frontend-application.
To solve this, I would sum the issue up as:
You need to be able to statelessly map a url from a frontend to an API
The simplest, but incorrect way might simply be to map a API url such as:
http://api.example.org/resources/1
Directly to url such as:
http://frontend.example.org/resources/1
The issue I have with this, is that there's an implication that /resource/1 is taken from the frontend url and just added on to the api url. This is not something we're supposed to do, because it means we can't really evolve this api. If the server decides to link to a different server for example, the urls break.
Another option is that you generate uris such as:
http://frontend.example.org/http://api.example.org/resources/1
http://frontend.example.org/?uri=http://api.example.org/resources/1
I personally don't think this is too crazy. It does mean that the frontend needs to be able to load that uri and figure out what 'view' to load for the backend uri.
A third possibility is that you add another api that can:
Generate short strings that the frontend can use as unique ids (http://frontend.example.org/[short-string])
This api would return some document to the frontend that informs what view to load and what the (last known) API uri was.
None of these ideas sound super great to me. I want a better solution to this problem, but these are things I came up with as I was contemplating this.
Super curious if there's better ideas out there!
The current URL that the user is viewing, and the steps it took to get to the current place, are both application state (in the HATEOAS sense).
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The starting point of the API is (well, can be) compiled-in to your client. Commpiled-in URLs are what couple the client to the server, not URLs that the user has visited during use of the client, including the current URL.
Your question, "For example, if the user requests a Resource with a particular ID", indicates that you have not grasped the decoupling that HATEOAS provides.
The user NEVER asks for a resource with such-and-such an ID. The user can click a link to get a query form, and then the server provides a form that generates requests to /collection/{id}. (In HTML, this is only possible for query strings, not path components, but other hypermedia formats don't have this limitation).
When the user submits the form with the ID number in the field, the client can build the request URL from the data supplied by the server+user.
Using access logging, as detailed here: https://cloud.google.com/storage/docs/access-logs, we can download and analyze usage patterns for our data. However, looking at the actual data, I noticed that there is no data relating which user (or service-account) performed the operation. The closest seeming attribute in the usage logs as described here: https://cloud.google.com/storage/docs/access-logs#format is the cs_user_agent attribute. However, this attribute seems to describe more about the tool performing the access (i.e., gsutil or gcloud) rather than the user.
Is it possible to obtain information that relates the activity to a user/service-account? Perhaps using the s_request_id attribute?
Is there a technical reason this is missing? Or is it intended as a privacy-preserving mechanism?
You can read about the access log schema in our docs. cs_user_agent tells you what kind of program made the request, while s_request_id is a unique id for the request. The closest thing to what you need would be c_ip, which is the IP address of the machine making the request. You could possibly tie those to users. If this doesn't suffice, you could have your clients make requests only through some code you yourself have written which will log - in parallel to sending the request - the URL, method, any headers / metadata desired, and the user / account making the request.
I have tried raising this concern on Facebook/Support/Bugs but they said I should post implementation issues here. I have read it everywhere and it seems to be quiet open issue till now. I am not sure, If this will be solved or not.
So, what we are doing is, we have clients - Android and iOS.
Apps on Android/iOS allows users to login into the app and generate the token on the basis of permissions set we have, and we are passing this token to server for fetching further data as and when required for client. As our userbase is increasing we are getting Application request limit reached quiet often.
We are fetching photos of users and their friends using FQL. So, when parallely fetching photos for around 8-10 different users, we are reaching the Application request limit sometimes, which is quiet random and we are not aware of the actual scenario when it breaks up and how. According to facebook the limit, which is 1M calls per day, but we are hitting around 80K - 1 Lac API calls in a day, but as users are increasing it is stretching a bit further, Less than or equal to 200 approax calls/user. We tried doing batch calls as well and we hit the application request limit as well.
If anyone of you could help us understand the complete concept of API limit and how this can be handled, then we will really appreciate the help. We want to understand how API limit is decided and it's rate is calculated over which interval so that we will be able to configure on our side accordingly.
Earlier in the day, we ran into a unique API call issue. Our server started to break for API calls for user tokens that are with us, we (on our systems, other than server) tried fetching the data for those tokens (Simple calls - /me or /me/home), and it was working alright for us but not for server, then we tried setting up another server and redirected the requests to our new server then this server works well for the same set of users. Not sure, what went wrong in this case and how it breaks up. Please help.
Many Thanks,
Reno Jones
Did you look at the Insights -> Developer section of developer.facebook.com for your app?
This will show you a breakdown per api call, including warnings and ones that are currently being throttled and why.
Also, are you sure you're using User token authorization and not just your App token?
Beyond that, we take the information from Insights to find api calls to cache on our side rather than hitting Facebook every time. You will likely have to do something similar if you're not already. They have limits for calling too often, as well as for requesting too much data. For those, we had to reduce the limits of historical data we requested.