Limit Number of Posts coming from /feed Facebook Graph API - facebook

When I use /{page_id}/feed?access_token=xxxx, this give me all the posts on the page, both by user and page. I want to limit and control the posts. I want to put constraints like:
Timestamp (that is to get posts after a particular timestamp)
Post id (to get post after a particular post)
Since getting all the posts from feed is irrelevant and in-effective. Is there any way to accomplish this ?

You can use
GET /{page_id}/feed?limit={nr_of_posts_to_return}&since={timestamp}
to be able to limit the number of results and specify the starting timestamp. Have a look at the reference here:
https://developers.facebook.com/docs/graph-api/using-graph-api/v2.0#paging
For your second Use Case you'd need to use the Batch API imho, because with a single Graph API request you can't filter on specific Posts. Instead, you need to use the Batch API to split this in two queries as described here:
https://developers.facebook.com/docs/graph-api/making-multiple-requests/#operations
The request would then look like this:
curl \
-F 'access_token={your_access_token}' \
-F 'batch=[{ "method":"GET","name":"get-post","relative_url":"{your_post_id}?fields=created_time"},{"method":"GET","relative_url":"{your_page_id}/feed?since={result=get-post:$.created_time}&limit={nr_of_posts_to_return}"}]' \
https://graph.facebook.com/
In Graph Explorer, you have to change the HTTP method to Post, then add a new field called batch. Leave the URL blank so far. Paste this as batch value:
[{ "method":"GET","name":"get-post","relative_url":"​293088074081904_400071946716849?fields=created_time"},{"method":"GET","relative_url":"293088074081904/feed?since={result=get-post:$.created_time}&limit=1"}]
This works at least for me.

For others looking for a solution, it appears the 'since' done at the 'comment' and 'reply' levels are ignored. Which means this is not a solution for me.
The query Tobi provided will provide all the posts after the first 'since' but every comment and reply in those posts, regardless of that you set their 'since' to.
Further to this, if you wish to search for new comments , regardless of the age of the post, this fails as well. For example:remove the first 'since' and change to limit=1000 and only request comments as a fields using 'since' , this will return the last 1000 posts and all comments for all of those 1000.
That said, thank you Tobi for your time and showing me how to get everything I need in a single function call. I may experiment parsing the complete recordset every time. ( maybe too much traffic though!)

Related

Determine the proper REST API method

I have the following functionalities in my API:
Getting a user by their name
Getting a user by their ID
Getting a user, or if it doesn't exist create one
Getting multiple users by their ID
Currently I'm handling the two first functionalities with a GET request, and the third with a POST request. I could use a GET request for getting multiple users, but sending potentially hundreds of IDs through a query parameter seems like the wrong approach, as I would get a very long URL. I could also use a POST request to send the long list of IDs through its body, but I doubt a POST request is meant for this purpose.
What method would be appropriate to use for the last one?
I see 2 possible ways here:
Get all users and do your filtering post response.
Get a range of IDs, which resumes to only 2 parameters, the lower and the upper limits of the interval. (If this satisfy your needs)
Behaving the way you described and avoiding long URLs in the same time will not work together.

How can I list all users of the same location using the Github API?

It is my first time using the Github API, sorry if this is a stupid question. I did a short search for location:Germany, and got 39,063 users. I want to create a list of all the 39,063 usernames and tried this command:
curl -i https://api.github.com/search/users?q=location%3AGermany | grep login
However this returns only 30 hits.. Could anyone give me some advice, or guide me to the right resources?
You will have to make additional requests for other pages:
Pagination
Requests that return multiple items will be paginated to 30 items by default. You can specify further pages with the ?page parameter. For some resources, you can also set a custom page size up to 100 with the ?per_page parameter. Note that for technical reasons not all endpoints respect the ?per_page parameter, see events for example.
$ curl 'https://api.github.com/user/repos?page=2&per_page=100'
Note that page numbering is 1-based and that omitting the ?page parameter will return the first page.
For more information on pagination, check out our guide on Traversing with Pagination.

HasMany RESTfull (or anti-RESTfull) Design?

So I've been reading a lot on RESTfull design - specifically dealing with resources.
Taking the canonical example of Users, Posts, and Comments, with relationships as:
Users ---(hasMany)---> Post ---(hasMany)---> Comment
One may initially think to expose something like:
GET /users GET /posts GET /comments
POST /users POST /posts POST /comments
GET /users/id GET /posts/id GET /comments/id
PUT /users/id PUT /posts/id PUT /comments/id
DELETE /users/id DELETE /posts/id DELETE /comments/id
But then, say I want all Comments of a certain Post made by a particular User. I'd need to do something like:
GET /users/id
> someUser
> var postIds = someUser.posts()
GET /posts?id=<postIds[0]>&id=<postIds[1]>&...
> somePosts
> **application user inspects posts to see which one they care about**
> var postOfInterest = somePosts[x];
> var postId = postOfInterest.id;
GET /comments?id=postId
> someComments (finally)
Suppose though I only care about a Post or Comment in the context of it's owner. Suppose a different resource structuring which may (or may not?) be more natural:
GET /users
POST /users
GET /users/id
PUT /users/id
DELETE /users/id
GET /users/id/posts
POST /users/id/posts
GET /users/id/posts/id
PUT /users/id/posts/id
DELETE /users/id/posts/id
GET /users/id/posts/id/comments
POST /users/id/posts/id/comments
GET /users/id/posts/id/comments/id
GET /users/id/posts/id/comments/id
GET /users/id/posts/id/comments/id
Which to me, is probably a better representation of what the resources are. Then all I need is:
GET /users/id/posts
> somePosts
> **application user inspects posts to see which one they care about**
> var postOfInterest = somePosts[x];
> var postId = postOfInterest.id;
GET /users/id/posts/postId/comments
> someComments
This just seems more like navigating a file system than the previous method - but I don't know if its RESTfull at all (perhaps this is what REST was trying to get rid of) because in order to access a Comments resource, I need to know which User and which Post it belongs to. But the former requires 3 requests, while the latter requires just 2.
Thoughts?
Quite a bit of what is good REST is opinion but I would say your second approach is generally more "RESTful".
Basically you do want hierarchy in REST API and filesystem like navigation instead of query parameters. This is especially so if you follow HATEOS like API as someone can navigate your API.
In your second example it's important to have both GET /users/id and GET /users/id/posts so that when a request for the user's info is made it doesn't include all it's posts (or their IDs) too. And the second request will return their posts too. Often users have thousands of posts in a forum.
The disadvantage is that the api user always has to know the author of the post for which it wants to get comments. He'd essentially make a "give me that user and give me his/hers posts" request to your server which means that your server will make a query for that user and then select his posts. Instead it's much more convenient for both your user and your server to have separate requests - "give me that user", "give me that post" and "give me that comment". This means that you have to store separately users, posts and comments and for each post/comment store the id of it's author so that you can make selection of posts/comments by their author ("give me posts by this user", or simply "give me this post")
I would personally go with this variant of requests
GET user
GET post
GET comment
...
For every request I'd implement a where clause which will give the user of my api more options to make a specific selection. For example GET posts where userId='myID'. It can be implemented with url query parameters like http://myapi.mydomain.com/post?userId=user1 or inside the header. It will return a list of posts for that user. You can also have where clause for the post's ID - http://myapi.mydomain.com/post?id=123 which will return only this post. Note that for the first case - when you fetch a list of posts - you can only return some kind of short info for the posts (like id, author, summary...) and require an additional request to post?id=id for the full post content.
Having this implementation would give you at least two advantages:
the user of the api needs to know only one id to get some info - postID to get a post's content/comments, userId to get all posts/comments for that user
the selection is done on the server so less data is transfered over the network meaning faster responses (and potentially less costs for final users if they are on a mobile plan or something)
In my opinion this implementation giveс you loosely coupled objects (user, post, comment) and more flexible queries

Posting IDs to REST API

I am designing a REST API for inserting a record to the "solutions" table. A "solution" has a solverID, problemID. I have two different designs in mind:
POST /solutions
and passing the solverID and problemID in JSON with the content of the solution. Or putting the solverID and problemID in the URI:
POST /users/:solver_id/problems/:problem_id/solutions
Which design is better?
It's a good practice to define your resources in a consistent hierarchy, so that they are easily understandable and predictable.
Let's say this is the URL to retrieve a question -
GET /users/{solverId}/problems/{problemId}
It clearly conveys that the problem belongs to the {solverId}.
The following URL would clearly show that the we are retrieving all solutions for problems solved by {solverId}
GET /users/{solverId}/problems/{problemId}/solutions
To create a new solution for the {problemId}, you would do a post on
POST /users/{solverId}/problems/{problemId}/solutions
To retrieve a particular solution you would do a get on
GET /users/{solverId}/problems/{problemId}/solutions/{solutionId}
When to use Ids in path vs query ?
If an ID is definitely required to identify a resource, use it in the path. In the above scenario, since all three Ids are required to uniquely identify a solution, all of them should be in the path.
Let's say you want to retrieve a solution that was given in a particular date range, you would use the following
GET /users/{solverId}/problems/{problemId}/solutions?startDate={}&endDate={}
Here startDate and endDate cannot uniquely identify a resource, they are just parameters that are being used to filter the results.
Go with the first one. I would keep your urls as clean and simple as you can. Here are some other examples off the top my head. Not sure on your entire structure.
POST /solutions
GET /solutions?solverid=123 //query solutions by user
GET /users/555/problems // problems for a given user
GET /users/555/solutions // solutions for a given user
GET /problems/987/solutions // solutions for a given problem
I came up with a scheme: including user ID in the route only when authentication is not needed for the route, otherwise, the user ID can be figured out from the authentication information, and the above route becomes:
POST /problems/:problem_id/solutions

The limit of Facebook's graph api "limit" parameter

I'm fetching a large amount of comments from a public page using Facebook's Graph API.
By default facebook returns 25 comments per response, and uses paging. This causes the need for multiple requests, which is uneccesery as I know ahead there will be a lot of comments.
I read about the "limit" parameter that you can pass to ask for a certain amount of items per response.
I was wondering, what is the limit of that parameter? I'm assuming I can't pass &limit=10000.
There's a different way for fetching comments:
https://graph.facebook.com/<PAGE_ID>_<POST_ID>/comments?limit=500
The maximum value for the limit parameter is 500.
yes, with limit parameter you can pass what number of certain resource you want in one call. default limit is 25.
for ex. if you want 100 comment in one call for a post having id POST_ID, you can query like this:
https://graph.facebook.com/POST_ID?fields=comments.limit(100)
I think they have changed this. For /feed? I only get 200-225 posts back but for comments I get as many as 2000 back
Old question, but this is in the current Facebook documentation in case anyone finds this question via search (emphasis mine):
Some edges may also have a maximum on the limit value for performance reasons. In all cases, the API returns the correct pagination links.
In other words, even if you specify a limit above what's allowed by the endpoint, the "pagination.previous" and "pagination.next" elements will always provide the correct URL to resume where it left off.
I would recommend you to use FQL instead.
FQL provide a more flexible approach where you can combine data types (posts, users, pages, etc..) as you please. You can also query for comments belonging to a list of stories instead of just one limiting your number of requests even more.
There are a couple of drawbacks though:
1. There is a limit on 5000 comments. Here you would use a query looking something like: "SELECT id, ...... FROM comments, ... WHERE parent_id in (1,2,3....) ORDER BY time LIMIT 0, 5000". Even though you split this up in several queries with "LIMIT 0, 1000", "LIMIT 1000, 1000", LIMIT 2000, 1000, etc.., you would never get anything over 5000 comments("LIMIT 5000, 1000" would return empty).
2. All real requests made on Facebooks server counts as one request. You can send of something that is actually a combination of requests, this will be counted as multiple requests.
3. Facebook does not like to heavy requests. You can end up with getting blocked for a shorter time periods(minutes -> hours, not days). If this happens, act on it.