I want to get the full history of my wall. But I seem to hit a limit somewhere back in June.
I do multiple calls like this:
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID LIMIT 50
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID LIMIT 51,100
and so on...
But I always end up on the same last (first) post on my wall.
Through facebook.com I can go back much longer so Facebook obviously have the data.
Why am I not getting older posts?
Is there another way to scrape my history?
From http://developers.facebook.com/docs/reference/fql/stream :
The stream table is limited to the last 30 days or 50 posts, whichever is greater
I am experiencing the same thing. I don't understand it at all, but it appears that the offset cannot be greater than the limit * 1.5
Theoretically, this means that always increasing the limit to match the offset would fix it, but I haven't been able to verify this (I'm not sure whether the problems I'm seeing are other bugs in my code or if there are other limitations I don't understand about getting the stream).
Can anyone explain what I'm seeing and whatever I'm missing?
You can reproduce my results by going to the FQL Test Console:
http://developers.facebook.com/docs/reference/rest/fql.query
pasting in this query:
SELECT post_id, created_time, message, likes, comments, attachment, permalink, source_id, actor_id
FROM stream
WHERE filter_key IN
(
SELECT filter_key
FROM stream_filter
WHERE uid=me() AND type='newsfeed'
)
AND is_hidden = 0 limit 100 offset 150
When you click "Test Method" you will see one of the 2 results I am getting:
The results come back: [{post_id:"926... (which I expected)
It returns empty [] (which I didn't expect)
You will likely need to experiment by changing the "offset" value until you find the exact place where it breaks. Just now I found it breaks for me at 155 and 156.
Try changing both the limit and the offset and you'll see that the empty results don't occur at a particular location in the stream. Here are some examples of results I've seen:
"...limit 50 offset 100" breaks, returning empty []
"...limit 100 offset 50" works, returning expected results
"...limit 50 offset 74" works
"...limit 50 offset 75" breaks
"...limit 20 offset 29" works
"...limit 20 offset 30" breaks
Besides seeing the limit=offset*1.5 relationship, I really don't understand what is going on here.
Skip the FQL and go straight to graph. I tried FQL and it was buggy when it came to limits and getting specified date ranges. Here's the graph address. Put in your own page facebook_id and access_token:
https://graph.facebook.com/FACEBOOK_ID/posts?access_token=ACCESS_TOKEN
Then if you want to get your history set your date range using since, until and limit:
https://graph.facebook.com/FACEBOOK_ID/posts?access_token=ACCESS_TOKEN&since=START_DATE&until=END_DATE&limit=1000
Those start and end dates are in unix time, and I used limit because if I didn't it would only give me 25 at a time. Finally if you want insights for your posts, you'll have to go to each individual post and grab the insights for that post:
https://graph.facebook.com/POST_ID/insights?access_token=ACCESS_TOKEN
I dont know why, but when I use the filter_key = 'others' the LIMIT xx works.
Here is my fql query
SELECT message, attachment, message_tags FROM stream WHERE type = 'xx' AND source_id = xxxx AND is_hidden = 0 AND filter_key = 'others' LIMIT 5
and now I get exactly 5 posts...when i use LIMIT 7 i get 7 and so on.
As #Subcreation said, something is wack with FQL on stream with LIMIT and OFFSET and higher LIMIT/OFFSET ratios seem to work better.
I have created an issue on it Facebook at http://developers.facebook.com/bugs/303076713093995. I suggest you subscribe to it and indicate you can reproduce it to get it bumped up in priority.
In the bug I describe how a simple stream FQL returns very inconsistent response counts based on its LIMIT/OFFSET. For example:
433 - LIMIT 500 OFFSET 0
333 - LIMIT 500 OFFSET 100
100 - LIMIT 100 OFFSET 0
0 - LIMIT 100 OFFSET 100
113 - LIMIT 200 OFFSET 100
193 - LIMIT 200 OFFSET 20
You get a maximum likes of 1000 when using LIMIT
FQL: SELECT user_id FROM like WHERE object_id=10151751324059927 LIMIT 20000000
You could specify created_time for your facebook query.
create_time field is unix based time. You could convert it with such convertor http://www.onlineconversion.com/unix_time.htm, or use program methods depends on you language.
Template based on your request
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID and created_time>BEGIN_OF_RANGE and created_time>END_OF_RANGE LIMIT 50
And specific example from 20.09.2012 to 20.09.2013
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID and created_time>1348099200 and created_time>1379635200 LIMIT 50
I have a similar issue trying to download older posts from a public page, adding a filter ' AND created_time < t', and setting t for each query to the minumum created_time I got so far. The weird thing is that for some values of t this returns an empty set, but if I manually set t back of one or two hours, then I start getting results again. I tried to debug this using the explorer and got to a point where a certain t would get me 0 results, and t-1 would get results, and repeating would give me the same behavior.
I think this may be a bug, because obviously if I created_time < t-1 gives me results, then also created_time < t should.
If it was a question of rate limits or access rights, then I should get an error, instead I get an empty set and only for some values of t.
My suggestion for you is to filter on created_time, and change it manually when you stop getting results.
Try it with a comma:
SELECT post_id, created_time, message, likes, comments, attachment, permalink, source_id, actor_id FROM stream WHERE filter_key IN (SELECT filter_key FROM stream_filter WHERE uid=me() AND type='newsfeed') AND is_hidden = 0 limit 11,5
Related
I'm having some trouble to retrieve a sequence of posts using FQL when sorting by *created_time*. It seems that when I try to retrieve posts ordering by *created_time*, FQL first retrieves the first 50 posts (LIMIT 50) ordered by *updated_time* and then apply ORDER BY on these first 50 posts.
The *updated_time* datetime gets updated whenever someone comments on that post, so if someone comments on a post that's 1 year old, the last post from my first query will be that post and my next sequence will start from that point in time (older than 1 year).
This causes me a problem because to get the next 50 posts in the sequence, I have to use the *created_time* datetime from the last post from the first query.
Any ideas on how to this the correct way?
SELECT post_id,actor_id,message,created_time,updated_time FROM stream WHERE source_id=xxx AND created_time < xxxxxxxx ORDER BY created_time DESC LIMIT 50
Example:
There is a post with timestamps: 'updated_time': 1372837741 and 'created_time': 1372081023
With the following query, there are no results although the post's created_time is inside the specified created_time range.
SELECT post_id,actor_id,message,permalink,created_time,updated_time FROM stream WHERE source_id=xxxxxxxxx
AND created_time < 1372081033
AND created_time > 1372081013
But if I change the query's range to include the post's updated_time, the post above is returned.
SELECT post_id,actor_id,message,permalink,created_time,updated_time FROM stream WHERE source_id=xxxxxxxxxxx
AND created_time < 1372837742 AND created_time > 1372081013
This is probably a bug.
Thanks in advance!
You can try this using the Graph API Explorer:
SELECT post_id, id, fromid, time, text, user_likes, likes FROM comment WHERE post_id ='126757470715601_530905090300835' AND time < '1366318653' ORDER BY time DESC LIMIT 30
This will return an empty result set. BUT if I remove DESC from the query, it will return 30 results.
SELECT post_id, id, fromid, time, text, user_likes, likes FROM comment WHERE post_id ='126757470715601_530905090300835' AND time < '1366318653' ORDER BY time LIMIT 30
So adding DESC to the order by somehow changed the way LIMIT behaves. Can anyone shed some light on this?
Update:
The time is within range. Sorry by my wrong calculation before. What you can do is increase the comment number to huge value, such as LIMIT 150, because there's may be a lot of comments is_privacy='0' around 30 items.
Is there a limit to the maximum number of results (considering selecting only a field from a table - ex: uid from users) one can get with a single FQL query?
Ex: select uid from users where condition has a 1M sized results set -> how many of those 1M would be returned to the caller?
According to a blog post made by the Facebook on same issue the limit stands at 5000 results before the visibility check kicks in reducing even further the result set.
Since some time I'm getting inconsistent behavior when using multiple source_ids in an FQL query to the stream table. The query is e.g.:
SELECT source_id, post_id, created_time, message, permalink, type, attachment
FROM stream
WHERE (created_time >= 1338444667) and
((source_id = 74133697733 and actor_id = 74133697733) or
(source_id = 259126564951 and actor_id = 259126564951))
It seems that the timespan from which posts are returned is quite limited. But what's the cutoff value? Sometimes I'm not getting posts from 30 minutes ago, sometimes I'm getting multiple posts from between 2 and 3 hours ago, only to stop getting all of them on the next query. Is there a rule?
There's no mention of special multi-source_id queries treatment in https://developers.facebook.com/docs/reference/fql/stream/.
I need to monitor number of the facebook group users and display it on the website. I know that it is possible to get User IDs using their API, but they are limited to 500 only (if the total number of members is 500+).
What would be the easiest way to get total number of members that signed up to a Facebook Group that I'd set up? Is this at all possible?
If you write an http bot, it shouldn't be very hard to scrap, given that real-time performance is not the key.
You can do it with a FQL query like this:
SELECT uid FROM group_member WHERE gid = <group_id> limit 500
SELECT uid FROM group_member WHERE gid = <group_id> limit 500 offset 500
SELECT uid FROM group_member WHERE gid = <group_id> limit 500 offset 1000
...
Get the number of members
Do it inside a loop (until you get 0 results) and you'll get the total number of group members
perPage = 500
for count in range(100):
res = fql('SELECT uid FROM group_member WHERE gid = %s limit %d offset %d' % (fbUserId, perPage, perPage * count))
if len(res) == 0:
break
friends += len(res)
Get the members detail
You can even join with the user FQL table to have all the user detail:
SELECT uid, name, pic_square FROM user WHERE uid IN (
SELECT uid FROM group_member WHERE gid = <group_id> limit 500 offset %d )
According to the documentation for Groups.getMembers it isn't possible to get > 500 group members with an API call. Worse, you seem to only be able to get 500 random members.
You may want to consider using Facebook Connect with your site instead. I'm no expert on Connect but I believe you wont have this problem using it since you are actually writing Facebook-specific code -- seems like there would be no purpose in limiting results. That'd be the direction I'd look, at least.
Good luck.