When I run the following FQL query in two Graph API explorer windows at the same time, I get two different result sets:
SELECT
post_id,
actor_id,
target_id,
created_time,
type,
permalink,
message,
description,
attachment
FROM stream
WHERE filter_key IN (SELECT filter_key
FROM stream_filter
WHERE uid=me() AND type='newsfeed')
ORDER BY created_time ASC
LIMIT 50
Here are the differences:
The 1st result set has a few extra posts in the start, whereas the 2nd has a few extra posts in the end (I use ORDER BY created_time ASC in the FQL.)
One of the posts in between has an empty permalink in the 1st result set, whereas its present in the 2nd result set.
There are few posts not present in the 1st result set which are present in 2nd, and vice versa.
Is this because of load balancing within the Facebook server farm?
How can one make sure they are getting all posts which can be seen on the user's newsfeed via the API?
Related
I'm trying to query Facebook for the last 50 photos posted by my friends.
Sounds easy enough, right? So far I've found two different approaches using FQL, and neither works.
Photo Method [Explorer]
SELECT pid, owner, src_big, caption, created
FROM photo WHERE owner IN (
SELECT uid1 FROM friend WHERE uid2=me()
) ORDER BY created DESC LIMIT 50
The problem with this is that it's doing a depth-first search when I really want a breadth first search. It's going through my friends list, finding the first friend with a bunch of photos, and then giving me back her most recent photos, ignoring more recent photos from friends further down on my friends list.
Stream Method [Explorer]
SELECT pid, owner, src_big, caption, created
FROM photo where pid in (
SELECT attachment.media.photo.pid
FROM stream
WHERE filter_key IN (
SELECT filter_key FROM stream_filter WHERE uid=me()
) AND type = 247
ORDER BY created_time DESC
LIMIT 50
)
This is much closer to what I want, except that it's giving me photos from pages, too, and I only want photos from friends. As far as I can tell, there is no way to filter uid by type (e.g. friend, page, etc.).
What am I missing? Is there a third way? Is this impossible?
I tried your query and timeout error encountered(by this time):
Your main key should put filter_key with app_2305272732 in the first place to return photo only, instead of filter by type=247 afterward, so you can get more results:
SELECT pid, owner, src_big, caption, created FROM photo where pid IN(SELECT attachment.media.photo.pid FROM stream WHERE filter_key='app_2305272732' AND actor_id IN(SELECT uid1 FROM friend WHERE uid2=me()) AND created_time<=now() ORDER BY created_time DESC LIMIT 200)
As you can see, the actor_id only include friends, but the response can be faster(~2 seconds for 2000 friends) if you put actor_id directly, so you may consider to cache the friends id (depends on your app flow).
actor_id IN(FRIEND_ID1,FRIEND_ID2,FRIEND_ID3...)
Also, there's no guarantee to get 50 photos in once, but you can navigate to next page by created_time(Get from last photo's created field). Even thought the created time from photo table may slightly different(~1 seconds) than stream table, but it should acceptable.
created_time<CREATED
Finally, you should increase the limit, so you can get more result at once. I found 200 is acceptable and i'm able to get more than 20 photos in one page.
LIMIT 200
Update:
You should use source_id instead of actor_id, to get much more results. So the correct query is:
SELECT pid, owner, src_big, caption, created FROM photo where pid IN(SELECT attachment.media.photo.pid FROM stream WHERE filter_key='app_2305272732' AND source_id IN(SELECT uid1 FROM friend WHERE uid2=me()) AND created_time<=now() ORDER BY created_time DESC LIMIT 200)
The LIMIT parameter can be decrease if you do this query, let's say LIMIT 50, so your single query can be faster and avoid timeout request failed(if the photos data is too many and too heavy), it's depends on your decision.
I have to remind you, you can't simply use created from photo table to query next page of stream table, because a feed may contains many photos and the created time for each photos can be much more different(the distinct can be in hours!). So you should consider to do multiquery to retrieve created_time of stream table if you want to do next page, for example:
{"query1":"SELECT attachment.media.photo.pid, created_time FROM stream WHERE filter_key='app_2305272732' AND source_id IN(SELECT uid1 FROM friend WHERE uid2=me()) AND created_time<=now() ORDER BY created_time DESC LIMIT 200", "query2":"SELECT pid, owner, src_big, caption, created FROM photo where pid IN(SELECT attachment.media.photo.pid FROM #query1)"}
Also, please note that ORDER BY created DESC is sort the feed, not photo, so it's normal if you see the create time is not in the order.
You may consider to do comparison for every single photo's created time with next page's created_time, and only show the photo which are currently earlier than next page's created_time. For example, if the next page created_time is 5.00 PM, and you have the photo A with created time at 2.00 PM(get from first page). You can just hold on the photo A until the next page created_time is older than 2.00 PM, so you can display photo A to the user. Of course, you have to do sorting after you insert photo A to current page's photos.
I'm having some trouble to retrieve a sequence of posts using FQL when sorting by *created_time*. It seems that when I try to retrieve posts ordering by *created_time*, FQL first retrieves the first 50 posts (LIMIT 50) ordered by *updated_time* and then apply ORDER BY on these first 50 posts.
The *updated_time* datetime gets updated whenever someone comments on that post, so if someone comments on a post that's 1 year old, the last post from my first query will be that post and my next sequence will start from that point in time (older than 1 year).
This causes me a problem because to get the next 50 posts in the sequence, I have to use the *created_time* datetime from the last post from the first query.
Any ideas on how to this the correct way?
SELECT post_id,actor_id,message,created_time,updated_time FROM stream WHERE source_id=xxx AND created_time < xxxxxxxx ORDER BY created_time DESC LIMIT 50
Example:
There is a post with timestamps: 'updated_time': 1372837741 and 'created_time': 1372081023
With the following query, there are no results although the post's created_time is inside the specified created_time range.
SELECT post_id,actor_id,message,permalink,created_time,updated_time FROM stream WHERE source_id=xxxxxxxxx
AND created_time < 1372081033
AND created_time > 1372081013
But if I change the query's range to include the post's updated_time, the post above is returned.
SELECT post_id,actor_id,message,permalink,created_time,updated_time FROM stream WHERE source_id=xxxxxxxxxxx
AND created_time < 1372837742 AND created_time > 1372081013
This is probably a bug.
Thanks in advance!
I am trying to retrieve the object_id of all pictures that my friends has on facebook.
This is the method I use that I believe should work fine:
https://api.facebook.com/method/fql.query?access_token=[YOURTOKEN]&query=SELECT object_id FROM photo WHERE aid IN (SELECT aid FROM album WHERE owner IN (SELECT uid FROM friend WHERE uid1=me )) ORDER BY created DESC
My problem is that I only retrieve 5108 object_id's , thats nowhere close to the total number of pictures that all of my friend has.
Is there a restriction from facebook ? Any suggestions appreciated.
You can add LIMIT and OFFSET to the end of your query. So to get the first 1000 photos, you would have LIMIT 1000 OFFSET 0, then for the next group LIMIT 1000 OFFSET 1001 and so on.
You are also using a legacy endpoint. You should be using the newer one:
https://graph.facebook.com/fql?q=[QUERY]&access_token=[TOKEN]
Since some time I'm getting inconsistent behavior when using multiple source_ids in an FQL query to the stream table. The query is e.g.:
SELECT source_id, post_id, created_time, message, permalink, type, attachment
FROM stream
WHERE (created_time >= 1338444667) and
((source_id = 74133697733 and actor_id = 74133697733) or
(source_id = 259126564951 and actor_id = 259126564951))
It seems that the timespan from which posts are returned is quite limited. But what's the cutoff value? Sometimes I'm not getting posts from 30 minutes ago, sometimes I'm getting multiple posts from between 2 and 3 hours ago, only to stop getting all of them on the next query. Is there a rule?
There's no mention of special multi-source_id queries treatment in https://developers.facebook.com/docs/reference/fql/stream/.
I want to get the full history of my wall. But I seem to hit a limit somewhere back in June.
I do multiple calls like this:
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID LIMIT 50
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID LIMIT 51,100
and so on...
But I always end up on the same last (first) post on my wall.
Through facebook.com I can go back much longer so Facebook obviously have the data.
Why am I not getting older posts?
Is there another way to scrape my history?
From http://developers.facebook.com/docs/reference/fql/stream :
The stream table is limited to the last 30 days or 50 posts, whichever is greater
I am experiencing the same thing. I don't understand it at all, but it appears that the offset cannot be greater than the limit * 1.5
Theoretically, this means that always increasing the limit to match the offset would fix it, but I haven't been able to verify this (I'm not sure whether the problems I'm seeing are other bugs in my code or if there are other limitations I don't understand about getting the stream).
Can anyone explain what I'm seeing and whatever I'm missing?
You can reproduce my results by going to the FQL Test Console:
http://developers.facebook.com/docs/reference/rest/fql.query
pasting in this query:
SELECT post_id, created_time, message, likes, comments, attachment, permalink, source_id, actor_id
FROM stream
WHERE filter_key IN
(
SELECT filter_key
FROM stream_filter
WHERE uid=me() AND type='newsfeed'
)
AND is_hidden = 0 limit 100 offset 150
When you click "Test Method" you will see one of the 2 results I am getting:
The results come back: [{post_id:"926... (which I expected)
It returns empty [] (which I didn't expect)
You will likely need to experiment by changing the "offset" value until you find the exact place where it breaks. Just now I found it breaks for me at 155 and 156.
Try changing both the limit and the offset and you'll see that the empty results don't occur at a particular location in the stream. Here are some examples of results I've seen:
"...limit 50 offset 100" breaks, returning empty []
"...limit 100 offset 50" works, returning expected results
"...limit 50 offset 74" works
"...limit 50 offset 75" breaks
"...limit 20 offset 29" works
"...limit 20 offset 30" breaks
Besides seeing the limit=offset*1.5 relationship, I really don't understand what is going on here.
Skip the FQL and go straight to graph. I tried FQL and it was buggy when it came to limits and getting specified date ranges. Here's the graph address. Put in your own page facebook_id and access_token:
https://graph.facebook.com/FACEBOOK_ID/posts?access_token=ACCESS_TOKEN
Then if you want to get your history set your date range using since, until and limit:
https://graph.facebook.com/FACEBOOK_ID/posts?access_token=ACCESS_TOKEN&since=START_DATE&until=END_DATE&limit=1000
Those start and end dates are in unix time, and I used limit because if I didn't it would only give me 25 at a time. Finally if you want insights for your posts, you'll have to go to each individual post and grab the insights for that post:
https://graph.facebook.com/POST_ID/insights?access_token=ACCESS_TOKEN
I dont know why, but when I use the filter_key = 'others' the LIMIT xx works.
Here is my fql query
SELECT message, attachment, message_tags FROM stream WHERE type = 'xx' AND source_id = xxxx AND is_hidden = 0 AND filter_key = 'others' LIMIT 5
and now I get exactly 5 posts...when i use LIMIT 7 i get 7 and so on.
As #Subcreation said, something is wack with FQL on stream with LIMIT and OFFSET and higher LIMIT/OFFSET ratios seem to work better.
I have created an issue on it Facebook at http://developers.facebook.com/bugs/303076713093995. I suggest you subscribe to it and indicate you can reproduce it to get it bumped up in priority.
In the bug I describe how a simple stream FQL returns very inconsistent response counts based on its LIMIT/OFFSET. For example:
433 - LIMIT 500 OFFSET 0
333 - LIMIT 500 OFFSET 100
100 - LIMIT 100 OFFSET 0
0 - LIMIT 100 OFFSET 100
113 - LIMIT 200 OFFSET 100
193 - LIMIT 200 OFFSET 20
You get a maximum likes of 1000 when using LIMIT
FQL: SELECT user_id FROM like WHERE object_id=10151751324059927 LIMIT 20000000
You could specify created_time for your facebook query.
create_time field is unix based time. You could convert it with such convertor http://www.onlineconversion.com/unix_time.htm, or use program methods depends on you language.
Template based on your request
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID and created_time>BEGIN_OF_RANGE and created_time>END_OF_RANGE LIMIT 50
And specific example from 20.09.2012 to 20.09.2013
SELECT created_time,message FROM stream WHERE source_id=MY_USER_ID and created_time>1348099200 and created_time>1379635200 LIMIT 50
I have a similar issue trying to download older posts from a public page, adding a filter ' AND created_time < t', and setting t for each query to the minumum created_time I got so far. The weird thing is that for some values of t this returns an empty set, but if I manually set t back of one or two hours, then I start getting results again. I tried to debug this using the explorer and got to a point where a certain t would get me 0 results, and t-1 would get results, and repeating would give me the same behavior.
I think this may be a bug, because obviously if I created_time < t-1 gives me results, then also created_time < t should.
If it was a question of rate limits or access rights, then I should get an error, instead I get an empty set and only for some values of t.
My suggestion for you is to filter on created_time, and change it manually when you stop getting results.
Try it with a comma:
SELECT post_id, created_time, message, likes, comments, attachment, permalink, source_id, actor_id FROM stream WHERE filter_key IN (SELECT filter_key FROM stream_filter WHERE uid=me() AND type='newsfeed') AND is_hidden = 0 limit 11,5