How to pull Facebook comments and feeds to hadoop? - facebook

I am working on a case study where I have to get comments, activities etc. from my facebook page and aggregate them in Hadoop for textual analysis using map reduce.
What is the right way to pull Facebook fan page Feeds? Is there any API for it, which can be consumed by hadoop components like flume or scribe?

Yes, the JSON API is called Graph, I recommend you start with the tutorial

Related

Streaming facebook data into Hadoop HDFS

I want to analyse Social media data with Hadoop. I used flume to stream Twitter data into hadoop using custom source but for Facebook I didn't find anything for streaming though I can use API for downloading the FB data.
Is it possible using any other tool than flume? Can anyone suggest something?
There are no public streaming endpoints in the Graph API, unlike with Twitter. Theonly thing I imagine you could use are the Realtime Updates, but therefore you's either need a Facebook app or a facebook Page which you could subscribe to.

Embed Facebook Stream Filtered By Hashtags

I want to embed a stream of facebook posts onto a web page that is comprised of posts all around facebook containing a specific hash tag, but I'm not quite sure how to go about it. I had to do the same with twitter and instagram, but those were all fairly easy to accomplish. I'm just kind of looking for the best option right now, not so much on specifics. I've seen a couple of times the graph api for facebook as an option, but every time I see those they seem to be from a year ago, so not sure if it's out of date or if there is a better option out by now. Any recommendations on ways to go about it would be greatly appreciated.
See my answer here on how you can use the Search API for hashtags:
Need help on employing Graph Search parameters for hashtag query on facebook
Basically, you can call
https://graph.facebook.com/search?q=%23selfie&type=post&access_token={user_access_token}
Be aware that you have to use v1.0 of the Graph API, because in v2.0 searching for public posts will no longer be possible (https://developers.facebook.com/docs/apps/upgrading#upgrading_v2_0_graph_api):
Public post search is no longer available.
(/search?type=post&q=foobar)
Graph API v1.0 will only be available until 30th of April, 2015.

News Feed via Graph API 'outdated'

On the Graph API documentation, the following is in brackets next to the News Feed endpoint:
this is an outdated view, does not reflect the News Feed on facebook.com
This is a fairly critical method in any app using the Graph API, so what are we supposed to use? Is there a way to obtain a more accurate version of the News Feed with a different API?
I've noticed some differences between what is shown on the website and what is shown through the API but I assumed most of it was down to individual user permissions. Either way this issue is non-trivial and is starting to make regret choosing the Graph API over, say, FQL.
You can use FQL to fetch the news feed. See the documentation under https://developers.facebook.com/docs/reference/fql/stream/
But it seems that the results doesn't differ from the results via the graph api.

list all facebook friend requests

Is there a way to list all friendship requests using the facebook graph API? I didn't find anything in the docs, but with the experience I've undergone concerning the quality of this doc so far, I wouldn't be surprised if there was a way to it.
I think it's not possible using only Graph API. You will need to use any SDK and then make a FQL query.
Here you go:
http://developers.facebook.com/docs/reference/fql/friend_request/

How do I publish an activity on facebook using the Graph API

The old Facebook Legacy REST API had a function dashboard.publishActivity, however the new Graph API only allows messages to be posted on /me/feed.
Is there a way to send activities using the Graph API?
As mentioned in http://developers.facebook.com/blog/post/552/ Facebook "have removed the section which displayed the News that developers published via the Dashboard APIs in the Games Dashboard". Therefore the dashboard.publishActivity function no longer provides any useful functionality. My recommendation is to switch to either using stream posts or to Requests 2.0, as these will provide the same sort of distribution that you're looking for.