How to scrape Facebook posts from certain location? - facebook

We have spent several days looking into FB Graph API and third party tools for scraping FB data but cant figure out if it is even possible to scrape what we are looking for and if it falls into FB policies (really not looking forward to start a lawsuit with FB).
We need to obtain statistic of how often is specific question (read - problem that we will try to solve) posted on Facebook. We need to get all FB posts filtered by three criterium:
Location - country or city of user that posted the post
Time - Some reasonable period of time, for example a full month, week or day
Keyword - keyword that can be associated with questions that we are looking for
We would then takes this data set and manually go over it in order to distinguish whats relevant to us and what is not. Maybe use some language processing engine like wit.ai or api.ai to use data set to teach app to regonize which posts are relevant and which not. But thats on us, later.
So the question: Is it possible (technically and also from FB policies point of view) and what would be the steps to get FB posts filtered by three criterium stated above?

Related

Facebook PHP SDK limitations

I am developing a page on my site that pulls in analytics data from Facebook posts and pages, and have began with trying out the PHP SDK for Facebook.
The documentation seems to have improved in the last few months since the last time I used it, but as there is so much going on, I find it difficult to find the exact answer to my questions sometimes.
I want to get the analytics data for a number of posts and pages which are not necessarily related to each other... and also want to do this without logging in via Facebook. Is there some sort of simple API key I can use, which skips the logging in stage? I want a number of different peple to go to the site and see info from a number of different posts and pages.
Hope this makes sense.
You'll be able to get limited data from public pages and posts without logging in. Data includes number of likes, public page information, comment and likes on posts etc.
For example, if I look at the following Page https://graph.facebook.com/thetimberyard, I can see various stats like:
checkins
talking_about_count
were_here_count
likes
However, you can get better insights if you ask the user to login with read_insights permission. This permission will give you access to full insights for pages you own. Extra insights include (but not limited to):
Daily New Likes
Daily Unlikes
Daily Non/Logged-in Page Views
Weekly Non/Logged-in Page Views
Daily Internal Referrers
Daily External Referrers
Daily/Weekly/Monthly People Talking About This
Daily/Weekly/Monthly Reach Demographics, etc.

What are the meaning of some of the "deeper" insights metrics? Why is my Daily Organic Reach less than my Daily Reach of Page Posts?

I use FB Insights data everyday when running analysis for my company. However, I have had some inconsistencies in the data and don't know if it is caused by a misunderstanding of the meaning of the "deeper" metrics. I have searched everywhere and am hoping that someone can help me.
Key Metrics tab:
Why is my Daily Organic Reach less than my Daily Reach of Page Posts? What is the count delta from? What is not included in Daily Organic Reach this included in Daily Reach of Page Posts?
Can you reach the same person Organically and Virally? Why does Organic Reach + Viral Reach equal to more than Total Reach (and Paid Reach is 0)?
‘Daily Likes Sources’ tab:
What are the full definitions of each of the sources: profile_connect, mobile, api, recommended_pages, page_suggestions, timeline, external_connect, page_profile, hovercard, search, ticker, like_story
Are mobile likes independent of the others?
Why would the Daily New Likes column in the Key Metrics tab not equal the summation all the columns for the same day in the Daily Likes Sources tab?
‘Daily Viral Reach by Story Type’ tab:
What are the full definitions of each of the story types: fan, page post, user post, mention?
If we normally get 1-5 viral uniques from user post, and then one day get 1.5k, what is the likely source of this?
‘Daily Page Consumers by Consumption Type’ tab:
What are the full definitions of the consumption types: other click, link click
Are photo views and video views included in other clicks?
Well I have only recently started using facebook insights and this is what I can share with you.
Q. Why is my Daily Organic Reach less than my Daily Reach of Page Posts?
A. As far as I understand these are two very different factors, Your organic reach is the number of people that find you through searches (Facebook, google, yahoo, bing ...) while Daily reach of page posts is the number of people visiting your page due to internal efforts, such as posting status updates.
Q. Can you reach the same person Organically and Virally?
A. Organic was explained previously while Viral is an effect of other people sharing your posts, be it images, statuses or videos. In general if you have interesting updates your viral reach will be higher than your organic reach. So basically yes, this just means that you are reaching a target audience using two different methods of marketing (If you are achieving in this, then keep it up, the more platforms people see you on, the comfortable they become with your brand)
The rest of your paragraph is very difficult for me to follow, please could you re-write in an easier point per question format.
1.1 I have explained why the organic reach is less, if you want to increase your organic reach you need to do some SEO (Search Engine Optimization) I really hope you know what it means.
Im not quite sure what count delta is, but I am assuming that if it refers to organic vs posts then it would be the difference between them. I am not 100% sure of what factors are included in each type of reach (Sorry, but I suggest you do some more research on that)
1.2 To re-iterate, these are two different forms of online marketing so simply yes, you can reach the same person by doing both forms of marketing. Paid reach is affected by many things, first it depends on whether you are doing ppc (pay per click) or cpm (pay per impression). Here you will need to run multiple campaigns and set your ctr targets, then compare the results and choose which is better. You need keep repeating this to continuously optimize your spending per person entering your page.
2.1 profile_connect - A like through a friend
mobile - A like from the mobile site (m.facebook.com)
api - This would be if you have the api on an external website of yours
recommended_pages - Liked by someone cause a friend posted it to them either in chat or in a PM
page_suggestions - I think this is if another page has liked you and refers people to you, but not 100% sure
timeline - No clue, this is new to FB and I havent had a chance to look at it yet ;)
external_connect - This would be from an external website, similar to the api, but rather just a url link as opposed to the api (can be found on forums...)
page_profile - ??
hovercard - ??
search - Facebook search
ticker - ??
like_story - Some one that came to your page because they saw a post of yours (true Viral)
2.2 I highly doubt that mobile like are independant of others, I just dont think that f-book has setup proper tracking for their mobile site
2.3 I think that people removing their likes may cause this, but I have asked myself this question a number of times :/
3.1 ok well you need to understand that viral reach is when a friend of a fan sees the post.
fan - A friend of a fan acted on a post that they saw (eg: John likes top-racers fan page)
page post - you share your page and it lands up on a fans wall and one of their friends act on it
user post - you post to your page ...
mention - A fan mentions you in one of their posts (the strongest viral)
3.2 Well this is a tough one because I don't know what you did, but I would suggest you list everything you do and when this happens have a look to see what you did different (sometimes viral really is viral)
4. I have never looked into these, sorry.
I hope this information helps you, but I really would suggest that you try and do some more research on the topic. Remember that Google tactics aren't necessarily going to work in Facebook, especially with paid advertising.
Good luck ;)

Facebook Insight data not appearing for website

I have recently been looking into adding Facebook Insights into one of our client's websites (www.mcvuk.com). I've created an app to associate with this and added the necessary Facebook meta tags to my site which reference the app id.
I was, until today, having issues adding the app domain information to https://developers.facebook.com/apps but have added this information in today.
My question is how long does it take before you will start to see results filter through for the site and is there any way of checking that everything has been set up correctly?
It might not be a matter of time, it might be a matter of how many 'likes' the app or page has. At least for pages, it tells you "Once 30 people like your Page, you'll get access to insights about your activity."
That's an interesting point.
It all depends on which metrics (results) you're after and how much traffic your app gets.
Additionally, you might want to look at facebook documentation for the metrics (results) you're looking for -- some of them are available monthly or weekly, others are a lifetime aggregate, and some are daily.
The easiest way to test would be to ask some of your friends to do whatever it is you want to test (comment on a post, link to a page, etc.).
I hope that answers some of your questions.

Getting all likes on my domain (facebook)

I'm trying to get statistics for likes on my domain. I would like to get all likes (if possible with user ids) for all pages on my domain (which has tens of thousands of pages)
What does domain_like_adds actually return?
SELECT metric, value FROM insights
WHERE object_id=[domain-id] AND
metric='domain_like_adds' AND
end_time=end_time_date('2011-01-03')
AND period=period('month')
Returns blank, does anyone know what data domain_like_adds returns?
Regards,
Niklas
I don't think there's any way you're going to get user IDs as that is a major privacy invasion, but I believe domain_like_adds indicates how many NEW likes your domain got in the given time period, as opposed to the cumulative likes your domain has earned until that point. It doesn't appear there's a viable way to determine the # of likes of all objects in your domain for all time without tracking it from the beginning and/or going back and summing up historical data.
You can make a sitemap.xml of your site and crawl the urls against the Facebook Graph API. I actually made a Ruby script to do this: http://bobbelderbos.com/2012/01/ruby-script-facebook-like-stats-blog/. I don't think you can get the users that 'liked' your pages, but this script might be useful to find out what URLs are most popular.

Getting all facebook page wall posts and comments?

I am developing a social media monitoring application. Currently, we are entering Facebook page ids into the application to collect data from possible customers' Facebook walls (so we have a realistic sample for the customer for direct promotion).
These page ids are used to collect wall postings and comments and to compute statistics (e.g. to show most used words), and are presented to the user in a special view. Requirements are to collect all postings and comments without exception in near-live time. We currently have about 130 page ids in the system, with more to come.
Right now, I am using the Graph API for this, with several disadvantages:
FB API access is restricted to 600 request/10 minutes. To get a near-live view, I need to access the API at least each two hours. As we are using API requests in other parts of the program, too, it is obvious that the limit is hit sooner or later (actually, this already happens)
The responses are mostly redundant: to receive current comments, I have to request the wall postings (comments are enclosed in postings) with the URL http://graph.facebook.com/NAME/feed...
The probability for hitting the limits is dependent on the number of postings on the several walls
I cannot get all comments with this method (e.g. comments on postings some time ago)
I am currently trying out how to switch to (or to complement Graph API usage) using FQL by querying the stream and the comment tables but this also has limitations:
I cannot restrict my query to a specific timespan, leading to redundancy again
The max number of posts I am getting for each one of my 130 page ids is 61 - (why 61?)
I need an unpredictable number of additional requests because I need to get special objects like videos and links in separate requests.
My question now is - if anyone is doing similar things: How did you solve these problems? How do you get a pseudo-live-stream of a larger number (up to, say 1,000) of walls?
Letting the customer grant extra permissions to us is currently not an option.
you will probably have to meet with FaceBook and work out a contractual deal for greater access to their data. I would bet that the answer will be no, no and no, seeing as it appears you are trying to monetize their data, and furthermore, do so without the explicit permission of the users, but hey give it a shot.
I have a similar task - By default FB return only last ~50 posts or all in last 30 days(whichever is smaller) in FQL you should use created_time filter to receive more results. my current problem is that via FQL I receive no more than ~500 posts from any FB page wall even when LIMIT increased:
'select post_id from stream where source_id = 40796308305 and created_time <'.time().' LIMIT 1000000 ;'
this FQL request to CocaCola FB Page returns now ~300 posts only (less than 2 day posts).
If you find a better solution pls advise :)