Scraping facebook

Scraping facebook - facebook

I need to quickly get the names of about 1000 users that I now only have the facebook id and access tokens of. I'm not comfortable with the FB api yet so I was considering just writing a scraper to retrive the name from the FB page of the user (since I have the id of the users).
Is this allowed? I assume it's not "best practice" but how severe is it? Will it get me banned for instance? The data will only be used to complete our user database so no advertisement
Alternativly: Can anyone point me in to a good (and up to date) guide of how to get user info using the FB api (keep in mind that I have the ID and the access.tokens of all my users).

No, scraping is not allowed and you MUST use the Graph API: https://www.facebook.com/apps/site_scraping_tos_terms.php
/me?fields=name&access_token=[user-access-token] returns the name of a User - You may run into API limits though, but if it´s a one time thing it should not really matter. If you run into limits, just wait a bit and get the next batch.

Related

How does Facebook API Rate Limit work for App Access Token?

I'm working on a project where an app displays events in the near surrounding based on the personal preferences of the user. We plan on getting the events from the Facebook Graph API using this approach. Due to Facebook's API changes it is much more complicated to search for events in a particular city. Therefore it requires much more API calls than before and i'm worried about FB rate limit.
We want to get the information about events by calling the Graph API with our app access token from our server and then store the data temporarily in our own database. So every time a user searches for events in our app, the client gets the information from our database. Moreover the user can (but don't has to) log-in with his Facebook account in order to provide us more information about him. We want to use the user's access token to call the API in order to get the user's likes.
I've read the FB documentation about the rate limits and some posts here on the site. Apparently FB calculates the number of calls based on the active users (200 calls per user every hour). It says that
"These limits apply to calls made using any access token other than a
page access token"
ergo they also apply to the app access token. Additionally in the FB policy it says something about 100M calls per day.
So my questions are:
How does the rate limit work on a per user basis if I am using my App Access Token?
To what token does the "100M" number belong? Is it an overall number for all tokens used by the app?
A similiar question has been posted here some time ago but didn't receive any answers. I hope maybe someone got new information since then. An answer to these questions is crucial to our project, so bear with me if you've read that question before.
Thanks in advance!

Please check this
Facebook Rate Limits

Facebook Page Crawler / Access tokens

I know pretty well that this topic has been discussed very intensively (I read a lot all over the day).
Still, how probable is it that FB might allow me to create a frontend crawler for a non-commercial, non-public research university project?
My crawler should repeatedly lookup a very limited number of specific public fan pages and collect anonymized data like number of fans, status updates and their number of likes and number of comments each.
What I would like to show is what kind of topics in media pages are "liked" and discussed most and how that develops over time. I know about FB's restricted TOS. Thanks for your opinion on that.
The second question concerns technological approach / authorization: Reading a fan page's number of fans, status updates and their number of likes each - could I even use the API/OpenGraph for such a crawler? I think for reading page walls, you need an access token at any cost, and realizing an automatic "crawler" via an application therefore is not possible I guess (as apps only react to users' actions and cannot act like cron jobs for example)?
As you see, I am pretty new to FB development and logic. Thanks so much for your expertise.

If you mainly target public pages then you should be ok.
You need to have a facebook app and then you can authenticate as the app from your program.
You will get an app token with which you should be able to use in order to crawl public pages data.
If you check the documentation for the Page object you'll see in the tables (fields and connections) that most of what's in the Permissions column is either "No access token or user access_token" or "any valid access_token or user access_token", if you have the app token you're good.
Also, and I think this is something you'll be interested in, the Page object has the "talking_about_count" field.
So, yes you can do it, at least most of it.
As for the TOS, since all of this is perfectly ok and straight from their official documentation, there's no problem.

Facebook friend's friend count using graph api?

Is it ok for a webservice which uses F-connect to store the friend count (NOT the friends list) from Facebook (or this is counted as a breach in privacy).
As the Facebook Graph API does not allow auth_tokens to be used to get such detail of friends if they are not registered to the webservice.
I have seen people recommending storing the friend count and showing it to others.

I think I can help with this. I did some research on this topic.
Their policy states: "You may cache data you receive through use of the Facebook API in order to improve your application’s user experience, but you should try to keep the data up to date. This permission does not give you any rights to such data." The anwser to the FAQ 'How long can I store data' is as follows: "We have deprecated our 24 hour data storage policy. You may now indefinitely cache data to improve your application's user experience."
For the record, I am not an attorney, so Im looking at their statement from an IT perspective. If I read their statements (which basically says you may store anything received through the API), I would say it is ok to save the friend count.

Is it ok for a webservice which uses F-connect to store the friend count
We don't give out legal advice, consult an attorney. As always free legal advice is worthless, so even if we did tell you anything if it's legal, it's not trustworthy information.
I have seen people recommending storing the friend count and showing it to others.
Facebook says this is ONLY ok if it serves a purpose for your application. Again, consult an attorney before you make any hard decisions.

Facebook Graph API FQL Query Limit

Is there an official limit (or at least a guaranteed rate) for Graph API calls?
I am getting valid access_tokens for users and use them both on web server and client side scripts. Both calls use FQL queries, which are like below:
SELECT+page_id+FROM+page_fan+WHERE+uid=me()+and+page_id=...&access_token=...
SELECT+post_id+FROM+stream+WHERE+(privacy.value='EVERYONE'+OR+privacy.value='ALL_FRIENDS')+AND+attachment.description='...'+AND+attachment.name='...'+AND+actor_id=me()+AND+source_id=me()+AND+is_hidden=0&access_token=...
I plan to query once every minute for each access_token and some will be made from client IPs, some from web server IP. So what exactly must I care?
And one additional question :) about the "me()" in those queries, if I make the calls from client or server does it differ? e.g. if the client user changes his/her FB login out of my web page, does it refer to new login or the login that the access_token had been generated?

When Facebook had app boxes and profile pages, My Countdown app updated the profile once per hour. At one point it had 400K users, thus was making 9.6 million (400K x 24) calls to Facebook per day.
I'm not sure if there is a limit, but the subscribe feature is suppose eliminate the need to hit their API so often. It sounds like you are trying to check if anything changed. The subscribe API call essentially tells Facebook to let YOU know when something changes.
Really, your issue is going to be network bandwidth and CPU, not Facebook limits.

The me() refers to the user/page ID encoded in the access token. Lint the token at https://developers.facebook.com/tools/debug and see what id it is for.

Outside access to Facebook events

Let's say I own/control a Facebook page where events are posted. I'd like to display these events on another website (In my case, a WordPress blog, but that's not the important part) on an "Upcoming events" page.
What I'm unsure about is: Is the Facebook API usable "externally" like this? I've downloaded the PHP library and have a demo app running that works from within Facebook (i.e. emitting FBML that facebook.com interprets and displays to the logged-in user), but in my case I want a third party (my web server) to query Facebook every so often, rather than the site visitors directly requesting data (HTML/JSON/etc.) from Facebook itself.
Is this sort of thing possible with the Facebook API? How will my web server authenticate itself? What information do I have to store?
Note: I'm looking for information more at a "sequence diagram" conceptual level, not just asking for code. That part I can figure out myself. ;) Unfortunately, Google and the FB developer wiki have not been entirely forthcoming. What do I need to know so I can start coding?

This is a basic overview of how I've done it for a few of my clients who wanted similar functionality:
Create a pretty basic app that prompts for Extended permissions, specifically "offline_access" and whatever else you need
Store the resulting Session Key in your database with the UID
Create a secure, authenticated webservice for your app which allows you to get the info you need for a UID that you supply, using the session that you've stored in your database
On the website make requests to your app's webservice, being sure to cache the results for a certain period of time and only make a new request to your webservice once the cache has expired (I use 5-10 minutes for most of mine)
So basically your Facebook app acts sort of like a proxy between the website and the user, doing all of the authenticating and requesting using legitimate means.
I've used a webservice because I only wanted to maintain one Facebook app for multiple client's needs. It works like this (in a not-very-awesome ASCII art diagram):
Facebook User 1 \ / Client Website 1
Facebook User 2 --- Facebook App --- Client Website 2
Facebook User 3 / \ Client Website 3
Note: I've only done this for users, not pages, so your mileage may vary.

You can do Events.get with the Facebook API then supply the page/profile ID you'd like to get the events for. Depending on how your page is setup you may have to authenticate, simply use your Facebook account, since you should have access to all the events. oh and make sure you do plenty of caching so your not hitting Facebook on every page load.

AFAIK other than user info, you can't fetch any other data from facebook.
But you can try it other way - say create an app that stores events and other relevant information on a webserver and then your other website can easily access that info.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse