I want to crawl twitter and facebook - facebook

I am making a crawler now.
It supports web, Facebook and Twitter.
My mentor says it needs to support getting post using Facebook and Twitter API, but I don't know how.
I am using Solr as a search engine and planning to use Nutch for web crawling.
I saw that Nutch does not support those APIs.
Could you recommend other web crawlers or a way to get posts using Nutch or other ways, whatever.
I would appreciate very much!

What exactly do you want to crawl in facebook/ twitter?
Only specific search engine bots are allowed to crawl facebook.
Visit https://facebook.com/robots.txt
At the bottom they have disallowed all bots except the one listed.
So to fetch data from facebook (if that's what you need), use API.
https://developers.facebook.com/
In twitter, you can crawl a few url's
Allow: /?lang=
Allow: /hashtag/*?src=
Allow: /search?q=%23
Again a better approach is to use API, if your aim is to fetch some data.
https://dev.twitter.com/

Related

About Google Plus API and Zend GData

Well, recently I am writing something as facebook activity. I want to support the google plus platform, so I search about the google plus API library in PHP, and find that there is two source I can choose: Google Plus API and Zend GData.
Then, there is a funny situation. Google Plus API in PHP library and Zend GData are two different library. Google Plus API can allow me post comment as activity, and Zend GData can allow me to post image. The facebook activity things can allow the user post comment only, or the user can post both comment and image.
I am thinking if the user post comment, I call Google Plus API. If the user post comment and image, I call Zend GData? This will make the code very confuse. Can I post the image and comment as activity in google plus API? Or can I post the comment only by using Zend GData?
I think you are mistaken the Google+ api really doesn't exist yet (currently only supports limited GET activity in Developer Preview) and the Zend Gdata component predates Google+ by many years.
I'm sure you can POST images to Picasa and comments to other google services that maybe are reflected on Google+, but any library that is actually POSTing to Google+ is likely a hack.
So to answer the question:
Can I post the image and comment as activity in google plus API?
The answer would be, NO.

Having issue sharing my Websites post to Facebook's timeline

We want users to be able to post to their Facebook if they want to. We have it kind of working with the older Facebook profile but not with the new Timeline.
We want it to be like Tumblr where you can post/share to your Facebook account as much as you want.
Is there a limit of post that we can have a day per users or the via entire API in general?
We are using FB connect already of course!
The documentation for Facebook's Open Graph is probably the best place to start. Take a look at their best practices page, as well:
If your application is making too many calls, the API server might rate limit you automatically, returning an "API_EC_TOO_MANY_CALLS" error.

How can I like a post/comment in facebook fan page from my custom website

I am importing the posts and comments in my FB fan page to my custom website. I am importing using graph api. In the response array I am getting two types of action URL for "comments" and "likes".
See below :
http://www.facebook.com/149263441795729/posts/240758399312899
Using this link in following code
<fb:like href="http://www.facebook.com/149263441795729/posts/240758399312899" width="450" height="80"/>
I get the following error
The page at http://www.facebook.com/149263441795729/posts/240758399312899 could not be reached.
How can I like these posts or comments from my website? Is there any solution for that?
I think--I'm no expert here--that redirects such as this are controlled by Facebook, with a cross-site scripting policy file on their servers that say whether or not they will allow redirects and to who. On my website for example I allow anybody to cross link, since I'm just a little guy, but I bet Facebook only allows it with preferred partners like various corporations, see the story below. That would be my best guess.
Paul
http://thenextweb.com/facebook/2011/09/15/facebook-may-be-adding-cross-linking-to-foursquare-yelp-gowalla-and-more-on-pages/
Facebook may be adding cross-linking to Foursquare, Yelp, Gowalla and more on Pages
Facebook appears to have added cross-linking between Pages and other location-based sites like Foursquare, Yelp, Gowalla and SCVNGR to its Pages, reports Scribbal. Tech evangelist Robert Scoble posted a notice on his Google+ profile earlier today that indicated a new partnership between Foursquare and Facebook as the Page for a place was shown to direct viewers to the comparable location on Foursquare.
The links appear off to the left underneath the ‘Like’ count and checkin count on a location’s page. The only question that remains is whether the users are activating these connections themselves or if it is something that is done automatically. This could be Facebook’s plan to integrate itself with other location sites now that it has distributed the Facebook Places features throughout its framework.
Facebook has been on a tear lately, adding the Subscribe button and smart Friends list features just this week, as well as Facebook integration into the new Skype for Mac. It is clearly making an effort to maintain its lead over Google+ as the preeminent social network and it doesn’t want lack of features to be a reason for anyone to quit it.
We have reached out to Facebook about this new feature and will update this post when we hear back.

php - facebook quick login

I am trying to price up a quote and one thing on there is a facebook 'quick' login on the members area of the site in question. I think it is like open id. Is there any tutorials that anyone know of to accomplish this?
I guess there should be a fair number of tutorials out there:
How to Authenticate Users With Facebook Connect
The Facebook PHP-SDK has a nice example that could be used (with some work) to achieve what you want.
I've written a tutorial about the Registration Plugin (one of the flows) if you are interested.
There are many tutorial which will be helpful in Login with Facebook as well as you can login with multipal sites.
You can use PHP OAuth API: Authorize and access APIs using OAuth
written by Manuel Lemos. It provides built-in support for OAuth servers of Bitbucket, Box.net, Disqus, Dropbox, Eventful, Facebook, Fitbit, Flickr, Foursquare, github, Google, Instagram, LinkedIn, Microsoft, RightSignature, Salesforce, Scoop.it, StockTwits, SurveyMonkey, Tumblr, Twitter, XING and Yahoo.
Also
http://www.9lessons.info/2011/02/login-with-facebook-and-twitter.html
2.http://www.a2zwebhelp.com/login-with-facebook
3.http://runnable.com/UfwdES1fQz9uAAAh/simple-facebook-connect-php-example

How much information can I access with facebook connect?

My google skills have failed.
My interest is how much user information I can access when having users connect to my website through. Comments, shared links, uploaded photos, data from other facebook applications used by the user?
Have a look at the Graph API ..or the old REST API for an idea of what you can access via Facebook APIs.
There's a good tutorial here: Facebook Connect Tutorial. I believe this article is using the REST API and the Javascript SDK for single sign on.