Facebook debugger: Clear whole site cache - facebook

I am aware that Facebook caches the Like data for specific pages on your site once they're visited for the first time, and that entering the url into the debugger page clears the cache. However, we've now improved our Facebook descriptions/images/etc and we need to flush the cache for the entire site (about 300 pages).
Is there a simple way to do this, or if we need to write a routine to correct them one by one, what would be the best way to achieve this?

Is there a simple way to do this,
Not as simple as a button that clears the cache for a whole domain, no.
or if we need to write a routine to correct them one by one, what would be the best way to achieve this?
You can get an Open Graph URL re-scraped by making a POST request to:
https://graph.facebook.com/?id=<URL>&scrape=true&access_token=<app_access_token>
So you’ll have to do that in a loop for your 300 objects. But don’t do it too fast, otherwise you might hit your app rate limit – try to leave a few seconds between the requests, according to a recent discussion in the FB developers group that should work fine. (And don’t forget to URL-encode the <URL> value properly before inserting it into the API request URL.)

The simple solution in wordpress, go to permalinks and change the permalinks and use a custom permalink, in my case I just added an underscore so did this...
/_%postname%/
Facebook then has no info on the (now) new urls so they scrape it all fresh.
I was looking for this same answer and all the answers were super complicated for me as a non coder.
Turned out there is a very simple answer and I came up with it all by myself :) .
I have a wordpress website that with a variety of plugins I've bulk uploaded over 4,000 images that created 4,000 posts.
The problem was I uploaded them and then tried setting up the facebook share plugins before sorting the og:meta tag issue so the total 4,000 posts were scraped by FB with no og:meta so when I then added them it made no difference. The fb debugger could not be used as I had over 4k posts.
I must admit I'm a bit excited, for many years I have got helpfull answers from google searches sending me to this forum. Often the suggestions I found were well over my head as I'm not a coder, I'm a "copy paster".
I'm so happy to be able to give back to this great forum and help someone else out :)

Well i also got the same scenario and used hack and it works but obviously as #Cbroe mentioned in his answer that the API call has some limitation with rate limiting so i guess you should take care of it in my case i only have 100 URLs to re-scrape.
So here is the solution:
$xml = file_get_contents('http://example.com/post-sitemap.xml'); // <-- Because i have a wordpress site which has sitemap.
$xml = simplexml_load_string($xml); // Load it as XML
$applicationAccessToken = 'YourToken'; // Application Access Token You can get it from https://developers.facebook.com/tools/explorer/
$urls = [];
foreach($xml->url as $url) {
$urls[] = $url->loc; // Get URLS from site map to our new Array
}
$file = fopen("response.data", "a+"); // Write API response to another file so later we can debug it.
foreach($urls as $url) {
echo "\033[Sending URL for Scrape $url \n";
$data = file_get_contents('https://graph.facebook.com/?id='.$url.'&scrape=true&access_token='.$applicationAccessToken);
fwrite($file, $data . "\n"); //Put Response in file
sleep(5); // Sleep for 5 seconds!
}
fclose($file); // Close File as all the urls is scraped.
echo "Bingo It's Compelted!";

Related

Facebook, post to wall, using me/photos, in 2.1 Facebook?

Considering a Unity project from ~3 years ago, and using Facebook graph I'm pretty sure it was 1.0,
You could post to a user's wall like this:
private byte[] imageAsBytes;
Texture2D im = ... your image
imageAsBytes = im.EncodeToPNG();
Dictionary<string, object> dct = new Dictionary<string, object>
{
{ "message", "Marketing message here" },
{ "picture", imageAsBytes }
};
Facebook.instance.graphRequest(
"me/photos", HTTPVerb.POST, dct, completionHandler );
As has been made known for many months now, there is a change to this coming.
With Facebook 2.1 being required as of this Aug 8, I'm rather confused about, simply, whether this still works in 2.1?
in short, how to post an image to the user's wall, in 2.1?
Note - here's where to find the important resource CBRoe mentions below...
Note that the only problem with the alternative, FB.FeedShare() is that, as far as I understand, you can not actually post an image (sure, you can link to an image at a URL).
This isn’t deprecated in any way. But since API v2.0 you need to get the necessary permission reviewed and approved by Facebook, before you can ask normal users for it.
And yes, this is a rather major change - but that's why it was announced way ahead of time, via a lot of channels. We all know how fast the IT world moves and changes- so I think you can not put the blame on Facebook here. If you were "out of the game" (this particular one) for over three years, you just have to go and find the resources that a) list what's changed, and b) what the current state of things is. And the developer section does both. The changelog has already been mentioned, and for example the need to get permissions reviewed now is also mentioned on the starting page for Facebook login, right at the top under Essential Guides.
Plus, Facebook actively informs you about changes - if you let them. Go to https://developers.facebook.com/settings/developer/contact/ where you'll find several options to get informed about specific stuff via e-mail.
You can check the changelog to see what changes happened https://developers.facebook.com/docs/apps/changelog. There are no changes for /me/photos as far as I can tell.
It is possible to use image data or a URL.
See https://developers.facebook.com/docs/graph-api/reference/user/photos#Creating for more info

is it possible to include # in redirect_uri for facebook's dialog/feed

I have a jquery mobile site that I want to share via facebook's dialog/feed system.
jQuery mobile uses #'s for their internal navigation system, so if I want to share a jqm url for page_3 of my jqm site, I would use something like: http://www.my_jqm_site.com/#page_3.
But that # is causing grief for facebook's dialog/feed:
https://www.facebook.com/dialog/feed
?app_id = ...
&link = http://apps.facebook.com/celjska_puzzle/#page_3
&redirect_uri = http://apps.facebook.com/celjska_puzzle/#page_3
&picture = ...
&name = .
&caption = ...
&description = ...
So is their anyway to do it?
I have tried it both with and without encoding.
Currently I suppose I will use a ? and then get the page to make some alterations via javascript during loading, but I really hate the thought of doing it this way.
I'm working on a problem like this right now. The URL to go in redirect_uri and link:
.../xfile.jsp?item=/contests/bhg_homeimprovement/bhg_splashsweeps_win2500_homeimprovement&temp=yes&hid=#HashID#&esrc=nwbhgsweeps072514a
The FB dialog gets an error as is. Encoding the full URL fixes that and does output hashes around "HashID", but the equal sign before it is removed. Adding a 2nd equal sign there will output 2 equal signs, but having just one will output none.
This isn't a complete solution but it does look like hash marks are possible.

Google Analytics - Tracking pages using history token

I would like to know if Google Analytics automatically keeps track of the pages that have their state retained using the ajax history token ('#'), developed for example with GWT.
My app has a single html page and different modules (pages) have the same URL, except that part that comes after # (ex. www.mysite.com?test=true#page=Contacts/id=1).
Also, if this mentioned behavior is not by default, is there a way to set up the Google Analyics to have this functionality ?
EDIT:
I found this article which explains how #hashtag can be tracked:
http://www.searchenginepeople.com/blog/how-to-track-clicks-on-anchors-in-google-analytics.html
But, if i use this solution, will the page access be recorded when a user presses an Anchor with href'#hastag' or only when a a page is accessed directly with that hashtag (in that case, I should register a function that calls trackPageview when history changes)?
Google tracks the # just fine. You just need to take it actually receives the # as-is (in our case the # got url-encoded to %23 and we had to use a search-and-replace-filter to restore it).
The most elegant way would be probably to look in GA admin into the instructions for the advanced filter - there is a nice example how to rewrite obscure URls into something readable by humans in the reports, which could be easily adapted for your needs.
I added the following lines to the initial analytics script:
_gaq.push(['_trackPageview', location.pathname + location.search + location.hash]);
and
window.onhashchange = function(){
_gaq.push(['_trackPageview',location.pathname + location.search + location.hash]);
}
which tracks the history change.

How to recognize Facebook User-Agent

When sharing one of my pages on FB, I want to display something different. Problem is, I prefer not to use the og: elements, but to recognize FB user-agent.
What is it? I can't find it.
For list of user-agent strings, look up here. The most used, as of September 2015, are facebookexternalhit/* and Facebot. As you haven't stated what language you're trying to recognize the user-agent in, I can't tell you more information. If you do want to recognize Facebook bot in PHP, use
if (
strpos($_SERVER["HTTP_USER_AGENT"], "facebookexternalhit/") !== false ||
strpos($_SERVER["HTTP_USER_AGENT"], "Facebot") !== false
) {
// it is probably Facebook's bot
}
else {
// that is not Facebook
}
UPDATE: Facebook has added Facebot to list of their possible user-agent strings, so I've updated my code to reflect the change. Also, code is now more predictible to possible future changes.
"Facebook's user-agent string is facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)..."
Hi
Small, yet important, correction -> Facebook external hit uses 2 different user agents:
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Setting you fitler to 1.1 only may cause filtering issues with 1.0 version.
For more information about Facebook Bot (and other bots) please refer to Botopedia.org - a Comunity-Sourced bot directory, powered by Incapsula.
Besides user-agent data, the directory also offers an IP verification option, allowing you to cross-verify an IP/User-Agent, thus helping to prevent impersonation attempts.
Here are the Facebook crawlers User Agent:
FacebookExternalHit/1.1
FacebookExternalHit/1.0
or
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Note that the version numbers might change. So use a regular expression to find the crawler name and then display your content.
Update:
You can use this code in PHP to check for Facebook User Agent
if(preg_match('/^FacebookExternalHit\/.*?/i',$agent)){
print "Facebook User-Agent";
// process here for Facebook
}
Here is ASP.NET code. You can use this function to check if the userAgent is Facebook's useragent.
public static bool IsFacebook(string userAgent)
{
userAgent = userAgent.ToLower();
return userAgent.Contains("facebookexternalhit");
}
Note:
Why would you need to do that? When you share a link to your site on Facebook, facebook crawls it and parses it to get some data to display the thumbnail, title and some content from your page, but it would link back to your site.
Also, I think this would lead to cloaking of the site, i.e. displaying different data to user and the crawlers. Cloaking is not considered a good practice and may search engines and site take note of it.
Update: Facebook also added a new useragent as of May 28th, 2014
Facebot
You can read more about the facebook crawler on https://developers.facebook.com/docs/sharing/webmasters/crawler
Please do note that sometimes the agent is visionutils/0.2 . You should check for it too.
Facebook User-Agents are:
FacebookExternalHit/1.1
FacebookExternalHit/1.0
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.0 (+https://www.facebook.com/externalhit_uatext.php)
facebookexternalhit/1.1 (+https://www.facebook.com/externalhit_uatext.php)
I'm using the code below to detect FB User-Agent in PHP and it works as intended:
$agent = $_SERVER['HTTP_USER_AGENT'];
if(stristr($agent, 'FacebookExternalHit')){
//Facebook User-Agent
}else{
//Other User-Agent
}
Short solution is to check pattern, and not to load all the mess to user each time
<?php
# Facebook optimized stuff
if(strstr($_SERVER['HTTP_USER_AGENT'],'facebookexternalhit')) {
$buffer.='<link rel="image_src" href="images/site_thumbnail.png" />';
}
?>
In the perspective of user-agent modifications on FB side, it is maybe safer to use a regex like that :
<?php
if (preg_match("/facebook|facebot/i", $_SERVER['HTTP_USER_AGENT'])){
do_something();
}
?>
You can find more information about Facebook crawler on their doc: https://developers.facebook.com/docs/sharing/webmasters/crawler
And if you want to block facebook bot from accessing your website (assuming you're using Apache) add this to your .htaccess file:
<Limit GET POST>
BrowserMatchNoCase "Feedfetcher-Google" feedfetcher
BrowserMatchNoCase "facebookexternalhit" facebook
order deny,allow
deny from env=feedfetcher
deny from env=facebook
</Limit>
It also blocks google's feedfetcher that also can be used for cheap DDoSing.
Firstly you should not use in_array as you will need to have the full user agent and not just a subset, thus will quickly break with changes (i.e. version 1.2 from facebook will not work if you follow the current preferred answer). It is also slower to iterate through an array rather than use a regex pattern.
As no doubt you will want to look for more bot's later so I've given the example below with 2 bot names split in a pattern with the pipe | symbol. the /i at the end makes it case insensitive.
Also you should not use $_SERVER['HTTP_USER_AGENT']; but you should filter it first incase someone has been a little nasty things exist in there.
$pattern = '/(FacebookExternalHit|GoogleBot)/i';
$agent = filter_input(INPUT_SERVER, 'HTTP_USER_AGENT', FILTER_SANITIZE_ENCODED);
if(preg_match($pattern,$agent)){
echo "found one of the patters";
}
A bit safer and faster code.
You already have the answer for Facebook above, but one way to get any user agent is to place a script on your site that will mail you when there is a visit to it. For example, create this file on your domain at, say, https://example.com/user-agent.php :
<?php
mail('you#youremail.com', 'User Agent', $_SERVER['HTTP_USER_AGENT']);
Then, visit Facebook, and type the link to the script there, and hit space bar. You don't actually have to share anything, just typing the link in and a space will cause Facebook to fetch a preview. You should then get an email with Facebook's user agent.
Another generic approach in PHP
$agent = $_SERVER['HTTP_USER_AGENT'];
$agent = trim($agent);
$agent = strtolower($agent);
if (
strpos($agent,'facebookexternalhit/1.1')===0
|| strpos($agent,'facebookexternalhit/1.0')===0
){
//probably facebook
}else{
//probably not facebook
}

Codeigniter Facebook app POST method AND query_string

I have a toy facebook app I'm playing with so I can understand how it all works. It's fine if you go the the app like this: http://apps.facebook.com/pushup-challenge/ (and connect it). But if you then go to it from your facebook page, FB uses the URL http://apps.facebook.com/pushup-challenge/?ref=bookmarks.
In my log file, I see that FB is POSTing the data and including the /?ref=bookmarks to it's call to my codeigniter system. This is causing it to either say "invalid URI parameters" or give me a 404, depending on if I've edited the system/core/URI.php file to add rawurlencode() to a particular call.
I've tried using mod_rewrite to get rid of the query_string, too, but since it's POSTing, it doesn't appear to be working (though I'm not exactly sure why).
Has anyone else run into this? How did you fix it?
Thanks in advance,
Hans
try $config['uri_protocol'] = “PATH_INFO”; and set enable_query_strings = TRUE
or
set
$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-?=';
in config.php
Because it isn't calling your file by name (just ?ref=bookmarks) the server runs thru the standard default files: index.htm, index.html, index.asp. Because you need to accept a POST, you need a server that allows POSTs to htm & html if you choose to use those. Index.asp will accept POSTs on most servers, and that works for me.
SOLUTION: Add a file (index.asp), that calls the real app that you named in the App settings.