Facebook Scraper seeing only parts of header - facebook

I couldn't see an obvious reason for the differences, but am pretty new to coding for facebook friendliness.
I've got a page on my site that shows flickr albums:
http://jpgme.co.uk/sports/index.php?type=sets
At the moment the headers source looks like this:
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>
<html xmlns:fb="http://ogp.me/ns/fb#" xmlns='http://www.w3.org/1999/xhtml'>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' />
<title>Blue Line Ice Hockey Photography - Professional Ice Hockey Photography in Greater Manchester & Beyond</title>
<link href='http://jpgme.co.uk/sports/themes/blackstripe/css/blackstripe.css' rel='stylesheet' type='text/css' />
<link rel='alternate' type='application/rss+xml' title='Blue Line Ice Hockey Photography - Professional Ice Hockey Photography in Greater Manchester & Beyond' href='http://jpgme.co.uk/sports/index.php?type=rss' />
<meta property="fb:admins" content="61401353" />
<meta property="og:title" content="Blue Line Ice Hockey Photography" />
<meta property="og:type" content="website" />
<meta property="og:url" content="http://jpgme.co.uk/sports/"/>
<meta property="og:image" content="http://jpgme.co.uk/images/fb.jpg"/>
</head>
But all the facebook scraper can see is:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head>
<body>
<div id="leftcontent"></div>

Trying scrapping after document load!
The possible reason is that you are trying to scrap before the whole page load.
Moreover FB using AJAX adds/removes the div as and when required, that can also be the possible problem.

Related

Facebook Open Graph Tags not working correctly in a simple Html

I'm trying to use Facebook Open Graph meta tags for my site. I want my links to be showed correctly when I share them in Facebook. But it doesn't. When I test the link in Facebook Debuger but it always shows Error parsing input URL, no data was cached, or no data was scraped.. I searched a lot and read the Facebook's documentation about good examples and followed them. But no success. Here is my Page Code:
<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb#">
<meta charset="utf-8" />
<title>تست</title>
<link rel="canonical" href="www.kaladaran.com">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<meta name="keywords" content="Product">
<meta name="description" content="Product">
<!--FACEBOOK-->
<meta property="og:type" content="website">
<meta property="og:title" content="Test">
<meta property="og:site_name" content="Kaladaran">
<meta property="og:url" content="www.kaladaran.com/test.html">
<meta property="og:description" content="Descritption">
<meta property="fb:app_id" content="My App Id">
<meta property="og:image" content="http://kaladaran.com/Data/1598/6ccec951-f124-43f6-abb0-a59e3d94bd67.jpg">
<meta property="og:image:width" content="250px">
<meta property="og:image:height" content="169px">
<meta property="og:locale" content="fa_IR"><meta property="article:author" content="https://www.facebook.com/KDAdsCo" />
<meta property="article:publisher" content="https://www.facebook.com/KDAdsCo" />
</head>
<body>
<div>
<h1>
Facebook Open Graph Tag Test
</h1>
<img src="http://kaladaran.com/Data/1598/6ccec951-f124-43f6-abb0- a59e3d94bd67.jpg" alt="Panasonic" />
</div>
</body>
</html>
I can't understand what is wrong with my code? Any help is appreciated in advance.
The line:
<link rel="canonical" href="www.kaladaran.com”> is causing the error. A canonical link is meant to point to the current page, not the home page.
Also, note that, the URL inside a <link> is relative if you leave out the http:// which is why you are getting the error.
Make it like :
<link rel="canonical" href="http://www.kaladaran.com/test.html">
<meta property="og:url" content="http://www.kaladaran.com/test.html">
Also, remember to type the exact same URL in the Facebook scraper.
When Facebook finds a canonical link that points to a different URL, it will treat it like a redirect .
From Facebook:
The following will be treated as a redirect by the crawler:
A HTTP redirect
A <link rel="canonical" href=".." /> tag
A <meta property="og:url" content=".." /> tag
https://en.wikipedia.org/wiki/Canonical_link_element

Facebook swf embed works only for selected domains

Since about a week or two the facebook swf embed feature stopped working for my website. I realized that it stopped working for a few websites but was still working for soundcloud.com. After doing some research i was able to pinpoint the issue to a single open graph tag.
A website containing the following does seam to work:
<!DOCTYPE html>
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"><head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta charset="utf-8">
<meta content="music.song" property="og:type">
<meta content="http://venc.pl/test.html" property="og:url">
<meta content="Asd - Keygen Music" property="og:title">
<meta content="http://i1.sndcdn.com/artworks-000026602093-dp518o-t500x500.jpg?16b9957" property="og:image">
<meta content="Listen to Asd / Asd - Keygen Music | Explore the largest community of artists, bands, podcasters and creators of music & audio." property="og:description">
<meta content="SoundCloud" property="og:site_name">
<meta content="video" name="medium">
<meta content="98" property="og:video:height">
<meta content="460" property="og:video:width">
<meta content="application/x-shockwave-flash" property="og:video:type">
<meta content="http://player.soundcloud.com/player2.swf" property="og:video">
</head></html>
But the following does not
<!DOCTYPE html>
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"><head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta charset="utf-8">
<meta content="music.song" property="og:type">
<meta content="http://venc.pl/test2.html" property="og:url">
<meta content="Asd - Keygen Music" property="og:title">
<meta content="http://i1.sndcdn.com/artworks-000026602093-dp518o-t500x500.jpg?16b9957" property="og:image">
<meta content="Listen to Asd / Asd - Keygen Music | Explore the largest community of artists, bands, podcasters and creators of music & audio." property="og:description">
<meta content="SoundCloud" property="og:site_name">
<meta content="video" name="medium">
<meta content="98" property="og:video:height">
<meta content="460" property="og:video:width">
<meta content="application/x-shockwave-flash" property="og:video:type">
<meta content="http://venc.pl/player.swf" property="og:video">
</head></html>
So it basically boils down to changes in the domain presented in the og:video tag ().
There used to be a whitelist of websites enabled to embed swf on facebook a few years ago. The idea was dropped but I think that Facebook just got back to it.
How can I get in touch with facebook to resolve this issue? If a whitelisting is needed how do I ask for being whitelisted?
I believe the answer to this is that the swf file needs to be on an https link (see how to share a video from my website on facebook like youtube)

Multilingual Open Graph objects?

For a Facebook game, I want to have Open Graph picture objects with titles in multiple languages (specifically, English and German). I did everything as described in Facebooks open graph internationalization document, but somehow the objects (and actions) are always shown with English titles in the newsfeed and activities - and the app is definitely localized in its configuration.
Here's the URL of one of the objects: http://apps.facebook.com/spot-it/opengraph/picture/pictures.1A24.html
If I get it through Facebook's object debugger using the fb_locale parameter set to German, I see:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# spot-it: http://ogp.me/ns/fb/spot-it#">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta property="og:locale:alternate" content="de_DE">
<meta property="og:locale:alternate" content="en_US">
<meta property="og:locale" content="de_DE">
<meta property="fb:app_id" content="419035224820013">
<meta property="og:type" content="spot-it:picture">
<meta property="og:url" content="http://apps.facebook.com/spot-it/opengraph/picture/pictures.1A24.html">
<meta property="og:title" content="Beißerchen">
<meta property="og:description" content="Findest du die Fehler in 'Beißerchen'?">
<meta property="og:image" content="http://d2tv32y5kdvj8c.cloudfront.net/assets/pictures/1A24_potd.jpg">
</head>
<body>
<script type="text/javascript">
self.location.href = "";
</script>
</body>
</html>
So why doesn't Facebook use the German version when displaying actions involving that object to German users? Am I doing something wrong, or does internationalization for open graph simply not work?
in the app settings at developers.facebook.com > your app > "Localize" you can add support for additional languages.
you can set:
display name
tagline
description
detailed description
explanation for permissions
and all images like logo, icon, web banner, cover image…
i hope this is what you're looking for ;)

cannot get opengraph to scape url

I know this is a pretty common problem but any of the solutions I have tried (thats a lot) haven't worked.
I am trying to scrape http://residencyradio.com/ using https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fresidencyradio.com%2F
The site itself is getting a complete overhaul, this will be revealed next week and I want the relevant images, title and info to appear if someone links to the site, but instead at the moment these properties are being shown as they were the very first time the site was cached on FB (nearly a year ago).
As far as I can see, I have included all the relevant meta tags etc how they should be, I even tried implementing a like button on the site, but to no avail. I have followed what has been set out on: http://ogp.me/ and can't see anything wrong.
Here is a sippet of the page from !DOCTYPE to </head>:
<!DOCTYPE html>
<html lang="en" prefix="og: http://ogp.me/ns#">
<head>
<meta charset=utf-8>
<title>The Residency</title>
<!--[if IE]>
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js">
</script>
<![endif]-->
<link href="css/reset.css" rel="stylesheet" type="text/css" />
<link href="css/stylesheet.css" rel="stylesheet" type="text/css" />
<link rel="icon" type="image/png" href="images/favicon.png" />
<!--Meta Data-->
<meta name="keywords" content="The Residency, M. Budden, Neal McClelland, Michael Budden, Radio, Residency Radio,
Residency, Global, House, Electro, Progressive, Tech, Techno, DnB, Drum and Base, Dubstep, iTunes, Belfast,
Northern Ireland, UK" />
<meta name="description" content="Brought to you by Neal McClelland and M. Budden, The Residency is a weekly global underground dance show" />
<meta property="og:title" content="The Residency A global underground dance show" />
<meta property="og:type" content="musician" />
<meta property="og:url" content="https://www.facebook.com/theresidency" />
<meta property="og:image" content="https://www.residencyradio.com/images/Residency_logo_circle.png" />
<meta property="og:site_name" content="The Residency" />
<meta property="fb:admins" content="1324839758" />
Any help would be greatly appreciated, as I've been scratching my head for a few days trying to figure it out!
Thanks in advance!
This is a guess, but your html is not valid and maybe because of that the facebook scraper fail to parse and extract the data from it.
I haven't went through all of it, but you don't seem to close all tags.
For example the description and keywords meta tags don't end with "/>" or ">".
Edit
Screen capture of what the debugger shows when I load your html from my server:

Facebook debugger : Response 206

url: http://www.pagepilot.co.uk/pp_cftest/
When posted into facebook debugger: https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fwww.pagepilot.co.uk%2Fpp_cftest%2F
Keeps returning a 206 Partial Response Instead of a 200 OK
The code being returned is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="https://www.facebook.com/2008/fbml">
<head>
<title>pp_cftest</title>
<meta property="og:title" content="pp_cftest" />
<meta property="og:type" content="company" />
<meta property="og:url" content="http://www.pagepilot.co.uk/pp_cftest/" />
<meta property="og:site_name" content="PagePilot" />
<meta property="og:description" content="description text" />
<meta property="fb:app_id" content="242396009188876" />
<meta property="og:image" content="http://www.pagepilot.co.uk/views/pp_w_en/assets/images/ladbrokes.gif" />
</head>
<body>
<p>test content</p>
</body>
</html>
There doesn't seem to be anything unusual or missing in the code. Just don't understand it.
The debugger only requests the first 40KB of your page - so the 206 is expected (well, it's expected if only part of the document was returned but i guess some servers return it for any request with a Range: header)
It shouldn't affect your ability to have the tags read correctly and the metadata populated when sharing a link on Facebook