Is my CDN or W3 Total Cache blocking the Facebook crawler? - facebook

Every new page I've added since yesterday (6) hasn't been able to pull data from the Facebook crawler. Debugger gives me Response code 200 and
error that must be fixed The 'og:type' property is required, but not present.
errors that should be fixed Inferred Property The 'og:url' property should be explicitly provided, even if a value can be inferred from other tags.
Both of these are actually on the page.
Crawler tells me it's seeing this:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="robots" content="noindex,nofollow">
<script>
loooong token(?) information
</script>
</head>
<body>
<iframe style="display:none;visibility:hidden;" src="//content.incapsula.com/jsTest.html" id="gaIframe"></iframe>
</body>
</html>
I tried disabling/reenabling plugins that I've updated recently but that doesn't seem to d anything.
Ideas?

Related

open graph protocols not working

I have been trying to implement open graph protocols on my pages and keep getting errors through the facebook debugger - at first I was getting
Could Not Follow Redirect Path - Circular redirect path detected
and research suggested this could be fixed by making
<meta property = "og:url" content="http://www.mandyevansartist.com/newsletterone/index.html" />
into
<meta property = "og:url" content="http://www.mandyevansartist.com/newsletterone/index.html/" />
(adding the backslash at the end)
but this has created even more errors and dependent on wether I put the backslash on the end of the address when i plug it into the debugger - will not scrape any information at all - at this point i am getting
'URL returned a bad HTTP response code'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta property = "og:title" content="Noahs Ark|Mandy Evans Artist|Newsletter One" />
<meta property = "og:description" content="Noahs Ark is the limited edition print showcased by unique artist Mandy Evans in this first website newsletter" />
<meta property = "og:image" content="http://www.mandyevansartist.com/newsletterone/images/sendinglove1.jpg" />
<meta property = "og:url" content="http://www.mandyevansartist.com/newsletterone/index.html/" />
<meta property = "og:type" content="website"/>
could someone point out what i am missing
It seems that your server is returning with a 301 code. You can see that with the following:
curl http://www.mandyevansartist.com/newsletterone/index.html
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>301
Moved Permanently</title> </head>
<body> <h1>Moved Permanently</h1> <p>The document has moved here.</p> </body></html>
Note that you redirect from ...newsletterone... to ...newsetterone.... But, the og:url provided points to newsletterone again, which will cause the crawler to get in a loop.
This is the part that you need to fix.

How to Redirecting to source url from form (where submit or close the form)

On our main intranet page (homepage) http://indi.cdc.com I created a link to InfoPath form that is on a site collection http://indi.cdc.com/salesteam. So the link looks like this. When user click Submit or Cancel they are not redirected back to homepage (http://indi.cdc.com). They are seeing "The form has been closed.). Please suggest.
I tried following and neither is doing the redirection.
http://indi.cdc.com/salesteam/Lists/RequestsList/Issue/newifs.aspx?Source=http://indi.cdc.com/Pages/Home.aspx
http://indi.cdc.com/salesteam/Lists/RequestsList/Issue/newifs.aspx?Source=http://indi.cdc.com/Pages/Home.aspx?target=http://indi.cdc.com/salesteam
Created a redirect page (http://indi.cdc.com/salesteam/SharedDocuments/Redirecting.html)
Created a view on the target list where the InfoPath form is redirect.aspx, edit the page and dropped a CEWP and reference above html file from it. (http://indi.cdc.com/salesteam/spteam/Lists/RequestsList/redirect.aspx)
Recreated the link on the homepage.
http://indi.cdc.com/salesteam/spteam/Lists/RequestsList/Issue/newifs.aspx?Source=http://indi.cdc.com/salesteam/Lists/RequestsList/redirecting.aspx
Here is the html code for redirecting.html page
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:mso="urn:schemas-microsoft-com:office:office" xmlns:msdt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Refresh"
content="0; URL=http://indi.cdc.com">
<title></title>
</head>
<body>
</body>
</html>

w3c validator shows error for facebook open graph

I get 2 errors while validating my site in w3c validator.
Line 7, Column 47: Attribute xmlns:og not allowed here.
xmlns:fb="http://www.facebook.com/2008/fbml" >
Line 7, Column 47: Attribute with the local name xmlns:fb is not serializable as XML 1.0.
xmlns:fb="http://www.facebook.com/2008/fbml" >
I guess it is related with Facebook open graph. I'm running my site on wordpress and using All in one SEO pack with Social features enabled. When Social feature is disabled, my site validates perfectly with no errors. Is there any fix to this problem?
This is how it looks on site
<!DOCTYPE html>
<!--// OPEN HTML //-->
<html lang="en-US"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:og="http://ogp.me/ns#"
xmlns:fb="http://www.facebook.com/2008/fbml" >
<!--// OPEN HEAD //-->
<head>
<!--// SITE TITLE //-->
<title>Aton usluge | Licencirana agencija za kreditno posredovanje</title>
<!--// SITE META //-->
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1, maximum-scale=1">
and so on..
The xmlns attribute is deprecated in HTML+RDFa 1.1.
You should use the prefix attribute instead:
<html
prefix="og: http://ogp.me/ns#
fb: http://www.facebook.com/2008/fbml">
resp. if you want to keep the xmlns for XHTML5:
<html
xmlns="http://www.w3.org/1999/xhtml"
prefix="og: http://ogp.me/ns#
fb: http://www.facebook.com/2008/fbml">

Facebook debugger is seeing code that isn't there

When Facebook debugger scrapes http://www.daisyworld.co.za it says 'Can't Download: Could not retrieve data from URL.' When I click 'See exactly what our scraper sees for your URL', this is what I get:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head>
<body><p>ÿþ</p></body>
</html>
But what is actually there is:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<META HTTP-EQUIV="content-language" CONTENT="En">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
None of the other pages in the domain where I implemented a like button have any problems, it works just fine & I basically used the same pieces of fb code, for all of them with just the different particulars for each page. I cannot figure out what the problem is except that it seems that the debugger is looking at a cached file but surely that isn't supposed to happen?
Maria-Helena
I just hit this issue as well and discovered that facebook's scraper was appearing as a inbound JSON request. Since that particular route was set up to handle both JSON and HTML responses, FB was getting a big gnarly JSON blob instead of the actual web page. Not sure if this solves your exact problem, but hopefully sparks some fresh ideas!
Try saving the file with a different encoding - going from unicode to UTF-8 did it for me.

Can't solve Facebook Open Graph Meta tags not being scraped for my Wordpress site

This is my first time posting a question on this site, but certainly not the first time finding answers in it.
I have used stackoverflow as a resource to fix several issues I've faced with my new blog, that is until last night, when I found this issue which I just can't fix.
When I try to share the home page of my blog, I don't get the proper image specified in the og:image tag... once I check my site via de FB debugger, it shows me this:
https://developers.facebook.com/tools/debug/og/object?q=ivanfuentes.com
Curiously enough, I do not find any issues when I check for a page, or a post:
https://developers.facebook.com/tools/debug/og/object?q=ivanfuentes.com%2Fvideos%2F
https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fivanfuentes.com%2Fthe-popularity-contest%2F
So, I know it's an issue that is generated in the home page only, but during the last 18 hours, have been unable to find it.
I have OG meta tags specified dinamically via a wordpress plugin... currently, it's "Facebook AWD", but I've had several other Facebook sharing, all-in-one's, and OG plugins, which give me the same results in the debugger, which makes me think I messed up somewhere else. I have no embarrassment in admitting I'm quite a newbie, so it's highly likely I messed up while trying to modify some code... probably when I added a few lines to make the site IE compliant?
Hope I gave enough information, and someone gets to help me, as this is not only about the proper image being displayed on a Facebook link, but rather about me likely having a mess in my code, and that could (WILL) mean trouble once I make any mods/updates to my site in the future.
Thanks for the time!
Your html is a complete mess and that's why the debugger is complaining.
Visiting your page and looking at the code I can see this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<div id="fb-root"></div>
<script>
...
</script>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="https://www.facebook.com/2008/fbml">
<head profile="http://gmpg.org/xfn/11">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<div id="fb-root"></div>
<script>
...
</script>
<title>Ivan Fuentes Hagar</title>
Two problems there:
The sdk code is inserted twice
In both cases there's a div placement before the body
In the debugger result for this page when clicking the bottom link (Scraped URL: See exactly what our scraper sees for your URL) you can also see broken html but in another variation:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><body>
<div id="fb-root"></div>
<script>
...
</script><meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<script>
...
</script><title>Ivan Fuentes Hagar</title>
There problems here:
The body definition is right after the html
There's no head definition
All of the tags which suppose to live inside the head are inside the body
The sdk script is loaded inserted twice
In both cases I found 3 occurrences of the <div id="fb-root"></div>.
As you can see you have some fixin' up to do with the html output of your wordpress.
I'm not sure why the outputs is different for the debugger, I thought that maybe due to the user agent string, but trying curl --user-agent "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)" "http://ivanfuentes.com/" returns the exact results as with the browser.