Facebook Connect Won't Validate - facebook

I'm trying to get my Facebook Connect code to validate, but it won't. I think the problem is that their xmlns page isn't loading. I have the code:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:fb="http://www.facebook.com/2008/fbml" xml:lang="en" lang="en">
But http://www.facebook.com/2008/fbml isn't found. Does anyone have a copy of what it should be? Is there a different URL i should use?

The XML namespace doesn't need to actually exist, they are just a way to qualify elements and attributes. If you're interested (this is unrelated to your issue), there is more about XML namespaces here
Facebook seems to have a pretty straightforward page setup here: FB Connect, and it's laid out step by step. Have you checked this out?

As already mentioned, the namespace URL is a red-herring. It's the DTD (as specified in the DOCTYPE) that is validated against.
If you really want to validate your pages that use XFBML, you will need to validate against a custom DTD. And, as far as I'm aware, Facebook don't publish a DTD for XFBML themselves, so you'll have to write one yourself (probably only for the elements/attributes that you're actually using).
It's not actually as tricky as it sounds; here's an A List Apart article on how to validate against a custom DTD.
Note also that messing around with your DOCTYPE declaration may do funny things with regards to knocking (older) browsers into quirks mode.
So you can get it to validate; it's just up to you whether it's worth the hassle.

Facebook XHTML does not validate.

XFBML - why did they have to be different? Nobody will care enough to petition them on this until a couple years from now and then W3C will re-write their standard to include XFBML or Facebook will be forced to re-write in a more compatible format. Why not use ID's or rel attributes to make their script work? That would allow a 100% valid format that would play nice with all browsers and CMS out there. Maybe they were worried about blog pages without root access making FB-based scams?

Related

Site not valid - but it is

So, I'm building a website called "dagbok.nu", which is swedish for "diary now" :)
Anyway, when creating the Facebook application, it claims that the site URL is invalid as well as the app domain. For site url, I used "http://dagbok.nu" and for site domain, I used "dagbok.nu". Please don't reply (as I've seen others do on similar issues) that I should type the site url with the scheme and the domain without - that's exactly what I'm doing.
Right, so according to another question here, one could trouble shoot this functionality using FB's own URL scraper, so I did just that:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fdagbok.nu
And the reply: Error Parsing URL: Error parsing input URL, no data was scraped
Right, so now I can assume that the reason for it being considered invalid is because of FB not being able to scrape the URL. But why?
According to this question, one of the reasons seems to be that FB has deemed the URL insecure or "spammy". I've acquired this domain from a previous owner so this wasn't all that impossible. But when doing the same thing as Matthew in that post - i.e. trying to post in my timeline using the domain "http://dagbok.nu", I didn't get any information. The status box expanded as if to include a thumbnail and information about the link, but it only contained a "(No title)" text and nothing more.
So now I don't know what to do. I've tried to check the DIG and NS records from multiple servers around the web, and everyone seems to resolve it correctly, and I've had friends double check the URL from the states as well. I can't understand what's wrong and I have no idea how to ask someone at FB how to resolve this. Does anyone here have a good advice for this? Thanks in advance! :)
EDIT
When changing the domain to another domain that points to the exact same web server and document_root, it works! So this is definitely a problem with the domain "dagbok.nu" and not with the code on that page.
EDIT
When using the debug function above - I see no activity in the server log what so ever. Facebook doesn't even contact the server. When using the alternate url - the one from the last edit, it pops up in the logs as it should.
EDIT
I filed a bug report with Facebook, And their first response was that they were going to follow up. Now, a month later, I got an email that said "We are prioritizing bugs based on impact to the developer community. As this bug report has not received much attention from other developers, we are closing it so as to better focus on the top issues", and then they told me to go here to stackoverflow to try to solve my issue - but the issue is WITH THEM, and of course no one else have reported that my site doesn't work, it affects only me, and I haven't opened it yet due to this bug!
EDIT
I wanted to file a new bug report, but I can't even that now, since they are blocking bug reports with this URL as well!
I had to edit the URL - here is the new bug report
When Facebook tries to scrap your site for information, they send a call to your server with specific user agent called "facebookexternalhit"...
Facebook needs to scrape your page to know how to display it around
the site.
Facebook scrapes your page every 24 hours to ensure the properties are
up to date. The page is also scraped when an admin for the Open Graph
page clicks the Like button and when the URL is entered into the
Facebook URL Linter. Facebook observes cache headers on your URLs -
it will look at "Expires" and "Cache-Control" in order of preference.
However, even if you specify a longer time, Facebook will scrape your
page every 24 hours.
The user agent of the scraper is: "facebookexternalhit/1.1(+http://www.facebook.com/externalhit_uatext.php)"
Make sure it is not blocked by your server firewall
Look in your server log if it even tried to access your site
If you think this is a firewall issue look at this link
Your problem appears to be with your character encoding string. Your Apache server is currently sending the unsupported string latin1. You've defined your meta:content-type as iso-8859-1. See the w3c validator
From what I've seen, the Facebook parser will stop immediately if it encounters either an unrecognized character encoding string or a mismatch in character encoding strings between your header and meta tags.
The problem could be originating from either your httpd.conf or php.ini files. Change these to match your meta and restart Apache. Since the problem seems to be domain-specific, I'd check httpd.conf first.
Could your domain be blacklisted? Could you try messaging your url to someone, and see if Facebook gives you a "This message contains blocked content..." error?
For example:
If you don't provide certain minimum Facebook markup on your page, it will respond with "Error Parsing URL: Error parsing input URL, no data was scraped." I only looked at the homepage, but it appears that dagbok.nu contains no Facebook markup. I'm not sure what things must be present at minimum, but in my implementation, I assume the fb:app_id meta tag and the JavaScript SDK script must be there. You may want to take a look at http://developers.facebook.com/docs/guides/web/#plugins , particularly the Authentication section.
I discovered your question because I had this same error today for an unknown reason. I found that it was caused because the content of my og:image meta tag used an incorrect URL to the image I was trying to use. So as you add Facebook markup to your page, make sure your values are correct or you may continue to receive this message.
This doesn't seem to be a Facebook problem if you take a look at what I've discovered.
The results when testing it with W3C Online Validation Tool are 1 of 2 results.
Tested using: dagbok.nu but note http://dagbok.nu has no difference in test results. Remove the last forward slash in between tests.
Test: 1
Results: 72 Errors 0 Warning
Note: Shown here is a fragment of the source Frameset DOCTYPE webpage.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<NOSCRIPT><IMG SRC="http://svs.bystorm.se/rv?java=off"></NOSCRIPT><SCRIPT SRC="http://svs.bystorm.se/rvj"></SCRIPT>
<HTML STYLE="height:100%;">
<HEAD>
<META HTTP-EQUIV="content-type" CONTENT="text/html;charset=iso-8859-1">
Test: 2
Results: 4 Errors 1 Warning
Note: Shown here is a fragment of the source Transitional DOCTYPE webpage.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html >
<head>
<title>Dagbok: Framsida</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="author" content="Jonas Eklundh Communication (http://jonas.eklundh.com)">
<meta name="author-email" content="jonas#eklundh.com">
<meta name="copyright" content="Jonas Eklundh Communication #2012">
<meta name="keywords" content="Atlas,Innehållssystem,Jonas Eklundh">
<meta name="description" content="">
<meta name="creation-time" content="0,079s">
<meta name="kort" content="DGB">
Repeated tests loop these results when done a couple seconds apart indicating a page-redirect is occurring.
Security warnings are seen in Firefox and Chrome when visiting your site using these secure URL's:
https://dagbok.nu
https://www.dagbok.nu
The browser indicates the site should not be trusted because it's impersonating another site using invalid security certificate from *.loopiasecure.com
Recommendation: Check your .htaccess file, CMS Settings, page redirection, and security settings. Use the above source webpages to realize those file-locations / file-names that are being served to discover what's set incorrectly.
Once that's done, I think Facebook will be happy to then debug your webpage and provide additional recommendations.
Had the same problem and I discovered it was an incorrect IPv6 address in the AAAA records for my domain. The IPv4 record was correct, so the site worked in a browser but FB obviously check the IPv6 records!
This issue may also happen when Cloudflare is used. This is because Cloudflare protects the page from Facebook, which is then unable to collect the data, which in turn makes Facebook think the page is invalid.
My fix was:
Turn off Cloudflare for the page.
Scrape the page using Facebook's Dev Tools: https://developers.facebook.com/tools/debug/og/object
Click and let run the "Fetch new scrape information" button.
Re-enable cloudflare protection for the page.
You should then be able to continue to add the page where you needed.

What does fb: namespace stand for?

It might a bug in documentation or just me getting it wrong, but in any case I think it is confusing to see several different definitions of the same fb: namespace:
xmlns:fb="http://www.facebook.com/2008/fbml"
Given as example here.
xmlns:fb="http://ogp.me/ns/fb#"
Shows up in the generated XFBML code here.
So which one should the developers use?
fb namespace is like your application name.
for example if your application name your testapp your canvas url is going to be
apps.facebook.com/testapp/
hope that helps
Edit : In developers screen there is a namespace field my info is about it.
Just asked a facebook engineer (at the mobile hackathon today).
Advise was use the second one:
xmlns:fb="http://ogp.me/ns/fb#" Shows up in the generated XFBML code here: http://developers.facebook.com/docs/reference/plugins/like/
As the first one is for fbml which is deprecated.
(If I am wrong this info came directly from a facebook engineer).
I believe you should now be using
<html xmlns:fb="http://ogp.me/ns/fb#">
This is what is used in the sample code of the xfbml tab of their beta plugins so I presume this is the latest. I've never seen it used prior to your post implying it is newer then the 2008 facebook namespace. See This
xmlns:og="http://ogp.me/ns#"
xmlns:fb="http://www.facebook.com/2008/fbml">
both are related about open graph protocol
My understanding was that the xmlns attribute allowed for adding
namespace information for tags and for attributes

Get google to index links from javascript generated content

On my site I have a directory of things which is generated through jquery ajax calls, which subsequently creates the html.
To my knwoledge goole and other bots aren't aware of dom changes after the page load, and won't index the directory.
What I'd like to achieve, is to serve the search bots a dedicated page which only contains the links to the things.
Would adding a noscript tag to the directory page be a solution? (in the noscript section, I would link to a page which merely serves the links to the things.)
I've looked at both the robots.txt and the meta tag, but neither seem to do what I want.
It looks like you stumbled on the answer to this yourself, but I'll post the answer to this question anyway for posterity:
Implement Google's AJAX crawling specification. If links to your page contain #! (a URL fragment starting with an exclamation point), Googlebot will send everything after the ! to the server in the special query string parameter _escaped_fragment_.
You then look for the _escaped_fragment_ parameter in your server code, and if present, return static HTML.
(I went into a little more detail in this answer.)

How can I pull in my BlogSpot page into a page on my web site

I have a blog on BlogSpot.com, and I have a domain based on my own name. I want to have a URL on my site (like http://www.mydomain.com/blog) that will then pull in the content from my blog page, but I want the URL in the address bar to stay on http://www.mydomain.com/blog, so that it does not look like you left my site.
(I have a Windows hosting account on 1and1.com)
I did Google this question, and I found how a few things, like:
1: Adding a tag in to "refresh". Tried this, but it changes the address bar.
<meta http-equiv="refresh" content="0; URL=http://myblog.blogspot.com" />
2: I also learned about the html iframe thing, but it has height and scrollbar issues.
3: Then, I found this partial code snippet, but I don't know what to do with it, or if it will even work against the BlogSpot server, or on my server:
<%
Set objHTTP = Server.CreateObject("Microsoft.XMLHTTP")
objHTTP.Open "GET", "http://myblog.blogspot.com", false
objHTTP.Send
Response.Write objHTTP.ResponseText
%>
I am a client app guy, so this web stuff is all new to me.
Any help will be greatly appreciated.
The third option will probably work for the initial page load, but any links on the page will then direct the user to the BlogSpot page, and change the url. It simply fetches the page from blogspot, and then sends it to the user without any changes.
For me, the changing url is not a big deal, as long as it's easy for the user to get from one to the other easily; have prominent links on either page that tell the user where they go. Most people don't care about the url, they just care about the content.
Using an IFrame is probably your best bet. Many Facebook applications are in IFrames and still integrate very well.
I think using a regular frame or an iFrame is probably the easiest solution. What kind of scrollbar issues did you encounter? You can set custom values for some of these attributes, just check out the documentation here:
http://www.w3schools.com/TAGS/tag_iframe.asp
If you didn't want to use frames, you could actually proxy the entire page using a server side application like Squid. However, this is more difficult to setup, requires the ability to install software and configure firewall/iptable settings on your host, and must be configured properly to prevent malicious abuse.
-Mark
Here are some options you can try:
If you have PHP installed:
<?php
echo file_get_contents('http://myblog.blogspot.com'); // or you can use fopen()
?>
Or Server-Side-Includes installed:
<!--# include virtual="http://myblog.blogspot.com" -->
You can also pull blog content from Blogspot using the Blogger Data API.
The advantage of this is that you can reformat and reorganize the content to match the style of your website. The disadvantage is that it's more work than an iframe, and you probably won't match the full functionality of Blogspot.
I'm playing with this now to see whether I can use Blogspot as a type of CMS for a club news system.

What's the proper way to add Facebook Connect and their xmlns to (X)HTML5?

I'm gleaning from this question [Facebook Connect Won't Validate that using Facebook Connect and other facebook 'social widgets' just doesn't result in a 'valid' document.
Concerning (X)HTML5, however, what would be the (most) appropriate doctype/header if I want to include Facebook Connect content on my pages, and be as close to 'valid' (X)HTML5 as possible?
There is no proper way.
If you're using XHTML5, then you don't need to include a DOCTYPE. You can include <!DOCTYPE html> if you like, but it's not necessary. You simply need to specify the XHTML namespace, and the namespace for Facebook Connect.
I'm not familiar with Facebook Connect at all, but you will need to make sure the document is processed as XML, which means serving it with an XML MIME type like application/xhtml+xml. (The text/html serialisation does not support namespaces).
The HTML5 validator will not be able to check the conformance of the facebook markup, but it should still be able to check the conformance of the XHTML. Elements from other namespaces are allowed to be included, even though the HTML5 validator does not support them.
I found these links helpful. Maybe you can find them useful either.
http://earthpeople.se/labs/2010/09/html5-validation-with-facebook-opengraph/
http://fbml5.blogspot.com/