Site not valid - but it is - facebook

So, I'm building a website called "dagbok.nu", which is swedish for "diary now" :)
Anyway, when creating the Facebook application, it claims that the site URL is invalid as well as the app domain. For site url, I used "http://dagbok.nu" and for site domain, I used "dagbok.nu". Please don't reply (as I've seen others do on similar issues) that I should type the site url with the scheme and the domain without - that's exactly what I'm doing.
Right, so according to another question here, one could trouble shoot this functionality using FB's own URL scraper, so I did just that:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fdagbok.nu
And the reply: Error Parsing URL: Error parsing input URL, no data was scraped
Right, so now I can assume that the reason for it being considered invalid is because of FB not being able to scrape the URL. But why?
According to this question, one of the reasons seems to be that FB has deemed the URL insecure or "spammy". I've acquired this domain from a previous owner so this wasn't all that impossible. But when doing the same thing as Matthew in that post - i.e. trying to post in my timeline using the domain "http://dagbok.nu", I didn't get any information. The status box expanded as if to include a thumbnail and information about the link, but it only contained a "(No title)" text and nothing more.
So now I don't know what to do. I've tried to check the DIG and NS records from multiple servers around the web, and everyone seems to resolve it correctly, and I've had friends double check the URL from the states as well. I can't understand what's wrong and I have no idea how to ask someone at FB how to resolve this. Does anyone here have a good advice for this? Thanks in advance! :)
EDIT
When changing the domain to another domain that points to the exact same web server and document_root, it works! So this is definitely a problem with the domain "dagbok.nu" and not with the code on that page.
EDIT
When using the debug function above - I see no activity in the server log what so ever. Facebook doesn't even contact the server. When using the alternate url - the one from the last edit, it pops up in the logs as it should.
EDIT
I filed a bug report with Facebook, And their first response was that they were going to follow up. Now, a month later, I got an email that said "We are prioritizing bugs based on impact to the developer community. As this bug report has not received much attention from other developers, we are closing it so as to better focus on the top issues", and then they told me to go here to stackoverflow to try to solve my issue - but the issue is WITH THEM, and of course no one else have reported that my site doesn't work, it affects only me, and I haven't opened it yet due to this bug!
EDIT
I wanted to file a new bug report, but I can't even that now, since they are blocking bug reports with this URL as well!
I had to edit the URL - here is the new bug report

When Facebook tries to scrap your site for information, they send a call to your server with specific user agent called "facebookexternalhit"...
Facebook needs to scrape your page to know how to display it around
the site.
Facebook scrapes your page every 24 hours to ensure the properties are
up to date. The page is also scraped when an admin for the Open Graph
page clicks the Like button and when the URL is entered into the
Facebook URL Linter. Facebook observes cache headers on your URLs -
it will look at "Expires" and "Cache-Control" in order of preference.
However, even if you specify a longer time, Facebook will scrape your
page every 24 hours.
The user agent of the scraper is: "facebookexternalhit/1.1(+http://www.facebook.com/externalhit_uatext.php)"
Make sure it is not blocked by your server firewall
Look in your server log if it even tried to access your site
If you think this is a firewall issue look at this link

Your problem appears to be with your character encoding string. Your Apache server is currently sending the unsupported string latin1. You've defined your meta:content-type as iso-8859-1. See the w3c validator
From what I've seen, the Facebook parser will stop immediately if it encounters either an unrecognized character encoding string or a mismatch in character encoding strings between your header and meta tags.
The problem could be originating from either your httpd.conf or php.ini files. Change these to match your meta and restart Apache. Since the problem seems to be domain-specific, I'd check httpd.conf first.

Could your domain be blacklisted? Could you try messaging your url to someone, and see if Facebook gives you a "This message contains blocked content..." error?
For example:

If you don't provide certain minimum Facebook markup on your page, it will respond with "Error Parsing URL: Error parsing input URL, no data was scraped." I only looked at the homepage, but it appears that dagbok.nu contains no Facebook markup. I'm not sure what things must be present at minimum, but in my implementation, I assume the fb:app_id meta tag and the JavaScript SDK script must be there. You may want to take a look at http://developers.facebook.com/docs/guides/web/#plugins , particularly the Authentication section.
I discovered your question because I had this same error today for an unknown reason. I found that it was caused because the content of my og:image meta tag used an incorrect URL to the image I was trying to use. So as you add Facebook markup to your page, make sure your values are correct or you may continue to receive this message.

This doesn't seem to be a Facebook problem if you take a look at what I've discovered.
The results when testing it with W3C Online Validation Tool are 1 of 2 results.
Tested using: dagbok.nu but note http://dagbok.nu has no difference in test results. Remove the last forward slash in between tests.
Test: 1
Results: 72 Errors 0 Warning
Note: Shown here is a fragment of the source Frameset DOCTYPE webpage.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<NOSCRIPT><IMG SRC="http://svs.bystorm.se/rv?java=off"></NOSCRIPT><SCRIPT SRC="http://svs.bystorm.se/rvj"></SCRIPT>
<HTML STYLE="height:100%;">
<HEAD>
<META HTTP-EQUIV="content-type" CONTENT="text/html;charset=iso-8859-1">
Test: 2
Results: 4 Errors 1 Warning
Note: Shown here is a fragment of the source Transitional DOCTYPE webpage.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html >
<head>
<title>Dagbok: Framsida</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="author" content="Jonas Eklundh Communication (http://jonas.eklundh.com)">
<meta name="author-email" content="jonas#eklundh.com">
<meta name="copyright" content="Jonas Eklundh Communication #2012">
<meta name="keywords" content="Atlas,Innehållssystem,Jonas Eklundh">
<meta name="description" content="">
<meta name="creation-time" content="0,079s">
<meta name="kort" content="DGB">
Repeated tests loop these results when done a couple seconds apart indicating a page-redirect is occurring.
Security warnings are seen in Firefox and Chrome when visiting your site using these secure URL's:
https://dagbok.nu
https://www.dagbok.nu
The browser indicates the site should not be trusted because it's impersonating another site using invalid security certificate from *.loopiasecure.com
Recommendation: Check your .htaccess file, CMS Settings, page redirection, and security settings. Use the above source webpages to realize those file-locations / file-names that are being served to discover what's set incorrectly.
Once that's done, I think Facebook will be happy to then debug your webpage and provide additional recommendations.

Had the same problem and I discovered it was an incorrect IPv6 address in the AAAA records for my domain. The IPv4 record was correct, so the site worked in a browser but FB obviously check the IPv6 records!

This issue may also happen when Cloudflare is used. This is because Cloudflare protects the page from Facebook, which is then unable to collect the data, which in turn makes Facebook think the page is invalid.
My fix was:
Turn off Cloudflare for the page.
Scrape the page using Facebook's Dev Tools: https://developers.facebook.com/tools/debug/og/object
Click and let run the "Fetch new scrape information" button.
Re-enable cloudflare protection for the page.
You should then be able to continue to add the page where you needed.

Related

Facebook Object Debugger returns 404 not found when trying to scrape

I have a simple Tumblr website blog, upon which I post content.
However since I changed my DNS, the Facebook Object debugger sees really old data for my root url: http://www.kofferbaque.nl/ and for every post (for instance: http://kofferbaque.nl/post/96638253942/moodoid-le-monde-moo) it shows a 404 not found, which is bullshit because the actual content is there.
The full error message: Error parsing input URL, no data was cached, or no data was scraped.
I have tried the following things to fix it:
clear browser cache / cookies / history
using ?fbrefresh=1 after the URL (didn't work)
I've added a FB app_id to the page (made sure the app was in production - added the correct namespaces etc. - also didn't change anything)
Checked out other questions regarding this subject
Rechecked all my meta tags a dozen times
What other options are there to fix this issue?
If you need more info please ask in the comments.
2014-09-08 - Update
When throwing my url into the static debugger https://developers.facebook.com/tools/debug/og/echo?q=http://www.kofferbaque.nl/. The 'net' tab from firebug gives the following response:
<meta http-equiv="refresh" content="0; URL=/tools/debug/og/echo?q=http%3A%2F%2Fwww.kofferbaque.nl%2F&_fb_noscript=1" /><meta http-equiv="X-Frame-Options" content="DENY" />
2014-09-11 - Update
removed duplicate <!DOCTYPE html> declaration
cleanup up <html> start tag (aka - removed IE support temporarily)
I've placed a test blog post to see if it would work, it didn't. Somehow my root url started 'magically' updating itself. Or let's say, it removed the old data - probably due to the fact that I removed the old app it was still refering to. However, it still doesn't see the 'newer' tags correctly.
Still no succes
2014-09-12 - Update
Done:
moving <meta> tags to the top of the <head> element
removed fb:app_id from page + the body script, for it has no purpose.
This appearantly doesn't make any changes. It also appears that tumblr injects lots of script tags at the start of the head element. Maybe that is the reason the Facebook scraper doesn't 'see' the meta tags.
The frustrating bit is that through some other og tag scanner: http://iframely.com/debug?uri=http%3A%2F%2Fkofferbaque.nl%2F, it shows all the correct info.
First, the HTML is not valid. You got the doctype two times (at least on the post page), and there is content before the html tag (script tag and IE conditionals).
This may be the problem, but make sure you put the og-tags together at the beginning of the head section - The debugger only reads part of the page afaik, so make sure the og-tags are in that part. Put all the other og-tags right after "og:site_name".
Btw: ?fbrefresh=1 is not really necessary, you can use ANY parameter - just to create a different url. But the debugger offers a button to refresh the scraping, so it´s useless anyway.

URL Blocked on Facebook [duplicate]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to share a link to my site on on Facebook. The page displays correctly in my browser, but when I share it via the API or front end it does not show up. When I put my URL into the Graph API debugger it gives me an error "Error Parsing URL: Error parsing input URL, no data was scraped."
What could be wrong?
Hopefully, this is an exhaustive list of things to check when your site won't scrape:
1) Is your site on a spam blacklist?
This is rare, but Facebook and most other tools won't parse your site at all if it shows up on a spammer blacklist.
I use https://admin.uribl.com/ as a checker. If your site is listed, you need to find and clean the malware on your site, then follow the instructions from the blacklist owner(s) to remove your site. If the problem is that you've got a host who is a known spammer, you'll need to change hosts. It's going to take a few days for this to work its way through the system before any site will scrape your site again.
2) Is your (X)HTML valid?
Facebook's parser is very strict. If the headers sent by your web server or your HTML isn't valid, Facebook will not parse your site. To test this in detail, use the Markup Validator from the W3C. You have to resolve all of the errors before Facebook will parse your page.
Some of the most common errors I have seen are:
Invalid string sent in the headers
Mismatch between the character-encoding sent in the header and the <meta charset> tag in the document.
Invalid or incorrect <!DOCTYPE>
Whitespace before the ` tag
Malformed HTML tags, especially in the <head>
Tags closed with > instead of /> in XHTML documents
3) Are you redirecting your visitors with JavaScript?
The Facebook parser does not execute JavaScript. If you want to redirect a visitor to custom content, you need to do this with a server-side script.
4) Is your server refusing connections to non-browsers?
This is harder to diagnose, but some servers are set to return a 500:Server Error or 403:Forbidden for any non-browser visitor.
5) Does the Facebook site tell you your link is blocked?
Log into Facebook and attempt to share a link on your timeline. If your site appears in the Facebook internal blacklist, you will get a message telling you the site is blocked. On this dialog, there is a form where you can mark this as a false positive and request a review of your site.
If you end up on this list, Facebook users are blocking your postings or marking them as spam. That probably originates in your content. What you think is SEO is probably spamdexing or the content you are sharing is offensive or polarizing to some users, or you're just sharing the same stuff over and over again.
Once you have fixed the error, visit the Facebook Debugger again. A manual visit to the debugger clears Facebook's cache for that URL. Give things a few minutes for Facebook to push the updates to all servers, and then try again.
if your site in black list (you can check in that place: https://add5000.com/all-tips/entry/facebook-checker-if-link-blocked )
you can try use url redirector with open graph and refferals support:
https://add5000.com/all-tips/entry/how-to-post-a-blocked-link-on-facebook
If your link may not blocked and facebook dont accept your link
search in your page blocked links to ather sites such references should check

Facebook won't share a link to my site [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to share a link to my site on on Facebook. The page displays correctly in my browser, but when I share it via the API or front end it does not show up. When I put my URL into the Graph API debugger it gives me an error "Error Parsing URL: Error parsing input URL, no data was scraped."
What could be wrong?
Hopefully, this is an exhaustive list of things to check when your site won't scrape:
1) Is your site on a spam blacklist?
This is rare, but Facebook and most other tools won't parse your site at all if it shows up on a spammer blacklist.
I use https://admin.uribl.com/ as a checker. If your site is listed, you need to find and clean the malware on your site, then follow the instructions from the blacklist owner(s) to remove your site. If the problem is that you've got a host who is a known spammer, you'll need to change hosts. It's going to take a few days for this to work its way through the system before any site will scrape your site again.
2) Is your (X)HTML valid?
Facebook's parser is very strict. If the headers sent by your web server or your HTML isn't valid, Facebook will not parse your site. To test this in detail, use the Markup Validator from the W3C. You have to resolve all of the errors before Facebook will parse your page.
Some of the most common errors I have seen are:
Invalid string sent in the headers
Mismatch between the character-encoding sent in the header and the <meta charset> tag in the document.
Invalid or incorrect <!DOCTYPE>
Whitespace before the ` tag
Malformed HTML tags, especially in the <head>
Tags closed with > instead of /> in XHTML documents
3) Are you redirecting your visitors with JavaScript?
The Facebook parser does not execute JavaScript. If you want to redirect a visitor to custom content, you need to do this with a server-side script.
4) Is your server refusing connections to non-browsers?
This is harder to diagnose, but some servers are set to return a 500:Server Error or 403:Forbidden for any non-browser visitor.
5) Does the Facebook site tell you your link is blocked?
Log into Facebook and attempt to share a link on your timeline. If your site appears in the Facebook internal blacklist, you will get a message telling you the site is blocked. On this dialog, there is a form where you can mark this as a false positive and request a review of your site.
If you end up on this list, Facebook users are blocking your postings or marking them as spam. That probably originates in your content. What you think is SEO is probably spamdexing or the content you are sharing is offensive or polarizing to some users, or you're just sharing the same stuff over and over again.
Once you have fixed the error, visit the Facebook Debugger again. A manual visit to the debugger clears Facebook's cache for that URL. Give things a few minutes for Facebook to push the updates to all servers, and then try again.
if your site in black list (you can check in that place: https://add5000.com/all-tips/entry/facebook-checker-if-link-blocked )
you can try use url redirector with open graph and refferals support:
https://add5000.com/all-tips/entry/how-to-post-a-blocked-link-on-facebook
If your link may not blocked and facebook dont accept your link
search in your page blocked links to ather sites such references should check

Facebook Debugger lint tool gets HTTP 206 - doesn't detect Open Graph meta tags (others tools do)

I believe my site has the correct markup for Facebook & Open Graph meta tags. But checking Facebook's linter shows that none of the tags are being detected. You can see for yourself here:
http://developers.facebook.com/tools/debug/og/object?q=goodloesolitaire.com
When I use a different site, the tags are found:
http://www.opengraph.in/?url=goodloesolitaire.com&format=html
I went through the similar questions and none of those check out. Any ideas on why Facebook's debugger might see nothing?
Facebook is seeing HTTP code 206 "Partial Content" instead of normal 200 "OK".
206 "Partial Content": This message might occur if a client has a
partial copy of content in its cache and requests and update of missing
content. This message indicates that the partial request succeeded.
I found one old forum post about it: http://forum.developers.facebook.net/viewtopic.php?id=68440
It looks like it might be a server configuration issue to do with caching. Do you run Varnish or anything like that on your server? Check in to that.
Another thing to try might be to move your charset meta tag below your Open Graph tags, so Facebook knows the right encoding to parse them with. Also, using this type tag might work better:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Finally, make sure you don't have anything blocking the Facebook scraper user agent. As mentioned in their documentation:
Our bot functions with the User Agent "facebookexternalhit/*". Make
sure you're not blocking that user agent. Also, make sure Facebook's
servers can reach your server.
If You are using Varnish:
Put
if (req.http.user-agent ~ "facebookexternalhit")
{
return(pipe);
}
Inside your sub vcl_recv:
sub vcl_recv
{
}
It worked very well.
We use Varnish so this did the trick for us:
if (req.http.user-agent ~ "facebookexternalhit")
{
return(pipe);
}
https://www.varnish-cache.org/lists/pipermail/varnish-misc/2011-February/020060.html

Facebook Post Link Image

When someone posts a link on facebook, a script usually scans that link for any images, and displays a quick thumbnail next to the post. For certain URLs though (including mine), FB doesn't seem to pick up anything, despite their being a number of images on that page.
I read up that FB prefers the "image_src" rel tag for the image the user wishes to specify, but this does not generate that thumbnail either for my site.
My url goes directly to the DNS, and is not forwarded, so I don't imagine that could be the problem either.
Does anyone have an idea as to why FB can't generate any thumbnails from my site?
The easiest way is just a link tag:
<link rel="image_src" href="http://stackoverflow.com/images/logo.gif" />
But there are some other things you can add to your site to make it more Social media friendly:
Open Graph Tags
Open Graph tags are tags that you add to the <head> of your website to describe the entity your page represents, whether it is a band, restaurant, blog, or something else.
An Open Graph tag looks like this:
<meta property="og:tag name" content="tag value"/>
If you use Open Graph tags, the following six are required:
og:title - The title of the entity.
og:type - The type of entity. You must select a type from the list of Open Graph types.
og:image - The URL to an image that represents the entity. Images must be at least 50 pixels by 50 pixels. Square images work best, but you are allowed to use images up to three times as wide as they are tall.
og:url - The canonical, permanent URL of the page representing the entity. When you use Open Graph tags, the Like button posts a link to the og:url instead of the URL in the Like button code.
og:site_name - A human-readable name for your site, e.g., "IMDb".
fb:admins or fb:app_id - A comma-separated list of either the Facebook IDs of page administrators or a Facebook Platform application ID. At a minimum, include only your own Facebook ID.
More information on Open Graph tags and details on Administering your page can be found on the Open Graph protocol documentation.
http://developers.facebook.com/docs/reference/plugins/like
I know this question is old, but I recently dealt with the exact same problem and went round and round on it for a couple weeks. Multiple searches on Google turned up a lot of useful information, but most of it was focused on Open Graph tags, which I wasn't interested in using. Turns out my site had multiple issues, but here are some of the basics.
As EightyEight said, make sure your HTML is valid - and the same goes for your javascript and server-side code (PHP, ASP, etc.). I had a small PHP error in a piece of code that was executing as a separate call to the server from the main page. Due to a number of bizarre coincidences, that code was generating a 500 error - but ONLY for IE6 and strict parsing engines like the W3C validator and the Facebook page crawler. The problem didn't appear in modern browsers (Chrome 4, FF 3.5, IE 8, etc) so I didn't see it right away, but older/stricter clients were showing the 500 every time and that was the main reason FB wasn't crawling our page (when everything else seemed to be correct).
Regarding Randy's response, he's correct that Facebook will keep an old cached copy of your page long after you've updated it. FB claims it's only held for 24 hours, but I experienced much longer times than that. FORTUNATELY, FB has released their "URL Linter" tool that will show you a preview of how your page will appear when being shared on FB, and it will force FB to instantly update its cache of your page. This was a lifesaving tool. You can find it at http://developers.facebook.com/tools/lint/
Regarding the URL Linter tool, be aware that each variation of a URL is cached separately on Facebook, so "www.example.com" is not the same as "example.com". Also, unique capitalization is stored as well, so "ExampleOne.com" is not the same as "exampleone.com". (This led to a lot of confusion between my client and myself when it appeared to me that the cache had been updated just fine and the client claimed they weren't seeing the updates. Turns out I was looking at exampleone.com and had used Linter to update the cache, but they were looking at exampleOne.com which I hadn't submitted to Linter. As a result, I ended up submitting quite a few variations of the URL to Linter just to cover the bases.)
WyrdNEXUS's advice to use the image_src link tag is spot-on. This allows you to be sure that FB is scraping the best possible image for your page. There are some varying guidelines out there about what specs the image file should have, but I've successfully used a 128px square image and have seen a 130x97 image make it through as well. Here is Facebook's official documentation from http://developers.facebook.com/docs/reference/plugins/like/:
Images must be at least 50 pixels by 50 pixels. Square images work best, but you are allowed to use images up to three times as wide as they are tall.
Obviously, FB will resize a large image for you, but you'll almost always get better results if you resize it yourself beforehand.
Regarding Mike Cooper's link to the eHow article, avoid using step #1 in that article. It was valid advice when the article was written and when Mike posted the link, but it's now better to use the URL Linter tool for previewing how your page will appear when being shared. By using Linter, you won't cause FB to cache a (potentially) bad copy of the page before you get a chance to tweak it.
Use the facebook lintter available here. http://developers.facebook.com/tools/lint/
This will check your link and re fetch any images. this also clears any old cache.
Or try this - https://developers.facebook.com/tools/debug
To change Title, Description and Image, we need to add some meta tags under head tag.
STEP 1 :
Add meta tags under head tag
<html>
<head>
<meta property="og:url" content="http://www.test.com/" />
<meta property="og:image" content="http://www.test.com/img/fb-logo.png" />
<meta property="og:title" content="Prepaid Phone Cards, low rates for International calls with Lucky Prepay" />
<meta property="og:description" content="Cheap prepaid Phone Cards. Low rates for international calls anywhere in the world." />
NEXT STEP :
Click on below link
https://developers.facebook.com/tools/debug
Add your URL in text box (e.g http://www.test.com/) where you mentioned the tags. Click on DEBUG button.
Its done.
You can verify here https://www.facebook.com/sharer/sharer.php?u=http://www.test.com/
In above url, u = your website link
ENJOY !!!!
try this: http://www.ehow.com/how_4938148_thumbnail-show-up-facebook-share.html
Is the site's HTML valid? Run it through w3c validation service.
Actually, if you've already tried linking that page on Facebook BEFORE adding the "image_src" link, Facebook will keep using the old cached copy and not even see your changes. Try modifying the URL by removing or adding the 'www', or duplicate your page to test it.
I've noticed that Facebook does not take thumbnails from websites if they start with https, is that maybe your case?
had the same problem and figured out that my head closing tag was in the wrong place
Old question but recently I seemed to be running into same issue with thumbnail images from my link not showing in status updates on Facebook. I post for many clients and this is relatively new.
FB doesn't seem to like long URLs anymore — if you use a URL shortener such as goo.gl or bitly.com, the thumbnail from your link/post will appear in your FB update.
Try using something like this:
<link rel="image_src" href="http://yoursite.com/graphics/yourimage.jpg" /link>`
Seems to work just fine on Firefox as long as you use a full path to your image.
Trouble is it get vertically offset downward for some reason. Image is 200 x 200 as recommended somewhere I read.
If you used any plugin for seo then Check 1st your seo plugin settings.Then find out Noindex setting if Enable Media for Noindex then disable it.