URL Blocked on Facebook [duplicate] - facebook

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to share a link to my site on on Facebook. The page displays correctly in my browser, but when I share it via the API or front end it does not show up. When I put my URL into the Graph API debugger it gives me an error "Error Parsing URL: Error parsing input URL, no data was scraped."
What could be wrong?

Hopefully, this is an exhaustive list of things to check when your site won't scrape:
1) Is your site on a spam blacklist?
This is rare, but Facebook and most other tools won't parse your site at all if it shows up on a spammer blacklist.
I use https://admin.uribl.com/ as a checker. If your site is listed, you need to find and clean the malware on your site, then follow the instructions from the blacklist owner(s) to remove your site. If the problem is that you've got a host who is a known spammer, you'll need to change hosts. It's going to take a few days for this to work its way through the system before any site will scrape your site again.
2) Is your (X)HTML valid?
Facebook's parser is very strict. If the headers sent by your web server or your HTML isn't valid, Facebook will not parse your site. To test this in detail, use the Markup Validator from the W3C. You have to resolve all of the errors before Facebook will parse your page.
Some of the most common errors I have seen are:
Invalid string sent in the headers
Mismatch between the character-encoding sent in the header and the <meta charset> tag in the document.
Invalid or incorrect <!DOCTYPE>
Whitespace before the ` tag
Malformed HTML tags, especially in the <head>
Tags closed with > instead of /> in XHTML documents
3) Are you redirecting your visitors with JavaScript?
The Facebook parser does not execute JavaScript. If you want to redirect a visitor to custom content, you need to do this with a server-side script.
4) Is your server refusing connections to non-browsers?
This is harder to diagnose, but some servers are set to return a 500:Server Error or 403:Forbidden for any non-browser visitor.
5) Does the Facebook site tell you your link is blocked?
Log into Facebook and attempt to share a link on your timeline. If your site appears in the Facebook internal blacklist, you will get a message telling you the site is blocked. On this dialog, there is a form where you can mark this as a false positive and request a review of your site.
If you end up on this list, Facebook users are blocking your postings or marking them as spam. That probably originates in your content. What you think is SEO is probably spamdexing or the content you are sharing is offensive or polarizing to some users, or you're just sharing the same stuff over and over again.
Once you have fixed the error, visit the Facebook Debugger again. A manual visit to the debugger clears Facebook's cache for that URL. Give things a few minutes for Facebook to push the updates to all servers, and then try again.

if your site in black list (you can check in that place: https://add5000.com/all-tips/entry/facebook-checker-if-link-blocked )
you can try use url redirector with open graph and refferals support:
https://add5000.com/all-tips/entry/how-to-post-a-blocked-link-on-facebook
If your link may not blocked and facebook dont accept your link
search in your page blocked links to ather sites such references should check

Related

Error parsing input URL, no data was scraped. only with new pages on my site

The problem i have is that i own a website where other people can post stuff ,creating new pages on my domain, but the problem that occured today is that all the new post pages created today are malfunctioning , sharing is not loading thumbnail picture and title and so on, but the weird this is that all the posts(new pages) created before today are all working fine
What caused an error to occur out of nowhere?
I also cannot debug any of the URL's of my website as the same error: Error parsing input URL, no data was scraped
The website im having problems with is here http://www.vabameedia.ee/vm/184/h%C3%A4da-ei-anna-h%C3%A4beneda.html
This is one of the sites where it says no error on page but facebook still cant reach it. http://www.vabameedia.ee/vm/178/craig-parks-%C3%BChek%C3%A4eline-krossisoitja.html
For people experiencing the same problem but for different causes, I discovered a few interesting things about how Facebook "scrapes" pages, checking the logs of the server while doing some trials.
First of all: if you never tried to share a page with FB, FB never tried to scrape it, and it will not try to do so if you only put the url in the Debug tool.
That's the first reason because you get the error: it just states that FB has no information on the page, you must "force" it to scrape the page.
The first time you try to share a page, FB scrapes it (asks your server the first 40k of the page and analyse the opengraph tags).
What can happen is that you do not see the image: Facebook Share Dialog does not display thumbnails one first load
The reason is that FB behind the scenes is still scraping your page and caching the image. The next time, in fact, you have also the image.
How to solve it? Pre caching: https://developers.facebook.com/docs/sharing/best-practices#precaching
or simply add
<meta property="og:image:width" content="450"/>
<meta property="og:image:height" content="298"/>
I was pulling my hair out trying to fix this issue. Hours and hours of troubleshooting to no avail. After speaking with one of our programmers about a topic unrelated I thought of something to try as a long shot.
Much to my surprise, it worked!!!
This is the reason behind the problem and my solution for it:
When you draft a post in WordPress it generates a link based on your article's title (unless you manually change it). The title of my article included special characters, however the auto-generated link didn't display these special characters, only hyphens to replace the spaces. Should be fine right? Wrong! Somewhere embedded in metadata and code in the WordPress platform are those special characters and they mess up the way Facebook pulls info from the article being linked to. This is a problem because certain special characters invalidate hyperlinks.
For example:
Article Title: R[eloaded]
Auto-generated hyperlink DISPLAYED in WordPress "Permalink" field: http://www.example.com/reloaded
Actual WordPress Auto-generated hyperlink: http://www.example.com/r[eloaded]
Those brackets will invalidate the link and Facebook will be unable to pull any information (ie pictures) from it.
Solution:
(1) Simply, manually change the WordPress hyperlink address to something that doesn't include any special characters (this will not change the title of your article).
(2) Click "Update" to change the post to include the new hyperlink.
(3) Click "Purge from Cache" in the WordPress window
(4) Refresh your Facebook browser window
(5) Paste the new hyperlink for your article
(6) Enjoy your Facebook post with a preview image and information
Sidenote: Don't pull your hair out over Facebook, it's not worth it. =)
If you're using Wordpress, edit the post in question to change the permalink (just alter it slightly), then update the post. Using the new permalink in the Facebook OG debugger should now work.
It's a weird fix, but I think it takes care of a problem caused by special characters being used in the title of a post, which is then used to make the permalink.
Its all about DNS issue, was having same issue and resolved it by updating domain name servers to actual name servers.
In my case my domain was pointed to ns1.websterz.net and ns2.websterz.net and on this server i had DNS redirect to my other server (where web site is hosted). I Just updated name servers of the domain to actual name servers where my web site is hosted on. This was account migration case i forgot to update name servers as of new server.
Everything works fine now.

Facebook won't share a link to my site [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to share a link to my site on on Facebook. The page displays correctly in my browser, but when I share it via the API or front end it does not show up. When I put my URL into the Graph API debugger it gives me an error "Error Parsing URL: Error parsing input URL, no data was scraped."
What could be wrong?
Hopefully, this is an exhaustive list of things to check when your site won't scrape:
1) Is your site on a spam blacklist?
This is rare, but Facebook and most other tools won't parse your site at all if it shows up on a spammer blacklist.
I use https://admin.uribl.com/ as a checker. If your site is listed, you need to find and clean the malware on your site, then follow the instructions from the blacklist owner(s) to remove your site. If the problem is that you've got a host who is a known spammer, you'll need to change hosts. It's going to take a few days for this to work its way through the system before any site will scrape your site again.
2) Is your (X)HTML valid?
Facebook's parser is very strict. If the headers sent by your web server or your HTML isn't valid, Facebook will not parse your site. To test this in detail, use the Markup Validator from the W3C. You have to resolve all of the errors before Facebook will parse your page.
Some of the most common errors I have seen are:
Invalid string sent in the headers
Mismatch between the character-encoding sent in the header and the <meta charset> tag in the document.
Invalid or incorrect <!DOCTYPE>
Whitespace before the ` tag
Malformed HTML tags, especially in the <head>
Tags closed with > instead of /> in XHTML documents
3) Are you redirecting your visitors with JavaScript?
The Facebook parser does not execute JavaScript. If you want to redirect a visitor to custom content, you need to do this with a server-side script.
4) Is your server refusing connections to non-browsers?
This is harder to diagnose, but some servers are set to return a 500:Server Error or 403:Forbidden for any non-browser visitor.
5) Does the Facebook site tell you your link is blocked?
Log into Facebook and attempt to share a link on your timeline. If your site appears in the Facebook internal blacklist, you will get a message telling you the site is blocked. On this dialog, there is a form where you can mark this as a false positive and request a review of your site.
If you end up on this list, Facebook users are blocking your postings or marking them as spam. That probably originates in your content. What you think is SEO is probably spamdexing or the content you are sharing is offensive or polarizing to some users, or you're just sharing the same stuff over and over again.
Once you have fixed the error, visit the Facebook Debugger again. A manual visit to the debugger clears Facebook's cache for that URL. Give things a few minutes for Facebook to push the updates to all servers, and then try again.
if your site in black list (you can check in that place: https://add5000.com/all-tips/entry/facebook-checker-if-link-blocked )
you can try use url redirector with open graph and refferals support:
https://add5000.com/all-tips/entry/how-to-post-a-blocked-link-on-facebook
If your link may not blocked and facebook dont accept your link
search in your page blocked links to ather sites such references should check

Domain blocked and no data scraped

I recently purchased the domain www.iacro.dk from UnoEuro and installed WordPress planning to integrate blogging with Facebook. However, I cannot even get to share a link to the domain.
When I try to share any link on my timeline, it gives the error "The content you're trying to share includes a link that's been blocked for being spammy or unsafe: iacro.dk". Searching, I came across Sucuri SiteCheck which showed that McAfee TrustedSource had marked the site as having malicious content. Strange considering that I just bought it, it contains nothing but WordPress and I can't find any previous history of ownership. But I got McAfee to reclassify it and it now shows up green at SiteCheck. However, now a few days later, Facebook still blocks it. Clicking the "let us know" link in the FB block dialog got me to a "Blocked from Adding Content" form that I submitted, but this just triggered a confirmation mail stating that individual issues are not processed.
I then noticed the same behavior as here and here: When I type in any iacro.dk link on my Timeline it generates a blank preview with "(No Title)". It doesn't matter if it's the front page, a htm document or even an image - nothing is returned. So I tried the debugger which returns the very generic "Error Parsing URL: Error parsing input URL, no data was scraped.". Searching on this site, a lot of people suggest that missing "og:" tags might cause no scraping. I installed a WP plugin for that and verified tag generation, but nothing changed. And since FB can't even scrape plain htm / jpg from the domain, I assume tags can be ruled out.
Here someone suggests 301 Redirects being a problem, but I haven't set up redirection - I don't even have a .htaccess file.
So, my questions are: Is this all because of the domain being marked as "spammy"? If so, how can I get the FB ban lifted? However, I have seen examples of other "spammy" sites where the preview is being generated just fine, e.g. http://dagbok.nu described in this question. So if the blacklist is not the only problem, what else is wrong?
This is driving me nuts so thanks a lot in advance!
I don't know the details, but it is a problem that facebook has with web sites hosted on shared servers, i.e. the server hosting your web site also hosts a number of other web sites.

URLs redirect to spyware site

We are developing an app that makes posts on behalf of our users to Facebook. Within those posts, we want to put links to external (non-Facebook) websites.
Looking at the links in the status bar of the browser (usually Chrome), the correct URL is displayed. However, Facebook seems to wrap the actually-clicked link into some extra bells-and-whistles. Usually, this works correctly.
Sometimes, however, this URL wrapping ends up sending the click to a URL like:
http: //spywaresite.info/0/go.php?sid=2
(added space to make it non-browsable!) which generates Chromes severe warning message:
This happens very occasionally on Chrome, but very much more often in the iOS browser on the iPhone.
Does anyone have any pointers as to how to deal with this?
EDIT
For example, the URLs we put in the link is
http://www.example.com/some/full/path/somewhere
but the URL that actually gets clicked is:
http://platform.ak.fbcdn.net/www/app_full_proxy.php?app=374274329267054&v=1&size=z&cksum=fc1c17ed464a92bc53caae79e5413481&src=http%3A%2F%2Fwww.example.com%2Fsome%2Ffull%2Fpath%2Fsomewhere
There seems to be some JavaScript goodness in the page that unscrambles that and usually redirects correctly.
EDIT2
The links above are put on the image and the blue text to the right of the image in the screenshot below.
Mousing over the links (or the image) in the browser shows the correct link. Right-clicking on the link and selecting "Copy Link Address" gets the fbcdn.net link above (or one like it). Actually clicking on the link seems to set off some JavaScript processing of the fbcdn.net link into the right one... but sometimes that processing fails.
I'm not 100% sure what you're asking here, but i'll tell you what I know:- are you referring to this screen on Facebook?
(or rather, the variation of that screen which doesn't allow clickthrough?)
If you manually send a user to facebook.com/l.php?u=something they'll always see that message - it's a measure to prevent an open redirector
if your users are submitting such links, including the l.php link, you'll need to extract the destination URL (in the 'u' parameter)
If you're seeing the l.php URLs come back from the API this is probably a bug.
If links clicked on facebook.com end up on the screen it's because facebook have detected the link as suspicious (e.g. for URL redirector sites - the screen will allow clickthrough but warn the user first) or malicious/spammy (will not allow clickthrough)
In your app you won't be able to post links to the latter (an error will come back saying the URL is blocked), and the former may throw a captcha sometimes (if you're using the Feed dialog, this should be transparent to the app code, the user will enter the captcha and the dialog will return as normal)
If this isn't exactly what you were asking about please clarify and i'll update my answer
Rather than add to the question, I thought I'd put more details here.
It looks like the Facebook mention in the original title was mis-directed, so I've removed it.
We still haven't got to the bottom of the issue.
However, we used both Wireshark and Fiddler to look at the HTTP traffic between the Chrome browser (on the PC) and Facebook. Both showed that Facebook was returning the correct URL refresh.
Here's what Wireshark showed:
What we saw on Fiddler was that our server is issuing a redirect to the spywaresite.info site:
We are working with our ISP to figure out what is happening here.

Site not valid - but it is

So, I'm building a website called "dagbok.nu", which is swedish for "diary now" :)
Anyway, when creating the Facebook application, it claims that the site URL is invalid as well as the app domain. For site url, I used "http://dagbok.nu" and for site domain, I used "dagbok.nu". Please don't reply (as I've seen others do on similar issues) that I should type the site url with the scheme and the domain without - that's exactly what I'm doing.
Right, so according to another question here, one could trouble shoot this functionality using FB's own URL scraper, so I did just that:
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fdagbok.nu
And the reply: Error Parsing URL: Error parsing input URL, no data was scraped
Right, so now I can assume that the reason for it being considered invalid is because of FB not being able to scrape the URL. But why?
According to this question, one of the reasons seems to be that FB has deemed the URL insecure or "spammy". I've acquired this domain from a previous owner so this wasn't all that impossible. But when doing the same thing as Matthew in that post - i.e. trying to post in my timeline using the domain "http://dagbok.nu", I didn't get any information. The status box expanded as if to include a thumbnail and information about the link, but it only contained a "(No title)" text and nothing more.
So now I don't know what to do. I've tried to check the DIG and NS records from multiple servers around the web, and everyone seems to resolve it correctly, and I've had friends double check the URL from the states as well. I can't understand what's wrong and I have no idea how to ask someone at FB how to resolve this. Does anyone here have a good advice for this? Thanks in advance! :)
EDIT
When changing the domain to another domain that points to the exact same web server and document_root, it works! So this is definitely a problem with the domain "dagbok.nu" and not with the code on that page.
EDIT
When using the debug function above - I see no activity in the server log what so ever. Facebook doesn't even contact the server. When using the alternate url - the one from the last edit, it pops up in the logs as it should.
EDIT
I filed a bug report with Facebook, And their first response was that they were going to follow up. Now, a month later, I got an email that said "We are prioritizing bugs based on impact to the developer community. As this bug report has not received much attention from other developers, we are closing it so as to better focus on the top issues", and then they told me to go here to stackoverflow to try to solve my issue - but the issue is WITH THEM, and of course no one else have reported that my site doesn't work, it affects only me, and I haven't opened it yet due to this bug!
EDIT
I wanted to file a new bug report, but I can't even that now, since they are blocking bug reports with this URL as well!
I had to edit the URL - here is the new bug report
When Facebook tries to scrap your site for information, they send a call to your server with specific user agent called "facebookexternalhit"...
Facebook needs to scrape your page to know how to display it around
the site.
Facebook scrapes your page every 24 hours to ensure the properties are
up to date. The page is also scraped when an admin for the Open Graph
page clicks the Like button and when the URL is entered into the
Facebook URL Linter. Facebook observes cache headers on your URLs -
it will look at "Expires" and "Cache-Control" in order of preference.
However, even if you specify a longer time, Facebook will scrape your
page every 24 hours.
The user agent of the scraper is: "facebookexternalhit/1.1(+http://www.facebook.com/externalhit_uatext.php)"
Make sure it is not blocked by your server firewall
Look in your server log if it even tried to access your site
If you think this is a firewall issue look at this link
Your problem appears to be with your character encoding string. Your Apache server is currently sending the unsupported string latin1. You've defined your meta:content-type as iso-8859-1. See the w3c validator
From what I've seen, the Facebook parser will stop immediately if it encounters either an unrecognized character encoding string or a mismatch in character encoding strings between your header and meta tags.
The problem could be originating from either your httpd.conf or php.ini files. Change these to match your meta and restart Apache. Since the problem seems to be domain-specific, I'd check httpd.conf first.
Could your domain be blacklisted? Could you try messaging your url to someone, and see if Facebook gives you a "This message contains blocked content..." error?
For example:
If you don't provide certain minimum Facebook markup on your page, it will respond with "Error Parsing URL: Error parsing input URL, no data was scraped." I only looked at the homepage, but it appears that dagbok.nu contains no Facebook markup. I'm not sure what things must be present at minimum, but in my implementation, I assume the fb:app_id meta tag and the JavaScript SDK script must be there. You may want to take a look at http://developers.facebook.com/docs/guides/web/#plugins , particularly the Authentication section.
I discovered your question because I had this same error today for an unknown reason. I found that it was caused because the content of my og:image meta tag used an incorrect URL to the image I was trying to use. So as you add Facebook markup to your page, make sure your values are correct or you may continue to receive this message.
This doesn't seem to be a Facebook problem if you take a look at what I've discovered.
The results when testing it with W3C Online Validation Tool are 1 of 2 results.
Tested using: dagbok.nu but note http://dagbok.nu has no difference in test results. Remove the last forward slash in between tests.
Test: 1
Results: 72 Errors 0 Warning
Note: Shown here is a fragment of the source Frameset DOCTYPE webpage.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<NOSCRIPT><IMG SRC="http://svs.bystorm.se/rv?java=off"></NOSCRIPT><SCRIPT SRC="http://svs.bystorm.se/rvj"></SCRIPT>
<HTML STYLE="height:100%;">
<HEAD>
<META HTTP-EQUIV="content-type" CONTENT="text/html;charset=iso-8859-1">
Test: 2
Results: 4 Errors 1 Warning
Note: Shown here is a fragment of the source Transitional DOCTYPE webpage.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html >
<head>
<title>Dagbok: Framsida</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="author" content="Jonas Eklundh Communication (http://jonas.eklundh.com)">
<meta name="author-email" content="jonas#eklundh.com">
<meta name="copyright" content="Jonas Eklundh Communication #2012">
<meta name="keywords" content="Atlas,Innehållssystem,Jonas Eklundh">
<meta name="description" content="">
<meta name="creation-time" content="0,079s">
<meta name="kort" content="DGB">
Repeated tests loop these results when done a couple seconds apart indicating a page-redirect is occurring.
Security warnings are seen in Firefox and Chrome when visiting your site using these secure URL's:
https://dagbok.nu
https://www.dagbok.nu
The browser indicates the site should not be trusted because it's impersonating another site using invalid security certificate from *.loopiasecure.com
Recommendation: Check your .htaccess file, CMS Settings, page redirection, and security settings. Use the above source webpages to realize those file-locations / file-names that are being served to discover what's set incorrectly.
Once that's done, I think Facebook will be happy to then debug your webpage and provide additional recommendations.
Had the same problem and I discovered it was an incorrect IPv6 address in the AAAA records for my domain. The IPv4 record was correct, so the site worked in a browser but FB obviously check the IPv6 records!
This issue may also happen when Cloudflare is used. This is because Cloudflare protects the page from Facebook, which is then unable to collect the data, which in turn makes Facebook think the page is invalid.
My fix was:
Turn off Cloudflare for the page.
Scrape the page using Facebook's Dev Tools: https://developers.facebook.com/tools/debug/og/object
Click and let run the "Fetch new scrape information" button.
Re-enable cloudflare protection for the page.
You should then be able to continue to add the page where you needed.