Domain blocked and no data scraped - facebook

I recently purchased the domain www.iacro.dk from UnoEuro and installed WordPress planning to integrate blogging with Facebook. However, I cannot even get to share a link to the domain.
When I try to share any link on my timeline, it gives the error "The content you're trying to share includes a link that's been blocked for being spammy or unsafe: iacro.dk". Searching, I came across Sucuri SiteCheck which showed that McAfee TrustedSource had marked the site as having malicious content. Strange considering that I just bought it, it contains nothing but WordPress and I can't find any previous history of ownership. But I got McAfee to reclassify it and it now shows up green at SiteCheck. However, now a few days later, Facebook still blocks it. Clicking the "let us know" link in the FB block dialog got me to a "Blocked from Adding Content" form that I submitted, but this just triggered a confirmation mail stating that individual issues are not processed.
I then noticed the same behavior as here and here: When I type in any iacro.dk link on my Timeline it generates a blank preview with "(No Title)". It doesn't matter if it's the front page, a htm document or even an image - nothing is returned. So I tried the debugger which returns the very generic "Error Parsing URL: Error parsing input URL, no data was scraped.". Searching on this site, a lot of people suggest that missing "og:" tags might cause no scraping. I installed a WP plugin for that and verified tag generation, but nothing changed. And since FB can't even scrape plain htm / jpg from the domain, I assume tags can be ruled out.
Here someone suggests 301 Redirects being a problem, but I haven't set up redirection - I don't even have a .htaccess file.
So, my questions are: Is this all because of the domain being marked as "spammy"? If so, how can I get the FB ban lifted? However, I have seen examples of other "spammy" sites where the preview is being generated just fine, e.g. http://dagbok.nu described in this question. So if the blacklist is not the only problem, what else is wrong?
This is driving me nuts so thanks a lot in advance!

I don't know the details, but it is a problem that facebook has with web sites hosted on shared servers, i.e. the server hosting your web site also hosts a number of other web sites.

Related

Facebook OG Tags cache my old server

I'm facing a big problem with the Facebook debugger. I've read a tons of topics about the Facebook cache etc ... but nothing like mine.
I recently changed my server, so the new one runs perfectly, while the old one is closed.
The problem is Facebook don't see the change and keeps scanning the old server. I know it because the title is a 404 and when I click on "See exactly what our scraper sees for your URL" it returns "Document returned no data".
The problem is for every single pages but if you want to test one, for example :
http://sayitwithkittens.io/cat/40
What Facebook debugger see : https://developers.facebook.com/tools/debug/og/echo?q=http%3A%2F%2Fsayitwithkittens.io%2Fcat%2F40
I would like to upload a screen of the parsing but I don't have the necessary reputation yet..
Thank you for helping me :)
Resolved by Igy :
Did you update both the IPV4 and IPV6 records for your domain? When I check from my laptop i get an IPV6 address, which returns a 404 for that URL

Error parsing input URL, no data was scraped. only with new pages on my site

The problem i have is that i own a website where other people can post stuff ,creating new pages on my domain, but the problem that occured today is that all the new post pages created today are malfunctioning , sharing is not loading thumbnail picture and title and so on, but the weird this is that all the posts(new pages) created before today are all working fine
What caused an error to occur out of nowhere?
I also cannot debug any of the URL's of my website as the same error: Error parsing input URL, no data was scraped
The website im having problems with is here http://www.vabameedia.ee/vm/184/h%C3%A4da-ei-anna-h%C3%A4beneda.html
This is one of the sites where it says no error on page but facebook still cant reach it. http://www.vabameedia.ee/vm/178/craig-parks-%C3%BChek%C3%A4eline-krossisoitja.html
For people experiencing the same problem but for different causes, I discovered a few interesting things about how Facebook "scrapes" pages, checking the logs of the server while doing some trials.
First of all: if you never tried to share a page with FB, FB never tried to scrape it, and it will not try to do so if you only put the url in the Debug tool.
That's the first reason because you get the error: it just states that FB has no information on the page, you must "force" it to scrape the page.
The first time you try to share a page, FB scrapes it (asks your server the first 40k of the page and analyse the opengraph tags).
What can happen is that you do not see the image: Facebook Share Dialog does not display thumbnails one first load
The reason is that FB behind the scenes is still scraping your page and caching the image. The next time, in fact, you have also the image.
How to solve it? Pre caching: https://developers.facebook.com/docs/sharing/best-practices#precaching
or simply add
<meta property="og:image:width" content="450"/>
<meta property="og:image:height" content="298"/>
I was pulling my hair out trying to fix this issue. Hours and hours of troubleshooting to no avail. After speaking with one of our programmers about a topic unrelated I thought of something to try as a long shot.
Much to my surprise, it worked!!!
This is the reason behind the problem and my solution for it:
When you draft a post in WordPress it generates a link based on your article's title (unless you manually change it). The title of my article included special characters, however the auto-generated link didn't display these special characters, only hyphens to replace the spaces. Should be fine right? Wrong! Somewhere embedded in metadata and code in the WordPress platform are those special characters and they mess up the way Facebook pulls info from the article being linked to. This is a problem because certain special characters invalidate hyperlinks.
For example:
Article Title: R[eloaded]
Auto-generated hyperlink DISPLAYED in WordPress "Permalink" field: http://www.example.com/reloaded
Actual WordPress Auto-generated hyperlink: http://www.example.com/r[eloaded]
Those brackets will invalidate the link and Facebook will be unable to pull any information (ie pictures) from it.
Solution:
(1) Simply, manually change the WordPress hyperlink address to something that doesn't include any special characters (this will not change the title of your article).
(2) Click "Update" to change the post to include the new hyperlink.
(3) Click "Purge from Cache" in the WordPress window
(4) Refresh your Facebook browser window
(5) Paste the new hyperlink for your article
(6) Enjoy your Facebook post with a preview image and information
Sidenote: Don't pull your hair out over Facebook, it's not worth it. =)
If you're using Wordpress, edit the post in question to change the permalink (just alter it slightly), then update the post. Using the new permalink in the Facebook OG debugger should now work.
It's a weird fix, but I think it takes care of a problem caused by special characters being used in the title of a post, which is then used to make the permalink.
Its all about DNS issue, was having same issue and resolved it by updating domain name servers to actual name servers.
In my case my domain was pointed to ns1.websterz.net and ns2.websterz.net and on this server i had DNS redirect to my other server (where web site is hosted). I Just updated name servers of the domain to actual name servers where my web site is hosted on. This was account migration case i forgot to update name servers as of new server.
Everything works fine now.

URL Blocked on Facebook [duplicate]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to share a link to my site on on Facebook. The page displays correctly in my browser, but when I share it via the API or front end it does not show up. When I put my URL into the Graph API debugger it gives me an error "Error Parsing URL: Error parsing input URL, no data was scraped."
What could be wrong?
Hopefully, this is an exhaustive list of things to check when your site won't scrape:
1) Is your site on a spam blacklist?
This is rare, but Facebook and most other tools won't parse your site at all if it shows up on a spammer blacklist.
I use https://admin.uribl.com/ as a checker. If your site is listed, you need to find and clean the malware on your site, then follow the instructions from the blacklist owner(s) to remove your site. If the problem is that you've got a host who is a known spammer, you'll need to change hosts. It's going to take a few days for this to work its way through the system before any site will scrape your site again.
2) Is your (X)HTML valid?
Facebook's parser is very strict. If the headers sent by your web server or your HTML isn't valid, Facebook will not parse your site. To test this in detail, use the Markup Validator from the W3C. You have to resolve all of the errors before Facebook will parse your page.
Some of the most common errors I have seen are:
Invalid string sent in the headers
Mismatch between the character-encoding sent in the header and the <meta charset> tag in the document.
Invalid or incorrect <!DOCTYPE>
Whitespace before the ` tag
Malformed HTML tags, especially in the <head>
Tags closed with > instead of /> in XHTML documents
3) Are you redirecting your visitors with JavaScript?
The Facebook parser does not execute JavaScript. If you want to redirect a visitor to custom content, you need to do this with a server-side script.
4) Is your server refusing connections to non-browsers?
This is harder to diagnose, but some servers are set to return a 500:Server Error or 403:Forbidden for any non-browser visitor.
5) Does the Facebook site tell you your link is blocked?
Log into Facebook and attempt to share a link on your timeline. If your site appears in the Facebook internal blacklist, you will get a message telling you the site is blocked. On this dialog, there is a form where you can mark this as a false positive and request a review of your site.
If you end up on this list, Facebook users are blocking your postings or marking them as spam. That probably originates in your content. What you think is SEO is probably spamdexing or the content you are sharing is offensive or polarizing to some users, or you're just sharing the same stuff over and over again.
Once you have fixed the error, visit the Facebook Debugger again. A manual visit to the debugger clears Facebook's cache for that URL. Give things a few minutes for Facebook to push the updates to all servers, and then try again.
if your site in black list (you can check in that place: https://add5000.com/all-tips/entry/facebook-checker-if-link-blocked )
you can try use url redirector with open graph and refferals support:
https://add5000.com/all-tips/entry/how-to-post-a-blocked-link-on-facebook
If your link may not blocked and facebook dont accept your link
search in your page blocked links to ather sites such references should check

Why does Object debugger say my URL is a facebook URL and isn't "scrapable"

In trying to create an "object" page for my first facebook app, I've run into some difficulty. I followed Facebook's Open Graph Tutorial nearly exactly.
After creating an "object" html page with the appropriate <meta property="og:... tags I tried running the URL through the Debugger Tool as suggested in the tutorial but I'm given the following error:
"Facebook URLs aren't scrapable by this Debugger. Try your own."
This page is in the same directory on my company's linux box as the canvas page, and is certainly not a "Facebook URL". If it matters, I'm using an IP instead of a domain name: xx.x.x.xxx/app/obj.html
...
I continued the tutorial anyway, but ultimately it does not seem to want to post a new action/object (is this even right?). I did however manage to get something to work, as in the app timeline view I apparently actioned one of those objects a couple hours ago. I assume this happened when I was pasting curl POST commands into the terminal.
I'm pretty new to the whole open graph, and facebook APIs, etc., so I'm probably operating under false assumptions of some sort, and I've been all over trying different things, but this error seems pretty bizarre to me and I can't seem to resolve it.
UPDATE
I just took the object page and put it on my own personal shared hosting acct. The debugger worked (inexplicably) fine on it, but I couldn't go too far since it's a different domain than the one authorized by my app.
Make sure og:url inside your html page does not point to facebook.
Also, make sure to look at the open graph protocol page (to see you formatted the og tags correctly.
Also, make sure the page is accessible to everyone, not just yourself.
Without knowing the URL it's hard to be sure, but it's most likely that your URL is either including a og:url tag pointing to a facebook.com address, or a HTTP 301/302 redirect to Facebook instead

Site URL blocked as "Spammy", but also no data was scraped in debug

Made a website for a client of mine who owns a small business. About three months ago, her site URL was blocked by Facebook for being "Spammy". We launched a pretty impressive "Go Here And Report It As Safe" campaign, but alas, it's not unblocked.
We made a new domain that mirrored the blocked one. This worked for about an hour. Then lo and behold! It got blocked too.
I was very curious, so I decided to try out the "Object Debugger". When I did, I got this message:
"Error Parsing URL: Error parsing input URL, no data was scraped."
I tried it again a few hours later, and it scraped just fine! Not only did it scrape and show up in debug perfectly, but it also didn't ping as blocked when I posted to my wall! It was amazing.
Sadly, I made an edit to the header file (just took away a meta tag), and now it won't scrape again. And it's blocked again.
The URL in question is enchantedcareers.com.
I feel like maybe the site isn't being blocked as spammy, but rather, there may be some kind of coding problem? Anyone else had an OG bug ping a URL as blocked upon link shim?
EDIT : Again, it let me post the link, with full preview and everything. I posted it, and about one minute later, the post was removed, and it was back to being "spammy"
The URL Debugger only scrapes my URL sporadically (with no page edits being made whatsoever).
I can't find a pattern.
no data was scraped: 9:06
successful scrape: 9:34
no data was scraped: 9:44
successful scrape: 10:04
no data was scraped: 10:08
Edit #2 : This is just completely crazy. Our new domain, enchantedcareers.net, which is nothing but my host's default quickstart.html page, is also blocked from being posted. When I try to post the .net domain, it gives me both the .net and .com domain as being blocked.
THE .COM DOMAIN ISN'T EVEN TIED TO THE .NET NAME. This domain is straight out of the box. Why is it bringing that domain up when I try to post a new one?
I'm just so confused.
It won't scrape the .net name, either.
Could this be a server thing...?
Your URI comes up clean on a URIBL check, you're on Dream Host, which is typically reliable, so I wouldn't expect to see your IP address show up in an DNSBL check. I don't see any glaring errors in your page code that typically causes the Facebook parser to choke.
There is one "suspicious" script on your page according to this report. Try removing this, clear your cache and see if Facebook will parse your URL.