Why does Object debugger say my URL is a facebook URL and isn't "scrapable" - facebook

In trying to create an "object" page for my first facebook app, I've run into some difficulty. I followed Facebook's Open Graph Tutorial nearly exactly.
After creating an "object" html page with the appropriate <meta property="og:... tags I tried running the URL through the Debugger Tool as suggested in the tutorial but I'm given the following error:
"Facebook URLs aren't scrapable by this Debugger. Try your own."
This page is in the same directory on my company's linux box as the canvas page, and is certainly not a "Facebook URL". If it matters, I'm using an IP instead of a domain name: xx.x.x.xxx/app/obj.html
...
I continued the tutorial anyway, but ultimately it does not seem to want to post a new action/object (is this even right?). I did however manage to get something to work, as in the app timeline view I apparently actioned one of those objects a couple hours ago. I assume this happened when I was pasting curl POST commands into the terminal.
I'm pretty new to the whole open graph, and facebook APIs, etc., so I'm probably operating under false assumptions of some sort, and I've been all over trying different things, but this error seems pretty bizarre to me and I can't seem to resolve it.
UPDATE
I just took the object page and put it on my own personal shared hosting acct. The debugger worked (inexplicably) fine on it, but I couldn't go too far since it's a different domain than the one authorized by my app.

Make sure og:url inside your html page does not point to facebook.
Also, make sure to look at the open graph protocol page (to see you formatted the og tags correctly.
Also, make sure the page is accessible to everyone, not just yourself.

Without knowing the URL it's hard to be sure, but it's most likely that your URL is either including a og:url tag pointing to a facebook.com address, or a HTTP 301/302 redirect to Facebook instead

Related

Facebook showing page not found when sharing link

I'm sharing content from a website and every time I paste the link into Facebook it says 'page not found'.
Sometimes it works when I manually add the 'www.' in front of the URL in the address bar.
EXAMPLE
Shows page not found:
http://roundreviews.co.uk/reviews/speakers/native-union-monocle-speaker/
Works when you manually place www. in front:
www.roundreviews.co.uk/reviews/speakers/native-union-monocle-speaker/
I honestly have no I idea why it's doing this, any thoughts on how it can be fixed on the web side?
Also...
I have tried with the link below with both the www. and without yet it doesn't work with either of them, this is all very strange. This is the only link I have tried and it doesn't work with both:
www.roundreviews.co.uk/microphones/spark-digital-microphone/
Any help is much appreciated, thanks.
For me what it worked was to access the Facebook Debugger, as Goose said.
I saw that the scrape was about 12 hours ago, looks like it fetches the first time and saves it as caché or whatsoever...
What it worked for me is to debug the url, then click "fetch new scrape information" after the previous information has been shown.
Hope it works!
For those running across this today, you might find that you also need to verify your domain and link it to your page.
To do this you need to
Set up a Facebook Business Account
Add your page to the business account
Verify your domain (using DNS TXT or adding a page facebook gives you)
Under domains, connect your page as an asset of that domain

Error parsing input URL, no data was scraped. only with new pages on my site

The problem i have is that i own a website where other people can post stuff ,creating new pages on my domain, but the problem that occured today is that all the new post pages created today are malfunctioning , sharing is not loading thumbnail picture and title and so on, but the weird this is that all the posts(new pages) created before today are all working fine
What caused an error to occur out of nowhere?
I also cannot debug any of the URL's of my website as the same error: Error parsing input URL, no data was scraped
The website im having problems with is here http://www.vabameedia.ee/vm/184/h%C3%A4da-ei-anna-h%C3%A4beneda.html
This is one of the sites where it says no error on page but facebook still cant reach it. http://www.vabameedia.ee/vm/178/craig-parks-%C3%BChek%C3%A4eline-krossisoitja.html
For people experiencing the same problem but for different causes, I discovered a few interesting things about how Facebook "scrapes" pages, checking the logs of the server while doing some trials.
First of all: if you never tried to share a page with FB, FB never tried to scrape it, and it will not try to do so if you only put the url in the Debug tool.
That's the first reason because you get the error: it just states that FB has no information on the page, you must "force" it to scrape the page.
The first time you try to share a page, FB scrapes it (asks your server the first 40k of the page and analyse the opengraph tags).
What can happen is that you do not see the image: Facebook Share Dialog does not display thumbnails one first load
The reason is that FB behind the scenes is still scraping your page and caching the image. The next time, in fact, you have also the image.
How to solve it? Pre caching: https://developers.facebook.com/docs/sharing/best-practices#precaching
or simply add
<meta property="og:image:width" content="450"/>
<meta property="og:image:height" content="298"/>
I was pulling my hair out trying to fix this issue. Hours and hours of troubleshooting to no avail. After speaking with one of our programmers about a topic unrelated I thought of something to try as a long shot.
Much to my surprise, it worked!!!
This is the reason behind the problem and my solution for it:
When you draft a post in WordPress it generates a link based on your article's title (unless you manually change it). The title of my article included special characters, however the auto-generated link didn't display these special characters, only hyphens to replace the spaces. Should be fine right? Wrong! Somewhere embedded in metadata and code in the WordPress platform are those special characters and they mess up the way Facebook pulls info from the article being linked to. This is a problem because certain special characters invalidate hyperlinks.
For example:
Article Title: R[eloaded]
Auto-generated hyperlink DISPLAYED in WordPress "Permalink" field: http://www.example.com/reloaded
Actual WordPress Auto-generated hyperlink: http://www.example.com/r[eloaded]
Those brackets will invalidate the link and Facebook will be unable to pull any information (ie pictures) from it.
Solution:
(1) Simply, manually change the WordPress hyperlink address to something that doesn't include any special characters (this will not change the title of your article).
(2) Click "Update" to change the post to include the new hyperlink.
(3) Click "Purge from Cache" in the WordPress window
(4) Refresh your Facebook browser window
(5) Paste the new hyperlink for your article
(6) Enjoy your Facebook post with a preview image and information
Sidenote: Don't pull your hair out over Facebook, it's not worth it. =)
If you're using Wordpress, edit the post in question to change the permalink (just alter it slightly), then update the post. Using the new permalink in the Facebook OG debugger should now work.
It's a weird fix, but I think it takes care of a problem caused by special characters being used in the title of a post, which is then used to make the permalink.
Its all about DNS issue, was having same issue and resolved it by updating domain name servers to actual name servers.
In my case my domain was pointed to ns1.websterz.net and ns2.websterz.net and on this server i had DNS redirect to my other server (where web site is hosted). I Just updated name servers of the domain to actual name servers where my web site is hosted on. This was account migration case i forgot to update name servers as of new server.
Everything works fine now.

Domain blocked and no data scraped

I recently purchased the domain www.iacro.dk from UnoEuro and installed WordPress planning to integrate blogging with Facebook. However, I cannot even get to share a link to the domain.
When I try to share any link on my timeline, it gives the error "The content you're trying to share includes a link that's been blocked for being spammy or unsafe: iacro.dk". Searching, I came across Sucuri SiteCheck which showed that McAfee TrustedSource had marked the site as having malicious content. Strange considering that I just bought it, it contains nothing but WordPress and I can't find any previous history of ownership. But I got McAfee to reclassify it and it now shows up green at SiteCheck. However, now a few days later, Facebook still blocks it. Clicking the "let us know" link in the FB block dialog got me to a "Blocked from Adding Content" form that I submitted, but this just triggered a confirmation mail stating that individual issues are not processed.
I then noticed the same behavior as here and here: When I type in any iacro.dk link on my Timeline it generates a blank preview with "(No Title)". It doesn't matter if it's the front page, a htm document or even an image - nothing is returned. So I tried the debugger which returns the very generic "Error Parsing URL: Error parsing input URL, no data was scraped.". Searching on this site, a lot of people suggest that missing "og:" tags might cause no scraping. I installed a WP plugin for that and verified tag generation, but nothing changed. And since FB can't even scrape plain htm / jpg from the domain, I assume tags can be ruled out.
Here someone suggests 301 Redirects being a problem, but I haven't set up redirection - I don't even have a .htaccess file.
So, my questions are: Is this all because of the domain being marked as "spammy"? If so, how can I get the FB ban lifted? However, I have seen examples of other "spammy" sites where the preview is being generated just fine, e.g. http://dagbok.nu described in this question. So if the blacklist is not the only problem, what else is wrong?
This is driving me nuts so thanks a lot in advance!
I don't know the details, but it is a problem that facebook has with web sites hosted on shared servers, i.e. the server hosting your web site also hosts a number of other web sites.

URLs redirect to spyware site

We are developing an app that makes posts on behalf of our users to Facebook. Within those posts, we want to put links to external (non-Facebook) websites.
Looking at the links in the status bar of the browser (usually Chrome), the correct URL is displayed. However, Facebook seems to wrap the actually-clicked link into some extra bells-and-whistles. Usually, this works correctly.
Sometimes, however, this URL wrapping ends up sending the click to a URL like:
http: //spywaresite.info/0/go.php?sid=2
(added space to make it non-browsable!) which generates Chromes severe warning message:
This happens very occasionally on Chrome, but very much more often in the iOS browser on the iPhone.
Does anyone have any pointers as to how to deal with this?
EDIT
For example, the URLs we put in the link is
http://www.example.com/some/full/path/somewhere
but the URL that actually gets clicked is:
http://platform.ak.fbcdn.net/www/app_full_proxy.php?app=374274329267054&v=1&size=z&cksum=fc1c17ed464a92bc53caae79e5413481&src=http%3A%2F%2Fwww.example.com%2Fsome%2Ffull%2Fpath%2Fsomewhere
There seems to be some JavaScript goodness in the page that unscrambles that and usually redirects correctly.
EDIT2
The links above are put on the image and the blue text to the right of the image in the screenshot below.
Mousing over the links (or the image) in the browser shows the correct link. Right-clicking on the link and selecting "Copy Link Address" gets the fbcdn.net link above (or one like it). Actually clicking on the link seems to set off some JavaScript processing of the fbcdn.net link into the right one... but sometimes that processing fails.
I'm not 100% sure what you're asking here, but i'll tell you what I know:- are you referring to this screen on Facebook?
(or rather, the variation of that screen which doesn't allow clickthrough?)
If you manually send a user to facebook.com/l.php?u=something they'll always see that message - it's a measure to prevent an open redirector
if your users are submitting such links, including the l.php link, you'll need to extract the destination URL (in the 'u' parameter)
If you're seeing the l.php URLs come back from the API this is probably a bug.
If links clicked on facebook.com end up on the screen it's because facebook have detected the link as suspicious (e.g. for URL redirector sites - the screen will allow clickthrough but warn the user first) or malicious/spammy (will not allow clickthrough)
In your app you won't be able to post links to the latter (an error will come back saying the URL is blocked), and the former may throw a captcha sometimes (if you're using the Feed dialog, this should be transparent to the app code, the user will enter the captcha and the dialog will return as normal)
If this isn't exactly what you were asking about please clarify and i'll update my answer
Rather than add to the question, I thought I'd put more details here.
It looks like the Facebook mention in the original title was mis-directed, so I've removed it.
We still haven't got to the bottom of the issue.
However, we used both Wireshark and Fiddler to look at the HTTP traffic between the Chrome browser (on the PC) and Facebook. Both showed that Facebook was returning the correct URL refresh.
Here's what Wireshark showed:
What we saw on Fiddler was that our server is issuing a redirect to the spywaresite.info site:
We are working with our ISP to figure out what is happening here.

Wordpress og:image shows up blank

I've been at this for almost 3 days straight and now I can't even think clearly anymore.
All I'm trying to do is to get my featured image thumbnail to appear when I paste the link in Facebook.
I'm using the Wordpress Facebook Open Graph protocol plugin which generates all the correct og meta properties.
My thumbnail images are 240x200px which respects the minimum requirements and also respects the 3:1 ratio
I've made sure there's no trailing slash at the end of my post URLs
When I use the Facebook Object Debugger, the only warning is in regards to my locale, but that shouldn't affect it.
Facebook appears to be pulling the right image, at least the URL is correct, but the image appears as a blank square
I've gone through pretty much every thread I could find in forums, but all the information available is about using the correct og tags, which I believe I'm already doing.
Thank you very very much for any help, I'm desperate!! :)
You can troubleshoot the OpenGraph meta tags with the Debugger https://developers.facebook.com/tools/debug - this can at least show if you're using the meta tags properly and if Facebook can 'read' the image.
I finally figured out that the root of my issue was the fact that I was using an addon domain (which is really a subdomain being redirected to the top level domain) and I read on eHow (of all places :) ) that Facebook has trouble pulling data from redirected domains.
Not sure if there was another way around it, but I simply ended up creating a seperate hosting account and everything is loading properly now.
one problem youre going to run into testing is that often the first time your page or post gets liked, fb keeps whatever img it finds in your meta tags or by searching your page. so, you'll keep changing your img meta tag and still it wont show the right pic. it's very anoying. One way to get around it is to change the slug of your post. now, it has a different url and to fb, it's a different page. The downside is you lose all the likes that go with your orig url. Not a problem with a new site.
I ended here googling another problem. Maybe this might help someone:
Please bear in mind that the facebook scraper works asynchronously and will need some time (during my tests around 10 minutes) to be able to display an image after seeing it for the first time.
For more information, here's a more thorough answer on a similar problem.
Indeed, as Andy Wibbels points out the FB debugger is a really handy tool.
I faced a similar issue with a server's og:image tag pointing to a secure subdomain which actually mirrors a CDN server,
<meta property="og:image" content="https://subdomain.pathToImage.jpg" />
<meta property="og:image_secure" content="https://subdomain.pathToImage.jpg" />
The FB debugging tool allows you to see the errors that FB encounters when trying to pull the image.
In my case the subdomain was not registered under the SSL certificate used by the HTTPS protocol. Hence FB was getting the following error,
Curl Error : SSL_CACERT SSL certificate problem: unable to get local issuer certificate