Warning: "my page" is unreachable - facebook

Ok, so i am developing a new site, and it is very dependent of facebook.
I have looked everywhere and done everything i should but i keep getting this message in my FB comment area: Warning: http://www.videozoo.dk/?videos=klo-aben is unreachable
My header looks as it should like this:
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:fb="http://www.facebook.com/2008/fbml" xmlns:og="http://ogp.me/ns#">
<head profile="http://gmpg.org/xfn/11">
<meta property="fb:admins" content="my fb id nr"/>
<meta property="fb:app_id" content="195385377211689">
<meta property="og:title" content="Videozoo.dk - Dyre video" />
<meta property="og:type" content="Video" />
<meta property="og:url" content="www.videozoo.dk/?videos=klo-aben" />
<meta property="og:site_name" content="Dyre videoer for alle!"/>
<meta property="og:description"
content="Endnu en dyre video på VideoZoo.dk"/>
My fb comment code looks like this:
<div class="fb-comments" data-href="www.videozoo.dk/?videos=klo-aben" data-num-posts="10" data-width="640" data-colorscheme="dark"></div>
My app id for this comment box was made 4 hours ago so it should be updated on the servers by now...
The information in the app matches what is stated above:
Application ID/API Key:
195385377211689
Site-URL:
http://www.videozoo.dk/
Domain:
videozoo.dk
BUT - It is still not working! - is it because my site is new or mabye because the app is not updated on the servers yet??
Please have a closer look and maybe test at this link: www.videozoo.dk/?videos=klo-aben
All ideas are welcome!!

You left the protocol out of your value for og:url, include it there and it may solve the issue, the scraper follows HTTP or og:url redirects, and that URL is likely detected as invalid
{edit} I figured this out, and it's a bit strange, but... {/edit}
When I manually scrape that page it seems to work fine, but when I run it through the URL Debugger it fails due to a HTTP 403 response from your side
I've seen this before with other servers which can't handle some part of Facebook's request - in this case it seems to be because your server is rejecting the request if a HTTP 'Range' header is sent.
Facebook's crawler only requests the first 40KB of the document when scraping, as the meta tags should be in the <head></head> section
My test was:
$ curl -I -H 'Range: bytes=0-40960' 'http://www.videozoo.dk/?videos=klo-aben'
HTTP/1.1 403 Forbidden
Date: Wed, 30 Nov 2011 14:17:54 GMT
Server: Apache/2.2.6 mod_auth_kerb/5.3 PHP/5.2.17 mod_fcgid/2.3.5
Accept-Ranges: bytes
Connection: close
Content-Type: text/html
$ curl -I 'http://www.videozoo.dk/?videos=klo-aben'
HTTP/1.1 200 OK
Date: Wed, 30 Nov 2011 14:18:02 GMT
Server: Apache/2.2.6 mod_auth_kerb/5.3 PHP/5.2.17 mod_fcgid/2.3.5
X-Powered-By: PHP/5.2.17
Connection: close
Content-Type: text/html
I'm not sure if this is something in your code, server config, an intermediate proxy, etc, but it's very likely the cause of your problem

Related

Using an HTTP proxy server to stream IPTV sources

I don't know if this is the right place to ask this question but I'm hoping to find directions here.
I have a smart tv and I like to watch tv from my country with the SSIPTV app. I found an android app that streams local channels, so I checked the requests with android studio to find the streaming links. Some of them are free, but some others are served through cloudfront. The problem is that I can't add a header needed for cloudfront to authorize the request.
For example: when I try to make a request without the "User-Agent" header, the response is this:
Status Code: 403 Forbidden
Connection: keep-alive
Content-Length: 560
Content-Type: text/html
Date: Tue, 01 Jan 2019 20:57:50 GMT
Server: CloudFront
Via: 1.1 f7e7b00c5c66a4e43041ba24c378d07a.cloudfront.net (CloudFront)
X-Amz-Cf-Id: uZQAVTrQzHsQe2vGyHxY1OYfjHCL-Nz7gCTG-koHcgr1A5HG7fGGOg==
X-Cache: Error from cloudfront
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD>
<BODY>
<H1>403 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: TZztsUjltHpEhx54wplzupvLmZwjCRPtAvTcbdJ8DL16b1k9-_XwZw==
</PRE>
<ADDRESS></ADDRESS>
</BODY>
</HTML>
But if I set the "User-Agent" header with the value "iPhone" this is the response:enter code here
Status Code: 200 OK
Accept-Ranges: bytes
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Content-Type, User-Agent, If-Modified-Since, Cache-Control, Range
Access-Control-Allow-Methods: OPTIONS, GET, POST, HEAD
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Date, Server, Content-Type, Content-Length
Cache-Control: max-age=1
Connection: keep-alive
Content-Length: 366
Content-Type: application/vnd.apple.mpegurl
Date: Tue, 01 Jan 2019 20:51:32 GMT
Server: WowzaStreamingEngine/4.7.6
Via: 1.1 880eb84cefca849ee159a7c4d89c31ea.cloudfront.net (CloudFront)
X-Amz-Cf-Id: pogc8_OBsN2-QeGj_1q8K_vyxrQH-G8a2JmWqSkVt9x57NlbKfDSdQ==
X-Cache: Hit from cloudfront
So, is there a way I could set up a proxy to add the request and then get the content served in my tv app?
If you can configure your TV app to use an HTTP proxy, then this is straightforward, eg in squid this is documented here: http://www.squid-cache.org/Doc/config/request_header_add/
request_header_add User-Agent "iPhone"

Facebook debugger scrapes default Apache page instead mine

I made a site: http://pravo-trans.eu/
There is all needed og meta tags. But when I want to share link on any social networks nothings happens. I thought it might be cach. But when I used facebook debugger, it said:
The 'og:type' property is required, but not present.
And it's not true because I wrote in <head> this:
<meta property="og:title" content="Проект правовой помощи людям" />
<meta property="og:type" content="website" />
<meta property="og:image" content="/transgender-project.jpg" />
<meta property="og:description" content="Бесплатные юридические консультации и представительство по делам о смене документов (внесение изменений в записи о рождении, паспорта, трудовые книжки, документы об образовании и другие документы)" />
<meta property="og:url" content="http://pravo-trans.eu/" />
<meta property="og:locale" content="ru_RU" />
<link rel="canonical" href="http://pravo-trans.eu/" />
And most strange thing for me happen when I click on "See exactly what our scraper sees for your URL". There I saw that debugger parsed Apache default page instead mine! https://developers.facebook.com/tools/debug/og/echo?q=http%3A%2F%2Fpravo-trans.eu%2F
How it can be and how I can fix it?
After several hours of trying to debug this issue and playing with DNS settings/servers, I have a solution that works for me.
I noticed that requests from Facebook were coming from an IPv6 server, but my Apache VirtualHost declarations did not include the IPv6 address. To debug, I changed the following line in my Apache .conf file:
<VirtualHost IPv4:80>
to:
<VirtualHost IPv4:80 [IPv6]:80>
...and immediately upon restarting Apache, Facebook was able to successfully scrape my site. (Replace IPv4/IPv6 above with your actual addresses of course.)
If by chance you are using Parallels Plesk, as am I, then this is not a permanent solution because Plesk will rewrite the configuration files, so you have to go into the Plesk panel and make sure that your server's IPv6 address is assigned to the Subscription that owns the domain in question. In my case, only the IPv4 was assigned to the subscription.
The setting can be found under "Change Hosting Settings" for each particular Subscription.

Object Debugger 404 error

I have checked similar questions asked, but none seem to match the circumstances of this one.
This page is returning a 404 error in Facebook's Object Debugger tool. Other pages on the site work okay, so it shouldn't be any missing meta tags.
Now some of the page content is hidden, but only some, the majority of the page content is available, so surely this shouldn't be causing the issue. If it does then that would have to be regarded as a bug, no?
Anyone have any idea what the issue might be and/or how to fix?
The error message is accurate - your URL is returning a 404 when the Facebook crawler attempts to get the metadata
You'll need to check your server settings or the code which renders that URL to see why it's doing so, here's the output when i made the same request Facebook makes from my own laptop:
$ curl -A "facebookexternalhit/1.1" -i 'http://austparents.edu.au/webinars/parent-webinar-on-the-australian-curriculum-with-rob-randall-ceo-acara/'
HTTP/1.1 403 Forbidden
Date: Tue, 23 Sep 2014 00:03:36 GMT
Server: Apache/2.2.14 (Ubuntu)
Vary: Accept-Encoding
Content-Length: 366
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /webinars/parent-webinar-on-the-australian-curriculum-with-rob-randall-ceo-acara/
on this server.</p>
<hr>
<address>Apache/2.2.14 (Ubuntu) Server at austparents.edu.au Port 80</address>
</body></html>

Google share and Facebook sharer not pulling through information

I am running an MVC application on IIS.
when sharing a URL on either facebook (https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Ftn.hollaroo.com%2Fcontent%2Faux%2Fhollaroo%2Findex.html) or google+ (https://plus.google.com/share?url=http://tn.hollaroo.com/content/aux/hollaroo/index.html) it works with STATIC.html
When I am trying to do the same thing with tn.hollaroo.com/terms - no meta-data (title, description, image) is pulled through. index.html is a "view source + save as html" copy of /terms, so I doubt that the error is in the HTML itself.
Header section as follows
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:og="http://ogp.me/ns#"
xmlns:fb="http://www.facebook.com/2008/fbml"
itemscope itemtype="http://schema.org/Article">
<head runat="server">
<meta charset="utf-8" />
<title></title>
<meta itemprop="name" content="Hollaroo Trusted Network">
<meta property="og:description" content="The Trusted Network ...">
<meta name="description" content="The Trusted Network ..." />
<meta property="og:title" content="Hollaroo Trusted Network" />
<meta property="og:type" content="website" />
<meta property="og:image" content="http://tn.hollaroo.com/content/aux/hollaroo/images/posting.jpg" />
<meta property="og:site_name" content="Hollaroo - Private Social Recruitment Networks" />
is all there.
I have run CURL and the main difference I spot there is that I am trying to set a cookie:
~# curl -I http://tn.hollaroo.com/terms
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 26085
Content-Type: text/html; charset=utf-8
Set-Cookie: ASP.NET_SessionId=wscxgkryniqa0qd3dmukjpxe; path=/; HttpOnly
Date: Fri, 07 Mar 2014 16:03:09 GMT
~# curl -I http://tn.hollaroo.com/content/aux/hollaroo/index.html
HTTP/1.1 200 OK
Cache-Control: public
Content-Length: 26400
Content-Type: text/html
Last-Modified: Fri, 07 Mar 2014 15:44:21 GMT
Accept-Ranges: bytes
ETag: "e076f21b1c3acf1:0"
Date: Fri, 07 Mar 2014 16:03:32 GMT
The /terms url does not require login.
According to my IIS log AND to my own log in the app - I do get hits from facebook and I do return data:
IIS LOG:
2014-03-07 15:05:44.590 /terms - "D:\WEBS\Edge\terms" 200 "DEMO1" - 0 0 225
2014-03-07 15:05:54.605 /terms fb_locale=en_GB "D:\WEBS\Edge\terms" 200 "DEMO1" - 0 0 267
Url UserId IPAddress Browser At
/terms NULL 173.252.100.117 facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) 2014-03-07 15:05:44.263
/terms?fb_locale=en_GB NULL 173.252.100.113 facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) 2014-03-07 15:05:54.387
I am not disallowing web crawlers or blocking fb's IP.
Thank you very much for your help!

Facebook gives me a "URL requested a HTTP redirect, but it could not be followed." but the browser has no issue following it, how can I fix that?

My URL is http://citynomads.com which redirects to http://www.citnomads.com and the browser follows the redirect perfectly well, but Facebooks linter is telling me it can't follow the link. I can't see why it's having an issue with this:
$ curl -i citynomads.com
HTTP/1.1 302 Moved Temporarily
Server: nginx/1.0.8
Date: Sat, 28 Apr 2012 16:12:38 GMT
Content-Type: text/html
Content-Length: 160
Connection: keep-alive
Location: http://www.citynomads.com/
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.0.8</center>
</body>
</html>
Linter: https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fcitynomads.com
How can I rectify that?
I think this is a FB linter bug. Your given linter link now works (has some other bizarre warnings, but not a failed redirect...)
This happened to me too, but posting the same url to facebook does generate a proper preview, so i'm guessing here...
Wouldn't be their first bug, nor the last...
HTH