Google unable to crawl my website - google-search-console

Recently i have a crawl error appear in my google search console :
Google couldn't crawl your site because we were unable to access your
site's robots.txt file
my robots.txt content :
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: http://www.name.com/sitemap.xml
when i try FETCH AS GOOGLE it say :
Temporarily unreachable

Common causes for this error is blocked google ip address by hosting company. You can contact your hosting company and find out whether this is so.

Related

Make 'http' request from Github-pages 'https' hosted site

I've hosted my webapp to Github pages, thus website is on 'https'. But now, I want it to make a 'Http' request to some external site.(I don't have a custom domain to change hosted site to http.)
I'm getting the 'Mixed-content' error -
Mixed Content: The page at 'https://username.github.io/MyHostedSite/' was loaded over HTTPS, but requested an insecure XMLHttpRequest endpoint 'http://someHttpApi'. This request has been blocked; the content must be served over HTTPS.
Is there a way to proxy this so I can do a 'http' request over 'https' ?
Get a custom domain, and do DNS with Cloudflare (free)
… you can turn https on at Cloudflare (in page rules) — then you don't have to worry about github's http/https settings and mixed content errors.
There are good instructions for setting up a custom domain with github pages, you can see it here: Custom domain for GitHub project pages
You just have to decide if you want to serve your site at the apex domain, or with the www subdomain:
http://example.com
vs
http://www.example.com
page rules at Cloudflare look like this:

Google cloud platform - Set up a 301 redirect from www

When I was on AWS, setting up a 301 redirect for my website from www.example.com --> example.com involved simply creating a S3 bucket as a static website and set it up to redirect all traffic to example.com.
I don't see this option on Google cloud storage and I can't think of any way to do this with the HTTP load balancer.
Is the only way doing it involves patching my backend to notice addresses that start with www and strip the www and redirect ?
Google has a way of using buckets as backends for the http load balancer.
It is still in alpha but you can read about it and ask them to try it here. Use it with a html file to redirect like suggested here and my guess it should work.
Alternatively, I use Cloudflares free service, which allows for 3 free redirects. Saving you the trouble of configuring redirects in your backend. This can be done with some other CDN services as well I don't know which.

301 redirect for Github Pages and CloudFlare SSL

I am using Github Pages as my hosting site for my domain. The pages are hosted at username.github.io. As per github pages documentation I have put the CNAME file in the root directory pointing to example.com
And in my godaddy DNS manager I have added CNAME www to username.github.io
Later I switched to CloudFlare to use the Universal Free SSL for my Github Custom Domain page.
Currently the CloudFlare DNS Manager includes these two items:
A exmaple.com 192.30.252.153
CNAME www username.github.io
Since I have enabled SSL in cloudfare and redirect http (naked or otherwise) addresses to https, I have put a Page Rule as http://*example.com/* with Always use https turned on.
Now all types of addresses are getting redirected to https://example.com (this is my end requirement)
However the 301 redirection from http://www.example.com to https://example.com is happening this way:
http://www.example.com to
https://www.example.com/ to
http://example.com/ to
https://example.com/
This multiple redirection will affect the site loading speed if a user types the address as www.example.com. And (possibly?) these multiple redirections will affect page ranking in search sites.
Hence is it not better to put direct 301 instead of multiple? Or using multiple redirection is what normally all web-masters do in a situation like this?
If no, then someone please guide me to enable the 301 redirection from http://www.example.com directly to https://example.com/ without any multiple redirections.
You can set Page Rules in CloudFlare and change the order to your intended effect.
If this is still problematic you can also enable HSTS which will require the browser to access the HTTPS version after the first time you visit the site. This also makes the site more secure by not allowing anyone to man-in-the-middle your secure connections.

Facebook gives 403 error for my website - Updated Information

I have a website where i added facebook og tags. http://bowarrow.de No matter what i try and what i change i always get a 403 Error in the debugger.
Though it can access my site somehow. I read every question about this and in the last question i asked about it, no one could really help me. So i decided i ask on facebook and could find the following:
In this case, your site is definitely returning a 403 error to at
least some of the requests from the debugger. This is something
happening in your code or hosting infrastructure
$curl -A "facebookexternalhit/1.1" -i 'http://bowarror.de/' HTTP/1.1
403 Forbidden Date: Mon, 03 Jun 2013 16:03:55 GMT Server: Apache
Content-Length: 2940 Content-Type: text/html
Host Europe GmbH – bowarrow.de [...]
I tried it myself and can confirm that i can't get any access with that facebook header. I asked hostgator several times if there is a server problem on their site and they denied. So maybe i think it might have something to do with host europe, where my domain is registered?
I linked the domain to my hosting through a-records because host europe doesn't support nameserver changes.
Any ideas, help?
Okay i've found probably what caused it. The reason for this was that i use my domain from hosteurope with hostgator. Because hosteurope doesn't allow nameserver changes i had to change the a-records.
Unfortunally there were some AAAA records IPv6 that i didn't change, because hostgator doesn't support ipv6 in my hosting.
Facebook was crawling these ipv6 and sent a 403 to the debugger. (Because there was no ipv6 server that it could have access to)
Yesterday i deleted them and it nearly immediately startet working. See here: https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fwww.bowarrow.de%2F
Unfortunally it only works for the URL with www. without it i still get a 403.
see here: https://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fbowarrow.de%2F
For anyone using the 10Web Social Post Feed WordPress plugin, follow these steps:
Go to Facebook Feed WD > Options page and press Uninstall.
Uninstall the plugin by following the steps in this video.
Navigate to Plugins page and delete Facebook Feed WD
Reinstall and activate it.
Re-Authenticate your facebook account and recreate your feeds.

Disable googlebot fetching www

I have www redirect in .htaccess.
So www.example.com gets 301 redirect to example.com
But google still tries to fetch www.example.com also.
Can i disable googlebot fetcing www.example.com?
Eg from webaster tools or robots.txt?
In Google Webmaster Tools you can set your preferred hostname – with or without www, but this comes with no guarantee from Google. As you have 301s from www to non-www, Googlebot will probably respect your wish.