Google Webmaster Tools claiming my robots.txt blocking almost all of my site - google-search-console

I have submitted a sitemap which has many thousands of URLs, but when I look at the webmaster tools it claims that 9800 of my URLS are blocked by my robots.txt file.
What am I supposed to do to convince it that nothing is being blocked?

Sometimes this just means that the robots.txt file couldn't be reached (returned a 5xx server error, or Googlebot just didn't get a response). In those cases, Google will treat any URL they attempt to crawl as being disallowed by robots.txt. You can see that in the Crawl Errors section in Webmaster Tools (in the site-errors on top).

Related

Prevent hotlinking but allow Facebook - MaxCDN

I've recently enabled hotlink protection on MaxCDN, using what is known as Referer Access Control whitelisting.
I've enabled my own domains and my CDN domain, and it's working very nicely, however, when I try to share an image on social media, the og:image is not being picked up correctly.
Using the Facebook Debugger, I can see that an error is being thrown on the og:image
"Provided og:image URL, https://cdn.collectiveray.com/images/webdesign/web_design_blogs.jpg could not be processed as an image because it has an invalid content type."
I believe the problem is that the Facebook crawler is not in the whitelist. I've allow facebook.com, *.facebook.com, fbcdn.com, *.fbcdn.com, fbcdn.net, *.fbcdn.net, yet I am still unable to resolve the above error.
Would anybody know which are the exact domains to whitelist to allow social networks, both Facebook and others, to be able to access the images directly via their CDN URL?
TIA
David
I don’t think the Facebook scraper sends any referrer.
But you can identify it based on the User-Agent it sends, see https://developers.facebook.com/docs/sharing/webmasters/crawler
Details on how to set this up for MaxCDN here, https://www.maxcdn.com/one/tutorial/blank-referers-social-networks/

How to check whether my Domain is Blacklisted on Facebook or not?

I have a Facebook page of my blog but whenever I try to schedule post my blog article links on the Facebook page The Facebook automatically delete my schedule post and even some time they do not publish it can it be the reason that my domain name has been declared SPAM by the Facebook how to check it.
Use the Facebook sharing debugger.
If you input a URL that redirects to the blocked URL, it will initially say "this webpage contains a blocked URL. If you input the blocked URL directly, it will say "We can't review this website because the content doesn't meet our Community Standards. If you think this is a mistake, please let us know." You also get this message if you put in a URL that redirects to the blocked URL a second time.
https://developers.facebook.com/tools/debug/sharing/
:It is often hard to tell the reason for this but, you can definitely get out of this by following the proper instructions.
First check if your domain is prohibited or just penalized by the search engine. Or, maybe it is neither prohibited nor punishable by the domain name. It could be, you just managed to screw up your site.
Simply check this via online tool -isitban.com by entering your domain or website url.
Once found banned then, check your website content & remove content which is violating any Community Standards of facebook.
Once you are done with content optimisation then send your Facebook website again for review

Facebook page URL blocked by facebook

I'm getting this error message when i try to share my page in a post or in a comment:
the content you're trying to share includes a link that our security systems detected to be unsafe
Use Facebook URL Debugger to find out what's on your page is detected as unsafe. Usually, you can find some helpful warnings there. If there is nothing suspicious - you've probably caught false positive and you have two options:
Report a bug to facebook team
Use dynamic urls that are unique for each post
This may also happen when this urls has received too many users claims (this may be seen in dashboard) - in this case you should use dynamic urls only.
And, of course, be good ;) Do not spam users. Throttle automatic posts if any. Make sure your page content is appropriate for most of age groups.

Redirect google bots, but show a 'Moved' page to visitors?

When moving a site to a new domain name, how can I show a 'This page has moved.' message to visitors and not automatically redirect, but at the same time tell Google and other bots that the sites have moved, to maintain SEO?
Or what is the best alternative? User agent cloaking isn't what I'm looking for.
What about the canonical meta tag? Seems like each page would need it's own, and the content on those pages would need to be nearly the same, but I guess you could have a box pop up saying "we have moved" to the user or something. Is this a viable alternative, or are there any complications?
Thanks.
you can provide different content, depend ending on the user agent specified in the http request header. you can find an overview of googlebot user agents here: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=1061943. what language / framework are you using?

how to stop users accessing robots.txt file in the website?

I need to stop users accessing robots.txt file in my website. I am not sure if i add robots.txt to 301 redirection in htaccess, google may discard the robots.txt, so Please advise me about this.
You can't stop users accessing your robots.txt directly. you can use htaccess to block users having specific region or browser version etc..