I need to stop users accessing robots.txt file in my website. I am not sure if i add robots.txt to 301 redirection in htaccess, google may discard the robots.txt, so Please advise me about this.
You can't stop users accessing your robots.txt directly. you can use htaccess to block users having specific region or browser version etc..
Related
I have a Facebook page of my blog but whenever I try to schedule post my blog article links on the Facebook page The Facebook automatically delete my schedule post and even some time they do not publish it can it be the reason that my domain name has been declared SPAM by the Facebook how to check it.
Use the Facebook sharing debugger.
If you input a URL that redirects to the blocked URL, it will initially say "this webpage contains a blocked URL. If you input the blocked URL directly, it will say "We can't review this website because the content doesn't meet our Community Standards. If you think this is a mistake, please let us know." You also get this message if you put in a URL that redirects to the blocked URL a second time.
https://developers.facebook.com/tools/debug/sharing/
:It is often hard to tell the reason for this but, you can definitely get out of this by following the proper instructions.
First check if your domain is prohibited or just penalized by the search engine. Or, maybe it is neither prohibited nor punishable by the domain name. It could be, you just managed to screw up your site.
Simply check this via online tool -isitban.com by entering your domain or website url.
Once found banned then, check your website content & remove content which is violating any Community Standards of facebook.
Once you are done with content optimisation then send your Facebook website again for review
I am building a site, in which I denied hotlinking of images. But after I added the facebooks "like" link to my pages, I realized that I want to allow hotlinking for facebook. So, if a user likes a page on my site, facebook should be able to show a related thumbnail of the page in the profile of the user. So, I added an exclusion rule in IIS like
if {HTTP_REFERER} matches pattern ^(https?://)?(\w+.)facebook.(com|net)(/.)*$ , allow.
Alas, it didn't work for me.
After that I googled for an answer. A forum post suggested to use "tfbnw" instead of facebook, so I added that exclusion, too:
^(https?://)?(\w+.)*tfbnw.(com|net)(/.*)*$
But as you might expect, still no chance.
So, I don't know which URL facebook uses to request images when a user uses the like button. I would appreciate any help to uncover this mystery, so that I can allow that URL on my site.
Note: If I disable hotlinking protection, everything works fine. So we know that my problem is just the hotlinking protection.
Can you try whitelisting with IP address? All of FB's crawlers should come from one of the IP addresses returned by
whois -h whois.radb.net '!gAS32934'
Try allowing the domain fbcdn.net:
^(https?://)?(\w+.)fbcdn.(com|net)(/.)*$
This is facebook's content delivery network.
When moving a site to a new domain name, how can I show a 'This page has moved.' message to visitors and not automatically redirect, but at the same time tell Google and other bots that the sites have moved, to maintain SEO?
Or what is the best alternative? User agent cloaking isn't what I'm looking for.
What about the canonical meta tag? Seems like each page would need it's own, and the content on those pages would need to be nearly the same, but I guess you could have a box pop up saying "we have moved" to the user or something. Is this a viable alternative, or are there any complications?
Thanks.
you can provide different content, depend ending on the user agent specified in the http request header. you can find an overview of googlebot user agents here: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=1061943. what language / framework are you using?
Almost all of my web sites pages are user accessed and if not a user and not logged in they cannot access the pages at all.
Will the search engine spiders still be able to list, crawl and index these user pages as they won't be logging in?
Is there a better way using this system of getting better indexing?
If you have non-user related content and you want search engines to crawl it you should enable them accessing it without login.
A spider doesn't login, so no, it can't see protected pages. You should also not try to make those pages visible to spiders and not to visitors who are not logged in, search engines will put a penalty on your site for that or even stop indexing it.
You could however show an excerpt of the page to users who aren't logged in, like the first X words. Spiders crawling the page will be able to read that text.
I would rather it just display nothing, or another website. I don't want it to display any website that I am affiliated with.
Use a URL shortener, such as bit.ly to make the link. That way you're once removed, there may still be trace but it won't be as obvious in the referrer.
Use short url in your web link. such as tinyurl
user see the tiny url instead of the real url. They couldn't see the website that you are affiliated with
If you redirect from an HTTPS page to a URL with a different domain then no referrer will be sent by the browser to avoid leaking information. This will save you from relying on any third parties.