I have dev.example.com and www.example.com hosted on different subdomains. I want crawlers to drop all records of the dev subdomain but keep them on www. I am using git to store the code for both, so ideally I'd like both sites to use the same robots.txt file.
Is it possible to use one robots.txt file and have it exclude crawlers from the dev subdomain?
You could use Apache rewrite logic to serve a different robots.txt on the development domain:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^dev\.qrcodecity\.com$
RewriteRule ^robots\.txt$ robots-dev.txt
</IfModule>
And then create a separate robots-dev.txt:
User-agent: *
Disallow: /
Sorry, this is most likely not possible. The general rule is that each sub-domain is treated separately and thus would both need robots.txt files.
Often subdomains are implemented using subfolders with url rewriting in place that does the mapping in which you want to share a single robots.txt file across subdomains. Here's a good discussion of how to do this: http://www.webmasterworld.com/apache/4253501.htm.
However, in your case you want different behavior for each subdomain which is going to require separate files.
Keep in mind that if you block Google from indexing the pages under the subdomain, they won't (usually) immediately drop out of the Google index. It merely stops Google from re-indexing those pages.
If the dev subdomain isn't launched yet, make sure it has it's own robots.txt disallowing everything.
However, if the dev subdomain already has pages indexed, then you need to use the robots noindex meta tags first (which requires Google to crawl the pages initially to read this request), then set up the robots.txt file for the dev subdomain once the pages have dropped out of the Google index (set up a Google Webmaster Tools account - it helps to work this out).
I want Google to drop all of the records of the dev subdomain but keep the www.
If the dev site has already been indexed return a 404 or 410 error to crawlers to delist content.
Is it possible to have one robots.txt file that excludes a subdomain?
If your code is completely static what you're looking for the non-standard host directive:
User-agent: *
Host: www.example.com
But if you can support a templating language it's possible to keep everything in a single file:
User-agent: *
# if ENVIRONMENT variable is false robots will be disallowed.
{{ if eq (getenv "ENVIRONMENT") "production" }}
Disallow: admin/
Disallow:
{{ else }}
Disallow: /
{{ end }}
Related
Lets assume I have a website abc.com with sub-directories x, y and z. So website urls are like,
abc.com/x/
abc.com/y/
abc.com/z/
Now lets assume I have 3 sub-domains for each sub-directory like below.
x.abc.com
y.abc.com
z.abc.com
Now I want to redirect requests like abc.com/x/ to x.abc.com. And I have lot of directories and I cannot add rules for each directory.
You can use dynamically configured mass virtual hosting (of mod_vhost_alias module). You need a wildcard entry in DNS server for resolution. For www.x.com.com and content on /home/x/www :
RewriteEngine on
RewriteMap lowercase int:tolower
RewriteCond %{lowercase:%{HTTP_HOST}} ^www\.([^.]+)\.abc\.com$
RewriteRule ^(.*) /home/%1/www$1
I have DNS managed by NO-IP and self hosted websites on a QNAP, with Virtual hosts defined in the Qnap Software.
My sites are www.site1.com, www.site2.com and a new www.site3.com which (this one will work with subdomains created by wildcards at Users input in Joomla) (ex user1.site3.com would be the redirection of the site3.com/index.php/users/user1) and all are located in (root) web_folder/site1/...site2/...site3/ . The other sites have their virtual hosts working right.
Now, before doing that, i am getting stuck on this Qnap issue where i can't define my wildcard dns *.site3.com in the virtual hosts. I want to do this because when i enter in the browser anything.site3.com it gets me to the index.html file located in the root folder of my published sites (web/index.html)
What shoud I try in order for the subdomains to pass the root index and go directly to the website folder, where i could place my .htaccess containing the rewrite rules?
I tried with a .htaccess but couldn't manage what to write in order for it to work.
Define the wildcard in the qnap virtual host but does not accept
What am i thinking wrong?
I would like to avoid having a .htacces, in the web folder, to redirect every domain/subdomain to it's respective subfolder and afterwards having another .htaccess insinde my site3 folder which does the rest of the rewrites (subdomains wildcard and user based)
I have managed to do so far:
I shall partially answer to my question
RewriteEngine On
Options +FollowSymlinks
RewriteCond %{HTTP_HOST} !^www\.site3\.com<br>
RewriteCond %{HTTP_HOST} !^static\.site3\.com<br>
RewriteCond %{HTTP_HOST} ^(www\.)?(.+)\.site3.com$ [NC]
RewriteRule ^(.*)$ http://site3.com/index.php/user/%2 [L]
The thing is, I would like to still see user.site3.com instead of the full link.
As an update. I am running wordpress multisite. My www.site3.com comes into the qnap(web or public_html folder) then the NAS virtual host manager redirects it to its actual subfolder (site3).
In this situation, when i write :
-anything.site3.com it reads my top (web or public_html) index and i can't get to the wordpress subpages. (my anything.site3.com page has been created from user-end)
-site3.com it goes to mi site3 folder as it should, managed by the qnap virtual host (i don't have to write additional htaccess.
So i need to somehow let my subdomains pass the redirection from the qnap and go to my site3 subfolder and then wordpress htaccess do it's job. This should be done so that i don't have a continuous loop!
Sorry for my explanations might be a little bit messed.
I have several small projects I want to host on single virtual host using Zend Framework. The structure is:
apps/
testapp1/
application/
index.php,
testapp2/
application/
index.php
Document root in virtual host looks like: DocumentRoot /var/www/apps/
I need univeral mod_rewrite rule which will transform URLs like /apps/testapp1/xxx into /apps/testapp1/index.php.
Any help will be greatly appreciated as I am trying to resolve that for several hours. Thank You.
If your DocumentRoot for website is /var/www/apps/, then I guess URLs would be http://www.example.com/testapp1/ajax instead of http://www.example.com/apps/testapp1/ajax. If so -- then you need to remove apps/ part from these rules.
These rules need to be placed in your .htaccess into website root folder.
If you already have some rules there then these new rules needs to be placed in correct place as order of rules matters.
You may need to add RewriteEngine On line there if you do not have such yet.
This rule will rewrite /apps/testapp1/ajax into /apps/testapp1/index.php, no other URLs will be affected:
RewriteRule ^apps/testapp1/ajax$ /apps/testapp1/index.php [NC,QSA,L]
This rule will rewrite ALL URLs that end with /ajax into /index.php (e.g. /apps/testapp1/ajax => /apps/testapp1/index.php as well as /apps/testapp2/ajax => /apps/testapp2/index.php):
RewriteRule ^(.+)/ajax$ /$1/index.php [NC,QSA,L]
UPDATE:
This "universal" rule will rewrite /apps/testapp1/xxx into /apps/testapp1/index.php as well as /apps/testapp2/something => /apps/testapp2/index.php:
RewriteRule ^(.+)/([^/\.]+)$ /$1/index.php [NC,QSA,L]
I have a domain... let's say www.myOldDomain.com. It is currently running a site about local services near you.
I got a new domain (www.myNewDomain.com), it is also going to run a site about local services near you (albeit in a slightly different way). In fact, I want to totally replace www.myOldDomain.com with www.myNewDomain.com - on the same host using the same hosting account ( I don't want to pay for 2 accounts).
I want ANY traffic to www.myOldDomain.com/* to be redirected to www.myNewDomain.com/WelcomeOldDomainers
How would I go about doing that? Would I do it programmatically, or at DNS?
Update: This is on a Windows host.
Linux Hosting
The best way is using a .htaccess file at the root of your website directory if you are under linux hosting :
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.myOldDomain\.com/$1 [NC]
RewriteRule ^(.*)$ http://www.myNewDomain.com/$1 [R=301,L]
Beware though : this assumes the $1 part of your old domain is the same than your $1 of your new domain. And from reading your post I'm not sure this is the case.
.htaccess allows you to configure that in a very flexible way though.
Look on google for "URL Rewriting" and "htaccess redirection"
Windows Hosting :
You can use the web.config file from your IIS 7 or superior :
http://learn.iis.net/page.aspx/557/translate-htaccess-content-to-iis-webconfig/
Beware though Bis : Be careful not to leave old pages SEO goes to waste and be sure to map the best you can old pages to new pages or you'll loose all the SEO juice you have been building for these old pages.
The best option is to send a HTTP/1.1 301 Moved Permanently
In ASP.NET you would set Response.Redirect("www.myNewDomain.com/WelcomeOldDomainers")
in global.aspx
for some weird reason my CMS is logging out if the address bar does not have www before the full website name. for example, when we enter xyz.com, it takes me to the website but then it wont show as logged in and if i type in www.xyz.com it will find the cookie and show me logged in.
What i want to do is, when user types in xyz.com, i want it to directly (transparent to user) go to www.xyz.com. I want to add that www before xyz.com. I tried adding a .htaccess file in the directory where index.php is present and this is code in htaccess file.
DirectoryIndex index.php
Redirect xyz.com www.xyz.com/index.php
The .htaccess file is disappearing when i transfer it over ftp filezilla.
If you are willing to modify your index.php you could add the following logic to the top of the file:
/*This is a tempory redirection from mysite.com to www.mysite.com*/
if($_SERVER['SERVER_NAME'] == 'mysite.com')
{
$redirect = $_SERVER['REQUEST_URI'];
header( 'Location: http://www.mysite.com'. $redirect ) ;
}
try this in the htaccess file:
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^xyz.com$
RewriteRule ^/?$ "http://www.xyz.com" [R=301,L]
However, your problem sounds like cookie related. Probably the CMS is using a cookie to check the log in status, but the cookie domain param is 'www.xyz.com' instead of '.xyz.com'.
--- edit ---
improved a bit the final line of the code (it is tested and working), but as tcp said, mod_rewrite must be enabled. If you can't enable it, try the code that Lobsterm posted and if you can't do this either, you could try to change the cookie domain param from 'www.xyz.com' to '.xyz.com'
If you want to use rewrites, make sure mod_rewrite is being loaded in your Apache conf file and check that the AllowOverride parameter is either set to All or the just the directives you want to be allowed in .htaccess
Also as aletzo said, you probably want your cookie to cover your whole domain so change the cookie domain from www.example.com to example.com .
Then, it won't matter if user are accessing with a www prefix or within a subdomain.
EDIT: Glad you found the answer you were looking for, but if you need to make filezilla show you .htaccess in the future, Server -> Force showing hidden files