Can I exclude folders in a 301 redirect so that it only affects part of the web? - redirect

I have a question about applying redirects excluding some folders. Here you have my case:
My domain example.com is published in several languages:
example.com/es for content in spanish
example.com/it for content in italian
example.com/en for content in english
In my old website example.com/ the English content was under "/" (in the root of the project) but now it has been migrated to example.com/en leaving "/" without any content.
I want to apply a rule that all the content in English, indexed by Google hosted under example.com/ go to example.com/en but without this affecting the content under the other languages.
example.com/content1 redirected to example.com/en/content1
but mantaining unaltered:
example.com/es/content1es
example.com/it/content1it
...
That is, as indicated at the beginning, excluding the language folders and their child content. Is this possible?
Thank you very much

Related

Using 301 for only some content on a site that eventually will delete those pages?

My company is migrating content from one section "A" of its gigantic website to a new website "B" which is focused only on this type of content. Eventually, they want to delete this old section "A" completely but the gigantic website will otherwise remain.
Is it worth using 301 redirects to help users get to the new site B, rather than pull the rug out one day, and to retain some of the page rank?
What about when that section A on the old site is completely deleted? Will the 301 rank stay with the new site?
Yes, it's a good practice to create 301 redirect. It's good for both users and Search Engines. Also, it shows that you're not duplicating content, and Search Engines really don't like websites that "copy-paste" content from others.
As soon as search engines detect the 301, the rank will stay and they won't considerer it as duplicated content.

What happens when GET robots.txt returns an unrelated html file?

I have a web server capable of serving the assets of various web apps. When a requested asset doesn't exist, it sends back index.html. In other words:
GET /img/exists.png -> exists.png
GET /img/inexistent.png -> index.html
This also means that:
GET /robots.txt -> index.html
How will google (and other) crawlers handle this? Will they detect that robots.txt is invalid and ignore it (same as returning 404)? Or will they penalize my ranking for serving an invalid robots.txt? Is this acceptible, or should I make a point of returning 404 when the app I'm serving has no robots.txt?
Every robots.txt handler that I know of deals with invalid lines by simply discarding them. So an HTML file (which presumably does not contain any valid robots.txt directives) would be effectively treated as if it were a blank file. This is not really part of any official standards, though. The (semi-)official standard assumes that any robots.txt file will contain robot.txt directives. Behavior for a robots.txt file that contains HTML is undefined.
If you care about crawlers, your bigger problem is not that you serve an invalid robot.txt file, it's that you have no mechanism to tell crawlers (or anyone else) when a resource does not exist. From the crawlers point of view, your site will contain some normal pages plus an infinite number of exact copies of the home page. I strongly encourage you to find a way change your setup so resources that don't exist return status 404.

N2CMS - Multilanguage site and SEO friendly URLs

I have a N2 website with 2 languages: English and Serbian.
I want English content to point to mydomain.com/en
and Serbian content to point to mydomain.com/sr
How can I do that?
Today, URL for English start page is mydomain.com
And for Serbian start page: mydomain.com/Start/Index?page=127
All pages in my second language don't have SEO friendly URLs.
Any help would be greatly appreciated!
You need to organize structure of your site in a following way
Root
Language Intersection
Start Page EN
Start Page RS
Then, in web.config or config file in App_Data set StartPage to be ID of LanguageIntersection. One of the common causes of non-friendly URLs is that only pages below site start page (as defined in web.config) have friendly URLs. Hence, you must move your start page above each of language start pages.

How do I redirect a multi language site to a default language page?

I have this:
http://example.com/EN/index.php - english version
http://example.com/PT/index.php - portuguese version
What I want: http://example.com/ to be taken to the portuguse page by default.
I've also been thinking on having the portuguese page on the root and english in the EN/ directory. is this better for SEO?
iam not sure if i understand the problem, but you can create an index.php in your
root and change the header to forward the user to the PT/index.php
Header("Location: ./PT/index.php");
later you can construct an if/else check to verify the language of the browser etc..

robots.txt: Disallow bots to access a given "url depth"

I have links with this structure:
http://www.example.com/tags/blah
http://www.example.com/tags/blubb
http://www.example.com/tags/blah/blubb (for all items that match BOTH tags)
I want google & co to spider all links that have ONE tag in the URL, but NOT the URLs that have two or more tags.
Currently I use the html meta tag "robots" -> "noindex, nofollow" to solve the problem.
Is there a robots.txt solution (that works at least for some search bots) or do I need to continue with "noindex, nofollow" and live with the additional traffic?
I don't think you can do it using robots.txt. The standard is pretty narrow (no wildcards, must be at the top level, etc.).
What about disallowing them based on user-agent in your server?