I had to do a domain change on a website so I moved from www.mydomain.it to www.mydomain.eu. At the moment I don't have full control on the .it domain so when it was the time to make the domain change I asked the hosting provider to make a 301 redirect of the .it to the .eu so as a result, I got all the pages of the .it website redirected to my .eu's homepage (no 404 errors) because the previous website used queries as pages' URLs.
This is a link example of the previous website:
http://www.mydomain.it/index.php?page=lkr_pg_chisiamo
So what I started getting after the redirect was:
https://www.mydomain.eu/index.php?page=lkr_pg_chisiamo
which gave me back the homepage content as a result and not a 404 error.
The old website had at least 10k links like that one so each of them started having the behavior of the link above, I got the homepage content for all links. In the beginning, I thought it was a good thing for me because I wasn't getting 404 errors but then I started digging around on the web and I found out to not be a good practice because all links might be recognized as soft 404 errors.
Obviously before I made the domain change I had created all the 301 redirects of the most important pages of the website like this:
RewriteCond %{QUERY_STRING} ^page=lkr_pg_chisiamo$
RewriteRule ^(.*)$ https://www.mydomain.eu/chi-siamo/? [R=301,L]
RewriteCond %{QUERY_STRING} ^page=lkr_pg_contattaci&form_key=25-8124355$
RewriteRule ^(.*)$ https://www.mydomain.eu/contatti/? [R=301,L]
And so on..
Obviously, I didn't do that for all the 10k pages, but just the most important, so the other links are still pointing to the homepage content.
After I did this, I told Google I had changed the domain through the Google Search Console.
After a few weeks, I started seeing some results on Google but after one month I'm not still happy with them, I think I lost rank on Google. I know it could take a while more to do everything and that I should probably lose 3% of my "domain juice" after a domain change but what I was wondering if I have done everything in the right way in order to not lose rank.
My concern now is about all the links that I wasn't able to redirect and that has been redirected automatically and started getting the homepage content. Should I be worried about them?
How should I manage them?
Should I redirect them to another page which is not the homepage?
If yes, is there a way to redirect all those links (just those) even though I have all the other redirects in my .htaccess file?
Was there a better way to redirect all the 10k links of that type? How would I be able to do that?
You asked this 6 months ago, but i hope i can help you.
Add this to your .htaccess:
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^mydomain\.it$ [NC]
RewriteRule ^(.*)$ http://mydomain.eu [R=301,L]
This should redirect all links to your new domain.
About Google Search Console, take a look here: https://support.google.com/webmasters/answer/93633?hl=en
About changing domain, look at: https://support.google.com/webmasters/answer/6033049?hl=en
Hope this helps!
Brhaka
Issue with Facebook not recognizing images due to use of gzip on our servers.
First off our websites need to use gzip so the answer of turning gzip off isnt an applicable response. Our servers use gzip by default and it's a good thing so we need to keep that in place.
I understand that gzipping images might have negligible impact but we are using it nonetheless.
What Im looking to do (hopefully) is ideally turn of gzip if the website is visited by a Facebook bot and leave gzip enabled otherwise so when the user agent detected is either of the
following...
facebookexternalhit/1.0
facebookexternalhit/1.1
Facebot
We disable gzip (ie. SetEnv no-gzip 1 I assume)
We want to do this within each sites .htaccess file
Is there a way to do this in an .htaccess file, if so can anyone supply an .htaccess sample.
Appreciate your help.
You should not be gzipping images anyway.
http://gtmetrix.com/enable-gzip-compression.html
Gzip compression won't work for images, PDF's and other binary formats which are already compressed.
Here is a good sample of mime types that work well with gzip:
application/atom+xml
application/javascript
application/json
application/rss+xml
application/vnd.ms-fontobject
application/x-font-ttf
application/x-web-app-manifest+json
application/xhtml+xml
application/xml
font/opentype
image/svg+xml
image/x-icon
text/css
text/plain
text/x-component;
https://github.com/h5bp/server-configs-nginx/blob/3db5d61f81d7229d12b89e0355629249a49ee4ac/nginx.conf#L93
Also see: https://superuser.com/a/139273
I have the following issue related to URL rewriting. I am sure there must be some good solution to this.
I converted this URL
domainname.com/index.php?page=product&pid=5&proTitle=Samsung Galaxy
After rewrite it looks like this
domainname.com/products/5/Samsung-Galaxy.html
The .htaccess code looks like this.
RewriteRule ^products/(.*)/(.*).html$ index.php?page=product&pid=$1&proTitle=$2 [nc]
Rewrite Works fine. However, if I try to access old URL i.e domainname.com/index.php?page=product&pid=5&proTitle=Samsung Galaxy the page is still accessible and on top of that being crawled by Google and other search engines. I want If someone tries to access this URL, it should direct them to Page Not Found and this should also not be sniffed by any crawlers.
I am sure there must be a smart way of doing this. Awaiting for some valuable suggestions.
in htaccess insert this code :
the code can make the URL "domainname.com/index.php?page=product&pid=5&proTitle=Samsung Galaxy" look 403 Forbidden
insert this for direct if browser find the url is 404 or 403 :
ErrorDocument 404 /errormessages/404.php
ErrorDocument 403 /errormessages/403.php
you can change the path /errormessage/404.php or 403.php like what you want
and insert this "don't use space" :
< IfModule mod_rewrite.c >
RewriteEngine On
RewriteCond %{THE_REQUEST} ^.(\?page=product&pid=$1&proTitle=$2). [NC]
RewriteRule ^(.*)$ - [F,L]
< /IfModule >
I don't think you should really make the page inaccessible since people might have bookmarked it, and you're just making it harder to get back to where they were. If your main problem is Google still referencing it, then use rel="canonical" and specify the URL you'd rather have Google use. https://support.google.com/webmasters/answer/139394?hl=en
I am looking to perform a sitewide 301 redirect. The original site is over 15 years old! I understand the concept of making the .htaccess file with the code:
redirect 301 "/old/old.htm" http://www.you.com/new.html
However will this redirect every page of the old site? or just an individual page. How do I achieve redirection with the entire site?
I have a rewrite in .htaccess (apache rewrite mod enabled), all pages from old site
http://www.old.com and
http://www.old.com/site/index.php? .... redirect to the new site
http://www.new.com or
http://www.new.com/website/index.php?... (notice that /site/ and /website/ are different names)
pages from the old site
https://www.old.com (notice the s on https://) get redirected fine but pages from
https://www.old.com/site/index.php?... do not, they get a 404 error
since the old site is not secure anymore neither the
https://www.old.com or
https://www.old.com/site/index.php?... really exist anymore but
https://www.old.com gets redirected and the ones with
/site/index.php?... added do not get redirected but go instead to a 404 error
Be careful with a 301, 301 redirect is used for where content has moved.
e.g. content about making a cake was here /makeacake.html now is /cakes/making-a-cake.html.
what I would recommend is find the pages where the majority of your uses come to, and redirect those pages to the new relevant pages / sections and just delete the rest and add a custom 404 error page. which tells them the old content has been moved.
You can also use goggle web masters to remove pages from there index.
Assuming the old pages don't exist any more (would throw 404-errors), you can do the following: You redirect all the pages that don't exist anymore to the start page. (As specified in the comments below.)
This is the updated .htaccess code you can use to make that happen. The first RewriteCond checks if the requested path is a file, the second checks if its a directory. After that, you get redirected to the startpage - or any other page for that matter.
http://www.example.com/i/am/an/old/page.html or http://www.example.com/i/am/a/different/old/page.html will all redirect to http://www.example.com/
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . / [R=301,L]
The best way to redirect the entire site is by doing domain forwarding through your web server (or web host... most have the option in their control panel).
Domain forwarding is much more efficient than sending 301 redirects back to the client.
Am I right in thinking that your site is on the same domain name but you've changed it structurally?
So, you have a load of old page URLs that have now changed to new URLs (but on the same domain).
For example, you may have had:
www.yourdomain.com/about-us/history.htm
that has now become
www.yourdomain.com/our-history.htm
If that is the case you will more than likely need to set up many 301 redirect rules. It doesn't necessarily mean that you have to set up one rule for every single page change as you can use RegEx to catch pattern changes in the URL structure. As a scale example, I recently set up a htaccess file of 301 redirects for a site with just under 600 changed URLs. There were 70-something 301 Redirect rules in the end.
It's not necessarily a small job but it is doable. Worth it to retain your SEO rankings.
I don't have a favicon.ico, but my browser always makes a request for it.
Is it possible to prevent the browser from making a request for the favicon from my site? Maybe some META-TAG in the HTML header?
I will first say that having a favicon in a Web page is a good thing (normally).
However it is not always desired and sometime developers need a way to avoid the extra payload. For example an IFRAME would request a favicon without showing it.
Worst yet, in Chrome and Android an IFRAME will generate 3 requests for favicons:
"GET /favicon.ico HTTP/1.1" 404 183
"GET /apple-touch-icon-precomposed.png HTTP/1.1" 404 197
"GET /apple-touch-icon.png HTTP/1.1" 404 189
The following uses data URI and can be used to avoid fake favicon requests:
<link rel="shortcut icon" href="data:image/x-icon;," type="image/x-icon">
For references see here:
https://github.com/h5bp/html5-boilerplate/issues/1103
https://twitter.com/diegoperini/status/4882543836930048
UPDATE 1:
From the comments (jpic) it looks like Firefox >= 25 doesn't like the above syntax anymore. I tested on Firefox 27 and it doesn't work while it still work on Webkit/Chrome.
So here is the new one that should cover all recent browsers. I tested Safari, Chrome and Firefox:
<link rel="icon" href="data:;base64,=">
I left out the "shortcut" name from the "rel" attribute value since that's only for older IE and versions of IE < 8 doesn't like dataURIs either. Not tested on IE8.
UPDATE 2:
If you need your document to validate against HTML5 use this instead:
<link rel="icon" href="data:;base64,iVBORw0KGgo=">
Just add the following line to the <head> section of your HTML file:
<link rel="icon" href="data:,">
Features of this solution:
100% valid HTML5
very short
does not incur any quirks from IE 8 and older
does not make the browser interpret the current HTML code as favicon (which would be the case with href="#")
You can use the following HTML in your <head> element:
<link rel="shortcut icon" href="#" />
I tested this on a forced full refresh, and no favicon requests were seen in Fiddler. (tested against IE8 in compat mode as IE7 standards, and FF 3.6)
Note: this may download the html file twice, so while it works in hiding the error, it comes with a cost.
You can't. All you can do is to make that image as small as possible and set some cache invalidation headers (Expires, Cache-Control) far in the future. Here's what Yahoo! has to say about favicon.ico requests.
if you use nginx
# skip favicon.ico
#
location = /favicon.ico {
access_log off;
return 204;
}
Put this into your HTML head:
<link rel="icon" href="">
This is a bit larger than the other answers, but does contain an actually valid PNG image (1x1 pixel white).
The easiest way to block these temporarily for testing purposes is to open up the inspect page in chrome by right-clicking anywhere on the page and clicking inspect or by pressing Ctrl+Shift+j and then going to the networking tab and then reloading the page which will send all the requests your page is supposed to make including that annoying favicon.ico. You can now simply right click the favicon.ico request and click "Block request URL".
All of the above answers are for devs who control the app source code. If you are a sysadmin, who's figuring out load-balancer or proxying configuration and is annoyed by this favicon.ico shenanigans, this simple trick does a better job. This answer is for Chrome, but I think there should be a similar alternative which you would figure out for Firefox/Opera/Tor/any other browser :)
You can use .htaccess or server directives to deny access to favicon.ico, but the server will send an access denied reply to the browser and this still slows page access.
You can stop the browser requesting favicon.ico when a user returns to your site, by getting it to stay in the browser cache.
First, provide a small favicon.ico image, could be blank, but as small as possible. I made a black and white one under 200 bytes. Then, using .htaccess or server directives, set the file Expires header a month or two in the future. When the same user comes back to your site it will be loaded from the browser cache and no request will go to your site. No more 404's in the server logs too.
If you have control over a complete Apache server or maybe a virtual server you can do this:-
If the server document root is say /var/www/html then add this to /etc/httpd/conf/httpd.conf:-
Alias /favicon.ico "/var/www/html/favicon.ico"
<Directory "/var/www/html">
<Files favicon.ico>
ExpiresActive On
ExpiresDefault "access plus 1 month"
</Files>
</Directory>
Then a single favicon.ico will work for all the virtual hosted sites since you are aliasing it. It will be drawn from the browser cache for a month after the users visit.
For .htaccess this is reported to work (not checked by me):-
AddType image/x-icon .ico
ExpiresActive On
ExpiresByType image/x-icon "access plus 1 month"
A very simple solution is put the below code in your .htaccess. I had the same issue and it solve my problem.
<IfModule mod_alias.c>
RedirectMatch 403 favicon.ico
</IfModule>
Reference: http://perishablepress.com/block-favicon-url-404-requests/
Elaborating on previous answers, this might be the shortest solution from the HTML file itself:
<link rel="shortcut icon" href="data:" />
Tested working, no error messages or failed requests on Chrome Version 94.0.4606.81
Just make it simple with :
<link rel="shortcut icon" href="#" type="image/x-icon">
It displays nothing!!!!
In Node.js,
res.writeHead(200, {'Content-Type': 'text/plain', 'Link': 'rel="shortcut icon" href="#"'} );
Personally I used this in my HTML head tag:
<link rel="shortcut icon" href="#" />
I need prevent request AND have icon displayed i.e. in Chrome.
Quick code to try in <head>:
<link rel="icon" type="image/png" sizes="16x16" href="data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAABAAAAAQBAMAAADt3eJSAAAAMFBMVEU0OkArMjhobHEoPUPFEBIu
O0L+AAC2FBZ2JyuNICOfGx7xAwTjCAlCNTvVDA1aLzQ3COjMAAAAVUlEQVQI12NgwAaCDSA0888G
CItjn0szWGBJTVoGSCjWs8TleQCQYV95evdxkFT8Kpe0PLDi5WfKd4LUsN5zS1sKFolt8bwAZrCa
GqNYJAgFDEpQAAAzmxafI4vZWwAAAABJRU5ErkJggg==" />
In our experience, with Apache falling over on request of favicon.ico, we commented out extra headers in the .htaccess file.
For example we had
Header set X-XSS-Protection "1; mode=block"
... but we had forgotten to sudo a2enmod headers beforehand. Commenting out extra headers being sent resolved our favicon.ico issue.
We also had several virtual hosts set up for development, and only failed out with 500 Internal Server Error when using http://localhost and fetching /favicon.ico. If you run "curl -v http://localhost/favicon.ico" and get a warning about the host name not being in the resolver cache or something to that effect, you might experience problems.
It could be as simple as not fetching (we tried that and it didn't work, because our root cause was different) or look around for directives in apache2.conf or .htaccess which might be causing strange 500 Internal Server Error messages.
We found it failed so quickly there was nothing useful in Apache's error logs whatsoever and spent an entire morning changing small things here and there until we resolved the problem of setting extra headers when we had forgotten to have mod_headers loaded!
Sometimes this error comes, when HTML has some commented code and browser is trying to look for something. Like in my case I had commented code for a web form in flask and I was getting this.
After spending 2 hours I fixed it in the following ways:
1) I created a new python environment and then it threw an error on the commented HTML line, before this I was only thrown error 'GET /favicon.ico HTTP/1.1" 404'
2) Sometimes, when I had a duplicate code, like python file existing with the same name, then also I saw this error, try removing those too
If you are not using HTML and it's auto-generated by Flask or some frameworks you can always add a dummy route in the app to just return dummy text to fix this issue.
Or
.
.
.
you can just add the favicon :)
Eg for Python Flask Application.
#app.route('/favicon.ico')
def favicon():
return 'dummy', 200
I solved this problem by using the Content-Security-Policy HTTP response header. By using this, is possible to block the browser from making further media queries like images (other types are also possible). I added the following header to the response:
Content-Security-Policy: img-src 'none'
The problem is it will block ALL image queries. If your HTML has any image, they won't be loaded. In my case it was very likely a bug in Firefox because the browser was requesting the favicon.ico for a response whose Content-type is text/xml!
It also depends on the browser implementing this feature as is enforced on the client side.
Check https://content-security-policy.com for a complete guide on CSP.
Cheers!
You could use
<link rel="shortcut icon" href="http://localhost/" />
That way it won't actually be requested from the server.