Need to create 410 status code in Wordpress site for all pages in /forums/ folder - http-status-code-410

I was asked by a friend to fix a wordpress website with a bb press forum that was over-run by spammers. The wordpress site was okay, but the bb forum had over 1,500 urls of spammed entries in the space of about 3 months. The site is hosted by Godaddy and the person who was managing it was instructed by them to delete all content. They did this and now the webmaster tools is showing 1,500 404 errors. I have since deleted and reinstalled plugins on site and locked down wordpress to not allow subscribers or comments. Spam visits have diminished tremendously. But the damage is still there.
I have read extensively about creating 410 Gone command for search engines on those now deleted spammed posts. No where can I find sample code and instructions on where to insert the code appropriately. I have access to ftp and know how to open .htaccess in text edit. I also know where to find php files and how to get access to them. But I only know enough about php to get myself into a lot of trouble.
I found a site that offers .htaccess code sample. So I have attempted to plug in my data and am just looking for a 'spot check' from someone who knows the code.
I am keeping everything in the domain as is but need to redirect the subfolder http://www.mydomain.com/forums/ and everything in it to display a 410, meaning that all content is permanently gone inside that /forums/ folder and search engine no longer should look for it.
Here is the sample .htaccess file code I found:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.ourdomain.com/$1 [R=301,L]
#
# Folder that exists no more
RewriteRule ^forums/discontinued\.html$ - [G]
#
Redirect 301 /folder/index.html http://www.ourdomain.com/forums/
#
# File that exists no more
Redirect gone [b]/f[/b]ilea.html
Is this the proper code to place inside the .htaccess file on my wordpress site to eliminate searches on the now deleted /forums/ folder and all content that was ever posted in it?
I appreciate all help on this.
Also, where can I get a tutorial on how to write .htaccess code online, for future reference.
Thank you very much for the help. I am grateful.

All you need to do is something like this:
RewriteEngine On
RewriteCond %{REQUEST_URI} /forums/(.*)
RewriteRule (.*) http://www.example.com/forums/$1 [R=410,L]
That will catch anything that starts with http://www.example.com/forums and redirect it to a 410.
Also note, if you are modifying an existing .htaccess, it likely already has the RewriteEngine On line somewhere near the top (usually the first non-comment line), so you don't need to have it twice. Just the RewriteRule added somewhere should work. If there are lots of other RewriteRules, you should probably put it near the top. If there are just a few default lines from Wordpress, below those should suffice.

This could also be done by using the following, single rule:
RewriteRule ^/forums(/.*)?$ - [G,NC]
This will 410 all the following URLs:
http://www.example.com/forums
http://www.example.com/forums/
http://www.example.com/forums/my-forum-post-1
http://www.example.com/forums/my-forum-post-2/
The [G] flag denotes the 410 Gone status, the [NC] flag makes the URLs case-insensitive, and the [L] isn't needed since it's implied with the [G] flag.

Related

How to redirect all pages in a subfolder to a single page

I recently changed my site's directory structure and also started using php to generate file names. All of my previous files have been renamed so that they can be easily accessed via the mysql/php. Site works fine. Problem is existing links on external sites point to the old directory/file, e.g.,
<site>/library_wheels/wheel-1-name.html
<site>/library_wheels/wheel-2-name.html
<site>/library_wheels/wheel-3-name.html
I want any request to any html file in the old library_wheels directory (which no longer exists) to go to a general index page:
<site>/library_wheels.php.
There the visitor can look up the right link.
Or should I put in nearly a thousand 301 redirects in the .htacess file?
Thanks
Mar 22: tried #Marc B's answer; doesn't work.
June 14, 2015: tisantisan's answer worked, but I had to specify the full url of the destination page:
RewriteRule ^http://www.bps.lk/library_wheels.php [R=301,L,NC]
Or else, #Marc B's answer worked, but I must have typed it in wrong!
This works for me:
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^library_wheels/ library_wheels.php [R=301,L,NC]

Redirect from directory on subdomain to root

Google has indexed subdomain.thedomain.com/directory
I want that to redirect to thedomain.com
What goes in the .htaccess file? I've found all other combos but this
Please help!
Thanks in advance, much appreciated
You did not find the correct documentation. Read this:
http://httpd.apache.org/docs/2.4/en/rewrite/remapping.html#movehomedirs
Update:
As a start, write this into directory/.htaccess:
RewriteEngine on
RewriteRule .* http://thedomain.com/ [R,L,NE]
The .* means anything, and because the rule comes from a .htaccess file, this effectively means anything in this directory.
In total, this means: redirect anything in that directory to the homepage, using the options in brackets (which will just work).

how can I override robots in a sub folder

I have a sub-domain for testing purposes. I have set robots.txt to disallow this folder.
Some of the results are still showing for some reason. I thought it may be because I hadn't set up the robots.txt originally and Google hadn't removed some of them yet.
Now I'm worried that the robots.txt files within the individual joomla sites in this folder are causing Google to keep indexing them. Ideally I would like to stop that from happening because I don't want to have to remember to turn robots.txt back to follow when they go live (just in case).
Is there a way to override these explicitly with a robots.txt in a folder above this folder?
As far as a crawler is concerned, robots.txt exists only in the site's root directory. There is no concept of a hierarchy of robots.txt files.
So if you have http://example.com and http://foo.example.com, then you would need two different robots.txt files: one for example.com and one for foo.example.com. When Googlebot reads the robots.txt file for foo.example.com, it does not take into account the robots.txt for example.com.
When Google bot is crawling example.com, it will not under any circumstances interpret the robots.txt file for foo.example.com. And when it's crawling foo.example.com, it will not interpret the robots.txt for example.com.
Does that answer your question?
More info
When Googlebot crawls foo.com, it will read foo.com/robots.txt and use the rules in that file. It will not read and follow the rules in foo.com/portfolio/robots.txt or foo.com/portfolio/mydummysite.com/robots.txt. See the first two sentences of my original answer.
I don't fully understand what you're trying to prevent, probably because I don't fully understand your site hierarchy. But you can't change a crawler's behavior on mydummysite.com by changing the robots.txt file at foo.com/robots.txt or foo.com/portfolio/robots.txt.

Multiple 301 redirects in one line

This is probably easy for people who deal with these regularly, but I'm not sure what kind of code I will need to use to achieve what I want to. I know how to redirect individual URLs to other URLs, but when it comes to redirecting multiple at once I can't do it.
Basically I set up my site structure kinda bad when I built my website. I have a bunch of URLs named:
crafting-alchemist-level-1-10.php
all in the root directory, where alchemist-level-1-10 is the page name and crafting is the site section. I have about 50 of these URLs and I would like to put them all in a /crafting directory with the crafting- cut off the file names.
I could do this individually but there must be a way to do all with a single line. Is there?
These URL redirects need to be compatible with any parameters after the .php too.
Use mod_rewrite in your .htaccess
RewriteEngine On
RewriteRule ^(.*)/(.*)/(.*)$ $1-$2-$3.php
For more information (you will need to customize it a bit):
http://httpd.apache.org/docs/current/rewrite/intro.html#regex
EDIT
This will rewrite one/two/three to one-two-three.php.

Do I need to have a favicon on my site? How do I get rid of the errors I see in my apache log?

I keep seeing favicon warnings in my apache log. How do I get rid of those? Do I have to have a favicon for my site?
/favicon.ico is one of the artifacts of the Browser Dark Ages (cca 2000). While there is no way to prevent the browser requests, creating a 0-byte file named favicon.ico ends the flow of 404 errors (as the file exists), but no favicon will be shown by the browsers for your site.
Johan Petersson provides a good answer to preventing file not found errors without using a favicon at http://www.trilithium.com/johan/2005/02/no-favicon/
Placing the following code in the Virtual Host section of httpd.conf (or wherever you define your site environment), should stop the errors appearing in the Apache error log:
# Don't bother looking for favicon.ico
Redirect 404 /favicon.ico
# Don't bother sending the custom error page for favicon.ico
<Location /favicon.ico>
ErrorDocument 404 "No favicon
</Location>
Alternatively, you can create a blank file and name it favicon.ico, placing it in the root directory of the site.
You don't need to, no, but some browsers will request /favicon.ico automatically, so the errors are pretty much unavoidable.
You don't really need it, but as others have said, some browsers will ask for it even if it's not specified in <link rel="shortcut icon" />.
I'm not an expert, but I played with mod_rewrite a bit, and here's what you can do:
# turn on the mod_rewrite module
RewriteEngine On
# if requested file is not an existing file
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
# and it's name is favicon.ico, send an empty 410 GONE response to the browser
RewriteRule .*favicon\.ico$ - [G]
I only tried this on my localhost: first request resulted in 410, but for all following ones, browser does not ask for that file, because it remembers it's gone.
I'm not sure this is how you're supposed to use 410 GONE status, nor that it will work 100%.
Webbrowsers use this to display the image you see in your favorites as well as the icon of your tab. e.g. when you go to stackoverflow the cool icon you see in the tab as shown : is automatically fetched by my browser (chrome) using the url : https://stackoverflow.com/favicon.ico . Its pretty standard so in case you don't want it in your log you should put some icon and rename it as favicon.ico in the httpdocs.
Looking at your logs, you will probably see such 404 errors:
favicon.ico: Internet Explorer, Chrome...
apple-touch-icon.png: iOS devices, Android, maybe some other devices
apple-touch-icon-precomposed.png: iOS devices, Android, maybe some other devices
apple-touch-icon-76x76.png: iOS, maybe some other devices
apple-touch-icon-120x120.png: iOS, maybe some other devices
apple-touch-icon-152x152.png: iOS, maybe some other devices
If you absolutely don't want to add a favicon to your site, you can apply one the the solution described in the other answers:
mod_rewrite
Force 404
Empty picture
However, favicon are so common nowadays that you probably want to add one to your site. This favicon generator creates all these files at once. Full disclosure: I am the creator of this site.
Not really need it.
However it is used on your site (the warnings).
Check the source of your website to see if it contains:
<link rel="shortcut icon" href="favicon.ico">
In the head section of the page.
Remove that line or add the favicon to prevent erros in your log.