How to secure confidential Directories from bots? - robots.txt

What is the best way to secure some confidential files and directories from bots and crawlers(such as Google bot,..etc).
Sample Directory Structure :
/folder/
/public_html/
/includes/ - // Private
/db/config.php - //Private
index.php - // Public
robots.txt - // Public
I know i can add these files and directores to robots.txt and disallow them,
but only some bots accept the rules. Also users can read the robots.txt file and view the location of confidential files.
Another option is to put these folders and files outside public_html directory.
So whats in your opinion is the best way to secure them.

Of Course, you can't use the robots.txt file to hide your directory and the robots.txt file doesn't even prevent indexing on google.
What you need to do is set up a .htaccess file if you are using Apache server and build rules to hide/redirect and return 404 Error page or maybe 403 access denied.
See this for example http://corz.org/server/tricks/htaccess.php
The other option is to create a .htaccess file in each private folder and add the following line in that file.
deny from all
Hope that helps, 👍

Declare what user agents shall not be able to see you excludes
User-agent: *
This excludes all bots. Or at least it should.
Then exclude your paths
Disallow: /something/
Disallow: /something_else/
Hope this helps.

Related

How long does it take for an Amazon S3 object redirect to take effect?

I have a very simple, static web site in an Amazon S3 bucket.
Everything works, or used to work, except that one redirect does not work any more.
to be more specific:
I have one object, /A.zip, say, in the bucket.
It used to redirect to /A-v1.1.zip. That worked so far.
Now, I have just changed the redirect for A.zip to redirect to /A-v1.2.zip.
And I have deleted the object A-v1.1.zip .
But it seems that A.zip is still pointing to A-v1.1.zip!
Some browsers still load the old A-v1.1.zip when I try to get A.zip (probably out of a cache), and some browsers give me a 404, with error details "Message: The specified key does not exist.
Key: A-v1.1.zip".
I am not using CloudFront.
Also, Cache-Control is set to max-age=120.
Did I make a mistake?
Or is there anything I need to do to make the change in redirection effective?
(I have already checked the metatags in S3, they are correct.)
Thanks a lot in advance.

One link of a non-jekyll github page site not working. Is there any cache of something I need to delete (non-local)?

So I need to put a specific file with some specific text (location-> .well-known/acme-challenge/filename) where .well-known is at the publishing directory/source, so that I can obtain my SSL/TLS certificate from LetsEncrypt. However, once I try to create this, ALL links work except username.github.io/.well-known/acme-challenge/filename. If I modify .well-known or acme-challenge parts, the links work. I don't know why this is happening. Is there some cache somewhere I need to delete? I've run the links on private no-cache, no-cookies browsers too, they don't seem to work.

How facebook like websites is able to load the profile, instead of a directory when a request like facebook.com/profile/username is recieved?

When the facebook.com/profile/{username} is requested how is server able to load page with data corresponding to that user, instead of navigating to a directory named in that {username}, and possibly showing a 404 error ?
It's achieved typically using a pattern called "front controller", where all requests are handled by the same file (let's say index.php, talking specifically about PHP now). So all URLs are like this:
facebook.com/index.php/profile/abc
facebook.com/index.php/account
That file serves as the bootstrap for the application, reading extra parameters (anything after index.php) and dispatching requests to the appropriate handlers/controllers.
Then there's multiple ways you can get rid of that ugly index.php, depending on how you configure your web server (loads of questions here about that subject: htaccess remove index.php from url as an example).
Read more about it here: https://en.m.wikipedia.org/wiki/Front_controller

url links in Zend Layout automatically appended by ~username on server

I'm using ZF for my project and my server directory structure is:
/ROOT
__/APPLICATION
__/Zend library
__/public_html(I put all the contents of public folder created by ZF here)
__/docs
__library
I have a single .htaccess file which I put in public_html folder. There are two issues that I want help for.
First,
the url links I'm creating using $this->url(array('controller'=>'home', 'action'=>'index'),null,true), for example, are resulting into <a href='/~wethemen/home'>...</a>, where 'wethemen' is my username on the hosting server account. I checked that in page source. That's why it is not rendering the requested controller and actions as well, may be.
Second,
Only the layout is rendered and no action. My default controller is 'home' so I get this error when I try to access the site.
script:''home'/index.phtml' not found in path (/home1/wethemen/application/views/scripts/).
This is the first time I'm deploying a ZF project on server. Any help will be greatly appreciated. I'll pour the contents of index.php and bootstrap.php if needed.
Just put
$controller = Zend_Controller_Front::getInstance();
$controller->setBaseUrl('/your/base/url');
where you find the text "->setRouter" or anywhere in bootstrap, before dispatch() is called.
Edit: if this does not work and you think it is a HTTP server rewrite issue, adding
RewriteEngine on
RewriteBase /
if you application is accessed at http://yourdomain.com/
or
RewriteEngine on
RewriteBase /your/base/url/
if the application is accessed as http://yourdomain.com/your/base/url/

Redirect to same named page in directory structure before path changes

Well, say I have a number of html pages in my web. The case is that I´m doing changes sometimes in the directory structure, so when anybody try to access to a determinated URL, it's possible that such URL does not exit. The files names don't change but so do the paths.
As far as I now, the server takes the user to a "404" page that can be customized. Is possible to customize the page in this way?:
The user tries oneweb.com/oldpath/page.html; which does not exist.
A 404 customized page is launched
404 page runs an script IS THIS POSSIBLE?
The script is given the name of the file WHERE IS STORED SUCH NAME?
The script search the entire directory structure to find page.html HOW TO ACCESS TO THE STRUCTURE
The file is found and the new URL is stored: oneweb.com/newpath/page.html
a link appears showing the new URL
Maybe this process is relatively common and I can find some related code or tutorial?
Are you using Apache? Linux?
Add a 404 handler
ErrorDocument 404 /404.php
Then use 404.php to parse the url. This simple example just grabs everything after the last / in the URI so http://example.com/foo/bar/page.html would put page.html in $url:
$url = end(explode('/', $_SERVER['REQUEST_URI']));
Then use one of the comment example functions in http://php.net/manual/en/function.readdir.php to search your directory and find the file.
Then do a header 301 redirect
header ('HTTP/1.1 301 Moved Permanently');
header ('Location: http://example.com/' . $file_path);