Unlist a subdomain or directory according to robotstxt.org - robots.txt

According to robotstxt.org
The first answer is a workaround: You could put all the files you
don't want robots to visit in a separate sub directory, make that
directory un-listable on the web (by configuring your server)
How do I configure my server to have an unlisted directory or subdomain?

It depends on the server and its configuration.
As it may be a privacy/security issue to list the content of a folder, most servers will probably not do it by default. Some servers might display folder content only if there is no index.html file.
For Apache, see mod_autoindex.
You can easily test it if your server lists content or not:
create a folder test
add a dummy.txt file
visit the URL of this folder, e.g. http://example.com/test/
If you get an error message, your server doesn’t list content. If you see a link list containing dummy.txt, your server does list content.

Related

Deploying a Static website on Server

I am trying to built a static website using godaddy server. I created a folder say Manage inside public_html in which there is an index.php. Now when I am trying to open this page on browser with URL "www.mysite.com/Manage/index.php" It is showing error, File not found, 404 error. So what possible error I might be making?
There could be a few reasons. If you are using an RHEL-based distribution for your server, you need to edit the master Apache configuration file /etc/httpd/conf.d/userdir.conf and change two things:
Change UserDir disabled to UserDir disabled root
If #UserDir public_html is commented change it to UserDir public_html
This tells Apache that the directory containing each user's html files is a subdirectory of their home directory called public_html.
You may also need to change the permissions of your user directory and your public_html directory to allow Apache to read and execute inside them. To do this, run the following commands:
sudo chmod o+x /home/myusername
sudo chmod o+rx /home/myusername/public_html
Restart Apache and see if it works.
The source for this knowledge is not my brain. It comes from the wonderful course at Washington University in St. Louis CSE 330 Rapid prototype development.
At first look to error.log. If you use nginx find in /var/log/nginx, if httpd in /var/log/httpd.
And what do you mean about "static"? PHP preprocessor generate html from *.php files, so you index.php is not static.
For this case you need to setup LAMP stack.

Changes in conf/server.xml does not seem to have any effect during runtime

Here's what I know:
When uploading files given by users, we should put them in a folder
outside the deployment folder. Let me call it D:\uploads.
We should (somehow) add that folder (D:\uploads) as a web app context.
Here's what I did:
I upload my files to the folder D:\uploads.
I tried adding the web app context as it's mentionned here by adding the following row to TOMCAT_DIR/conf/server.xml:
<Context docBase="D:\uploads" path="/uploads"/>
But that doesn't have any effect. When consulting http://localhost:8080/uploads/file.png or http://localhost:8080/uploads I get a HTTP Status 404 error.
So what I want to know:
What did I do wrong ? How can I add my upload folder to Tomcat?
Is there any better approach when it comes to uploading files ?
Because I'm wondering what should I change if I want to deploy my
application to another server where there's no D:\uploads.
Change the docBase attribute. Use D:/uploads (with slash) instead of D:\uploads (with backslash).
When dealing with files in Java, you can safely use / (slash, not backslash) on all platforms.
Regarding the differences you mentioned in the comments when starting the Tomcat from the IDE and from bin/startup.bat: It's very likely when you start the Tomcat from the IDE, it is not using the same context.xml your Tomcat is using. Just review the Tomcat settings in the IDE.
How to store uploaded files is a common topic at Stack Overflow. Just look around and you'll get surprised in how this topic is popular.
If you aren't happy enough in storing your files in D:/uploads or you'll have other servers accessing the files, you could consider storing them in some location in your network. Depending on your requirements, you can have one dedicated server to store your files or just share the folder which contains the files in your current server. The right decision will always depend on your requirements.

Restricting access to particular URL for sesame server deployed on JBoss (WildFly 8.2)

I have a sesame server running deployed in a WildFly 8.2.0 final container.
How can I restrict access to some particular URLs?
I know I have to edit some XML files (deployment descriptor and some other files) but I don't know which files and where to find them.
I figured it out my self.
Step 1:
Open the openrdf-sesame.war with Total Commander or any file archiver. Go the WEB-INF folder and open the web.xml file.
Edit the web.xml file by adding constraints, roles and the login-config tag as in this example : http://www.rivuli-development.com/further-reading/sesame-cookbook/basic-security-with-http-authentication/
Save the edited file within the archive and redeploy the openrdf-sesame.war file containing the modified web.xml file.
Step 2:
Go to the WildFly folder and enter the bin directory and run the add-user.bat file.
Choose b) Application User and hit Enter.
Enter a username and a password for the new user.
When you are asked "What groups do you want this user to belong to?", type in one of the roles you have created in the web.xml file and hit Enter.
When asked “is this new user going to be used for one AS process to connect to another AS process?” type “yes” and hit Enter.
And that's all.
You now have youre particular URL's restricted.

Where is the web server root directory in WAMP?

Also is the web server root directory the place where you put your site files and later acces them with localhost/file_name in the browser?
If you installed WAMP to c:\wamp then I believe your webserver root directory would be c:\wamp\www, however this might vary depending on version.
Yes, this is where you would put your site files to access them through a browser.
In WAMP the files are served by the Apache component (the A in WAMP).
In Apache, by default the files served are located in the subdirectory htdocs of the installation directory. But this can be changed, and is actually changed when WAMP installs Apache.
The location from where the files are served is named the DocumentRoot, and is defined using a variable in Apache configuration file. The default value is the subdirectory htdocs relative to what is named the ServerRoot directory.
By default the ServerRoot is the installation directory of Apache. However this can also be redefined into the configuration file, or using the -d option of the command httpd which is used to launch Apache. The value in the configuration file overrides the -d option.
The configuration file is by default conf/httpd.conf relative to ServerRoot. But this can be changed using the -f option of command httpd.
When WAMP installs itself, it modify the default configuration file with DocumentRoot c:/wamp/www/. The files to be served need to be located here and not in the htdocs default directory.
You may change this location set by WAMP, either by modifying DocumentRoot in the default configuration file, or by using one of the two command line options -f or -d which point explicitly or implicity to a new configuration file which may hold a different value for DocumentRoot (in that case the new file needs to contain this definition, but also the rest of the configuration found in the default configuration file).
Everything suggested by user "mins" is correct, and excellent information.
WAMP 2.5 provides a default Server Configuration display when you enter localhost into your browser. This maps to c:\wamp\www, as described in previous posts. Creating subdirectories under www will cause Projects to appear on this display. A click and you're in your project.
I have various projects under different directory structures, sometimes on shared drives which makes this centralized location of files inconvenient. Luckily, there is a second feature of WAMP 2.5, an Alias, which makes specifying the location of one (or more) disparate web directories quite easy. No editing of configuration files. Using the WAMP menu, choose Apache > Alias directories > Add an Alias.
WAMP has evolved nicely to provide support for a variety of developer preferences.
If you use Bitnami installer for wampstack, go to:
c:/Bitnami/wampstack-5.6.24-0/apache/conf (of course your version number may be different)
Open the file:
httpd.conf in a text editor like Visual Studio code or Notepad ++
Do a search for "DocumentRoot". See image.
You will be able to change the directory in this file.
To check what is your root directory go to httpd.conf file of apache and search for "DocumentRoot".The location following it is your root directory
this is the path to the web root directory c:\wamp\www
you can create different projects by adding different folders to this directory and call them like:
localhost/project1 from browser
this will run the index.html or index.php, lying inside project1
Here's how I get there using Version 3.0.6 on Windows

Problems with setting the path for Zend framework, needed for Youtube API

I copied & pasted this text here. It seems the editor seems to format some parts randomly. ;)
I downloaded ZendGdata 1.9.6, extracted it & uploaded it to my site's
root folder ..., which I need for use with Youtube API to get videos onto my site.
I must say I’m new to all this, and so I would appreciate taking this into account.
The library folder is at /ZendGdata/library.
The problem I'm having is Step. 3 when I follow instructions
(http://code.google.com/intl/de-DE/apis/gdata/articles/php_client_lib.html#gdata-installation)
for setting it up for that purpose.
Download the Google Data Client Library files.
Decompress the downloaded files. Four sub-directories should be
created:
demos — Sample applications
documentation — Documentation for the client library files
library — The actual client library source files.
tests — Unit-test files for automated testing.
Add the location of the library folder to your PHP path (see the next section)
One of the suggested locations to add the path, apart from the .htaccess file is in php.ini.
My site is on shared hosting. I have no access to the main php.ini file, but I’m allow to create one if I need one. For Drupal CMS, for some functions, it suffices placing one in the root folder.
I added this line:
include_path=".:/usr/lib/php:/usr/local/lib/php:/home/habaris6/
public_html/site.root.folder/ZendGdata/library";
When I however go to mysite.com/ZendGdata/demos/Zend/Gdata/InstallationChecker.php to test the set up, like is mentioned in the
documentation on Youtube, I get the error:
PHP Extension ErrorsTested No errors found
Zend Framework Installation Errors: Tested 0
Exception thrown trying to access Zend/Loader.php using 'use_include_path' = true.
Make sure you include Zend Framework in your include_path which currently
contains: .:/usr/lib/php:/usr/local/lib/php
SSL Capabilities Errors: Not tested
YouTube API Connectivity Errors: Not tested
So my question is: Is that the correct way to “Add the location of the library folder to your PHP path” ?
I’m a bit mixed up.
Someone was saying the php.ini file is only active in the folder where it is located. If that is the case, which of the ZendGdata folders should have it?
As I said, my purpose is to have a the Zend framework properly set up to allow using Youtube API, something I also yet have to learn to do.
In Youtube API Google group, I was referred here. The documentation coming with the downloaded file & at zend.com pre-supposes, one knows much more than some beginners like me.
Another person said I try placing this
$clientLibraryPath = '/home/habaris6/public_html/site.root.folder/ZendGdata/library';
$oldPath = set_include_path(get_include_path() . PATH_SEPARATOR . $clientLibraryPath);
in mysite.com/ZendGdata/demos/Zend/Gdata/InstallationChecker.php
Whereas everything I had tried before failed, except fot the first test, when I placed the above snippet in the installation checker, I got positive tests for everything:
Ran PHP Installation Checker on 2009-12-09T21:16:08+00:00
PHP Extension ErrorsTested: No errors found
Zend Framework Installation Errors Tested No errors found
SSL Capabilities ErrorsTested No errors found
YouTube API Connectivity ErrorsTested No errors found
Does it mean if I place that snippet in install checker, all scripts needing the library can access it?
If not, please let me know what exactly to place in the self-made php.ini & in which folder(s) it should be.
Should that not work, and I were to use .htaccess files, what exactly, based on the folders mentioned above should be the content & exactly which folders should they be in? I read that the .htaccess files should be placed in each folder. Does it really mean I should place one in each of the ZendGdata folders?
I would be grateful for any guidance enabling me to finally start, after failing to sufficient get responses elsewhere.
Thanks in advance.
It's not necessary to put all the ZendGdata code under your website document root. In fact, as a rule I don't put PHP class libraries in a location that can be accessed directly by web requests, because if there's any way to do mischief by invoking the class files directly, then anyone can do it.
Instead, put libraries outside your document root and then reference them from scripts that are run directly. For example, you could create a directory phplib as a sister to your public_html directory. Then upload the ZendGdata bundle under that phplib directory.
You can set your PHP include path in a .htaccess file. You don't need to create a .htaccess file in every directory, because the directives in any .htaccess file apply to all files and directories under the directory where the .htaccess resides. See http://httpd.apache.org/docs/2.2/howto/htaccess.html for more information.
So I would recommend creating a .htaccess file at /home/habaris6/public_html/site.root.folder containing the following directives:
<IfModule mod_php5.c>
php_value include_path ".:/usr/local/lib/php:/home/habaris6/phplib/ZendGdata/library"
</IfModule>
See http://php.net/manual/en/configuration.changes.php for more info on this.
Note that this assumes your webhosting company allows you to use .htaccess files, and that they allow you to use the php_value directive in .htaccess files. Enabling these options is an Apache configuration and they could have their own policies against that for reasons of performance or security. You should contact them for this answer; no one on the internet can answer questions about your hosting provider's policies.
If you choose to use the set_include_path() PHP function to append a directory to your runtime include path, you need to do this in each file that serves as a landing point for a web request. That is, if you permit a request to be made directly to foo.php then you need to add the code to foo.php. Any files or classes subsequently included by foo.php use the include path you defined.
Note also that whatever method you use to define the include path, it has to take effect before your script tries to load any PHP class files via the include path. The .htaccess method should accomplish this, and if you use the code method you just have to put the code high enough in your PHP script.
I don't use the method of creating a custom php.ini file under each directory within your site document tree. That's a new feature of PHP 5.3.0, not supported by earlier versions of PHP. If you're using Apache you should just use .htaccess for the same effect.