I would like to prevent robots to index my web site. I created a robots.txt file with this content:
User-agent: *
Disallow: /
Now I must place this file in the root folder of my website. I try with my ftp software but it doesn't work. The upload failed. As you can see on the picture below, I try at the very top level ( / ). Is it correct? Or do I have to upload the file in the /httpdocs folder?
Thanks.
UPDATE
Here is the content of the httpdocs folder
robots.txt should go where the root of your website is. It looks like you should put it in httpdocs from your screenshots.
To check...copy it there, and navigate to yoursite.com/robots.txt . If you get a 404, then it's in the wrong place. If you see your robots.txt file, you're good to go!
httpdocs is the root of your website.
Related
The official SvelteKit docs on the topic of SEO, which mentions that a sitemap can be dynamically created using an endpoint. I could not find other documentation related to the robots.txt file, that can be used to reference the sitemap for web crawlers and SEO optimization.
I looked on other forums as well but could not find a solution. I created my robots.txt and included it at the root of my project / and in /src as well. When I search for the project file on nazar-design.com/robots.txt I am served with a 404 error message.
Any idea how to fix this?
You can place files in the directory named in your kit.files.assets configuration (which is the /static folder by default) to be served to users as-is.
In your case, placing the file at /static/robots.txt would yield the desired nazar-design.com/robots.txt URL.
I have uploaded a file to /var/www/html, which is the document root on my server, but when I go to a browser and type ip_to_server/filename I get a 404 not found erorr page?
folder "www" is the root folder by default. Have you changed it?
If you place your file directly in "www" (www/yourfile) and go to "ip_to_server/filename" it should work.
If you have a folder "html" in "www" and place your file under "html" then you'll need to go to "ip_to_server/html/filename" to access it
Let's say I have a test folder (test.domain.com) and I don't want the search engines to crawl in it, do I need to have a robots.txt in the test folder or can I just place a robots.txt in the root, then just disallow the test folder?
Each subdomain is generally treated as a separate site and requires their own robots.txt file.
When the crawler fetches test.domain.com/robots.txt that is the robots.txt file that it will see. It will not see any other robots.txt file.
If your test folder is configured as a virtual host, you need robots.txt in your test folder as well. (This is the most common usage).
But if you move your web traffic from subdomain via .htaccess file, you could modify it to always use robots.txt from the root of your main domain.
Anyway - from my experience it's better to be safe than sorry and put (especially declining access) files robots.txt in all domains you need to protect. And double-check if you're getting the right file when accessing:
http://yourrootdomain.com/robots.txt
http://subdomain.yourrootdomain.com/robots.txt
I'm using Pico CMS, a small markdown project - http://pico.dev7studios.com/- installed and running good, however I am trying to password protect a folder with htaccess file but the cms is bypassing this and showing the file I call in the browser.
The funny thing is that the url for the file does not contain the "content" folder which is where all the files/pages are stored. All the other folders are contained in the url. This is the only reason that I can find to explain what's happening.
If I manually enter the url to that same folder, which is password protected, which includes the "content" folder in it's path, then I get the htaccess auth window showing. This proves the htaccess file is being read, but not when the CMS accesses it. Can anyone explain why and how to force the folder to be protected when I call any page with the browser.
If you open up a Pico site, your request is redirected to the index.php file (via mod_rewrite). That's why the "content" folder does not show up in the url.
That's also the reason why you are not asked for a password. The index.php file does not have to pass the htaccess auth to get to the *.md files.
Read this for a bigger picture:
https://stackoverflow.com/a/10923542/3294973
This plugin may be interesting to you:
https://github.com/jbleuzen/Pico-Private
Unfortunately, it can't protect only part of the website at this point.
Protecting single pages is now possible. (Check my GitHub Fork)
My directory tree looks something like this:
public_html
example.com(containing symfony files)
lib
web
app
etc...
example.net
example1.com
example1.net
When I access my site via example.com it just shows my directory tree under the folder example.com, instead of routing to my actual homepage. How do I fix this?
You should place all files (except the web directory) in the directory directly above the web directory.
Then create a symlink public_html to your web directory, or add the files to your public_html, and then in your ProjectConfiguration call $this->setWebDir(realpath('../public_html'));