TYPO3 v9.5.0 - Error message: Requested page does not exist /robots.txt - typo3

TYPO3 v9.5.0 - Error message: Requested page does not exist /robots.txt
I have a TYPO3 9.5.0LTS and use the bootstrap package theme. It seems to be all working ... but quite often I get such error messages:
Core: Exception handler (WEB): Uncaught TYPO3 Exception: #1518472189: The requested page does not exist | TYPO3\CMS\Core\Error\Http\PageNotFoundException thrown in file /is/www/typo3_src-9.5.0/typo3/sysext/frontend/Classes/Controller/ErrorController.php in line 82. Requested URL: domain/robots.txt
What causes this and how to prevent this? Or how do I create a robots.txt in v.9.5 ?

In TYPO3 9.5 you can add a robots.txt in your Sites module.
Sites -> Choose your site -> Static Routes -> Create new.
Static Route Name: select "robots.txt"
Route Type: select "Static Text"
Static Text: Select "robots.txt Example Content"
Save. Should be fixed now.

This will work for all TYPO3 versions. For TYPO3 V9.x use the solution by Thomas Löffler.
Your server configuration (apache? .htaccess?) will hand over any request to a source that is no file and no directory and no symbolic link to the index.php file which is TYPO3.
In your case, you do not have a file robots.txt. So TYPO3 wants to handle it, but has no resource with that name. This creates a 404 error in TYPO3.
To prevent this, jst create the robots.txt file on your webserver in the DOCUMENT_ROOT folder
So what is a robots.txt file anyway.
This is a method to tell search engines how to behave on your server. It contains recomendations to the search engines' crawlers, when to stop crawling (like typo3_src folder). It is requested by the crawlers automatically and regularly.

Related

Cannot save template constants in typo3

I updated the constants of my template in the web editor of typo3. Each time I click on Save or Close+Save I get a pop-up from my browser to download a file. The content is like this:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>
The minimal example to get this is:
page.theme {
socialmedia.channels {
facebook.url = https://www.facebook.com/typo3/
}
}
It seams that Typo3 has a problem with the dots in the url. If I remove all of them or escape with a backslash "\" everything works. (But the backslash remains in the url and therefore produce invalid urls)
Some months before everything works fine. Some other templates in the same installation have also urls in their configuration and they are working (the page is rendered normaly). If I try to save them noe without any changes I get the same error.
That is the system I use:
Typo3-Version: 9.5.20
Webserver: Apache/2.4.43 (Unix)
PHP-Version: 7.3.21
Database: MySQL 5.6.42
Applicationcontext: Production
OS: SunOS SunOS localhost 5.10 Generic_150401-49 i86pc
Bootstrap Package: 11.0.2

TYPO3 9.5.3 - Log errors showing multiple attempts to access non-existing routes

In a TYPO3 9.5.3 demo installation I see multiple errors in the log looking this:
Core: Exception handler (WEB): Uncaught TYPO3 Exception: #1518472189: The requested page does not exist
... and attempts to access sites (which don't exist) like this:
typo3_src-9.5.3:
Requested URL: http://demo.domain/ultxswkov.html
typo3_src-9.5.1:
Requested URL: http://demo.domain/hpwymspohv.html
Requested URL: http://demo.domain/txlkcgnaet.html
Requested URL: http://demo.domain/contact.php
Requested URL: http://demo.domain/kontakt.html
Requested URL: http://demo.domain/kontakt.htm
Requested URL: http://demo.domain/kontakt
Requested URL: http://demo.domain/contact-us.html
Requested URL: http://demo.domain/contacts.htm
Requested URL: http://demo.domain/contacts.html
...
In all my v.8 installations I never had such log errors. I assume somebody tries to access thoses sites? (For this specific domain I don't have a ssl certificate yet) What's the best practice to do now?
It seems that error 404 was not logged in TYPO3 8.x into sys_log. Atleast with default configuration. You can check the apache error log to see what happend in the past (with TYPO3 8.x). You should see many similar 404 errors there.
Every website in the internet has evil bots as visitors, so its nothing special with TYPO3 9.x.
The question "no additional security precaution needed?" is hard to answer. As long as your installation is secure, there is no problem.
Security Guidelines: https://docs.typo3.org/typo3cms/CoreApiReference/Security/Index.html

NotFound page in Orchard CMS modules

When url starts with module area name, Orchard's NotFound page is not working.
Instead, it shows the standard 404 "Server Error in '/' Application."
e.g.
http://localhost:30320/users/test/
http://localhost:30320/orchard.medialibrary/test/
http://www.orchardproject.net/test12345 - NotFound.cshtml
http://www.orchardproject.net/users/test12345 - standard ASP.NET 404
When debugging, it throws an exception (HttpException: "The controller for path '***' was not found or does not implement IController").
How to solve this problem?

How to make browser stop caching GWT nocache.js

I'm developing a web app using GWT and am seeing a crazy problem with caching of the app.nocache.js file in the browser even though the web server sent a new copy of the file!
I am using Eclipse to compile the app, which works in dev mode. To test production mode, I have a virtual machine (Oracle VirtualBox) with a Ubuntu guest OS running on my host machine (Windows 7). I'm running lighttpd web server in the VM. The VM is sharing my project's war directory, and the web server is serving this dir.
I'm using Chrome as the browser, but the same thing happens in Firefox.
Here's the scenario:
The web page for the app is blank. Accorind to Chrome's "Inspect Element" tool, it's because it is trying fetch 6E89D5C912DD8F3F806083C8AA626B83.cache.html, which doesn't exist (404 not found).
I check the war directory, and sure enough, that file doesn't exist.
The app.nocache.js on the browser WAS RELOADED from the web server (200 OK), because the file on the server was newer than the browser cache. I verified that file size and timestamp for the new file returned by the server were correct. (This is info Chrome reports about the server's HTTP response)
However, if I open the app.nocache.js on the browser, the javascript is referring to 6E89D5C912DD8F3F806083C8AA626B83.cache.html!!! That is, even though the web server sent a new app.nocache.js, the browser seems to have ignored that and kept using its cached copy!
Goto Google->GWT Compile in Eclipse. Recompile the whole thing.
Verify in the war directory that the app.nocache.js was overwritten and has a new timestamp.
Reload the page from Chrome and verify once again that the server sent a 200 OK response to the app.nocache.js.
The browser once again tries to load 6E89D5C912DD8F3F806083C8AA626B83.cache.html and fails. The browser is still using the old cached copy of app.nocache.js.
Made absolutely certain in the war directory that nothing is referring to 6E89D5C912DD8F3F806083C8AA626B83.cache.html (via find and grep)
What is going wrong? Why is the browser caching this nocache.js file even when the server is sending it a new copy?
Here is a copy of the HTTP request/response headers when clicking reload in the browser. In this trace, the server content hasn't been recompiled since the last GET (but note that the cached version of nocache.js is still wrong!):
Request URL:http://192.168.2.4/xbts_ui/xbts_ui.nocache.js
Request Method:GET
Status Code:304 Not Modified
Request Headersview source
Accept:*/*
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Host:192.168.2.4
If-Modified-Since:Thu, 25 Oct 2012 17:55:26 GMT
If-None-Match:"2881105249"
Referer:http://192.168.2.4/XBTS_ui.html
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.94 Safari/537.4
Response Headersview source
Accept-Ranges:bytes
Content-Type:text/javascript
Date:Thu, 25 Oct 2012 20:27:55 GMT
ETag:"2881105249"
Last-Modified:Thu, 25 Oct 2012 17:55:26 GMT
Server:lighttpd/1.4.31
The best way to avoid browser caching is set the expiration time to now and add the max-age=0 and the must-revalidate controls.
This is the configuration I use with apache-httpd
ExpiresActive on
<LocationMatch "nocache">
ExpiresDefault "now"
Header set Cache-Control "public, max-age=0, must-revalidate"
</LocationMatch>
<LocationMatch "\.cache\.">
ExpiresDefault "now plus 1 year"
</LocationMatch>
your configuration for lighthttpd should be
server.modules = (
"mod_expire",
"mod_setenv",
)
...
$HTTP["url"] =~ "\.nocache\." {
setenv.add-response-header = ( "Cache-Control" => "public, max-age=0, must-revalidate" )
expire.url = ( "" => "access plus 0 days" )
}
$HTTP["url"] =~ "\.cache\." {
expire.url = ( "" => "access plus 1 years" )
}
We had a similar issue. We found out that timestamp of the nocache.js was not updated with gwt compile so had to touch the file on build. And then we also applied the fix from #Manolo Carrasco Moñino. I wrote a blog about this issue. http://programtalk.com/java/gwt-nocachejs-cached-by-browser/
We are using version 2.7 of GWT as the comment also points out.
There are two straightforward solutions (second is modified version of first one though)
1) Rename your *.html file which has a reference to *.nocache.js to i.e. MyProject.html to MyProject.jsp
Now search the location of you *.nocache.js script in MyProject.html
<script language="javascript" src="MyProject/MyProject.nocache.js"></script>
add a dynamic variable as a parameter for the JS file, this will make sure actual contents are being returned from the server every time. Following is example
<script language="javascript" src="MyProject/MyProject.nocache.jsp?dummyParam=<%= "" + new java.util.Date().getTime() %>"></script>
Explanation: dummyParam will be of no use BUT will get us our intended results i.e. will return us 200 code instead of 304
Note: If you will use this technique then you will need to make sure that you are pointing to right jsp file for loading your application (Before this change you was loading your app using HTML file).
2) If you dont want to use JSP solution and want to stick with your html file then you will need java script to dynamically add the unique parameter value on the client side when loading the nocache file. I am assuming that should not be a big deal now for you given the solution above.
I have used first technique successfully, hope this will help.
The app.nocache.js on the browser WAS RELOADED from the web server (200 OK), because the file on the server was newer than the browser cache. I verified that file size and timestamp for the new file returned by the server were correct. (This is info Chrome reports about the server's HTTP response)
I wouldn't rely on this. I've seen a bit of strange behaviour in Chrome's dev tools with the network tab in combination with caching (at least, it's not 100% transparent for me). In case of doubt, I usually still consult Firebug.
So probably Chrome still uses the old version. It may have decided long ago, that it will never have to reload the resource again. Clearing the cache should resolve this. And then make sure to set the correct caching headers before reloading the page, see e.g. Ideal HTTP cache control headers for different types of resources.
Open the page in cognito mode just to get-rid of cache issue and unblock yourself.
You need to configure cache time as mentioned in others comments.
After unsuccessfully preventing caching via Apache I created a bash script that root runs every minute in a cron job on my Linux Tomcat server.
#!/bin/bash
#
# Touches GWT nocache.js files in the Tomcat web app directory to prevent caching.
# Execute this script every minute in a root cron job.
#
cd /var/lib/tomcat7/webapps
find . -name '*nocache.js' | while read file; do
logger "Touching file '$file'"
touch "$file"
done

Downloading file by WebClient Exception

I have a problem downloading particular file types by WebClient. So there are no problems with usual types - mp3, doc and others, but when I rename file extension to config it returns me:
InnerException = {System.Net.WebException: The remote server returned an error: NotFound. ---> System.Net.WebException: The remote server returned an error: NotFound.
at System.Net.Browser.BrowserHttpWebRequest.InternalEndGetResponse(IAsyncResult asyncResult)
when I'm trying to access this file in browser (http://localhost:3182/Silverlight.config) - it's a usual xml file within - server returns me following error page:
Server Error in '/' Application.
This type of page is not served.
Description: The type of page you have requested is not served because it has been explicitly forbidden. The extension '.config' may be incorrect. Please review the URL below and make sure that it is spelled correctly.
Requested URL: /Silverlight.config
So I suppose this hapens because of some server configuration, which blocks files of unknown type.
downloading code is simple:
WebClient webClient = new WebClient();
webClient.OpenReadCompleted += new OpenReadCompletedEventHandler(webClient_OpenReadCompleted);
webClient.OpenReadAsync(new Uri("../Silverlight.config", UriKind.RelativeOrAbsolute));
completted eventhandler omitted for simplicity.
I'm not sure this is possible.
The .config extension is handled by the ASP.NET engine, for security reasons (sensitive data like connection strings need to be kept safe and hidden from unauthorized viewers).
This means that visitors cannot view your web.config file's content by simply entering "www.example.com/web.config" into their browser's adress bar.
EDIT : actually you can but I don't recommand it. If you really need to do it, you have to remove the mapping between the .config extension and ASP.NET ISAPI filter in IIS.