How to prevent foreign GET parameters in TYPO3's canonical tag? - typo3

If an uncached page is called in the frontend with a GET parameter that is not foreseen and has been appended to the URL from a link of an external source, like a tracking parameter or something worse e.g. …
https://www.example.com/?note=any-value
… then this foreign parameter is passed on in the automatically generated canonical tag, created by TYPO3's core extension ext:seo. It looks like this:
<link rel="canonical" href="https://www.example.com/?note=any-value&cHash=f2c206f6f14a424fdbf82f683e8bf383"/>
In addition, the page is saved in the cache with this parameter. This means that subsequent visitors will also receive this incorrect canonical tag, even if they call up the page https://www.example.com/ without the parameter.
Is this a bug (tested on TYPO3 10.4.15) or can it be disabled for all unknown parameters by configuration?
If you know the parameter, you can exclude it in the global configuration …
[FE][cacheHash][excludedParameters] = L,pk_campaign,pk_kwd,utm_source,utm_medium,…
… or via ext_localconf.php in the sitepackage:
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'][] = 'tlbid';
I am only concerned with parameters that were not expected. It might make sense to turn the concept around and basically exclude all parameters except for a few self-defined allowed parameters, but I don't know if that is possible so far.

Got it. Actually, TYPO3 handles these already for other common tracking and additional params, like L, utm_campaign, fbclid etc. The whole list of excluded params can be found in the source code.
To add your own, just add/modify the typo3conf/AdditionalConfiguration.php file i.e. just like:
<?php
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'][] = 'note';
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'][] = 'foo';
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'][] = 'bar';
or
<?php
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'] = array_merge(
$GLOBALS['TYPO3_CONF_VARS']['FE']['cacheHash']['excludedParameters'],
['note', 'foo', 'bar'],
);
Don't forget to clear caches after all :D (that should be a TYPO3's slogan)

It's a bug. The extension urlguard2 solves this issue.

it dont work for me in the TYPO3 V11.5.16
LocalConfig:
[FE][cacheHash][excludedParameters] = L,tx_solr,sword_list,utm_source,utm_medi…
Browser URL:
https://www.example.org/testfaelle/test?sword_list%5B0%5D=testf%C3%A4lle&no_cache=1
The HTML Frontend canonical is:
<link rel="canonical" href="https://www.example.org/testfaelle/test?sword_list%5B0%5D=testf%C3%A4lle&cHash=e81add4ca148ad10189b9cbfa4d57100">
Debugging:
if i go in the file: "/typo3/sysext/frontend/Classes/Utility/CanonicalizationUtility.php" and add the Parameters directly: $paramsToExclude[] = 'sword_list'; ist works:
<link rel="canonical" href="https://www.example.org/testfaelle/test">

Related

Dynamically inject content into JSDoc #example

I'm using JSDoc to document my javascript API.
I have an #example where I exhibit a minimized loader script (similar to the Google Analytics script). The loader script loads additional javascript from https://<server>/myProduct/lib/script.js.
My JSDoc documentation is bundled with myProduct, so there are aways /myProduct/lib/script.js and /myProduct/docs/ side-by-side. However, myProduct can be hosted by my customers anywhere, so I don't know what the <server> is.
I would like to be able to use document.location.href to detect current browser URL, and display a working loader script in my #example, so that customer can simply copy & paste a working script from documentation without having to manually edit the <server> part.
My question is: Does JSDoc offer any means to dynamically inject content into #example?
I could just manually edit the JSDoc output and include some custom javascript manually, which replaces <server> with the actual current server on run-time. However, this would be tedious to do every time my documentation updates.
No, JsDoc doesn't provide for injecting external content, and while it is certainly possible to create a JsDoc plugin to include external content during document generation, I don't think this would get you where you need to be.
I think the simplest answer would be to post-process the output of JsDoc to add a simple script, like
<script>
// retrieve the hostname from the current url.
const getServiceHostName = () =>
document.location.href.replace(/http:\/\/([\/]+)\/.+$/, '$1');
// replace the content of the span#service-host with the current hostname.
document.onload = function () {
document.getElementById("service-host").textContent=getServiceHostName();
};
</script>
to your html files just above the </body> tag.
Then, in the example content, insert a <span id="service-host"></span> where you'd like the service hostname to appear.
Another approach might be to take advantage of the fact that JsDoc passes the content attached to the #description tag directly into the html output.
So, this appears to do the expected:
/**
* Hostname: "<span id="service-host">hostname</span>"
*
* <script src="https://code.jquery.com/jquery-3.6.0.min.js"
* integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4="
* crossorigin="anonymous"></script>
*
* <script>
* $(document).ready(function() {
* $("#service-host").html(document.location.href);
* });
* </script>
*
* #name How-To-Monkey-Patch-JsDoc
*
*/
Note that the content of #example is wrapped in <pre><code>...</code></pre> in the html output which will definitely cause you problems, but it should be easy to post-process the html inline using the code above, by adding some recognizable token (perhaps [[HOSTNAME]]) into the example content and on document load, replace it with the desired value.
Please note also that this approach has the possibility of opening your docs to security issues, which is why I'd use the first solution above.

XML Sitemap with typoscript makes wrong URLs

According to http://www.typo3-probleme.de/2018/07/11/typo3-sitemap-mit-typoscript-erstellen-2285/ I let TYPO3 V8.7.24 generate the sitemap.xml file. So far it works. But in the file there are not proper URL's. On every URLs end is "?type=500001", for example an URL looks like "https://www.domain.ch/angebot/online-marketing/?type=500001". As a side note , there is also Ext:Realurl in use.
My request is, how can you remove the segment "?type=500001" ? Is the reason typoscript or the extension Realurl? How can I analyse it?
Any hint is welcome. Thanks in advance for your help.
It's the link generation inside of TYPO3. that is configured by typoscript, so you could see typoscript as the culprit.
If you want to know whether realurl (or any other extension) is the culprit: disable the extension in mind. if the error is gone there is a reason to suspect this extension.
When links are generated by TYPO3 it holds some parameter to stay in the current context. Which paramaters should be considered is a configuration (so it is grounded in typoscript).
Have a look (TSOB) at config.linkVars in general (it is copied implicit to every page object) or of your page object page.config.linkVars (in your case: xml_sitemap.config.linkVars)
There is a note in the manual:
Do not include the type parameter in the linkVars list, as this can result in unexpected behavior.
Other option would be to explicit set &type=0 to every link. But don't forget to set config.uniqueLinkVars = 1 (or xml_sitemap.config.uniqueLinkVars = 1)

Submit form to rewritten URLs?

I am trying to create nice URL's for my Magento search form, to make:
http://domain.com/catalogsearch/result/?q=KEYWORD
look like this:
http://domain.com/search/KEYWORD
I have written this is my htaccess file:
RewriteRule ^search/([^/]+)/?$ /catalogsearch/result/?q=$1 [QSA,P,NC]
Which works nicely, when I type in http://domain.com/search/KEYWORD it displays the results as it should.
BUT...
I can't workout how to get my search form to go to the nice format URL, it still goes to the original.
My search form is currently like this:
<form id="search_form" action="http://domain.com/catalogsearch/result/" method="get">
<input id="search" type="search" name="q" value="KEYWORD" maxlength="128">
<button type="submit">search</button>
</form>
Any point in the right direction much appreciated.
There are a couple of things going on here, so let me try to explain the best I can.
First and foremost, your main issue is the generation of this new "pretty" search URL. When you use a <form> with method="GET", each input (i.e. <input name="q">) will get appended to the form's action as a query parameter (you'll get /search?q=foo instead of /search/foo).
In order to fix this, you need to do two things:
Change your form tag to look like this:
<form id="search_form" action="<?php echo Mage::getUrl('search'); ?>" method="GET">
This will ensure that the form is submitted to /search instead of /catalogsearch/result. (You'll still get a ?q=foo, though, and that will be resolved in #2.)
Add a bit of JavaScript which hijacks the form submission and forms the desired URL:
var form = document.getElementById('search_form'),
input = document.getElementById('search');
form.onsubmit = function() {
// navigate to the desired page
window.location = form.action + input.value;
// don't actually submit the form
return false;
};
That'll get you up and running, but there are still some other issues which you should resolve.
Using RewriteRule based rewrites with Magento does not work well. I haven't quite figured out the technical reason for this, but I've had the same trouble that you're having. The reason that your rewrite works with the P flag is because the P flag turns the rewrite into a proxy request. This means that your web server will make another request to itself with the new URL, which avoids the typical RewriteRule trouble you'd run into.
So, how do you utilize a custom pretty URL without using RewriteRule? You use Magento's internal rewrite logic! Magento offers regex-based rewrite logic similar to RewriteRule through its configuration XML:
<config>
<global>
<rewrite>
<some_unique_identifier>
<from><![CDATA[#/search/(.*)/?$#]]></from>
<to><![CDATA[/catalogsearch/result/index/q/$1/]]></to>
<complete />
</some_unique_identifier>
</rewrite>
</global>
</config>
By putting that configuration in one of your modules, Magento will internally rewrite requests of the form /search/foo to /catalogsearch/result/index/q/foo/. Note that you have to use Magento's custom parameter structure (name-value pairs separated by /), as it will not parse query string parameters after it performs this internal rewrite. Also note that you have to specify the full module-controller-action trio (/catalogsearch/result/index/) because otherwise q would be interpreted as an action name, not a parameter name.
This is much better than using a proxy request because it doesn't issue a secondary request, and the rewrite happens in Magento's core route handling logic.
This should be enough to get you completely up and running on the right path. However, if you're interested, you could take this one step further.
By using the above techniques, you'll end up with three URLs for your searches: /search/foo, /catalogsearch/result/?q=foo, and /catalogsearch/result/q/foo. This means that you essentially have three pages for each search query, all with the same content. This is not great for SEO purposes. In order to combat this drawback, you can create a 301 permanent redirect from the second two URLs to redirect to your pretty URL, or you can use a <link rel="canonical"> tag to tell search engines that your pretty URL is the main one.
Anyways, I hope that all of this helps and puts you on the right track!

How do I protect dynamical fed style sheets in Zend Framework 1 from SQL injection?

I am working on a project in Zend Framework 1.12. I want to build a facility that will enable members to dynamically upload a VCSS stylesheet of their choice; thus enabling them to format the page in the colour of their choice. The parameters to their stylesheet is load via the URL;
i.e
The url could be like this: samplewebbsite/?s=rootfolder/stylesheet
we collect it with: $this->view->stylesheet = $this->_request->getParam('stylesheet', ' );
The getParam() gets the distination to their style sheet. i.e: rootfolder/stylsheet.css
I then display the value on the index page i.e:
<link href="<?= $this->stylesheet ?>" media="screen" rel="stylesheet" type="text/css" >
My question now is this: I want to protect the getParam() from javascript/sql injection/bad code etc. How do I protect it? Should I use strip_tags() or is there a better way to protect it?
i think i worked out how to do it; i did this; i simply used strip tags.
$this->view->stylesheet = $stylesheet = strip_tags ($this->_request->getParam('s', ''));
i tried to use the
$alpha = new Zend_Filter_Alpha();
$this->view->stylesheet = $alpha -> filter($this -> _request -> getParam('name'));
But i found that it was also taking out all the break lines i.e the http /:Sameplesite/Rootfolder/ became samplesiteRootfolder
if anyone has a better solution, i would be keen to hear ( ie i would have preferred to use the filter class). but otherwise. i think that my question is pretty much answered.

How do I access a Post Slug in a Tumblr theme?

I want to write a canonical tag into my Tumblr theme, and i need the slug for the (full) url. How can i access the posts-slug within the template? I just have access to the PostId. My current code looks like this:
<link rel="canonical" href="http://domain.com/blog/{block:PostTitle}post/{PostID}{/block:PostTitle}" />
What i want to have is something like this:
<link rel="canonical" href="http://domain.com/blog/{block:PostTitle}post/{PostID}/{PostSlug}{/block:PostTitle}" />
I tried the following tags (which obviously did not work...):
{slug}
{PostSlug}
{Postslug}
What amuses me is, that the API gives out a slug-key on every post, try:
http://(YOU).tumblr.com/api/read?debug=1
Thanks for any hints and suggestions.
Edit: I already scanned http://www.tumblr.com/docs/en/custom_themes for hints - but found nothing useful.
The post slug is not available as a token in Tumblr’s theme DSL. I’m not sure if this is an intentional omission, as post slugs are optional on Tumblr (you can manually set one, but if you don’t your post just goes by its numeric ID). However, you can parse it out of the link inserted by the {Permalink} token, i.e. include it in some hidden element in your template along the lines of
<span class="permalink-url">{Permalink}</span>
(hide the span if you will), then retrieve and parse it with JavaScript:
var plTags = document.querySelectorAll('.permalink-url');
for (i = 0; i <= plTags.length; i++) {
postSlug=plTags[i].replace(/.+\//, '');
// do whatever you want with the slug
}