AEM/CQ5 Request Parameter not UTF-8

AEM/CQ5 Request Parameter not UTF-8 - aem

Have a slight issue with AEM 6.0 SP1 and the search component. If searching for a french word like "Français" the "ç" gets messed up
the query string is like ?q=Français
on the JSP side, request.getCharacterEncoding() returns ISO-8859-1 instead of UTF-8 like we need.
I know that when under tomcat you can change the URIEncoding at the connector level.
But for an AEM/CQ instance running directly by itself, there is no such thing.
Anyone figured this out?

For 5.6.1 : The default encoding can be set in the configuration of Apache Sling Main Servlet . In the Configuration Manager
(<domain>:<port>/system/console/configMgr) look for Apache Sling Main Servlet and configure Default Parameter Encoding property.
For 6.0 (credits - Francois Cournoyer) : The configuration has been moved to Apache Sling Request Parameter Handling
Configure Temporary File Location to point to an absolute path in case of errors while saving the configuration

For forms within CQ always have a hidden field with the charset set to UTF-8 or the charset of your HTML:
<input type="hidden" name="_charset_" value="UTF-8"/>
This will ensure proper encoding when the servlet retrieves the post.

Need to add the Default charset change as part of response header.
Here is the detailed process
http://localhost:4502/system/console/configMgr
-->Apache Sling Main Servlet
--> Additional response headers add the below entry
Content-Type=text/html;charset=utf-8

Related

How do I prevent mule from encoding the query-param?

In a HTTP request, I am adding a token as a query-param.
It seems that mule is encoding the value.
- If I add the parameter, mule will encode it in the way it is wrong.
- If I add already encoded parameter, mule will double encode it and therefore won't be usable anymore.
So the question is: Is there a way or a workaround to prevent mule from encoding the URL query-param?
Example of the parameter: {AES}ZEoksxIg484magPtWwNUUQ==;iT0kI2HsqGkh%2Bdc2baW2B4dNR2vouKkWQsDTdbMP8us=

My colleague found a workaround for this, so I'm sharing it here.
Apparently, you can set a variable before the HTTP request and add the manually encoded value. Let's call it ourTokenVariable In my example above that would be %7BAES%7DZEoksxIg484magPtWwNUUQ%3D%3D%3BiT0kI2HsqGkh%252Bdc2baW2B4dNR2vouKkWQsDTdbMP8us%3D
After that, you can use this newly created variable directly in the url path. For example: /example/someapi?someToken=[#flowvars.ourTokenVariable]
This way you don't need to use uri-param or query-param anymore (where mule is double encoding the value). The value will be taken 'as is'.

TYPO3 7.6: 404 error page: HTML wrapped in numbers

I created my own “404 Page not found” error page on a TYPO3 website and implemented it via the /typo3conf/LocalConfiguration.php as follows, using the page’s Speaking URL path:
return [
...
'FE' => [
...
'pageNotFound_handling' => '/page-not-found/',
]
]
Now when I call a non-existing page, the error page gets displayed but there is a 4-digit alphanumeric number (hexadecimal as far as I’ve seen by now) BEFORE the HTML source code and a “0” AFTER it. Example (the number in the beginning is different after most of the reloads):
37b3
<!DOCTYPE html>
...
</html>
0
When calling the error page URL itself the page is returned correctly without those numbers.
Having the RealURL extension activated or deactivated does not make a difference.
Thanks a lot in advance!

I added the full description from the install tool and I guess we might find the solution there.
How TYPO3 should handle requests for non-existing/accessible pages.
empty (default)
The next visible page upwards in the page tree is shown.
'true' or '1'
An error message is shown.
String
Static HTML file to show (reads content and outputs with correct headers), e.g. notfound.html or http://www.example.org/errors/notfound.html.
Prefix "REDIRECT:"
If prefixed with "REDIRECT:" it will redirect to the URL/script after the prefix.
Prefix "READFILE:"
If prefixed with "READFILE" then it will expect the remaining string to be a HTML file which will be read and outputted directly after having the marker "###CURRENT_URL###" substituted with REQUEST_URI and ###REASON### with reason text, for example: READFILE:fileadmin/notfound.html.
Prefix "USER_FUNCTION:"
If prefixed with "USER_FUNCTION:" a user function is called, e.g. USER_FUNCTION:fileadmin/class.user_notfound.php:user_notFound->pageNotFound where the file must contain a class user_notFound with a method pageNotFound() inside with two parameters $param and $ref.
What you configured:
You're passing a string, thus TYPO3 expects to find a file - which you don't have, because it's more like an URL.
From what you try to achieve I'd go with REDIRECT:/page-not-found/.
Thanks for pointing this one out btw, I will remove the string configuration from the core since it does not make sense to have more people trip into this pitfall.

In short: change the following line in the FE section of your LocalConfiguration.php:
'pageNotFound_handling' => '/your404page.html',
to
'pageNotFound_handling' => 'REDIRECT:/your404page.html',

Cause
The actual cause is a combination of chunked Content-Encoding and the TYPO3 not being able to decode that in some cases. In your case the page not found handler eventually uses GeneralUtility::getUrl() to retrieve the error page.
If you have [SYS][curlUse] enabled it will use cUrl to retrieve the page and there is no problem.
If you don't have [SYS][curlUse] enabled it will open a socket, read the headers and then read the rest of the body. If the webserver uses "chunked" Content-Encoding the body will contain blocks of data and each block starts with a line with the length in hexadecimal format. The content ends with an empty block (with of course a line with the length "0").
cUrl apparently knows how to decode chunked data.
getUrl() itself does not know how to handle chunked data and uses the content as is as the page content.
In TYPO3 8 LTS the guzzle library is used to handle HTTP requests. In the guzzle code I can't find anything about handling chunked data. Guzzle will check if the cUrl PHP extension is present and use that as preferred transport. In most installations cUrl is present and since this decodes chunked data automagically no problem is visible. I have to test guzzle with PHP that has cUrl disabled to see if the issue is also present in v8/master.
Workaround/solution
If the PHP extension cUrl is enabled in your installation you can simply set [SYS][curlUse] in the Install Tool. The numbers around the 404 page content will disappear.

Perl - XML::LibXML: bad parsing performance on Apache2 default page

I was testing some code and parsing XML was included. For simple testing I requested / of my localhost and the response was my Apache2 default page.
So far, so good.
The response is XHTML and therefore XML. So I took it for my parsing (~11k of size).
XML::LibXML->load_xml (string => $response);
It takes about 16s till it finishes with no error.
If I give it an other xml-file with double the size if need 0 time.
So...why????
Apache/2.4.10
Debian/8.6
XML::LibXML/2.0128
EDIT
I need to mention that I removed the non-XML HTTP-header.
So the string starts with
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
and ends with
</html>
EDIT
Link: http://s000.tinyupload.com/index.php?file_id=88759644475809123183

One possibility is that each time you parse the document the parser is downloading the DTD from W3C. You could confirm this using strace or similar tools depending on your platform.
The DTD contains (among other things) the named entity definitions which map for example the string to the character U+00A0. So in order to parse HTML documents, the parser does need the DTD, however fetching it via HTTP each time is obviously not a good idea.
One approach is to install a copy of the DTD locally and use that. On Debian/Ubuntu systems you can just install the w3c-dtd-xhtml package which also sets up the appropriate XML catalog entries to allow libxml to find it.
Another approach is to use XML::LibXML->load_html instead of XML::LibXML->load_xml. In HTML parsing mode, the parser is more forgiving of markup errors and I think also always uses a local copy of the DTD.
The parser also provides options which allow you to specify your own handler routine for retrieving reference URIs.

Wrong encoding when saving forms on Orbeon

I created my own persistence for SQL Server, and the CRUD works fine,
BUT I'm having some trouble with the enconding i think,
i receive the xml text from the XForms like that when i'm going to save something
?xml version="1.0" encoding="UTF-8"?xhtml:html xmlns:xhtml="http://www.w3 ...............
metadata
application-name w4/application-name
form-name usuario/form-name
title xml:lang="en"Cadastro/title
description xml:lang="en"UsuÃƒÂ¡rio/description ---------PROBLEM!!!
metadata
xforms:instance....................
Any ideas how to solve this??

In general, you need to make sure, when you are decoding the XML, to properly deal with the character encoding. How exactly to do that depends on the programming language or framework you are using, but you should:
if possible use an XML parser and just feed it the bytes (the parser will take care of handling the encoding by itself)
never assume a default or platform encoding when converting bytes to characters (Java in particular has a number of APIs which, for very wrong reasons, use a default encoding which is platform-dependent)

Spring 3 Form Data Not in utf-8 Encoding

My Spring 3 project is configured with three filters: encoding, spring-security, urlRewrite. I have done all things needed in regarding of encoding according to this document: http://wiki.apache.org/tomcat/FAQ/CharacterEncoding. I, however, can't get the string encoding correctly. As a result, I need to make the encoding conversion to get the encoding right. According to the document, a utf-8 filter is the only thing needed to solve this problem. I test the filter. If I remove the filter or place it as the second in the filter order, the request encoding will be null and the response encoding will be ISO-8859-1 on a controller while request encoding is null and response encoding is utf-8 in a JSP file. And removing the filter has not any impact on the form data from the POST method.
I run out of idea. Anything missing?

Never mind. I have found the source of this problem: Spring Tool Suite or STS. The code works fine outside of STS.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

AEM/CQ5 Request Parameter not UTF-8 - aem

For forms within CQ always have a hidden field with the charset set to UTF-8 or the charset of your HTML: <input type="hidden" name="_charset_" value="UTF-8"/> This will ensure proper encoding when the servlet retrieves the post.

Need to add the Default charset change as part of response header. Here is the detailed process http://localhost:4502/system/console/configMgr -->Apache Sling Main Servlet --> Additional response headers add the below entry Content-Type=text/html;charset=utf-8

Related

How do I prevent mule from encoding the query-param?

TYPO3 7.6: 404 error page: HTML wrapped in numbers

Perl - XML::LibXML: bad parsing performance on Apache2 default page

Wrong encoding when saving forms on Orbeon

Spring 3 Form Data Not in utf-8 Encoding

Categories

Resources