jboss valve encoding problem while url rewriting - jboss

I have an app., coded with ejb3, jsf and maven, which runs on jboss 4.2.2GA
The problem I have been facing for 2 days is I cannot convert non-english characters that are added to url on runtime. For instance, there is a search textbox and a button. When a user enters a word including non-english characters, and pushes the button, it is added to the url with bad characters like %56 or &347 etc..
Is there any way to achieve what I am trying to do here? BTW, is there also any way to get over this problem on the jboss side configuration rather than application side (filters or context.xml etc..)?
Any help would be appreciated
Thanks a lot,
Baris
--
EDIT: I have solved this issue by using URLEncoder. When I passed the variable to the action method, I use URLEncoder in order to encode it to the right charset.
Example:
Take parameter from the URL:
String someString = ServletActionContext.getRequest().getParameter("someStringFromURL");
Encode the string;
String encoded = URLEncoder.encode(someString, "ISO-8859-9");

Find the appropriate connector element in your tomcat server.xml (deploy/jboss-web.deployer/server.xml for recent versions) and add the attribute URIEncoding with a value of UTF-8.

I have solved this issue by using URLEncoder. When I passed the variable to the action method, I use URLEncoder in order to encode it to the right charset.
Example: Take parameter from the URL:
String someString = ServletActionContext.getRequest().getParameter("someStringFromURL");
Encode the string;
String encoded = URLEncoder.encode(someString, "ISO-8859-9");

Related

OWASP Decoder issue in JSP

I am using OWASP encoder to encode my strings, but the string is not decoding in JSP pages. It is showing as it is like encoded string, e.g. My original String is l&t.com. After encoding, the string is "l'&amp';t.com", but again it should decode in JSP which is not happening. Can any one please suggest. I am using utf-8 meta tag also in JSP .
Any help much much appreciated .Thanks
I found the solution for this. We need to add escapeXml = false for the respective value, then this will be resolved.

ActionMailerNext double dot or double full stop

I am using ActionMailerNext Standalone v3.2 in a console application. I populate the model from the database and send the email using email template with IEmailResult. But it replaces a dot with double dot at random places. If the email contains an url to an image say image.png, it appears as image..png. It sometimes happens with full stops at the end of the sentence as well. Has someone come across something like this before or is this something else?
I had to specify the MessageEncoding property in ActionMailerNext.Standalone.RazorMailerBase.MailAttributes which has solved the issue.

characters allowed on CQ5 file name

When uploading a file(an image to the dam folder as an example) on CQ5 using CRXDE lite or other UI interface, the system would give an error message if the file being uploaded has invalid characters.
I just found out that [ and ] are not allowed as part of file names.
But when uploading file using a non-UI interface, SlingPostServlet for example, the character [ gets replaced with the percent encoding representation(%5D) and no error was generated.
Is there some kind of list/doc that would show which characters are not allowed in CQ5?
I am using CQ5.4
Thank you
The JCR naming restrictions are described in the Repository Model section of the JCR specification.
Specifically, the following characters are not allowed:
(“/”, “:”, “[“, “]”, “|”, “*”)
There is a com.day.cq.commons.jcr.JcrUtil class has a createValidName(title) method which may be able to help you. By default the STANDARD_LABEL_CHAR_MAPPING will replace all illegal characters with an underscore. It is possible to use the HYPHEN_LABEL_CHAR_MAPPING to replace with a hyphen using the createValidName(title, labelCharMapping[]) method.
Clientlibs used by the coral-ui on AEM's backend use the following regex to filter:
_ILLEGAL_FILENAME_REGEX: /[\".%/\:*?[]|\n\t\r ]|[\x7f-\uffff]/g
Here's a sample:
var text = "äüö?abcdefghijklmnopqrstuvwxyz!\"§$%&/()=?´`+*#'-_.:,;<>^°";
var regex = /[\"\.%/\\:*?\[\]|\n\t\r ]|[\x7f-\uffff]/g;
console.log(text.toLowerCase().replace(regex, '-'));

Handling multiple encodings in a single file

I'm encountering some weird encoding issues. I need to parse an HTML document from the web, and I'm using the 'Content-Type' charset meta-data to determine the encoding type.
One page has been giving me trouble and is encoded by 'Shift_jis' (Japanese) - The parser result contains some garbled characters.
When I parse the same document using UTF-8 the characters that were garbled before are parsed correctly but everything else is now garbled.
I'm assuming the document contains text in two different encoding types.
I there anyway I could parse this document correctly ?
Also, I don't how, but all the browsers seem to deal well with the issue and are presenting the page nicely.
Would really appreciate any thoughts on this.
The page that I need to parse : http://ao.recruit.co.jp/form.html
First of all, what the browser sees is:
莨夂、セ讎りヲ
What is shown in rendered html is not the same because of the CSS text-indent: -9999px and the background image laid over it. But it's there. Removing them will show the text browser is seeing.
Out of the box, decoding as Shift-Jis should give you 莨夂、セ讎りヲ?, but if you want same results as in a browser, you should use a custom CharsetDecoder with IGNORE:
URL url = new URL( "http://ao.recruit.co.jp/form.html");
BufferedInputStream bis = new BufferedInputStream(url.openStream());
CharsetDecoder decoder = Charset.forName("Shift-Jis").newDecoder();
decoder.onMalformedInput(CodingErrorAction.IGNORE);
decoder.onUnmappableCharacter(CodingErrorAction.IGNORE);
Reader inputReader = new InputStreamReader(bis, decoder);
String result = IOUtils.toString(inputReader);
System.out.print(result);
This will give you same result as with browsers. Of course, it won't parse the text from the image file.

Sinatra encoding query string

I've made a very simple sinatra application which displays a frameset with 3 frames. When I set the 'src' parameter for the the frames however, sinatra re-encodes the query string I chose.
For example I enter:
"url.com/page?var1=val1&var2=val2"
What I end up seeing however is something like:
"url.com/page?var1=val1&var2=val2"
All my &'s were turned into &-a-m-p-;'s. Is there anyway to disable this? Why does this happen?
Thanks,
This is probably happening because you have an "escape all html" flag turned on somewhere. Your template language should support flagging strings as "safe"--check out http://www.sinatrarb.com/faq.html#auto_escape_html for more details.