Sitecore: Valid item names - web-config

How do I expand the list of valid characters in item names, to include æøåÆØÅ?
As per default the valid characters seems to be defined by this rule in web.config:
<setting name="ItemNameValidation" value="^[\w\*\$][\w\s\-\$]*(\(\d{1,}\)){0,1}$" />
changing the regex to :
<setting name="ItemNameValidation" value="^[\wæøåÆØÅ\*\$][\wæøåÆØÅ\s\-\$]*(\(\d{1,}\)){0,1}$" />
Should in theory allow the characters, but that just "kills" the sitecore.
Edit:
A regex that allows dots, are working perfectly like this:
<setting name="ItemNameValidation" value="^[\w\*\$][\w\.\s\-\$]*(\(\d{1,}\)){0,1}$" />
So I am allowed to change some aspects of it, just not for the æøå characters?!?!?
Note:
- Using æøå in item names is for some reason possible from the "Page Editor", when creating and saving new content items, but it is not possible to do the same from the "Content Editor"!
- We are using SC v6.6.0 (rev. 120918).
Cause of error was not saving the file as UTF-8

Make sure your config file is saved as "UTF-8"
A bit late, but adding as an answer :)

Cause of error was not saving the file as UTF-8

Related

How to add extra "end characters" to the "First sentence should end with a period." Checkstyle rule

I'm trying to make it so that : is also a valid end of the line character.
Found out the rule is : JavadocStyle and the property endOfSentenceFormat however when i use the value listed as the default, i get an error in eclipse about parsing of the checkstyle file failing.
After looking on the web, i found out that it's because the '<' from the default value ([.?!][ \t\n\r\f<])|([.?!]$) need to be escaped since it's an XML file. Didn't get an error as i was using a text editor myself.
So the value if i wanna add ":" as valid end of the line needs to be
<module name="JavadocStyle">
<property name="endOfSentenceFormat" value="([.?!:][ \t\n\r\f<])|([.?!:]$)"/>
</module>
Hope it helps as there were not a whole lot of documentation.

How do I use FXML and properties files with non-Latin characters?

I need to create i18n properties files for non-Latin languages (simplified Chinese, Japanese Kanji, etc.) With the Swing portion of our product, we use Java properties files with the raw UTF-8 characters in them, which Netbeans automatically converts to 8859-1 for us, and it works fine. With JavaFX, this strategy isn't working. Our strategy matches this answer precisely which doesn't seem to be working in this case.
In my investigation into the problem, I discovered this old article indicating that I need to use native2ascii to convert the characters in the properties file; still doesn't work.
In order to eliminate as many variables as possible, I created a sample FXML project to illustrate the problem. There are three internationalized labels in Japanese Kanji. The first label has the text in the FXML document. The second loads the raw unescaped character from the properties file. The third loads the escaped Unicode (matching native2ascii output).
jp_test.properties
btn.one=閉じる
btn.two=\u00e9\u2013\u2030\u00e3\ufffd\u02dc\u00e3\u201a\u2039
jp_test.fxml
<?xml version="1.0" encoding="UTF-8"?>
<?import java.lang.*?>
<?import java.util.*?>
<?import javafx.scene.control.*?>
<?import javafx.scene.layout.*?>
<?import javafx.scene.paint.*?>
<?scenebuilder-preview-i18n-resource jp_test.properties?>
<AnchorPane id="AnchorPane" maxHeight="-Infinity" maxWidth="-Infinity" minHeight="-Infinity" minWidth="-Infinity" prefHeight="147.0" prefWidth="306.0" xmlns:fx="http://javafx.com/fxml">
<children>
<Label layoutX="36.0" layoutY="33.0" text="閉じる" />
<Label layoutX="36.0" layoutY="65.0" text="%btn.one" />
<Label layoutX="36.0" layoutY="97.0" text="%btn.two" />
<Label layoutX="132.0" layoutY="33.0" text="Static Label" textFill="RED" />
<Label layoutX="132.0" layoutY="65.0" text="Properties File Unescaped" textFill="RED" />
<Label layoutX="132.0" layoutY="97.0" text="Properties File Escaped" textFill="RED" />
</children>
</AnchorPane>
Result
As you can see, the third label is not rendered correctly.
Environment:
Java 7 u21, u27, u45, u51, 32-bit and 64-bit. (JavaFX 2.2.3-2.2.45)
Windows 7 Enterprise, Professional 64-bit.
UPDATE
I've verified that the properties files is ISO 8859-1
Most IDEs (NetBeans at least) handle the files in unicode encoding by default. If you are creating the properties files in NetBeans and entering the Japanese text in it, then the entered text will be automatically encoded to utf. To see this open the properties file with notepad(++), you will see that the Japanese characters are escaped.
The utf escaped equivalent of "閉じる" is "\u9589\u3058\u308b", whereas "\u00e9\u2013\u2030\u00e3\ufffd\u02dc\u00e3\u201a\u2039" is "é–‰ã�˜ã‚‹" on reverse side. So the program output in the picture is correct. Additionally, if you reopen the jp_test.properties file in NetBeans, you will see the escaped utf encoded texts will be seen as decoded.
EDIT: as per comment,
Why does it do this?
It maybe because you are omitting the -encoding parameter of native2ascii, then the default charset of your system may not be UTF. This maybe the reason of that output.
Also, why is it that Java and Swing have no problems with our properties files as they are,
but FXML can't handle it?
It cannot be the case, because the "FXML is a Java". The only difference may also be the "usage of system charset" vs "overriding the charset in some configuration place".
Anyway, I suggest using right encoding parameter of native2ascii according to the input files encoding. More specifically, convert the properties files to utf-8 encoding first then do the rest. If you are using NetBeans as IDE, then no need for native2ascii.
Properties files should be ISO 8859-1 encoded, not UTF-8.
Characters can be escaped using \uXXXX.
Tools such as NetBeans are doing this by default, AFAIK.
http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html
http://docs.oracle.com/javase/tutorial/i18n/text/convertintro.html

How can I spell check for multiple languages in emacs?

I write mostly my documentation in HTML using emacs as my main editor. Emacs let you interactively spell-check the current buffer with the command ispell-buffer.
Since I switch between a number of languages, I have an HTML comment at the end of the file specifying the main dictionary and personal dictionary for that file, E.g. for Norwegian (norsk) I use the following pair of dictionaries:
<!-- Local IspellDict: norsk -->
<!-- Local IspellPersDict: ~/.aspell/personal.dict -->
This works great.
However, sometimes I have a paragraph in another language (e.g. English) embedded in an otherwise Norwegian document. Example:
<p xml:lang="en">This paragraph is in English.</p>
The spell-checker naturally flag all the words in such a paragraph as misspellings (since the dictionary only contain Norwegian words).
To avoid this, I've tried to add a "british" dictionary to the document, like this:
<!-- Local IspellDict: british -->
<!-- Local IspellDict: norsk -->
<!-- Local IspellPersDict: ~/.aspell/personal.dict -->
Unfortunately, this does not work. The "british" dictionary is simply ignored.
My prefered solution would to load an additional dictionary and use this, toghether with the primary dictionary, for spell-checking. Is this possible?
However, I am also interested in a solution that let me mark paragraphs for not being spell checked. It is not ideal, but it would stop valid English words from being flagged as misspellings.
PS: I have also looked at the answer to this question: Multilingual spell checking with language detection, but it is much broader and does not address the specific use emacs ispell for doing the spell-check.
Try ispell-multi and flyspell-xml-lang http://www.dur.ac.uk/p.j.heslin/Software/Emacs/
You can spawn multiple instances of ispell, and use the xml:lang tag to decide which language to check for.

What causes Tuckey's UrlRewriteFilter to malform urlencoded unicode characters (e.g. %C3%B6 for ö) and how can I avoid it?

We are using a simple UrlRewriteFilter rule to permanently (301) redirect HTTP requests without trailing slash to the same URL with trailing slash.
In some cases our presentation layer needs URLs with encoded special characters (e.g. %C3%B6 for ö) in it, which works fine as long as the UrlRewriteFilter is not involved. But when the rule kicks in I can see the encoded character getting malformed while redirecting, e.g.
www.mydomain.com/asdf%C3%B6asdf/ --> 301 --> www.mydomain.com/asdf%F6asdf/
%F6 not being a valid unicode sequence (ending up as question mark in black diamond when urldecoded).
We use UTF-8 throughout our application, it's set in response headers as well as in the HTML's <head> section. The malformed encoding occurs on Windows and Linux machines. The rewrite rule looks as follows
<rule enabled="true" match-type="regex" >
<name>Force trailing slash</name>
<note>...</note>
<condition type="request-uri" operator="notequal">...>/condition> <!-- some URLs shall not be redirected -->
<from>(^[^\?]*)(\?.*)?$</from>
<to type="permanent-redirect" last="true" >$1/$2</to> <!-- adding trailing slash and query string, if present -->
</rule>
I'd be happy for any ideas how this could be solved. I've played with the decode-using and encode attributes, but it did not help.
I had a similar problem. what I did was set decode to null :
<urlrewrite decode-using="null">
The issue I described below seems to be related to this bug report, which has been filed in 2010 and is untouched since then. I'll probably have to work around this by handling the request "manually" using Java. Other ideas are still welcome, though.

cruisecontrol config.xml special characters svnbootstrapper

I have a line like :
<svnbootstrapper LocalWorkingCopy="${projects.dir}/${project.name}" Password="4udr=qudafe$h$&e4Rub" Username="televic-education" />
in my config.xml. Because of special characters in the Password cruisecontrol service won't start. Is there a way to solve this?
Maybe with setting a property? Or escaping characters?
thx, Lieven Cardoen
Replace & with & in the password attribute. I don't know if that's the problem, but it's definitely a problem.