I have a few different languages in a gwt project. For some reason the accent characters for locale_fr.properties works fine but the accent characters in locale_es.properties do not display correctly (show as "?" in a triangle). The host html page is encoding using utf-8 as well as the project .xml file, so it doesn't seems like the issue is with what type of encoding I'm using. Can't seem to figure out why the accent characters would work for one language but not another. No difference in how the .properties files are set up from what I can tell...any suggestions on what could be causing this issue?
Related
I am using Redactor.
When pasting in text whether its from Word or Textedit I am getting really odd characters after saving.
For example an apostrophe is being changed to ’ and any time I open that page and re save it adds even more characters like this ‚¢ÃƒÂÂ
The page uses utf-8 charset
Any help would be great. Thank you
I am developing a website in the Georgian language. The Georgian alphabet has its own Unicode range, but there are also special fonts which have Georgian glyphs in place of English characters, a bit like the "Symbol" and "Dingbats" fonts.
For example the string "saqarTvelo" will be rendered as "საქართველო" with these fonts. So now I have two options and don't know what to do:
Using Georgian Unicode for my website, but the problem is that all fonts are created for English Unicode, and don't work with Georgian Unicode.
Using Georgian fonts with English Unicode. But I don't know how search engine will react.
Please tell me what to do, I am stacked!
The short answer is that using the approach you mean in option 1, search engines will see the word “საქართველო” in your text as “saqarTvelo”, so normal searches will fail.
The question seems to refer to two different ways of using Georgian letters on web pages:
Using Unicode encoding, so that characters will be rendered using an Unicode-encoded font (which is what most fonts are, but most fonts don’t contain Georgian letters).
Using a nonstandard, “private” encoding, usually one that maps 256 different code positions (8-bit combinations) to whatever characters are needed for some purposes. This presumes that the text is rendered using a font encoded the same way.
Method 2 can be characterized as a wrong approach, but it has been used on the web since the early days (even when CSS was not available and one had to resort to <font face=...> for setting font), and especially in the early days. It really does not work unless the user’s computer has the specific, “privately” encoded font (or some font encoded exactly the same way). Since search engines are font-agnostic, they only see the 8-bit codes and try to interpret them in the encoding declared or implied for the page, not in the “private” encoding (which cannot be declared since it has no published definition and no standard name, or any name for that matter).
Method 1 has the problem that for it to work, the user’s computer needs to have some (Unicode-encoded font) that supports the characters used. Nowadays, this can be reasonably well solved using a downloadable font (web font) via #font-face. Fonts that support Georgian letters include some useful free fonts like DejaVu fonts, GNU Freefont fonts, and Quivira. For more info on this approach, see my Guide to using special characters in HTML.
Using method 1, search engines will see the Georgian letters correctly, provided that the document’s encoding (normally UTF-8) has been properly declared or can be inferred by the search engine.
I have a search index that handles both English and Chinese content. All documents are imported to SOLR by Solr.NET
When I search the Chinese content from the browser (localhost:8389/solr/...) everything works fine, but when I execute the search using Solr.NET then I get 0 hit :(
I tried to set up a logger to see what's the difference between browser search and Solr.NET search but I get question marks (??????) instead of Chinese characters.
Any help would be greatly appreciated!
Open solr0.log with your browser instead of Notepad++ in order to make sure that this is a real problem and to find out which encoding the file was saved as - if it's not Unicode, chances are that Solr.NET can't search it.
Notepad++ has been known to have trouble with Chinese, either because of the encoding (Notepad++ doesn't support typical Chinese encodings like gb or big5) or because of the default font, which may or may not support Chinese characters. Browsers are more forgiving and will do anything in order to display the file correctly. In fact, to convert non-unicode Asian encodings to unicode, I often open a file in my browser, which will detect the correct encoding and display the raw contents, and then CTRL+A CTRL+C CTRL+V everything into Editor, save as UTF-8, done.
I have a wordpress installation that clients can edit, all characters display ok. On the main homepage I query the same database for the same title and post content, but it doesn't display correctly - just a question mark
I have tried sending the utf8 headers manually, through htaccess and through meta tags. I have used SET name UTF8 (which turns the characters into the diamond symbol with a questionmark inside).
I genuinely cant figure out what it could be now and I really need these characters to display correctly.
Heres the homepage, you can see in the Sounddhism 6 preview that there are lots of question marks, if you click on it you will see what they are meant to look like
http://nottingham.subverb.net
I have passed it through the validator and it gives me this error:
Sorry, I am unable to validate this document because on line 373 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.
The error was: utf8 "\xA0" does not map to Unicode
Which, i appreciate is supposed to help me, but I don't know what to do about it. Especially since that line, the letter generating the error is supposed to be a space and is AFTER the offending question marks.
Can anyone help?
Compare the encoding of both the back-end scripts in Wordpress and also your homepage script. If you're using IE, right-click the page and check the encoding. Sometimes it's set to "Auto-detect" and IE will often detect a different encoding for different pages, causing strange issues like this.
If you're not using IE, try using a tool like Fiddler to see exactly what encoding (and what bytes are being sent back and forth both in the back-end and your homepage script.
If forcing UTF-8 on your homepage script doesn't work, I would guess that the back-end is not using UTF-8.
I have a file with Chinese text that I want to use in my XCode project (I'm planning to load it through a database as it is lot of text), the problem is I don't know how to add the font to my project so that it's viewable when used on an iPhone?
Thanks :)
I currently live in China and deal with this all of the time. Usually the problem is not the font, it's the way the characters are represented. All unix variants use UTF-8 (most OSes) Windows uses UTF-16/32 (I forget). The cool thing about UTF-8 is that it is backward-compatible with ASCII. Open your text in the TextEdit or Firefox. In Firefox you can tell the browser to try different encodings, then save it to a file. If it is the wrong encoding, Mac TextEdit can convert between UTF-8 and UTF-16. Once you have the string in UTF-8 encoding, you can display it in your text field.
When displaying text to a textfeild make sure to display a UTF-8 string, not an ASCII string.
If you are interested in the details of UTF-8, just say so and I will expand on the UTF-8 design.
rw
The iPhone already has chinese fonts installed by default.
I've had some success using the FontLabel library. It allows you to use arbitrary .ttf fonts in your app and it's Apache-licensed:
http://github.com/zynga/FontLabel
For the majority of cases this has worked perfectly for me.