How can I put a unicode character in the label(constructor) of a gwt checkbox. if I put the character in, gwt escapes the & and I end up with ë in the label of the checkbox instead of ë.
Unicode characters in Java String literals follow a special syntax.
In your case, you could write it like this:
new CheckBox("H\u00ebllo")
The code for "ë" is 00eb - you can use e.g. this table. By the way, 00ebhexadecimal = 235decimal
Another possibility is to save your Java files as UTF-8. Then you can write your literals without escaping for these characters. This however also requires you to set the compiler option -Dfile.encoding=UTF-8. Many IDEs do this automatically, if you set the encoding preference for the file to UTF-8.
Another important factor is that you should set the charset of your HTML page correctly (usually UTF-8):
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
Related
international html files archived by wget
should contain chars like this
(example hebrew and thai:)
אב
הם
and ยคน
instead they are saved like this:
íäáåãéú and ÃÒ¡à§é
How to get the these displayed properly?
iconv filename.html
iconv: illegal input sequence at position 1254
SOLVED: There was nothing wrong.
Only i didnt notice the default php.ini did set the charset in the http header but
to use various charsets like this meta http-equiv="Content-Type" content="text/html; charset=windows-874" you needed to set: default_charset = "empty";
....
The pages aren't "saved like this", whatever you're using to view the file is simply interpreting the encoding incorrectly. To know what encoding the file is in you should have paid attention to the HTTP Content-Type header during download; that's gone now.
Your only other chance is to parse the equivalent HTML meta tag in the <head>, if the document has one.
Otherwise, you can only guess the encoding of the document.
See What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text for more required background knowledge.
I have some .html with the font defined as:
<font color="white" face="Arial">
I have no other style applied to my tag. In it, when I display data like:
<b> “Software” </b>
or
<b>“Software”</b>
they both display characters I do not want in the UIWebView. It looks like this on a black background:
How do I avoid that? If I don't use font face="arial", it works fine.
This is an encoding issue. Make sure you use the same encoding everywhere. UTF8 is probably the best choice.
You can put a line
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
in your html to tell UIWebView about the encoding.
To be precise, “ is what you get when you take the UTF-8 encoding of “, and interpret it as ISO-8859-1. So your data is encoded in UTF-8, which is good, and you just need to set the content type to UTF-8 instead of ISO-8859-1 (e.g. using the <meta> tag above)
You shouldn’t generally use the curly quote characters themselves—character encodings will always mess you up somehow. No idea why it works correctly when you don’t use Arial (though that suggests a great idea: don’t use Arial), but your best bet is to use the HTML entities “ and ” instead.
I use a CellList like this
CellList<String> cellList = new CellList<String>(new TextCell());
and then give it an ArrayList<String>.
If a String contains an "ü" I get a question mark in the browser (FF4, GWT Dev Plugin). If I use ü I get ü
Where can I specify the encoding, so that "ü" works? (I'm not sure if it makes a difference, but the "ü" is currently hardcoded in the .java file and not read from somewhere else).
The GWT compiler assumes, that your Java files are encoded in UTF-8. Make sure, that your editor is set to save in that encoding.
You should also make sure to set the encoding of the HTML page to a unicode capable encoding like UTF-8 (this allows you to use even more exotic characters that you won't find in other charsets):
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
...
Moreover, if you later want to retrieve the strings from a database, make sure, that it is also set up to handle Unicode, and that your JDBC driver connects in Unicode mode (required for some databases).
we built a java ee web project and use jdbc for storing our data.
The problem is that German 'Umlaute' like äöü are in use and properly stored in the mysql database. We don't know why, but in the browser those characters are broken, displaying weird stuff like
ö�
instead.
I've already tried setting the encoding of the jdbc connection like described in this question:
JDBC character encoding
And the encoding of the html page is correctly set:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
Any ideas how to fix that?
Update
connection.prepareStatement("SET CHARACTER SET utf8").execute();
won't make umlauts work.
changing the meta-tag to
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
won't change anything, too
"We don't know why, but in the browser those characters are broken"
Well, that's the first thing to find out. You should trace your data at every stage:
As you fetch it out of the database (with logging)
When you inject it into the page (with logging)
On the wire (via Wireshark)
When you log, don't just log the strings: log the Unicode characters that make up the strings, as integers. Just cast each character in the string to an integer and log it. It's primitive, but it'll tell you what you need to know.
When you look on the wire, of course, you'll be seeing bytes rather than characters as such. You should work out what bytes you expect for your chosen encoding, and check those against what's actually coming across the network.
You've specified the encoding in the HTML - but have you told whatever's generating your page that you want it in ISO Latin 1? That's likely to be responsible for both setting the content-type header and performing the actual conversion from text to bytes.
Additionally, is there any reason why you're using ISO Latin 1 instead of UTF-8? Why would you deliberately restrict yourself like that? (ISO Latin 1 can only handle the first 256 characters of Unicode, instead of the full range of Unicode characters. UTF-8 can handle everything, and is just as efficient for ASCII.)
I'm developing a web app with Lift Framework, GlassfishV3 and there is a problem with diacritics in my app. I do just value binding to model and when I log the value from input text field, the diacritics letters are already broken. Where could possibly be the problem?
bind("entry",content,
"place" -> SHtml.text(lib.place, lib.place=_),
"submit" -> SHtml.submit("Kaboom", () => {
Logger.getAnonymousLogger.severe(lib.place)
Service.library.save(lib)})
)
It's probably a general java problem, not limited to Lift.
I enter š and I see Å¡ as the output from logger.
Would it make you feel better or worse to know that the source of your entire problem is located between the keyboard and the back of your chair?
Here's what happened:
You wanted to print out š, a lower-case s-with-caron, which is represented in Unicode by the number 0x161. You printed it out to a file, and your I/O system dutifully (and correctly) encoded it in UTF-8 as 0xC5, 0xA1. Then you asked view to that file without explaining to your viewing program that it was a UTF-8 file. Your viewing program, whatever it was, interpreted the file as ISO 8859-1, a very common, if somewhat elderly, format. The 0xC5 was displayed as Å, A-with-a-ring, and the 0xA1 as ¡, an inverted exclamation mark.
To summarize, there's nothing wrong with the output, there's just something wrong with the way you are looking at it. Bring the log up in an editor and set the encoding to UTF-8 or bring it up in a web browser and select View / Character Encoding / UTF-8 .
My guess would be that the clue to this might be the browser. What encoding does the browser assume for the page? Do you have an encoding meta-tag in the head; like this:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
This could well be a problem in the logger or your console. Try logging to a file and opening that in an editor you know can handle UTF8
so one option to get UTF-8 encoding is to specify encoding in sun-web.xml file like this:
<sun-web-app error-url="">
<parameter-encoding default-charset="UTF-8"/>
</sun-web-app>
the other option is to set encoding in lift bootstrapping class:
def boot {
LiftRules.early.append(makeUtf8)
}
private def makeUtf8(req: HTTPRequest) {
req.setCharacterEncoding("UTF-8")
}