Encoding UTF-8 for Czech chars - encoding

I want to ask you, as a beginner, what basic settings for the document encoding are you doing with UTF-8?
An example how I do it below and am asking about repair if something is wrong. I want to rely on all devices in different browsers with different user settings will render the text as it should, so I will do the following:
I use Notepad ++ , first in the Format tab choose "change the encoding to UTF-8 (if its already not)";
Because I use <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> mostly or . <!DOCTYPE html>, then select the correct attribute for the meta tag in the head, so either <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> , respectively . <meta charset="UTF-8" />
I'm concerned mainly about the Czech characters
Am I right or isn´t it that simple if I expect cooperation between HTML, PHP or JS, maybe MySQL?
Thank you for your answers and sorry for incomplete English.

If you read text from a Database make sure that it is set to utf8 and that the columns are as well. Then you can use SET NAMES UTF8 to make sure the connection encoding is utf8 as well. Just make it your first query to the databse.

Related

Set Chinese Fonts on HTML Emails (Outlook)

Is it possible to set a Chinese font on HTML Emails for Outlook 2013? I want to be able to change the style of the punctuation for commas and full stop.
So it'll look similar to the Microsoft JhengHei font instead of the SimSun font.
There are a couple things you can do to make sure Chinese characters display in web or email. First, some code for the email <head>:
<!DOCTYPE html>
<!--
Set HTML language attribute
zh = Chinese
zh-Hans = Chinese (Simplified)
zh-Hant = Chinese (Traditional)
-->
<html lang="zh" xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office">
<head>
<!--
utf-8 works for most cases, including Chinese
-->
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
</html>
You must make sure that you save your document in UTF-8 format and upload the document to your server or ESP so that the format is preserved. Some editors won't do or aren't configured like this by default, so you may need to check on that.
But ultimately these fonts won't display if a user doesn't have them installed on their local system. Specifying an appropriate font stack behind Microsoft JhengHei will help ensure that something shows up.

A trouble with czech encoding

I'm new here and I have a question about an encoding.
I created a simple html page and I use czech characters in it (ěščřžýáí)
But when I open it in a browser, the characters are deformed and they look... Russian... and the encoding is set to "windows-2051" instead of "windows-2050" as it should.
So I added this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//CZ" "http://www.w3.org/TR/html4/strict.dtd">
And this:
<meta charset="windows-1250">
But it didn't help. Still looks russian. So, could you, please, help me?
TL:DR version:
Shows "dnщ zbэvб do zaибtku novй шady" instead of "dnů zbývá do začátku nové řady"
Thank you very much!
You could use UTF-8? Make sure your editor is also saving as UTF-8 Read this helped me a lot.
Also, for HTML-4, you need something more like this <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">

Name of uploaded files in unicode

I´m having problems getting correct names of files uploaded to a NancyFx web.
I´m Spanish and we have no common characters like
ñ á é í ó ú... in uppercase and many more.
When I pick the file already uploaded from this.Request.Files.FirstOrDefault().Name then the names are always bad encoded.
I tried a lot of transformations with no success.
Any suggestions are highly appreciated.
Does your HTML page contain a
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
within the <HEAD> element?
I have same experience with Korean file name.
And after some more googling, I found this nancyfx github issue: https://github.com/NancyFx/Nancy/issues/1850
It's fixed bug. (but I am using nancy 0.x version, so it did not helped me.)

how to make german web site

what type of encoding or what do I have to do to make my web site display properly the text with German characters like this: Käse and not like this: K�se ?
Here is what I use for doctype:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
and here is what I use for encoding:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
the collation in mysql that I use is utf8_general_ci, I have never done web sites with other languages except for english (from scratch). I dont know what I am missing!
Thank you for your time!
Your encoding choise looks fine.
There is just two steps left: You have to make sure that the content type in the HTTP header also says the same, and you have to make sure that what you actually send is encoded using UTF-8.
UTF-8 should used for sites that cater for many languages, so is suitable for your needs.
The meta tag is correct too, though you may want to ensure that the server is sending the right Content-Type header.
Ensure that the HTML file is also encoded with UTF-8 and not ASCII or another codepage.
In general, you need to ensure that all steps from the DB to the browser use UTF-8 (so, DB columns are UTF-8, transferred to the server as UTF-8, rendered as UTF-8, transferred to the browser as UTF-8 with the right headers and meta tags).
From my expiriense, for utf-8 to work right:
MySql data needs to be in some of the "utf-8" collations
The meta tag needs to define charset as "utf-8"
The MySql connector needs to be set to "utf-8" (for php, its mysql_set_charset)
The server-side file (*.php or the like) needs to be saved in utf-8 (not actually necesary, but it saves some pain)

why "»" shows as a question mark("?") in my page?

Is there any restrictions for it to show normally?
Sounds like an encoding problem. For special characters like that, I prefer to use HTML entities. In this case, try »
After my experience, a question mark usually replaces undecodable special characters when you encode your special characters with utf8, because web browsers by default decode the web page using iso-latin1. You can/should explicitely declare the encoding of your web page using the following directive:
<?xml version="1.0" encoding="UTF-8" ?>
for xhtml, or
<meta http-equiv="Content-Type" content="text/html"; charset="utf-8">
(inside the element), for HTML.
Regard this post as a supplement, because I guess that using the xml/html entities like » or » mentioned above are the better way to go.
You can also use »
If your Apache server is configured with...
AddDefaultCharset UTF-8
...in the httpd.conf file (which, strangely, was the default on my server), then Content-Type specs in the .html files (e.g., <meta http-equiv=Content-Type content="text/html; charset=windows-1252">) will be ignored, causing character codes above 127 to be interpreted incorrectly.
Comment out the AddDefaultCharset line and restart Apache.