utf-8 unicode in html and utf-8_general_ci in mysql - unicode

I have Mysql database with collation utf-8 general_ci
in HTML text is shown as ????????
P.S
I am using
<meta charset="utf-8">
and this
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
In Mysql Database text is shown as needed but in HTML shown as ????????
P.S It's georgian unicode

use "set_charset" function after connecting to Mysql database.

Related

rails 5 request not sending german letters to controller

Using "ISO-8859-1" encoding.
layout/application.html.erb
<meta charset="ISO-8859-1">
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<?xml version="1.0" encoding="ISO-8859-1"?>
application.rb
config.encoding = "ISO-8859-1"
database.yml
adapter: mysql2
encoding: latin1
collation: latin1_swedish_ci
charset: latin1
Above is my declarations in Rails.
So now i can see proper data on html.erb page in german format. eg. WÄRTSILÄ
But while sending request from page to controller params are omitting special chars like Ä. Getting following string in controller.
"name"=>"WRTSIL"
Getting following error for simple_format()
ActionView::Template::Error (incompatible character encodings: ISO-8859-1 and UTF-8)
Does above declarations are right for German letters?
or What encoding should be used on page level to accept German letters and to save it in database what should be updated in database.yml
or there is something i need to update for application server(puma / thin)
Versions:
`Rails 5`
`Ruby-2.5.1`

preserve encoding for included files

I have used UTF-8 encoding and ASP classic with vbscript as default scripting language in my website. I have separated files to smaller parts for better management.
I always use this trick in first line of separated files to preserve UTF-8 encoding while saving files elsewhere the language characters are converted to weird characters.
mainfile.asp
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body>
<!--#include file="sub.asp"--->
</body>
</html>
sub.asp
<%if 1=2 then%>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<%end if%>
this is some characters in other language:
تست متن به زبان فارسی
This trick works good for offline saving and also works good when the page is running on the server because these Extra lines are omitted (because the condition is always false!):
Is there a better way to preserve encoding in separated files?
I use Microsoft expression web for editing files.
I use Textpad to ensure that all main files and includes are saved in UTF-8 encoding. Just hit 'Save As' and set the encoding dropdown on the dialog to the one you want.
Keep the meta tag as well because that is still necessary.

Represent encoding used for a text file

How is the encoding for a simple text file stored?
In an email there's a header
Content-Type: text/plain; charset="UTF-8"
In html we have a meta tag
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
That leaves me the question of how a text editor knows what encoding is used, since we don't explicitly set this in a text file as we do with an html file.
If it's a standard complex format, like .docx or .pdf the encoding is likely to be stored there as some sort of a property.
If it's a simple file, like .txt, .csv the encoding will not be stored anywhere. A text editor will use heuristics to determine which encoding was used to save the file, but it will only be a guess.
Read more:
How to detect the encoding of a file?
Heuristic to detect encoding

JavaScript can put an ansi string in a text field, but not utf-8?

I always use UTF-8 everywhere. But I just stumbled upon a strange issue.
Here's a minimal example html file:
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<script type="text/javascript">
function Foo()
{
var eacute_utf8 = "\xC3\xA9";
var eacute_ansi = "\xE9";
document.getElementById("bla1").value = eacute_utf8;
document.getElementById("bla2").value = eacute_ansi;
}
</script>
</head>
<body onload="Foo()">
<input type="text" id="bla1">
<input type="text" id="bla2">
</body>
</html>
The html contains a utf-8 charset header, thus the page uses utf-8 encoding. Hence I would expect the first field to contain an 'é' (e acute) character, and the second field something like '�', as a single E9 byte is not a valid utf-8 encoded string.
However, to my surprise, the first contains 'é' (as if the utf-8 data is interpreted as some ansi variant, probably iso-8859-1 or windows-1252), and the second contains the actual 'é' char. Why is this!?
Note that my problem is not related to the particular encoding that my text editor uses - this is exactly why I used the explicit \x character constructions. They contain the correct, binary representation (in ascii compatible notation) of this character in ansi and utf-8 encoding.
Suppose I would want to insert a 'ę' character, that's unicode U+0119, or 0xC4 0x99 in utf-8 encoding, and does not exist in iso-8859-1 or windows-1252 or latin1. How would that even be possible?
JavaScript strings are always strings of Unicode characters, never bytes. Encoding headers or meta tags do not affect the interpretation of escape sequences. The \x escapes do not specify bytes but are shorthand for individual Unicode characters. Therefore the behavior is expected. \xC3 is equivalent to \u00C3.

Zend_form: doesn't accept Latin characters(ú, ë, etc?

I can't get Zend_form to accept any inserted latin characters (ü, é, etc).
Even if I'm not validating it doesn't accept this.
Does anyone now how to get this to work?
Gr. Tosh
After doing a couple of tests, it seems to be a simple character encoding issue.
Your server is probably not delivering documents with UTF-8 encoding. You can easily force this in your view / layout by placing this in your <head> (preferably as the first child)
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
or if using a HTML 5 doctype
<meta charset="utf-8">
It probably doesn't hurt to set the Zend_View encoding as well in your application config file though this wasn't necessary in my tests (I think "UTF-8" is the default anyway)
resources.view.encoding = "utf-8"