Which is the Dockerfile encoding? - encoding

Defining my Dockerfile I got to this line:
...
MAINTAINER Ramón <ramon#example.com>
...
Which encoding shall I use to save this file?
Shall I escape non ASCII characters?

Considering Docker is done in Go, and Go has native support for utf-8, it is best to save a Dockerfile directly encoded in UTF-8.
That way, all characters (ASCII or not) are supported.
See "Dealing with encodings in Go".
Even though Go has good support for UTF-8 (and minimal support for UTF-16), it has no built-in support for any other encoding.
If you have to use other encodings (e.g. when dealing with user input), you have to use third party packages, like for example go-charset.
Here, it is best if the Dockerfile is directly encoded in UTF-8.
Update July 2016, docker 1.12-rc5 adds:
PR 23372: Support unicode characters in parseWords
PR 23234: Skip UTF-8 BOM bytes from Dockerfile and .dockerignore if exist

You need to set the locale correctly, remove the accent, check the encoding with a basic docker run -it container env and then put a correct encoding, the "Bible" on that is http://jaredmarkell.com/docker-and-locales/

Related

Babel writes utf-16 file; how can I make it write uft-8?

When I run babel --plugins transform-react-jsx like_button.jsx > like_button.js the resulting like_button.js is utf-16 encoded (and like_button.jsx has some 8 bit encoding, probably utf-8).
How can I make bable write like_button.js utf-8 encoded?
Babel's output is definitely UTF-8. Since you are seeing UTF-16 in your file, and the file is being written by your terminal, it seems most likely that your terminal is re-encoding the data before writing it to a file.
The easiest option for you would be to change from
-babel --plugins transform-react-jsx like_button.jsx > like_button.js
+babel --plugins transform-react-jsx like_button.jsx --out-file like_button.js
so that Babel itself is responsible for writing the output to the file, which removes the terminal from the equation.
If you don't want to do that, you'll need to look into your terminal options to see if there is an explicit encoding set somewhere.

Does pg_dump preserve all Unicode characters when .sql file is ANSI?

I use
pg_dump.exe -U postgres -f "file-name.sql" database-name
to backup UTF-8 encoded databases on PostgreSQL 8.4 and 9.5, Windows host. Some may have foreign characters such as Chinese, Thai etc stored in Characters columns.
The resulting .sql file shows ANSI encoding when opening in Notepad++ (I'm NOT applying ANSI to opened files by default). How do I know if Unicode characters are always preserved in the dump file? Should I be using an archive (object) backup file instead?
Quote from the manual
By default, the dump is created in the database encoding.
There is no difference in a text file in ANSI encoding and UTF-8 if no extended characters are used. Maybe your dump has no special characters and thus the editor doesn't identify it as UTF-8.
If you want the SQL dump in a specific encoding, use the --encoding=encoding parameter or the PGCLIENTENCODING environment variable

ElFinder and NTFS UTF-16 file names

I use WAMP server and ElFinder 2.x, it works fine except for filenames are encoded in UTF-8 when uploaded, so they look like Список предприятий ВРК123.xlsx in Windows Explorer. It's OK, but it would be nice to be able to copy files with unicode filename to ElFinder's folder via Windows Explorer.
As far as I know NTFS uses UTF-16. nao-pon answered here that one needs to set encoding, locale in connector options for multi-byte encodings. I've tried to set these options to 'UTF-16' and 'ru_RU.UTF-16', but ElFinder cannot load folder at all then and gives Invalid backend configuration. Readable volumes not available error.
UPD: it works fine with 'encoding' => 'CP1251' but well... it doesn't list files with names like 한자.txt.

STDERR of pg_dump in UTF-8

I am redirecting the stderr of pg_dump to file:
pg_dump ...... 2>pg_dump.log
but this file is ANSI-encoded. I would like to see it in UTF-8 or Unicode. Is this possible?
man pg_dump
-E encoding
--encoding=encoding
Create the dump in the specified character set encoding. By default, the dump is created in the database encoding.
BTW: regarding "UTF-8 or Unicode", the "or" does not make sense; UTF-8 is one of the encodings of Unicode (other is UTF-16)
Updated: Sorry, I misunderstood your problem. Are you interested in text error messages generated by Postresql or texts from some queries/data from your own data? If the former, I think the LC_MESSAGES setting should work http://www.postgresql.org/docs/9.2/interactive/locale.html
Elsewhere, you can always use iconv

UTF-8 encoding inside database encrypted

i Convert my database from this tutorial
http://en.gentoo-wiki.com/wiki/Convert_latin1_to_UTF-8_in_MySQL
but i didn't notice the arabic characters INSIDE the database is encrypted , like
اوÙاµ ®ØµØ… „Ù‡ Øكلق§Ø‡Ø°Ù…ا؄مشٳÙÙ‹ ÙÙ„...
through the php script connect with the database everything GOOD , but inside the database the arabic characters looks like that
i try to return the database to the old encoding which is WINDOWS-1256 using iconv by the following command
# iconv -f UTF-8 -t WINDOWS-1252 database.sql > database_1252.sql
i got this error
iconv: illegal input sequence at position
so i try to run the command again using -c option
# iconv -c -f UTF-8 -t WINDOWS-1252 database.sql > database_1252.sql
it's worked and i can see the arabic characters inside the database as well, but alot of characters missing , for example :
i would like to go shopping
after the converting
i would like to
i want to know how could i fix the Arabic Characters to be read as normal inside the database complete not missing anything
thanks
Wait wait .... you say your database was in WINDOWS-1256 (or WINDOWS-1252?) and you converted it based on tutorial latin1 -> utf8? No wonder the characters are malformed.
I wouldn't trust to the tutorial solution at all. I would recommend that you return to your former version of the database and use mysql alter table command to change the encoding.