How do you i use mandarin characters in matplotlib? - unicode

I have been trying to use matplotlib's text or annotate modules with mandarin Chinese characters. Somehow it ends up showing boxes. Any idea on this ?

Here is a solution that works for me on Python 2.7 and Python 3.3, using both text and annotate methods with Chinese.
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
fig = plt.figure()
ax = fig.add_subplot(111)
ChineseFont1 = FontProperties(fname = 'C:\\Windows\\Fonts\\simsun.ttc')
ChineseFont2 = FontProperties('SimHei')
ax.text(3, 2, u'我中文是写得到的', fontproperties = ChineseFont1)
ax.text(5, 1, u'我中文是写得到的', fontproperties = ChineseFont2)
ax.annotate(u'我中文是写得到的', xy=(2, 1), xytext=(3, 4),
arrowprops=dict(facecolor='black', shrink=0.05),
fontproperties = ChineseFont1)
ax.axis([0, 10, 0, 10])
plt.show()
ChineseFont1 is hard coded to a font file, while ChineseFont2 grabs a font by family name (but for ChineseFont2 I had to try a couple to find one that would work). Both of those are particular to my system, in that they reference fonts I have, so you quite likely will need to change them to reference fonts/paths on your system.
The font loaded by default doesn't seem to support Chinese characters, so it was primarily a font choice issue.

Another solution is to use pgf backend which uses XeTeX. This allows one to use UTF-8 directly:
#!/usr/bin/env python2
# -*- coding:utf-8 -*-
import matplotlib
matplotlib.use("pgf")
pgf_with_custom_preamble = {
# "font.size": 18,
"pgf.rcfonts": False,
"text.usetex": True,
"pgf.preamble": [
# math setup:
r"\usepackage{unicode-math}",
# fonts setup:
r"\setmainfont{WenQuanYi Zen Hei}",
r"\setsansfont{WenQuanYi Zen Hei}",
r"\setmonofont{WenQuanYi Zen Hei Mono}",
],
}
matplotlib.rcParams.update(pgf_with_custom_preamble)
from matplotlib import pyplot as plt
x = range(5)
y = range(5)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, label=u"我")
ax.legend(u"中")
ax.set_xlabel(u"是")
ax.set_ylabel(u"写")
ax.set_title(u"得")
ax.text(3, 2, u'到')
ax.annotate(u'的', xy=(2, 1), xytext=(3, 1),
arrowprops=dict(arrowstyle="<|-", connectionstyle="arc3", color='k'))
fig.savefig("pgf-mwe.png")
Result:
This solution requires matplotlib 1.2+ and probably XeTeX installed on Your system. The easiest way to get a working XeTeX is to go for any modern LaTeX distribution: TeXLive (available for all platforms) or MiKTeX (windows only).

matplotlib.rc('font', family='Source Han Sans CN')
ax = quarterly_gdp.plot(title='国内生产总值')
example
You only have to setup font family of your matplotlib and after that you can plot with Chinese labels. I've set up font to be Source Han Sans CN, as it's the only available font on my computer for Chinese.
You can check the available font by command fc-list :lang=zh.

Related

Matlab misreading ascii text file

This is a problem in analyzing some text files using Matlab, which is screwing up some of the text. I am using R2017a (9.2.0.538062) 64-bit (maci64). Please note the accented characters.
Other text editors are reading the file ("War and Peace.txt") correctly (Textmate, Emacs, Textedit, and GNU Octave), as well as other programs (Python, Ruby, Mathematica).
It was in July, 1805, and the speaker was the well-known Anna Pávlovna Schérer, maid of honor and favorite of the Empress Márya Fëdorovna.
Whereas in Matlab
It was in July, 1805, and the speaker was the well-known Anna Pávlovna Schérer, maid of honor and favorite of the Empress Márya Fëdorovna.
My Question
Is there a Matlab (preferences?) setting that will read Ascii text accurately? Matlab appears to be garbling valid Ascii characters (mostly in the 200-256 range).
I actually faced the same problem as yours, when trying to read string from a text file. The problem with me was that I saved the .txt file in ANSI Encoding Format. After many trials, I came up with a solution. First you have to save the file in UTF-8 Encoding format. Like this:
Then in your MATLAB code, you should specify the encondigIn in fopencommand.
A test code can be something like:
close all;clearvars;clc;
fileID = fopen('text.txt', 'r', 'n', 'UTF-8');
C = textscan(fileID, '%s');
fclose(fileID);
celldisp(C)
The output of this code would be:
C{1}{1} =
It
C{1}{2} =
was
C{1}{3} =
in
C{1}{4} =
July,
C{1}{5} =
1805,
C{1}{6} =
and
C{1}{7} =
the
C{1}{8} =
speaker
C{1}{9} =
was
C{1}{10} =
the
C{1}{11} =
well-known
C{1}{12} =
Anna
C{1}{13} =
Pávlovna
C{1}{14} =
Schérer,
C{1}{15} =
maid
C{1}{16} =
of
C{1}{17} =
honor
C{1}{18} =
and
C{1}{19} =
favorite
C{1}{20} =
of
C{1}{21} =
the
C{1}{22} =
Empress
C{1}{23} =
Márya
C{1}{24} =
Fëdorovna.

Print output to file no longer working - RPi 2, PN532 NFC RDW

For my classroom, I have a PN532 NFC card reader/writer hooked up via UART to a Raspberry Pi 2, and I'm using Type 2 NXP NTAG213 NFC cards to store information specifically to the text record. While weak in Python, I used the example under subheader 8.3 in the NFCPy Documentation to write to the card and used "How to redirect 'print' output to a file using python?" in order to complete the output process to a text file. For a while, the reading, writing, and outputting to my text file worked:
import nfc
import nfc.ndef
import nfc.tag
import os, sys
import subprocess
import glob
from os import path
import datetime
f = open('BankTransactions.txt', 'a')
sys.stdout = f
path = '/home/pi/BankTransactions.txt'
def connected(tag): print(tag); return False
clf = nfc.ContactlessFrontend('tty:AMA0:pn532')
clf.connect(rdwr={'on-connect': connected})
tag = clf.connect(rdwr={'on-connect': connected})
record_1 = tag.ndef.message[0]
signature = nfc.tag.tty2_nxp.NTAG213
today = datetime.date.today()
print(record_1.pretty())
if tag.ndef is not None:
print(tag.ndef.message.pretty())
if tag.ndef.is_writeable:
text_record = nfc.ndef.TextRecord("Jessica has 19 GP on card")
tag.ndef.message = nfc.ndef.Message(text_record)
print >> f, "Edited by Roman", today, record_1, signature, '\n'
f.close()
Now, however, when I use the same card for testing, it will not append the data within the text file. The data is still being written to the card, as I can read the information on the card with a simple read program.

Special character in Borland c++ Builder

I just wanna use Delta sign "Δ" in Borland c++ Builder 5.
for example in a label:
Label1->Caption = "delta sign here?";
thnx.
C++Builder 5 uses an ANSI-based VCL and ANSI-based Win32 API calls, where the ANSI encoding is dictated by the active user's locale settings in Windows.
If your app is running on a Greek machine that uses Latin-7/ISO-8859-7 (Windows codepage 28597) as its native locale, or at least has Greek fonts installed, you should be able to set the Label1->Font->Charset to GREEK_CHARSET (161) and Label1->Font->Name to a Greek font, and then assign the Delta character like this:
// using an implicit conversion from Unicode
// to ANSI on a Greek-locale machine...
Label1->Caption = L"Δ";
Label1->Caption = L"\x0394";
Label1->Caption = (wchar_t) 0x0394;
Label1->Caption = (wchar_t) 916;
Or:
// using an explicit Greek ANSI codeunit
// on a Greek font machine...
Label1->Caption = (char) 0xC4;
Label1->Caption = (char) 196;
However, if you need to display the Delta character on a non-Greek machine, or at least one that does not have any Greek fonts installed, you will have to use a third-party Unicode-enabled Label component, such as from the old TNTWare component suite, so that you can use Unicode codepoint U+0394 directly, eg:
TntLabel1->Caption = L"Δ";
TntLabel1->Caption = L"\x0394";
TntLabel1->Caption = (wchar_t) 0x0394;
TntLabel1->Caption = (wchar_t) 916;
If you are on Windows:
EDIT: Try ALT + 30, it works! ▲▲▲▲

Python with Gtk3 not setting unicode properly

I have some simple code that isn't working as expected. First, the docs say that Gtk.Clipboard.get(Gdk.SELECTION_PRIMARY).set_text() should be able to accept only one argument with the length argument option, but it doesn't work (see below). Finally, pasting a unicode ° symbol breaks setting the text when trying to retrieve it from the clipboard (and won't paste into other programs). It gives this warning:
Gdk-WARNING **: Error converting selection from UTF8_STRING
>>> from gi.repository.Gtk import Clipboard
>>> from gi.repository.Gdk import SELECTION_PRIMARY
>>> d='\u00B0'
>>> print(d)
°
>>> cb=Clipboard
Clipboard
>>> cb=Clipboard.get(SELECTION_PRIMARY)
>>> cb.set_text(d) #this should work
Traceback (most recent call last):
File "<ipython-input-6-b563adc3e800>", line 1, in <module>
cb.set_text(d)
File "/usr/lib/python3/dist-packages/gi/types.py", line 43, in function
return info.invoke(*args, **kwargs)
TypeError: set_text() takes exactly 3 arguments (2 given)
>>> cb.set_text(d, len(d))
>>> cb.wait_for_text()
(.:13153): Gdk-WARNING **: Error converting selection from UTF8_STRING
'\\Uffffffff\\Uffffffff'
From the documentation for Gtk.Clipboard
It looks like the method set_text needs a second argument. The first is the text, the second is the length of the text. Or if you don't want to provide the length, you can use -1 to let it calculate the length itself.
gtk.Clipboard.set_text
def set_text(text, len=-1)
text : a string.
len : the length of text, in bytes, or -1, to calculate the length.
I've tested it on Python 3 and it works with cb.set_text(d, -1).
Since GTK version 3.16 there is a easier way of getting the clipboard. You can get it with the get_default() method:
import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk, Gdk, GLib, Gio
display = Gdk.Display.get_default()
clipboard = Gtk.Clipboard.get_default(display)
clipboard.set_text(string, -1)
also for me it worked without
clipboard.store()
Reference: https://lazka.github.io/pgi-docs/Gtk-3.0/classes/Clipboard.html#Gtk.Clipboard.get_default
In Python 3.4. this is only needed for GtkEntryBuffers. In case of GtkTextBuffer set_text works without the second parameter.
example1 works as usual:
settinginfo = 'serveradres = ' + server + '\n poortnummer = ' + poort
GtkTextBuffer2.set_text(settinginfo)
example2 needs extra parameter len:
ErrorTextDate = 'choose earlier date'
GtkEntryBuffer1.set_text(ErrorTextDate, -1)

OK, I've read all to the unicode/mako posts but I still can get this simple code to work

I've running Python 2.6.6.
cat tbuild.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
from mako.template import Template
_template="""
% for v in my_list:
${'abc'.encode('utf-8')}
${'風連町?'.encode('utf-8')}
% endfor
"""
print Template(_template).render_unicode(my_list = [1, 2],
input_encoding='utf-8',
output_encoding='utf-8',
encoding_errors='replace'
)
./tbuild.py gives.
File "./tbuild.py",
line 15, in <module> print Template(_template).render_unicode(my_list = [1, 2],
File "/usr/lib/python2.6/site-packages/mako/template.py",
line 91, in __init__ (code, module) = _compile_text(self, text, filename)
File "/usr/lib/python2.6/site-packages/mako/template.py",
line 357, in _compile_text node = lexer.parse()
File "/usr/lib/python2.6/site-packages/mako/lexer.py",
line 192, in parse self.filename,)
File "/usr/lib/python2.6/site-packages/mako/lexer.py",
line 184, in decode_raw_stream 0, 0, filename)
mako.exceptions.CompileException: Unicode decode operation of
encoding 'ascii' failed at line: 0 char: 0
If I remove the line with Japanese it works fine.
There clearly is something fundamental that I'm miss understanding.
Thanks for your help,
eo
I would be surprised even if ${'á'.encode('utf-8')} worked. You need to specify unicode strings as such, using the unicode literal u. Rewrite ${'風連町?'.encode('utf-8')} as ${u'風連町?'.encode('utf-8')} and do the same for any text that you are handling.
EDIT:
Taking mako into consideration:
# -*- coding: utf-8 -*-
from mako.template import Template
_template=u"${u'風連町?'}"
x = Template(_template, output_encoding='utf-8')
print x.render()
The output_encoding parameter makes sense when creating a Template, it has no meaning in the render method. Also, why would you encode input, decode input using same encoding, and then use render_unicode ? In fact, render_unicode ignores output_encoding, so it seems you actually want to use render.