g++ unicode variable name - unicode

I am trying to use unicode variable names in g++.
It does not appear to work.
Does g++ not support unicode variable names, ... or is there some subset of unicode (from which I'm not testing in).
Thanks!

You have to specify the -fextended-identifiers flag when compiling, you also have to use \uXXXX or \uXXXXXXXX for unicode(atleast in gcc it's unicode)
Identifiers (variable/class names etc) in g++ can't be of utf-8/utf-16 or whatever encoding,
they have to be:
identifier:
nondigit
identifier nondigit
identifier digit
a nondigit is
nondigit: one of
universalcharactername
_ a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
and a universalcharactername is
universalcharactername:
\UXXXXXXXX
\uXXXX
Thus, if you save your source file as UTF-8, you cannot have a variable like e.g.:
int høyde = 10;
it had to be written like:
int h\u00F8yde = 10;
(which imo would beat the whole purpose - so just stick with a-z)

A one-line patch to the cpp preprocessor allows UTF-8 input. Details for gcc are given at
https://www.raspberrypi.org/forums/viewtopic.php?p=802657
however, since the preprocessor is shared, the same patch should work for g++ as well. In particular, the patch needed, as of gcc-5.2 is
diff -cNr gcc-5.2.0/libcpp/charset.c gcc-5.2.0-ejo/libcpp/charset.c
*** gcc-5.2.0/libcpp/charset.c Mon Jan 5 04:33:28 2015
--- gcc-5.2.0-ejo/libcpp/charset.c Wed Aug 12 14:34:23 2015
***************
*** 1711,1717 ****
struct _cpp_strbuf to;
unsigned char *buffer;
! input_cset = init_iconv_desc (pfile, SOURCE_CHARSET, input_charset);
if (input_cset.func == convert_no_conversion)
{
to.text = input;
--- 1711,1717 ----
struct _cpp_strbuf to;
unsigned char *buffer;
! input_cset = init_iconv_desc (pfile, "C99", input_charset);
if (input_cset.func == convert_no_conversion)
{
to.text = input;
Note that for the above patch to work, a recent version of iconv needs to be installed that supports C99 conversions. Type iconv --list to verify this, otherwise, you can install a new version of iconv along with gcc as described in the link above. Change the configure command to
$ ../gcc-5.2.0/configure -v --disable-multilib \
--with-libiconv-prefix=/usr/local/gcc-5.2 \
--prefix=/usr/local/gcc-5.2 \
--enable-languages="c,c++"
if you are building for x86 and want to include the c++ compiler as well.

Related

VScode Exception occur: System.ArgumentOutOfRangeException

When I try to run any c++ program in vs code an exception occur and do not run the program
Oops, something went wrong. Please report this bug with the details below.
Report on GitHub: https://github.com/lzybkr/PSReadLine/issues/new
-----------------------------------------------------------------------
Last 101 Keys:
c d Space " c : \ U s e r s \ U S E R
\ D o c u m e n t s \ p r o g r a m m i n g \ " Space ; Space i f Space ( $ ?
) Space { Space g + + Space c o d e 2 . c p p Space - o Space c o d e 2 Space
} Space ; Space i f Space ( $ ? ) Space { Space . \ c o d e 2 Space } Enter
Exception:
System.ArgumentOutOfRangeException: The value must be greater than or equal to zero and less than the console's buffer size in that dimension.
Parameter name: left
Actual value was -1.
at System.Console.SetCursorPosition(Int32 left, Int32 top)
at Microsoft.PowerShell.Internal.VirtualTerminal.set_CursorLeft(Int32 value)
at Microsoft.PowerShell.PSConsoleReadLine.ReallyRender(RenderData renderData, String defaultColor)
at Microsoft.PowerShell.PSConsoleReadLine.ForceRender()
at Microsoft.PowerShell.PSConsoleReadLine.Insert(Char c)
at Microsoft.PowerShell.PSConsoleReadLine.SelfInsert(Nullable`1 key, Object arg)
at Microsoft.PowerShell.PSConsoleReadLine.ProcessOneKey(ConsoleKeyInfo key, Dictionary`2 dispatchTable, Boolean ignoreIfNoAction, Object arg)
at Microsoft.PowerShell.PSConsoleReadLine.InputLoop()
at Microsoft.PowerShell.PSConsoleReadLine.ReadLine(Runspace runspace, EngineIntrinsics engineIntrinsics)
-----------------------------------------------------------------------
By using manual command in terminal , I can run my c++ program.
I can find some issues in vscode github repository about this problem but can not understand any solution

robotic arm not working with pyserial and gcode

I am working on a robotic arm.
M106 is turn on the fan
M17 is stepper on
M18 is stepper off
G1 X... Y.. X.. is the coordinates of movement
the port is correct, the terminal prints the hello hi there...
However the robotic arm is not moving, I totally have no clue why is this happening.
Is it there is some problem with my code?
import serial
import struct
def gcode_encode(gcode):
gcode += '\r\n'
return struct.pack(f'<{len(gcode)}s', gcode.encode(encoding='utf-8'))
print("hello")
# ser = serial.Serial('COM7', 9600, timeout=0, parity=serial.PARITY_EVEN, rtscts=1)
ser = serial.Serial()
ser.port = 'COM7'
ser.baudrate = 9600
ser.timeout = 0
ser.open()
g = gcode_encode('M106')
ser.write(b'g')
g = gcode_encode('M17')
ser.write(b'g')
g = gcode_encode('M18')
ser.write(b'g')
g = gcode_encode('G1 X0 Y120 Z120')
ser.write(b'g')
g = gcode_encode('G1 X50 Y120 Z60')
ser.write(b'g')
ser.close()
print("hi")
You are writing only the character 'g' to the port. If you want to write bytes of a variable g, you need to use bytes(g). The same is with f'<{len(gcode)}s', the characters in single or double quotes are not a command here, but just a string. Also you don't need packing of the string, just encoding.
Also add some pauses between commands using time.sleep().

kdb: guid encoding in c results in invalid serialization

I'm trying to manipulate guid from C++. Whenever I attempt to serialize a guid, I get a null pointer.
U g={0};
auto k = ku(g);
auto p = ::b9(2, k);
First two lines are straight from the manual for creating a null guid. This will result in p == 0.
Really what I was attempting to do was creating a list of guid and then serializing:
k = ktn(UU, 3)
kU(k)[0] = <an instance of U with the g bytes initialized>
kU(k)[1] = <an instance of U with the g bytes initialized>
kU(k)[2] = <an instance of U with the g bytes initialized>
That did not work when attempting to serialize.
I believe you should be using 3 as the first argument to b9. For example:
jmcmurray#homer ~/c $ more test.c
#include"k.h"
K f(K x)
{
K k = ktn(UU,3);I j=0;
for(j=0;j<3;j++){
U g={0};I i=0;
for(i=j;i<j+16;i++){
g.g[i] = (unsigned char)i;
}
kU(k)[0] = g;
}
return b9(3,k);
}
jmcmurray#homer ~/c $ gcc -shared -fPIC -DKXVER=3 test.c -o test.so
jmcmurray#homer ~/c $ q
KDB+ 3.5 2017.11.30 Copyright (C) 1993-2017 Kx Systems
l64/ 8()core 16048MB jmcmurray homer.aquaq.co.uk 192.168.1.57 EXPIRE 2019.06.30 AquaQ #52428
q)f:`:./test 2:(`f;1)
q)f[]
0x010000003e000000020003000000000002030405060708090a0b0c0d0e0f00ae67af727f000..
q)-9!f[]
00000203-0405-0607-0809-0a0b0c0d0e0f 001868af-727f-0000-6062-67af727f0000 a0a..
q)
Here I am able to return a serialised list of GUIDs from my shared object & deserialize on the q side. When I tried with 2 as in your example I got a 'type error when running the function in q.
According to https://code.kx.com/q/interfaces/capiref/#b9-serialize 3 means
unenumerate, compress, allow serialization of timespan and timestamp
2 is the same without "compress". So I guess you must compress GUIDs?

Passing C structs through SystemVerilog DPI-C layer

SystemVerilog LRM has some examples that show how to pass structs in SystemVerilog to\from C through DPI-C layer. However when I try my own example it seems to not work at all in Incisive or Vivado simulator (it does work in ModelSim). I wanted to know if I am doing something wrong, or if it is an issue with the Simulators. My example is as follow:
#include <stdio.h>
typedef struct {
char f1;
int f2;
} s1;
void SimpleFcn(const s1 * in,s1 * out){
printf("In the C function the struct in has f1: %d\n",in->f1);
printf("In the C function the struct in has f2: %d\n",in->f2);
out->f1=!(in->f1);
out->f2=in->f2+1;
}
I compile the above code into a shared library:
gcc -c -fPIC -Wall -ansi -pedantic -Wno-long-long -fwrapv -O0 dpi_top.c -o dpi_top.o
gcc -shared -lm dpi_top.o -o dpi_top.so
And the SystemVerilog code:
`timescale 1ns / 1ns
typedef struct {
bit f1;
int f2;
} s1;
import "DPI-C" function void SimpleFcn(input s1 in,output s1 out);
module top();
s1 in,out;
initial
begin
in.f1=1'b0;
in.f2 = 400;
$display("The input struct in SV has f1: %h and f2:%d",in.f1,in.f2);
SimpleFcn(in,out);
$display("The output struct in SV has f1: %h and f2:%d",out.f1,out.f2);
end
endmodule
In Incisive I run it using irun:
irun -sv_lib ./dpi_top.so -sv ./top.sv
But it SegV's.
In Vivado I run it using
xvlog -sv ./top.sv
xelab top -sv_root ./ -sv_lib dpi_top.so -R
It runs fine until it exits simulation, then there is a memory corruption:
Vivado Simulator 2017.4
Time resolution is 1 ns
run -all
The input struct in SV has f1: 0 and f2: 400
In the C function the struct in has f1: 0
In the C function the struct in has f2: 400
The output struct in SV has f1: 1 and f2: 401
exit
*** Error in `xsim.dir/work.top/xsimk': double free or corruption (!prev): 0x00000000009da2c0 ***
You were lucky that this worked in Modelsim. Your SystemVerilog prototype does not match your C prototype. You have f1 as a byte in C and bit in SystemVerilog.
Modelsim/Questa has a -dpiheader switch that produces a C header file that you can #include into your dpi_top.c file. That way you get a compiler error when the prototypes don't match instead of an unpredictable run-time error. This is the C prototype for your SV code.
typedef struct {
svBit f1;
int f2;
} s1;
void SimpleFcn(
const s1* in,
s1* out);
But I would recommend sticking with C compatible types in SystemVerilog.

Unicode letters with more than 1 alphabetic latin character?

I'm not really sure how to express it but I'm searching for unicode letters which are more than one visual latin letter.
I found this in Word so far:
DZ
Dz
dz
NJ
Lj
LJ
Nj
nj
Any others?
Here are some of the characters I've found. I'd first done this manually by looking at some probable blocks. However I've later written a Python script to do this automatically that you can find at the end of this answer
Digraphs
Two Glyphs
Digraph
Unicode Code Point
HTML
DZ, Dz, dz
DZ, Dz, dz
U+01F1 U+01F2 U+01F3
DZ Dz dz
DŽ, Dž, dž
DŽ, Dž, dž
U+01C4 U+01C5 U+01C6
DŽ Dž dž
IJ, ij
IJ, ij
U+0132 U+0133
IJ ij
LJ, Lj, lj
LJ, Lj, lj
U+01C7 U+01C8 U+01C9
LJ Lj lj
NJ, Nj, nj
NJ, Nj, nj
U+01CA U+01CB U+01CC
NJ Nj nj
Ligatures
Non-ligature
Ligature
Unicode
HTML
AA, aa
Ꜳ, ꜳ
U+A732, U+A733
Ꜳ ꜳ
AE, ae
Æ, æ
U+00C6, U+00E6
Æ æ
AO, ao
Ꜵ, ꜵ
U+A734, U+A735
Ꜵ ꜵ
AU, au
Ꜷ, ꜷ
U+A736, U+A737
Ꜷ ꜷ
AV, av
Ꜹ, ꜹ
U+A738, U+A739
Ꜹ ꜹ
AV, av (with bar)
Ꜻ, ꜻ
U+A73A, U+A73B
Ꜻ ꜻ
AY, ay
Ꜽ, ꜽ
U+A73C, U+A73D
Ꜽ ꜽ
et
🙰
U+1F670
🙰
f‌f
ff
U+FB00
ff
f‌f‌i
ffi
U+FB03
ffi
f‌f‌l
ffl
U+FB04
ffl
f‌i
fi
U+FB01
fi
f‌l
fl
U+FB02
fl
OE, oe
Œ, œ
U+0152, U+0153
Œ œ
OO, oo
Ꝏ, ꝏ
U+A74E, U+A74F
Ꝏ ꝏ
ſs, ſz
ẞ, ß
U+1E9E, U+00DF
ß
st
st
U+FB06
st
ſt
ſt
U+FB05
ſt
TZ, tz
Ꜩ, ꜩ
U+A728, U+A729
Ꜩ ꜩ
ue
ᵫ
U+1D6B
ᵫ
VY, vy
Ꝡ, ꝡ
U+A760, U+A761
Ꝡ ꝡ
There are a few other ligatures that are used for phonetic transcription but looks like Latin characters
Non-ligature
Ligature
Unicode
HTML
db
ȸ
U+0238
ȸ
dz
ʣ
U+02A3
ʣ
IJ, ij
IJ, ij
U+0132, U+0133
IJ ij
ls
ʪ
U+02AA
ʪ
lz
ʫ
U+02AB
ʫ
qp
ȹ
U+0239
ȹ
ts
ʦ
U+02A6
ʦ
ui
ꭐ
U+AB50
ꭐ
turned ui
ꭑ
U+AB51
ꭑ
https://en.wikipedia.org/wiki/List_of_precomposed_Latin_characters_in_Unicode#Digraphs_and_ligatures
Edit:
There are more letterlike symbols beside ℻ and ℡ like what the OP found in the comment:
℀ ℁ ⅍ ℅ ℆ ℔ ℠ ™
Longer letters are mainly from the CJK Compatibility block
U+XXXX
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
U+338x
㎀
㎁
㎂
㎃
㎄
㎅
㎆
㎇
㎈
㎉
㎊
㎋
㎌
㎍
㎎
㎏
U+339x
㎐
㎑
㎒
㎓
㎔
㎕
㎖
㎗
㎘
㎙
㎚
㎛
㎜
㎝
㎞
㎟
U+33Ax
㎠
㎡
㎢
㎣
㎤
㎥
㎦
㎧
㎨
㎩
㎪
㎫
㎬
㎭
㎮
㎯
U+33Bx
㎰
㎱
㎲
㎳
㎴
㎵
㎶
㎷
㎸
㎹
㎺
㎻
㎼
㎽
㎾
㎿
U+33Cx
㏀
㏁
㏂
㏃
㏄
㏅
㏆
㏇
㏈
㏉
㏊
㏋
㏌
㏍
㏎
㏏
U+33Dx
㏐
㏑
㏒
㏓
㏔
㏕
㏖
㏗
㏘
㏙
㏚
㏛
㏜
㏝
㏞
㏟
Among the 3-letter-like symbols are ㎈ ㎑ ㎒ ㎓ ㎔㏒ ㏕ ㏖ ㏙ ㎪ ㎫ ㎬ ㎭ ㏆ ㏿ ㍱... Probably the ones with most characters are ㎉ and ㎯
Unicode even have codepoints for Roman numerals. Here another 4-letter-like character can be found: Ⅷ
U+XXXX
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
U+215x
⅐
⅑
⅒
⅓
⅔
⅕
⅖
⅗
⅘
⅙
⅚
⅛
⅜
⅝
⅞
⅟
U+216x
Ⅰ
Ⅱ
Ⅲ
Ⅳ
Ⅴ
Ⅵ
Ⅶ
Ⅷ
Ⅸ
Ⅹ
Ⅺ
Ⅻ
Ⅼ
Ⅽ
Ⅾ
Ⅿ
U+217x
ⅰ
ⅱ
ⅲ
ⅳ
ⅴ
ⅵ
ⅶ
ⅷ
ⅸ
ⅹ
ⅺ
ⅻ
ⅼ
ⅽ
ⅾ
ⅿ
U+218x
ↀ
ↁ
ↂ
Ↄ
ↄ
ↅ
ↆ
ↇ
ↈ
↉
↊
↋
If normal numbers can be considered then there are some other code points for multiple digits like ⒆ ⒇ ⓳ ⓴ in enclosed alphanumerics
U+XXXX
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
U+246x
①
②
③
④
⑤
⑥
⑦
⑧
⑨
⑩
⑪
⑫
⑬
⑭
⑮
⑯
U+247x
⑰
⑱
⑲
⑳
⑴
⑵
⑶
⑷
⑸
⑹
⑺
⑻
⑼
⑽
⑾
⑿
U+248x
⒀
⒁
⒂
⒃
⒄
⒅
⒆
⒇
⒈
⒉
⒊
⒋
⒌
⒍
⒎
⒏
U+249x
⒐
⒑
⒒
⒓
⒔
⒕
⒖
⒗
⒘
⒙
⒚
⒛
⒜
⒝
⒞
⒟
U+24Ax
⒠
⒡
⒢
⒣
⒤
⒥
⒦
⒧
⒨
⒩
⒪
⒫
⒬
⒭
⒮
⒯
U+24Bx
⒰
⒱
⒲
⒳
⒴
⒵
Ⓐ
Ⓑ
Ⓒ
Ⓓ
Ⓔ
Ⓕ
Ⓖ
Ⓗ
Ⓘ
Ⓙ
U+24Cx
Ⓚ
Ⓛ
Ⓜ
Ⓝ
Ⓞ
Ⓟ
Ⓠ
Ⓡ
Ⓢ
Ⓣ
Ⓤ
Ⓥ
Ⓦ
Ⓧ
Ⓨ
Ⓩ
U+24Dx
ⓐ
ⓑ
ⓒ
ⓓ
ⓔ
ⓕ
ⓖ
ⓗ
ⓘ
ⓙ
ⓚ
ⓛ
ⓜ
ⓝ
ⓞ
ⓟ
U+24Ex
ⓠ
ⓡ
ⓢ
ⓣ
ⓤ
ⓥ
ⓦ
ⓧ
ⓨ
ⓩ
⓪
⓫
⓬
⓭
⓮
⓯
U+24Fx
⓰
⓱
⓲
⓳
⓴
⓵
⓶
⓷
⓸
⓹
⓺
⓻
⓼
⓽
⓾
⓿
and in Enclosed Alphanumeric Supplement
🅫, 🅪, 🆋, 🆌, 🆍, 🄭, 🄮, 🅊, 🅋, 🅌, 🅍, 🅎, 🅏
A few more:
Currency symbol group
₧ ₨ ₶ ₯ ₠ ₢ ₷
Miscellaneous technical group
⎂ ⏨
Control pictures (probably you'll need to zoom out to see)
U+XXXX
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
U+240x
␀
␁
␂
␃
␄
␅
␆
␇
␈
␉
␊
␋
␌
␍
␎
␏
U+241x
␐
␑
␒
␓
␔
␕
␖
␗
␘
␙
␚
␛
␜
␝
␞
␟
U+242x
␠
␡
␢
␣
␤
␥
␦
Alchemical Symbols
🜀 🜅 🜆 🜇 🜈 🝪 🝫 🝬 🝛 🝜 🝝
Musical Symbols
𝄶 𝄷 𝄸 𝄹 𝄉 𝄊 𝄫
And there are the emojis 🔟 💤🆔🚾🆖🆗🔢🔡🔠 💯🆘🆎🆑™🔙🔚🔜🔝🔛📆🗓🔞
Vertical bars may be considered uppercase i or lowercase L (like your 〷 example which is actually the TELEGRAPH LINE FEED SEPARATOR SYMBOL) and we have
Vai syllable see ꔖ 0xa516
Large triple vertical bar operator ⫼ 0x2afc
Counting rod tens digit three: 𝍫 0x1d36b
Suzhou numerals 〢 〣
Chinese river 川
║ BOX DRAWINGS DOUBLE VERTICAL...
Here's the automatic script to find the multi-character letters
import unicodedata
for c in range(0, 0x10FFFF + 1):
d = unicodedata.normalize('NFKD', chr(c))
if len(d) > 1 and d.isascii() and d.isalpha():
print("U+%04X (%s): %s\n" % (c, chr(c), d))
It won't be able to find many ligatures like æ or œ because they're not considered orthographic ligatures and aren't decomposable in Unicode. Here's the result in Unicode 11.0.0 (checked with unicodedata.unidata_version)
U+0132 (IJ): IJ
U+0133 (ij): ij
U+01C7 (LJ): LJ
U+01C8 (Lj): Lj
U+01C9 (lj): lj
U+01CA (NJ): NJ
U+01CB (Nj): Nj
U+01CC (nj): nj
U+01F1 (DZ): DZ
U+01F2 (Dz): Dz
U+01F3 (dz): dz
U+20A8 (₨): Rs
U+2116 (№): No
U+2120 (℠): SM
U+2121 (℡): TEL
U+2122 (™): TM
U+213B (℻): FAX
U+2161 (Ⅱ): II
U+2162 (Ⅲ): III
U+2163 (Ⅳ): IV
U+2165 (Ⅵ): VI
U+2166 (Ⅶ): VII
U+2167 (Ⅷ): VIII
U+2168 (Ⅸ): IX
U+216A (Ⅺ): XI
U+216B (Ⅻ): XII
U+2171 (ⅱ): ii
U+2172 (ⅲ): iii
U+2173 (ⅳ): iv
U+2175 (ⅵ): vi
U+2176 (ⅶ): vii
U+2177 (ⅷ): viii
U+2178 (ⅸ): ix
U+217A (ⅺ): xi
U+217B (ⅻ): xii
U+3250 (㉐): PTE
U+32CC (㋌): Hg
U+32CD (㋍): erg
U+32CE (㋎): eV
U+32CF (㋏): LTD
U+3371 (㍱): hPa
U+3372 (㍲): da
U+3373 (㍳): AU
U+3374 (㍴): bar
U+3375 (㍵): oV
U+3376 (㍶): pc
U+3377 (㍷): dm
U+337A (㍺): IU
U+3380 (㎀): pA
U+3381 (㎁): nA
U+3383 (㎃): mA
U+3384 (㎄): kA
U+3385 (㎅): KB
U+3386 (㎆): MB
U+3387 (㎇): GB
U+3388 (㎈): cal
U+3389 (㎉): kcal
U+338A (㎊): pF
U+338B (㎋): nF
U+338E (㎎): mg
U+338F (㎏): kg
U+3390 (㎐): Hz
U+3391 (㎑): kHz
U+3392 (㎒): MHz
U+3393 (㎓): GHz
U+3394 (㎔): THz
U+3396 (㎖): ml
U+3397 (㎗): dl
U+3398 (㎘): kl
U+3399 (㎙): fm
U+339A (㎚): nm
U+339C (㎜): mm
U+339D (㎝): cm
U+339E (㎞): km
U+33A9 (㎩): Pa
U+33AA (㎪): kPa
U+33AB (㎫): MPa
U+33AC (㎬): GPa
U+33AD (㎭): rad
U+33B0 (㎰): ps
U+33B1 (㎱): ns
U+33B3 (㎳): ms
U+33B4 (㎴): pV
U+33B5 (㎵): nV
U+33B7 (㎷): mV
U+33B8 (㎸): kV
U+33B9 (㎹): MV
U+33BA (㎺): pW
U+33BB (㎻): nW
U+33BD (㎽): mW
U+33BE (㎾): kW
U+33BF (㎿): MW
U+33C3 (㏃): Bq
U+33C4 (㏄): cc
U+33C5 (㏅): cd
U+33C8 (㏈): dB
U+33C9 (㏉): Gy
U+33CA (㏊): ha
U+33CB (㏋): HP
U+33CC (㏌): in
U+33CD (㏍): KK
U+33CE (㏎): KM
U+33CF (㏏): kt
U+33D0 (㏐): lm
U+33D1 (㏑): ln
U+33D2 (㏒): log
U+33D3 (㏓): lx
U+33D4 (㏔): mb
U+33D5 (㏕): mil
U+33D6 (㏖): mol
U+33D7 (㏗): PH
U+33D9 (㏙): PPM
U+33DA (㏚): PR
U+33DB (㏛): sr
U+33DC (㏜): Sv
U+33DD (㏝): Wb
U+33FF (㏿): gal
U+FB00 (ff): ff
U+FB01 (fi): fi
U+FB02 (fl): fl
U+FB03 (ffi): ffi
U+FB04 (ffl): ffl
U+FB05 (ſt): st
U+FB06 (st): st
U+1F12D (🄭): CD
U+1F12E (🄮): WZ
U+1F14A (🅊): HV
U+1F14B (🅋): MV
U+1F14C (🅌): SD
U+1F14D (🅍): SS
U+1F14E (🅎): PPV
U+1F14F (🅏): WC
U+1F16A (🅪): MC
U+1F16B (🅫): MD
U+1F190 (🆐): DJ