I'm looking for a way to represent an emoji 📄 in my code as unicode which is then displayed as an actual 'image' in output text. I'd like to use http://apps.timwhitlock.info/unicode/inspect/hex/1F4C4 to display the 'page facing up' in application, but I don't like the idea of having pictures in my code (though it is working fine) ;)
You can use arbitrary Unicode characters directly in your source code
let string = "📄"
or use the Swift Unicode escape sequence:
let string = "\u{1F4C4}"
More information in the section about "String Literals" in the Swift reference.
Related
I'm trying to put a custom font (https://andrewgioia.github.io/Mana/cheatsheet.html) into my app. I've done the apple docs instructions and confirmed that the font is usable in the storyboard etc....
But when I try to use the UTF8 representation in a string I get the following error
Expected '}' in \u{...} escape sequence
I am just trying to do something simple for now like:
let s = "\u{}"
I don't see this error in google search and am hoping someone can point me in the right direction. Thanks!
The proper syntax is:
let s = "\u{e600}"
Just put the hex code in the curly braces.
BTW - That's a private use area character so unless you have a special font that uses that character, don't expect to see much.
I want to use emojis in my iOS and Android app. I checked the list of emojis here and it lists out the hex code for the emojis. When I try to use the hex code such as U+1F600 directly, I don't see the emoji within the app. I found one other way of representing emoji which looks like \uD83D\uDE00. When using this notation, the emoji is seen within the app without any extra code. I think this is a Unicode string for the emoji. I think this is more of a general question that specific to emojis. How can I convert an emoji hex code to the Unicode string as shown above. I didn't find any list where the Unicode for the emojis is listed.
It seems that your question is really one of "how do I display a character, knowing its code point?"
This question turns out to be rather language-dependent! Modern languages have little trouble with this. In Swift, we do this:
$ swift
Welcome to Apple Swift version 3.0.2 (swiftlang-800.0.63 clang-800.0.42.1). Type :help for assistance.
1> "\u{1f600}"
$R0: String = "😀"
In JavaScript, it is the same:
$ node
> "\u{1f600}"
'😀'
In Java, you have to do a little more work. If you want to use the code point directly you can say:
new StringBuilder().appendCodePoint(0x1f600).toString();
The sequence "\uD83D\uDE00" also works in all three languages. This is because those "characters" are actually what Unicode calls surrogates and when they are combined together a certain way they stand for a single character. The details of how this all works can be found on the web in many places (look for UTF-16 encoding). The algorithm is there. In a nutshell you take the code point, subtract 10000 hex, and spread out the 20 bits of that difference like this: 110110xxxxxxxxxx110111xxxxxxxxxx.
But rather than worrying about this translation, you should use the code point directly if your language supports it well. You might also be able to copy-paste the emoji character into a good text editor (make sure the encoding is set to UTF-8). If you need to use the surrogates, your best best is to look up a Unicode chart that shows you something called the "UTF-16 encoding."
In Delphi XE #$1F600 is equivalent to #55357#56832 or D83D DE04 smile.
Within a program, I use it in the following way:
const smilepage : array [1..3] of WideString =(#$1F600,#$1F60A,#$2764);
JavaScript - two way
let hex = "😀".codePointAt(0).toString(16)
let emo = String.fromCodePoint("0x"+hex);
console.log(hex, emo);
I am working on one app in which I need to convert my string which contains the arabic language latter in to const. I have following code but it returns me the nil value.
I tried different encoding style like NSISOLatin1StringEncoding, NSASCIIStringEncodin etc.
my code is as follows.
My string cmpnyname contains the arabic charechter.
const char *textcmnylogo = [cmpnyname cStringUsingEncoding:NSSymbolStringEncoding];
textcmnylogo comes nil.
Please let me know the right encoding style.
Thanks in advance
Assuming you have the string in an NSString with the text you need to pick an encoding that can handle the Arabic character(s), NSUTF8StringEncoding will handle it as well as other UTF encodings.
I'm parsing RTF 1.5+ files generated by Word 2003+ that may have content from other languages. This content is usually encoded as hex literals (\'xx). I would like to convert these literals to unicode values.
I know my document's code page by looking for ansicpg (\ansi\ansicpg1252).
When I use the ansicpg codepage to decode to Unicode, many languages (like French) seem to convert to the Unicode char values that I expect.
However when I see Russian text (like below), codepage 1252 decodes the content to jibberish.
\f277\lang1049\langfe1033\langnp1049\insrsid5989826\charrsid6817286
\'d1\'f2\'f0\'e0\'ed\'e8\'f6\'fb \'e1\'e5\'e7 \'ed\'e0\'e7\'e2\'e0\'ed\'e8\'ff. \'dd\'f2
\'e0 \'f1\'f2\'f0\'e0\'ed\'e8\'f6\'e0 \'ed\'e5 \'e4\'ee\'eb\'e6\'ed\'e0
\'ee\'f2\'ee\'e1\'f0\'e0\'e6\'e0\'f2\'fc\'f1\'ff \'e2 \'f2\'e0\'e1\'eb\'e8\'f6\'e5
\'e2 \'f1\'ee\'e4\'e5\'f0\'e6\'e0\'ed\'e8\'e8.
I assume that lang1049, langfe1033, langnp1049 should provide me clues so I can programmatically choose a different (non-default) code page for the text that they reference? If so, where can I find information that explains how to map a lang* code to a codepage? Or should I be looking for some other RTF command/directive to provide me with the information I'm looking for? (Or must I use \f277 as a font reference and see if it has an associated codepage?)
\lang really only marks up particular stretches of the text as being in a particular language, and shouldn't impact what code page is to be used for the old non-Unicode \' escapes.
Putting an \ansicpg token in the header should perhaps do it, but seems to be ignored by Word (for both raw bytes and \' escapes.
Or must I use \f277 as a font reference and see if it has an associated codepage?
It looks that way. Changing the \fcharset of the font assigned to a particular stretch of text is the only way I can get Word to change how it treats the bytes, anyway. The codes in this token (see eg here for list) are, aggravatingly, different again from either the language ID or the code page number.
It is not so clear but you can use the RichEdit control in order to convert the RTF to UTF-8 format according to the MSDN:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb774304(v=vs.85).aspx
Take a look to the SF_USECODEPAGE for the EM_STREAMOUT message.
I am using a hidden RichTextBox to retrieve Text property from a RichEditCtrl.
rtb->Text; returns the text portion of either English of national languages – just great!
But I need this text in \u12232? \u32232? instead of national characters and symbols. to work with my db and RichEditCtrl. Any idea how to get from “пассажирским поездом Невский” to “\u12415?\u12395?\u23554?\u20219?\u30456?\u35527?\u21729? (where each national character is represented as “\u23232?”
If you have, that would be great.
I am using visual studio 2008 C++ combination of MFC and managed code.
Cheers and have a wonderful weekend
If you need a System::String as an output as well, then something like this would do it:
String^ s = rtb->Text;
StringBuilder^ sb = gcnew StringBuilder(s->Length);
for (int i = 0; i < s->Length; ++i) {
sb->AppendFormat("\u{0:D5}?", (int)s[i]);
}
String^ result = s->ToString();
By the way, are you sure the format is as described? \u is a traditional Escape sequence for a hexadecimal Unicode codepoint, exactly 4 hex digits long, e.g. \u0F3A. It's also not normally followed by ?. If you actually want that, format specifier {0:X4} should do the trick.
You don't need to use escaping to put formatted Unicode in a RichText control. You can use UTF-8. See my answer here: Unicode RTF text in RichEdit.
I'm not sure what your restrictions are on your database, but maybe you can use UTF-8 there too.