What is this unicode character - u'\xf1'? [closed] - unicode

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
What is this unicode character u'\xf1'
Is there a lookup table on the web somewhere? I have seen tables, but nowhere can I search on this character and get the actual representation.
thanks

It is ñ (ntilde).
Unicode Hexadecimal: 0x00F1
Unicode Decimal: 241
UCS-2 Hexadecimal: 0xF100
UCS-2 Decimal: 61696
HTML Hexadecimal: ñ
HTML Decimal: ñ
http://www.fileformat.info/info/unicode/char/f1/index.htm

A search for "unicode character f1" returns what you ask for.
http://www.fileformat.info/info/unicode/char/f1/index.htm
See http://www.unicode.org/charts/ for a full 'lookup table' (several hundreds of these actually).

Related

What is the machine encoding of 4 -- Is it 011 0100 (ascii) or 0100 (binary)? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
I have a file "num.txt" which has only a number 4 in it.
With xxd num.txt, I found that the number is encoded as its ASCII code, 0x34 that is 011 0100. Why is the number not simply encoded as its binary form 0100?
[Edit] My question is really about why 4 is encoded in ASCII, not in its binary form?
What you have is the character '4', which is code point 0x34 in ASCII (and Unicode, for that matter).
In ASCII, code point 4 is EOT (end of transmission), commonly entered as CTRL-D. See, for example, the following table:
As to your edit:
My question is really about why 4 is encoded in ASCII, not in its binary form?
The answer to that is that it's a text file. Whatever has created it has decided it wants to store the values as textual rather than binary information. It's really that simple :-)
If you want to go deeper into that particular question, you're going to have to ask the person who developed the software that creates the file, I'm afraid.
011 0100 isn't 34. It's 0x34. 0x34 is the ASCII encoding of the digit '4'.

How to convert character to unicode? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have this character.
&#8211
How to convert this character to unicode?
Sorry if it is a silly question.
It's not a silly question, character encoding can be tricky to get your head around. I highly recommend reading The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) (I'm sure you can guess the topic).
Unicode itself isn't an encoding, it's a very long list of characters and code points. What I'm guessing you want to do is display the dash character in some way. Where are you wanting to display or store the data? If it's in a browser, then that representation should work as that's the HTML encoded version. If you want to store it in a database then you'll need to convert that encoded version to a string and then convert that string to whatever encoding the database is using.
Take a look at this source has the encoding in different formats
http://www.fileformat.info/info/unicode/char/2013/index.htm
but each language has its own rules on how to write this in a string/char literal

Why does windows notepad give possibility to save document in unicode and in utf-8? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
Utf-8 is " is a variable-width encoding that can represent every character in the Unicode character set" (wikipedia), unicode is "standard for the consistent encoding, representation and handling of text" (wikipedia). They're difference things. Why does windows notepad give possibility to save document in unicode and utf-8? How can I compare two difference things?
To simplify,
Unicode says what number should represent each character.
UTF-8 says how to arange the bits to form different strings of unicode values.
According to this thread, what Unicode means in notepad is UTF-16 Little Endian (UTF-16LE) which is another way arranging the bits in order to form strings of Unicode values.

GB18030 vs Big5 Chinese character encodings sizewise [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
We've two encodings available for Chinese characters, GB18030 and Big5 for Chinese Simplified and Chinese Traditional respectively.
How many byte(s)/octet(s) a single Chinese character would take in each encoding?
Going by Wikipedia:
GB_18030 - Guójiā Biāozhǔn (国家标准) is a 4 octets(bytes) encoding scheme. Hence, every character should take 4 octets. Same is said on GB18030 - New Chinese Encoding Standard
Big-5 or Big5 is a 2 octets(bytes) encoding scheme. Here every character takes 2 octets.

What languages does the character encoding UTF-8 support? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
What languages does UTF-8 support?
And how many languages does the UTF-8 support?
See the page Supported Scripts on unicode.org. UTF-8 supports all Unicode characters.
Note that Unicode defines character encodings, not languages.
The Unicode Standard encodes scripts rather than languages per se. ...
UTF-8 is suppose to represent any Unicode character.
http://en.wikipedia.org/wiki/UTF-8