i saw musical symbol in html plain text, but any know how exactly it happen? [closed] - special-characters

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
♯
♭
I saw this two symbol and i copied it.
try to do any html entities or special character.. but i can't get any result
I can't find any information on google also because this is not a searchable symbols
any one can explain how this flat and sharp musical symbol exist in which standard?
and how to type or generate them and any siblings?
♯
♭
♪
♬
♫

The standard used to define the characters is Unicode
See Unicode Miscellaneous Symbols (includes common music symbols like ♯) and Unicode Musical Symbols (other music symbols) -- I did a search for "unicode musical symbols", there are many more hits.
Happy coding.
See How to enter Unicode characters in Microsoft Windows -- or use the Windows Character Map. However, you need to know the code-point (or general code-point area)
:-) Other operating systems have different input methods and utilities.

A quick google search find the following page which lists entity codes for musical notes:
http://www.danshort.com/HTMLentities/index.php?w=music

It is in Unicode, and you can insert any Unicode character by putting this in HTML/xHTML markup:
♬
Gives ♬, i.e. you put &#x and suffix it with the Hex code of the character (end it with ;)
P.S: This technique is used as the last resort when facing character encoding problems.

explain how this flat and sharp musical symbol exist in which standard?
Unicode
and how to type or generate them and any siblings?
There are utilities for picking characters from unicode distributed with most operating systems.

Related

How does compiler understand Unicode characters so quickly? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have made a document based program lately.
But what intrigues me that how can a compiler(in my case, objective-c) convert any character into Unicode so fast while these characters are only visual presentations.
I think maybe A~Z and all other common characters can be converted from ASCII to Unicode very easily. What about other special character such as brand icon and copyright icon?
I am solely interested in the internal working of such conversion.
Example:
How do compiler understand what "©" is in a blink of second? Is it by looking up a UNICODE table? But if I have 1000000 "©", does my compiler look them up in the table 1000000 times? That is very time consuming, isn't it?
The compiler doesn't see "©". It sees whatever numerical representation of "©" occurs in the source file it's processing. No lookup is needed, because it's already in the form the compiler uses. (Some conversions might be needed if, for example, the source file is in UTF-8 and the compiler uses UTF-32 internally, but such conversions don't require a full Unicode table.)

How to convert a normal font face to Unicode font [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I want to know , how can we convert a font to Unicode font. I have PDF file in my native language but those text has been written in a specific font file (ttf file). So i want to convert those text into Unicode fonts.
So how can i convert those text into Unicode. Is there any free online software available or i have to write any software code in any language.
I have tried in PHP but not getting much more effective.
Your question mixes several basic concepts (it is unclear whether you want to convert a font or the text it's written with), and I suggest you look a bit deeper into font technology before asking "then so how would I do it".
"Normal" fonts are using Unicode encoding. The "encoding" of a font describes which character image inside a font gets output for a given character code. A font can contain several encodings -- MacRoman, Windows Western -- and nowadays including a Unicode encoding is practically standard.
A font that does not comply to Unicode encoding (or any of the common ones) cannot be used without a translation from its character set to Unicode.
Your description suggests that the font in your PDF may be such a non-conforming font, so you need a table that maps its character codes to Unicode values. Use Google to see if someone else did this before you; if not, you will have to create the table yourself.
However.
Since your text comes out of a PDF, you cannot rely anymore on the encoding! If a PDF gets created, the software that does it is free to move characters around to different positions -- usually it creates a subset font from the original, and it can be convenient to reassign character codes. Friendly PDF creators may also include their own encoding in the PDF, but it is not mandatory. If it is missing, and your font is subsetted, then there is only one solution: you will have to create a translation table for that particular PDF. It will not be of any use for other documents using "the same" font, because that most likely will have a different subset.

What is Unicode? and how Encoding works? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Few hours before I was reading a c programming book. While I was reading the book I came across these words, Character encoding and Unicode. Then I started googling for the information about Unicode. Then I came to know that Unicode character set has every character from every language and UTF-8,16,32 can encode the characters listed in unicode character set.
but I was not able to understand how it works.
Does unicode depends upon the operating systems?
How it is related to softwares and programs?
Is UTF-8 is a software that is installed on my computer when i installed operating system?
or Is it related to hardware?
and how a computer encodes the things?
I have found it so much confusing. Please answer me in detail.
I am new to these things, so please keep that in mind while you give me the answer.
thank you.
I have written about this extensively in What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text. Here some highlights:
encodings are plentiful, encodings define how a "character" like "A" can be encoded as bits and bytes
most encodings only specify this for a small number of selected characters; for example all (or at least most) characters needed to write English or Czech; single byte encodings typically support a set of up to 256 characters
Unicode is one large standard effort which has catalogued and specified a number ⟷ character relationship for virtually all characters and symbols of every major language in use, which is hundreds of thousands of characters
UTF-8, 16 and 32 are different sub-standards for how to encode this ginormous catalog of numbers to bytes, each with different size tradeoffs
software needs to specifically support Unicode and its UTF-* encodings, just like it needs to support any other kind of specialized encoding; most of the work is done by the OS these days which exposes supporting functions to an application

how are non-english programming/scripting languages developed? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
how are non-english programming/scripting languages developed ?
do you need to be a computer scientist ?
You need to understand how Unicode works to build a parser in an international language, and yes you do need to be a CS major, or possess the ability to self-teach yourself compiler design.
Study unicode -- learn to use ICU -- or a language with GOOD Unicode support.
Decide on and Build a VM (or use an existing one).
Write a lexxer / parser or use something like ANTLR (Java based) .
decide on a AST
Generate the instruction stream for the VM.
check out "Principles of Compiler Design"
You use a character set capable of encoding extended characters, such as UTF8. Unicode sets above the 8 bit are written in double byte notation for UTF16 or quadruple byte notation for UTF32. The problem that arises is with regard to dibi, bidirectional notation, where language using different bidi notations may read the bytes in different orders. The solution to the bidi problem was through specification of the byte order prior to the character encoding, but the problem remains of what is before with regard to differences of bidi. So the byte order is clearly stated through a more specific subset of the Unicode character sets. UTF16BE, for big endian, mandates the byte order specification comes prior to the character encoding in a right to left interpretation. The opposite would be UTF16LE, or little endian.
There is also the UCS, Universal Character Set. This term is still used, but it is deprecated as it is not specific enough in concern for the problem mentioned above about characters whose mapping takes more than one byte. For information about the differences between UCS and Unicode please read this: http://en.wikipedia.org/wiki/Universal_Character_Set#Differences_between_ISO_10646_and_Unicode
Some examples are the following:
IRI - RFC 3987 - http://www.ietf.org/rfc/rfc3987.txt - mandates UTF8 encoding
Mail Markup Language - http://mailmarkup.org/ - mandates UTF16BE encoding

Unicode "end of story" [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
I'm looking for a good character that means "end-of-story" in unicode. I remember seeing one once that looked like a fractal and was really cool. Does anyone know where I can find this character? More importantly, where can I go to find a unicode character with a special meaning when I don't know it's names? Google wasn't very helpful.
Edit: I found something that looks kinda like a fractal, and also happens to be called "end-of-story." It's a Thai character.
Is this what you were looking for?
http://www.decodeunicode.org/en/u+0e5b/data/k//XS/khomut31910809.jpg
End of story The Khomut sign is a terminal punctuation character which is placed in old books at the end of a verse in a poem, the end of a chapter or at the end of a story.
Compare to U+17DA Khmer Sign Koomuut
Btw: I found this with a Google Image Search on "end of story" unicode--It was the 4th result. That's probably the best way to search for any kind of symbol. Though without the name of the character it would probably have been impossible to find, since unicode fractal didn't return anything useful.
Go and have a look at the unicode.org code charts. You can browse through them and find a character that you like by what they look like. http://www.unicode.org/charts/
Alternatively, browse through the names of the characters using the data file that has the official character name. Do a search using your browser or editor search function. http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt
When you find a character that you want to see what it looks like, just do a search for the character code. e.g. character 0087 (the first field in the UnicodeData.txt file) is searched as U+0087. FileFormat.info usually has all of the characters. For example, END OF SELECTED AREA.
Are you using Windows? Use the Character Map (Start | Accessories | System Tools). I personally like the Greek Omega (U+03A9) or the Ohm sign which is an Omega (U+2126).