How to convert a normal font face to Unicode font [closed] - unicode

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I want to know , how can we convert a font to Unicode font. I have PDF file in my native language but those text has been written in a specific font file (ttf file). So i want to convert those text into Unicode fonts.
So how can i convert those text into Unicode. Is there any free online software available or i have to write any software code in any language.
I have tried in PHP but not getting much more effective.

Your question mixes several basic concepts (it is unclear whether you want to convert a font or the text it's written with), and I suggest you look a bit deeper into font technology before asking "then so how would I do it".
"Normal" fonts are using Unicode encoding. The "encoding" of a font describes which character image inside a font gets output for a given character code. A font can contain several encodings -- MacRoman, Windows Western -- and nowadays including a Unicode encoding is practically standard.
A font that does not comply to Unicode encoding (or any of the common ones) cannot be used without a translation from its character set to Unicode.
Your description suggests that the font in your PDF may be such a non-conforming font, so you need a table that maps its character codes to Unicode values. Use Google to see if someone else did this before you; if not, you will have to create the table yourself.
However.
Since your text comes out of a PDF, you cannot rely anymore on the encoding! If a PDF gets created, the software that does it is free to move characters around to different positions -- usually it creates a subset font from the original, and it can be convenient to reassign character codes. Friendly PDF creators may also include their own encoding in the PDF, but it is not mandatory. If it is missing, and your font is subsetted, then there is only one solution: you will have to create a translation table for that particular PDF. It will not be of any use for other documents using "the same" font, because that most likely will have a different subset.

Related

Map special characters to URL safe, readable versions [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
i am looking for a mapping table or Perl module or anything else, which makes it possible to map characters to a URL safe version that is also readable.
I need to build URLs without any special characters. The base words are city names in their native language which means it can contain special characters from that language.
For example, when i have something like the polish city name 'łódź' i need to get a readable version like: 'lodz'
The major browsers show and accept non-ASCII characters in the URL bar even if they need to be encoded during transmission.
For example,
http://.../city/Montr%C3%A9al
will appear as
http://.../city/Montréal
in the browser's URL bar. [Test]
But if you want to convert to a subset of ASCII, you'd start by using Text::Unidecode's unidecode. Then you gotta decide what to do with the characters that must be escaped in URLs.

What is Unicode? and how Encoding works? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Few hours before I was reading a c programming book. While I was reading the book I came across these words, Character encoding and Unicode. Then I started googling for the information about Unicode. Then I came to know that Unicode character set has every character from every language and UTF-8,16,32 can encode the characters listed in unicode character set.
but I was not able to understand how it works.
Does unicode depends upon the operating systems?
How it is related to softwares and programs?
Is UTF-8 is a software that is installed on my computer when i installed operating system?
or Is it related to hardware?
and how a computer encodes the things?
I have found it so much confusing. Please answer me in detail.
I am new to these things, so please keep that in mind while you give me the answer.
thank you.
I have written about this extensively in What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text. Here some highlights:
encodings are plentiful, encodings define how a "character" like "A" can be encoded as bits and bytes
most encodings only specify this for a small number of selected characters; for example all (or at least most) characters needed to write English or Czech; single byte encodings typically support a set of up to 256 characters
Unicode is one large standard effort which has catalogued and specified a number ⟷ character relationship for virtually all characters and symbols of every major language in use, which is hundreds of thousands of characters
UTF-8, 16 and 32 are different sub-standards for how to encode this ginormous catalog of numbers to bytes, each with different size tradeoffs
software needs to specifically support Unicode and its UTF-* encodings, just like it needs to support any other kind of specialized encoding; most of the work is done by the OS these days which exposes supporting functions to an application

i saw musical symbol in html plain text, but any know how exactly it happen? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
♯
♭
I saw this two symbol and i copied it.
try to do any html entities or special character.. but i can't get any result
I can't find any information on google also because this is not a searchable symbols
any one can explain how this flat and sharp musical symbol exist in which standard?
and how to type or generate them and any siblings?
♯
♭
♪
♬
♫
The standard used to define the characters is Unicode
See Unicode Miscellaneous Symbols (includes common music symbols like ♯) and Unicode Musical Symbols (other music symbols) -- I did a search for "unicode musical symbols", there are many more hits.
Happy coding.
See How to enter Unicode characters in Microsoft Windows -- or use the Windows Character Map. However, you need to know the code-point (or general code-point area)
:-) Other operating systems have different input methods and utilities.
A quick google search find the following page which lists entity codes for musical notes:
http://www.danshort.com/HTMLentities/index.php?w=music
It is in Unicode, and you can insert any Unicode character by putting this in HTML/xHTML markup:
♬
Gives ♬, i.e. you put &#x and suffix it with the Hex code of the character (end it with ;)
P.S: This technique is used as the last resort when facing character encoding problems.
explain how this flat and sharp musical symbol exist in which standard?
Unicode
and how to type or generate them and any siblings?
There are utilities for picking characters from unicode distributed with most operating systems.

Is there a Unicode glyph that looks like a "key" icon? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Unicode has a million icon-like glyphs, but they're not always easy to search by, since I don't always know what they look like.
Is there a Unicode glyph that looks like a "key"? Or is there a symbol that's used in database circles to mean "primary key", which is in Unicode?
U+1F511 🔑 KEY
(128273 decimal)
Also:
U+1F5DD 🗝 (Decimal: 🗝) OLD KEY
U+26BF ⚿ SQUARED KEY
U+1F510 🔐 CLOSED LOCK WITH KEY
U+1F512 🔒 LOCK
U+1F513 🔓 OPEN LOCK
U+1F50F 🔏 LOCK WITH INK PEN
To find useful symbols, I have this resource:
http://shapecatcher.com
Allows you to draw a shape, which it then searches for similarly shaped unicode symbols.
I often end up using shapecatcher these days just because it's a fun break just to be able to draw the shape that you want and have the site pull it up for you. At least, sometimes it will pull it up.
Misc. Symbols Blocks
http://shapecatcher.com/unicode/block/Miscellaneous_Symbols_And_Pictographs is also a great category of unicode symbols, though as with all unicode, you may have to test compatibility.
This is duplicated from my answer here because I think the approach will be useful to others besides just me: What Unicode character do you use in your website? (instead of image icons)
I used a little Python 3 script to look, and the closest I found does not display here for me (does display in Idle on my machine), but it is:
9897 ⚩ HORIZONTAL MALE WITH STROKE SIGN
(Looks like a male sign pointed right with a perpendicular stroke added between the arrow and circle)
I searched for various matches like "KEY" and "LOCK" in the unicode names using Python's unicodedata module and no luck there.
Editing to add - Ah hah - one that looks even more like a key:
9911 ⚷ CHIRON
I give both of the above code points in decimal. To see them and their hex codes, go to this link:
http://www.unicode.org/charts/PDF/U2600.pdf
See 26B7 in particular for the Chiron.
Check out #26bf.
9919 ⚿ SQUARED KEY (HTML: ⚿)
It's the parental lock, which is a key inside a square. It's a newer Unicode specification so standard fonts don't support it, but if you can find a font that has it, you're home free.
I've found Google to be the best way to find Unicode characters. I didn't find see anything useful for a key symbol, however.
If you want to search visually, use the PDF charts, since HTML-based listings will only show symbols that occur in the particular set of fonts you have installed.
Lacking any specific symbol, I would just use "I" to indicate an index and "PK" for a primary key.
I browsed through all the symbols (using a PHP script I created a while back) and can't see a key symbol. You could try one of these:
A mathematic-looking P:
ℙ (#8473)
Various star shapes:
★ (#9733)
☆ (#9734)
✶ (#10038)
There doesn't seem to be a unicode character that fits your description, but I'd recommend the silk icon set by famfamfam if you can use icons in your situation--just a suggestion :P

Unicode "end of story" [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
I'm looking for a good character that means "end-of-story" in unicode. I remember seeing one once that looked like a fractal and was really cool. Does anyone know where I can find this character? More importantly, where can I go to find a unicode character with a special meaning when I don't know it's names? Google wasn't very helpful.
Edit: I found something that looks kinda like a fractal, and also happens to be called "end-of-story." It's a Thai character.
Is this what you were looking for?
http://www.decodeunicode.org/en/u+0e5b/data/k//XS/khomut31910809.jpg
End of story The Khomut sign is a terminal punctuation character which is placed in old books at the end of a verse in a poem, the end of a chapter or at the end of a story.
Compare to U+17DA Khmer Sign Koomuut
Btw: I found this with a Google Image Search on "end of story" unicode--It was the 4th result. That's probably the best way to search for any kind of symbol. Though without the name of the character it would probably have been impossible to find, since unicode fractal didn't return anything useful.
Go and have a look at the unicode.org code charts. You can browse through them and find a character that you like by what they look like. http://www.unicode.org/charts/
Alternatively, browse through the names of the characters using the data file that has the official character name. Do a search using your browser or editor search function. http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt
When you find a character that you want to see what it looks like, just do a search for the character code. e.g. character 0087 (the first field in the UnicodeData.txt file) is searched as U+0087. FileFormat.info usually has all of the characters. For example, END OF SELECTED AREA.
Are you using Windows? Use the Character Map (Start | Accessories | System Tools). I personally like the Greek Omega (U+03A9) or the Ohm sign which is an Omega (U+2126).