Itextsharp special characters like < - itext

I am generating pdf using itextsharp i have one requirement to display << and >> in the pdf page how can i do this

If you know the entity number of the special character you can create that special character simply by using the following code.
document.Add(Phrase.GetInstance(" This is " + (char)945));
Substitute your entity number instead of 945. Hope this helps.
Edit: FontSelector class in iTextSharp will help you further if you have the required font with the symbols.

Related

How to include special characters in A-frame web VR (čšž...)

I want to create a website readable not only in english but i have problems with special characters. I've tried ascii html.
Any idea?
If You have troble with the text component there are three ways I can think of:
1) The proper way: find or generate a font from a fontset containing those characters. The docs describe how to use custom fonts:
<a-entity text="text: Hello World; font: ../fonts/CustomFnt.fnt;
fontImage: ../fonts/CustomFnt.png"></a-entity>
But you need to have a font file + a .png grid with the font images.
The docs provide a link to a tool for generating fonts, as well as a tutorial.
2) check out Don McCurdy's custom font generator !
3) The workaround: You could make a transparent image containing Your text and put it on an <a-plane>, like I did here.

Unicode characters in document info dictionary keys

How do I create document info dictionary keys containing unicode characters (typically swedish characters, for instance C3A4 U+00E4 ä). I would like to use the PdfStamper to enter my own metadata in the document info dictionary, but I can't get it to accept the swedish characters.
Entering custom metadata using Acrobat works fine and looking at the PDF in a text editor I can see that the characters get encoded like for instance #C3#A4 for the character mentioned above. So is there a way to achieve this programmatically using iText PdfStamper???
regards
Mattias
PS. There is no problem having unicode characters in the info dictionary values, but the keys are a different story.
Please take a look at the NameObject example, and give it a try. You'll see that iText automatically escapes special characters in names.
iText follows the ISO-32000-1 specification that stats (7.3.5, Name Objects):
Beginning with PDF 1.2 a name object is an atomic symbol uniquely
defined by a sequence of any characters (8-bit values) except null
(character code 0). Uniquely defined means that any two name objects
made up of the same sequence of characters denote the same object.
Atomic means that a name has no internal structure; although it is
defined by a sequence of characters, those characters are not
considered elements of the name.
not part of the name but is a prefix indicating that what follows is a
sequence of characters representing the name in the PDF file and shall
follow these rules:
a) A NUMBER SIGN (23h) (#) in a name shall be written by using its
2-digit hexadecimal code (23), preceded by the NUMBER SIGN.
b) Any character in a name that is a regular character (other than
NUMBER SIGN) shall be written as itself or by using its 2-digit
hexadecimal code, preceded by the NUMBER SIGN.
c) Any character that is not a regular character shall be written
using its 2-digit hexadecimal code, preceded by the NUMBER SIGN only.
NOTE 1: There is not a unique encoding of names into the PDF file
because regular characters may be coded in either of two ways.
White space used as part of a name shall always be coded using the
2-digit hexadecimal notation and no white space may intervene between
the SOLIDUS and the encoded name.
Regular characters that are outside the range EXCLAMATION MARK(21h)
(!) to TILDE (7Eh) (~) should be written using the hexadecimal
notation.
The token SOLIDUS (a slash followed by no regular characters)
introduces a unique valid name defined by the empty sequence of
characters.
NOTE 2 The examples shown in Table 4 and containing # are not valid
literal names in PDF 1.0 or 1.1.
I'm not copy/pasting table 4, but I don't see any example that uses characters that consist of two bytes. Can you share a PDF that contains a name with a two-byte character that behaves in the way you desire? The PDF specification explicitly says that characters in the context of names are 8-bit values. You seem to be talking about 16-bit values...
Additional note: in the current implementation of iText, we only look at 8 bits:
c = (char)(chars[k] & 0xff);
We deliberately throw away all the higher bits when characters with more than 8 bits are passed.
Actually, I think I have answered your question. Initially, I thought you were asking to add this character: http://www.fileformat.info/info/unicode/char/c3a4/index.htm
As it turns out, you only need "\u00e4" (ä). I've made a small code sample that demonstrates how one would add a custom entry to the DID containing this character: ChangeInfoDictionary.
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
Map<String, String> info = reader.getInfo();
info.put("Special Character: \u00e4", "\u00e4");
stamper.setMoreInfo(info);
stamper.close();
reader.close();
}
Granted, when you open the PDF in a PDF viewer, you don't necessarily see "Special Character: ä" as the key value, but that's a problem of the PDF viewer. When you open the PDF in a text editor, you clearly see:
/Special#20Character:#20#e4(ä)
Which means that iText has correctly escaped the special character.
However: as you pointed out in your comment, the character doesn't show up in Adobe Reader. Based on a PDF I created using Acrobat, I found a workaround by using the following code:
StringBuffer buf = new StringBuffer();
buf.append((char) 0xc3);
buf.append((char) 0xa4);
info.put(buf.toString(), "\u00e4");
Now the character is shown correctly. In other words: it's a matter of encoding...
Just wanted to share a little experiment in C# illustrating one rather effortless way of getting the special characters into the document info dictionary keys.
string inputString = "My key with åäö";
byte[] inputBytes = Encoding.UTF8.GetBytes(inputString);
string convertedString = Encoding.UTF7.GetString(inputBytes);
info.Add(convertedString, "My value with åäö");
(info is the Dictionary used for adding metadata) Then just use the PdfStamper to get the info into the PDF. The metadata is stored correctly in the PDF and can be interpreted by Adobe Reader.

Formatting Field values using itextsharp

how can i have a string format "i am fine here" using itextsharp
fields.SetField("tgPara2", message2);
where message is "i am fine here" i want the fine word only to be bold.
Any help would be
iText has partial support for "rich text values" for text fields. You can get and set the rich values, but iText won't actually draw those values properly. You need to turn off SetGenerateAppearances, and open the PDF in Acrobat/Reader to see the rich text.
This means flattening isn't going to work (unless you open the PDF in Acrobat, then save it again... clunky).
You might want to check out the PDF Specification (Section 12..7.3.4 Rich Text Strings) for further information on what is and isn't legal. <b> is legal, as is the font-weight CSS style

How to draw Thai text to PDF file by using libharu library

i am using free pdf library libharu to generate PDF file,
but i have a encoding problem, i can not draw Thai lanugage text on PDF file,
all the text shows "???.."
Somebody know how to fix it?
Thanks
I have succeeded in rendering hieroglyphic texts (not Thai, but Chinese and Japanese) using libharu. First of all, I used Unicode mode, please refer to HPDF_UseUTFEncodings() function documentation.
For C language, here is a sequence of libharu API calls needed to overcome your trouble:
HPDF_UseUTFEncodings(docHandle);
HPDF_SetCurrentEncoder(docHandle, "UTF-8");
Here docHandle is a valid HPDF_Doc object.
Next part is proper work with UTF fonts:
const char * libFontName = HPDF_LoadTTFontFromFile(docHandle, fontFileName.c_str(), font_embed::EmbedFonts);
HPDF_Font font = HPDF_GetFont(docHandle, libFontName, "UTF-8");
After these calls you may render unicode texts containing Thai characters. Also note about embedding flag (3rd param of LoadTTFontFromFile) - your PDF file may be unreadable due to external font references. If you are not crazy with output PDF size, you may just embed fonts.
I've tested couple of Thai .ttf fonts found in Google and they were rendered OK. Also (it may be important, but I'm not sure) I'm using fork of libharu https://github.com/kdeforche/libharu which is now merged into master branch.
When you write text to the PDF, use the correct font and encoding. In the libharu documentation you have all the possibilities: https://github.com/libharu/libharu/wiki/Fonts
In your case, you must use the ISO8859-11 Thai, TIS 620-2569 character set
An example (in spanish):
HPDF_Font fontEn = HPDF_GetFont(pdf, "Helvetica-Bold", "ISO8859-2");
HPDF_Page_TextOut(page1, 50.00, 750.00, [#"Código para correcta codificación en libharu" cStringUsingEncoding:NSISOLatin1StringEncoding]);

RichTextBox use to retrieve Text property in C++

I am using a hidden RichTextBox to retrieve Text property from a RichEditCtrl.
rtb->Text; returns the text portion of either English of national languages – just great!
But I need this text in \u12232? \u32232? instead of national characters and symbols. to work with my db and RichEditCtrl. Any idea how to get from “пассажирским поездом Невский” to “\u12415?\u12395?\u23554?\u20219?\u30456?\u35527?\u21729? (where each national character is represented as “\u23232?”
If you have, that would be great.
I am using visual studio 2008 C++ combination of MFC and managed code.
Cheers and have a wonderful weekend
If you need a System::String as an output as well, then something like this would do it:
String^ s = rtb->Text;
StringBuilder^ sb = gcnew StringBuilder(s->Length);
for (int i = 0; i < s->Length; ++i) {
sb->AppendFormat("\u{0:D5}?", (int)s[i]);
}
String^ result = s->ToString();
By the way, are you sure the format is as described? \u is a traditional Escape sequence for a hexadecimal Unicode codepoint, exactly 4 hex digits long, e.g. \u0F3A. It's also not normally followed by ?. If you actually want that, format specifier {0:X4} should do the trick.
You don't need to use escaping to put formatted Unicode in a RichText control. You can use UTF-8. See my answer here: Unicode RTF text in RichEdit.
I'm not sure what your restrictions are on your database, but maybe you can use UTF-8 there too.