How to fix orphaned punctuation in iText - itext
I saw in
How to fix iText's text wrapping for chinese characters that another user had a similar problem as what we're facing. A response by https://stackoverflow.com/users/1622493/bruno-lowagie indicated the DefaultSplitCharacter has taken Chinese characters into account since iText 5. We're using iText 5.5.6, but still see the problem.
As near as I can tell, DefaultSplitCharacter is working correctly, but the problem appears to be that the ColumnText class allows lines to begin with these punctuation marks.
Here's a screen shot of the PdfChunks in the BidiLine class being used to render the text
However, the result is being written where the 3rd and 5th lines both begin with punctuation characters as show in this image of the PDF output
I can simply add some new lines in the proper places to make it look correct, but this would mean if the text is ever re-translated internally my fix may no longer work. Does anyone know how to ensure that iText won't begin a line with these punctuation characters?
For breaking lines in Asian languages you need to write your own implementation of SplitCharacter. A good reference for line breaking is Unicode® Standard Annex #14 -Unicode Line Breaking Algorithm. Another one is https://msdn.microsoft.com/en-us/library/cc194864.aspx.
Having suffered through implementing this for Japanese, I'm putting example code I wrote for Japanese text mixed with English text. This code could be modified for Chinese fairly easily using the references above.
Here is a snippet showing JapaneseSplitCharacter in use:
Chunk chunk = new Chunk(<asian text>,<asian font>);
chunk.setSplitCharacter(JapaneseSplitCharacter.SplitCharacter);
Paragraph paragraph = new Paragraph(chunk);
Here is the code for JapaneseSplitCharacter:
import com.itextpdf.text.SplitCharacter;
import com.itextpdf.text.pdf.DefaultSplitCharacter;
import com.itextpdf.text.pdf.PdfChunk;
/**
* <p/>
* For basic latin characters spaces, periods, commas, etc. are split characters. For Japanese characters lines can break
* anywhere, unless prohibited. This class uses logic for Japanese, non-starting and non-ending characters based on the
* kinsoku rule and uses the DefaultSplitCharacter class for basic latin characters while writing free flowing text to a PDF.
* <p/>
*/
public class JapaneseSplitCharacter implements SplitCharacter {
// line of text cannot start or end with this character
static final char u2060 = '\u2060'; // - ZERO WIDTH NO BREAK SPACE
// a line of text cannot start with any following characters in NOT_BEGIN_CHARACTERS[]
static final char u30fb = '\u30fb'; // ・ - KATAKANA MIDDLE DOT
static final char u2022 = '\u2022'; // • - BLACK SMALL CIRCLE (BULLET)
static final char uff65 = '\uff65'; // ・ - HALFWIDTH KATAKANA MIDDLE DOT
static final char u300d = '\u300d'; // 」 - RIGHT CORNER BRACKET
static final char uff09 = '\uff09'; // ) - FULLWIDTH RIGHT PARENTHESIS
static final char u0021 = '\u0021'; // ! - EXCLAMATION MARK
static final char u0025 = '\u0025'; // % - PERCENT SIGN
static final char u0029 = '\u0029'; // ) - RIGHT PARENTHESIS
static final char u002c = '\u002c'; // , - COMMA
static final char u002e = '\u002e'; // . - FULL STOP
static final char u003f = '\u003f'; // ? - QUESTION MARK
static final char u005d = '\u005d'; // ] - RIGHT SQUARE BRACKET
static final char u007d = '\u007d'; // } - RIGHT CURLY BRACKET
static final char uff61 = '\uff61'; // 。 - HALFWIDTH IDEOGRAPHIC FULL STOP
static final char uff63 = '\uff63'; // 」 - HALFWIDTH RIGHT CORNER BRACKET
static final char uff64 = '\uff64'; // 、 - HALFWIDTH IDEOGRAPHIC COMMA
static final char uff67 = '\uff67'; // ァ - HALFWIDTH KATAKANA LETTER SMALL A
static final char uff68 = '\uff68'; // ィ - HALFWIDTH KATAKANA LETTER SMALL I
static final char uff69 = '\uff69'; // ゥ - HALFWIDTH KATAKANA LETTER SMALL U
static final char uff6a = '\uff6a'; // ェ - HALFWIDTH KATAKANA LETTER SMALL E
static final char uff6b = '\uff6b'; // ォ - HALFWIDTH KATAKANA LETTER SMALL O
static final char uff6c = '\uff6c'; // ャ - HALFWIDTH KATAKANA LETTER SMALL YA
static final char uff6d = '\uff6d'; // ュ - HALFWIDTH KATAKANA LETTER SMALL YU
static final char uff6e = '\uff6e'; // ョ - HALFWIDTH KATAKANA LETTER SMALL YO
static final char uff6f = '\uff6f'; // ッ - HALFWIDTH KATAKANA LETTER SMALL TU
static final char uff70 = '\uff70'; // ー - HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
static final char uff9e = '\uff9e'; // ゙ - HALFWIDTH KATAKANA VOICED SOUND MARK
static final char uff9f = '\uff9f'; // ゚ - HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK
static final char u3001 = '\u3001'; // 、 - IDEOGRAPHIC COMMA
static final char u3002 = '\u3002'; // 。 - IDEOGRAPHIC FULL STOP
static final char uff0c = '\uff0c'; // , - FULLWIDTH COMMA
static final char uff0e = '\uff0e'; // . - FULLWIDTH FULL STOP
static final char uff1a = '\uff1a'; // : - FULLWIDTH COLON
static final char uff1b = '\uff1b'; // ; - FULLWIDTH SEMICOLON
static final char uff1f = '\uff1f'; // ? - FULLWIDTH QUESTION MARK
static final char uff01 = '\uff01'; // ! - FULLWIDTH EXCLAMATION MARK
static final char u309b = '\u309b'; // ゛ - KATAKANA-HIRAGANA VOICED SOUND MARK
static final char u309c = '\u309c'; // ゜ - KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
static final char u30fd = '\u30fd'; // ヽ - KATAKANA ITERATION MARK
static final char u30fe = '\u30fe'; // ヾ - KATAKANA VOICED ITERATION MARK
static final char u309d = '\u309d'; // ゝ - HIRAGANA ITERATION MARK
static final char u309e = '\u309e'; // ゞ - HIRAGANA VOICED ITERATION MARK
static final char u3005 = '\u3005'; // 々 - IDEOGRAPHIC ITERATION MARK
static final char u30fc = '\u30fc'; // ー - KATAKANA-HIRAGANA PROLONGED SOUND MARK
static final char u2019 = '\u2019'; // ’ - RIGHT SINGLE QUOTATION MARK
static final char u201d = '\u201d'; // ” - RIGHT DOUBLE QUOTATION MARK
static final char u3015 = '\u3015'; // 〕 - RIGHT TORTOISE SHELL BRACKET
static final char uff3d = '\uff3d'; // ] - FULLWIDTH RIGHT SQUARE BRACKET
static final char uff5d = '\uff5d'; // } - FULLWIDTH RIGHT CURLY BRACKET
static final char u3009 = '\u3009'; // 〉 - RIGHT ANGLE BRACKET
static final char u300b = '\u300b'; // 》 - RIGHT DOUBLE ANGLE BRACKET
static final char u300f = '\u300f'; // 』 - RIGHT WHITE CORNER BRACKET
static final char u3011 = '\u3011'; // 】 - RIGHT BLACK LENTICULAR BRACKET
static final char u00b0 = '\u00b0'; // ° - DEGREE SIGN
static final char u2032 = '\u2032'; // ′ - PRIME
static final char u2033 = '\u2033'; // ″ - DOUBLE PRIME
static final char u2103 = '\u2103'; // ℃ - DEGREE CELSIUS
static final char u00a2 = '\u00a2'; // ¢ - CENT SIGN
static final char uff05 = '\uff05'; // % - FULLWIDTH PERCENT SIGN
static final char u2030 = '\u2030'; // ‰ - PER MILLE SIGN
static final char u3041 = '\u3041'; // ぁ - HIRAGANA LETTER SMALL A
static final char u3043 = '\u3043'; // ぃ - HIRAGANA LETTER SMALL I
static final char u3045 = '\u3045'; // ぅ - HIRAGANA LETTER SMALL U
static final char u3047 = '\u3047'; // ぇ - HIRAGANA LETTER SMALL E
static final char u3049 = '\u3049'; // ぉ - HIRAGANA LETTER SMALL O
static final char u3063 = '\u3063'; // っ - HIRAGANA LETTER SMALL TU
static final char u3083 = '\u3083'; // ゃ - HIRAGANA LETTER SMALL YA
static final char u3085 = '\u3085'; // ゅ - HIRAGANA LETTER SMALL YU
static final char u3087 = '\u3087'; // ょ - HIRAGANA LETTER SMALL YO
static final char u308e = '\u308e'; // ゎ - HIRAGANA LETTER SMALL WA
static final char u30a1 = '\u30a1'; // ァ - KATAKANA LETTER SMALL A
static final char u30a3 = '\u30a3'; // ィ - KATAKANA LETTER SMALL I
static final char u30a5 = '\u30a5'; // ゥ - KATAKANA LETTER SMALL U
static final char u30a7 = '\u30a7'; // ェ - KATAKANA LETTER SMALL E
static final char u30a9 = '\u30a9'; // ォ - KATAKANA LETTER SMALL O
static final char u30c3 = '\u30c3'; // ッ - KATAKANA LETTER SMALL TU
static final char u30e3 = '\u30e3'; // ャ - KATAKANA LETTER SMALL YA
static final char u30e5 = '\u30e5'; // ュ - KATAKANA LETTER SMALL YU
static final char u30e7 = '\u30e7'; // ョ - KATAKANA LETTER SMALL YO
static final char u30ee = '\u30ee'; // ヮ - KATAKANA LETTER SMALL WA
static final char u30f5 = '\u30f5'; // ヵ - KATAKANA LETTER SMALL KA
static final char u30f6 = '\u30f6'; // ヶ - KATAKANA LETTER SMALL KE
static final char[] NOT_BEGIN_CHARACTERS = new char[]{u30fb, u2022, uff65, u300d, uff09, u0021, u0025, u0029, u002c,
u002e, u003f, u005d, u007d, uff61, uff63, uff64, uff67, uff68, uff69, uff6a, uff6b, uff6c, uff6d, uff6e,
uff6f, uff70, uff9e, uff9f, u3001, u3002, uff0c, uff0e, uff1a, uff1b, uff1f, uff01, u309b, u309c, u30fd,
u30fe, u309d, u309e, u3005, u30fc, u2019, u201d, u3015, uff3d, uff5d, u3009, u300b, u300f, u3011, u00b0,
u2032, u2033, u2103, u00a2, uff05, u2030, u3041, u3043, u3045, u3047, u3049, u3063, u3083, u3085, u3087,
u308e, u30a1, u30a3, u30a5, u30a7, u30a9, u30c3, u30e3, u30e5, u30e7, u30ee, u30f5, u30f6, u2060};
// a line of text cannot end with any following characters in NOT_ENDING_CHARACTERS[]
static final char u0024 = '\u0024'; // $ - DOLLAR SIGN
static final char u0028 = '\u0028'; // ( - LEFT PARENTHESIS
static final char u005b = '\u005b'; // [ - LEFT SQUARE BRACKET
static final char u007b = '\u007b'; // { - LEFT CURLY BRACKET
static final char u00a3 = '\u00a3'; // £ - POUND SIGN
static final char u00a5 = '\u00a5'; // ¥ - YEN SIGN
static final char u201c = '\u201c'; // “ - LEFT DOUBLE QUOTATION MARK
static final char u2018 = '\u2018'; // ‘ - LEFT SINGLE QUOTATION MARK
static final char u300a = '\u300a'; // 《 - LEFT DOUBLE ANGLE BRACKET
static final char u3008 = '\u3008'; // 〈 - LEFT ANGLE BRACKET
static final char u300c = '\u300c'; // 「 - LEFT CORNER BRACKET
static final char u300e = '\u300e'; // 『 - LEFT WHITE CORNER BRACKET
static final char u3010 = '\u3010'; // 【 - LEFT BLACK LENTICULAR BRACKET
static final char u3014 = '\u3014'; // 〔 - LEFT TORTOISE SHELL BRACKET
static final char uff62 = '\uff62'; // 「 - HALFWIDTH LEFT CORNER BRACKET
static final char uff08 = '\uff08'; // ( - FULLWIDTH LEFT PARENTHESIS
static final char uff3b = '\uff3b'; // [ - FULLWIDTH LEFT SQUARE BRACKET
static final char uff5b = '\uff5b'; // { - FULLWIDTH LEFT CURLY BRACKET
static final char uffe5 = '\uffe5'; // ¥ - FULLWIDTH YEN SIGN
static final char uff04 = '\uff04'; // $ - FULLWIDTH DOLLAR SIGN
static final char[] NOT_ENDING_CHARACTERS = new char[]{u0024, u0028, u005b, u007b, u00a3, u00a5, u201c, u2018, u3008,
u300a, u300c, u300e, u3010, u3014, uff62, uff08, uff3b, uff5b, uffe5, uff04, u2060};
/**
* An instance of the jpSplitCharacter.
*/
public static final JapaneseSplitCharacter SplitCharacter = new JapaneseSplitCharacter();
/**
* An instance DefaultSplitCharacter used for BasicLatin characters.
*/
private static final SplitCharacter defaultSplitCharacter = new DefaultSplitCharacter();
public JapaneseSplitCharacter() { }
/**
* Custom method to for SplitCharacter to handle Japanese characters.
* Returns <CODE>true</CODE> if the character can split a line. The splitting implementation
* is free to look ahead or look behind characters to make a decision.
*
* #param start the lower limit of <CODE>cc</CODE> inclusive
* #param current the pointer to the character in <CODE>cc</CODE>
* #param end the upper limit of <CODE>cc</CODE> exclusive
* #param cc an array of characters at least <CODE>end</CODE> sized
* #param ck an array of <CODE>PdfChunk</CODE>. The main use is to be able to call
* {#link PdfChunk#getUnicodeEquivalent(int)}. It may be <CODE>null</CODE>
* or shorter than <CODE>end</CODE>. If <CODE>null</CODE> no conversion takes place.
* If shorter than <CODE>end</CODE> the last element is used
* #return <CODE>true</CODE> if the character(s) can split a line
*/
public boolean isSplitCharacter(int start, int current, int end, char[] cc, PdfChunk[] ck) {
// Note: If you don't add an try/catch iText and there is an issue with isSplitCharacter() silently fails and
// you have no idea there was a problem.
try {
char charCurrent = getCharacter(current, cc, ck);
int next = current + 1;
if (next < cc.length) {
char charNext = getCharacter(next, cc, ck);
for (char not_begin_character : NOT_BEGIN_CHARACTERS) {
if (charNext == not_begin_character) {
return false;
}
}
}
for (char not_ending_character : NOT_ENDING_CHARACTERS) {
if (charCurrent == not_ending_character) {
return false;
}
}
boolean isBasicLatin = Character.UnicodeBlock.of(charCurrent) == Character.UnicodeBlock.BASIC_LATIN;
if (isBasicLatin)
return defaultSplitCharacter.isSplitCharacter(start, current, end, cc, ck);
return true;
} catch (Exception ex) {
ex.printStackTrace();
}
return true;
}
/**
* Returns a character int the array (Note: modified from the iText default version with the addition null
* check of '|| ck[Math.min(position, ck.length - 1)] == null'.
*
* #param position position in the array
* #param ck chunk array
* #param cc the character array that has to be checked
* #return the character
*/
protected char getCharacter(int position, char[] cc, PdfChunk[] ck) {
if (ck == null || ck[Math.min(position, ck.length - 1)] == null) {
return cc[position];
}
return (char) ck[Math.min(position, ck.length - 1)].getUnicodeEquivalent(cc[position]);
}
}
Hope this helps.
I'm using iTextSharp.
I wrote a ISplitCharacter following k.f.'s sample.
public class CJKSplitCharacter : ISplitCharacter
{
public static ISplitCharacter Default = new CJKSplitCharacter();
private static ISplitCharacter defaultSplit = new DefaultSplitCharacter();
public bool IsSplitCharacter(int start, int current, int end, char[] cc, PdfChunk[] ck)
{
char charCurrent = GetChar(current, cc, ck);
int next = current + 1;
if (next < cc.Length)
{
char charNext = GetChar(next, cc, ck);
// if next char is close char, do not break here
if (IsCloseChar(charNext))
{
return false;
}
// otherwise, if current char is close char, mark as breakable
else if (IsCloseChar(charCurrent))
{
return true;
}
}
// if current char is open char, do not break here
if (IsOpenChar(charCurrent))
{
return false;
}
// default:
// split every CJK character
if (Char.GetUnicodeCategory(charCurrent) == UnicodeCategory.OtherLetter) // CJK Letters
{
return true;
}
else
{
return defaultSplit.IsSplitCharacter(start, current, end, cc, ck);
}
}
private char GetChar(int position, char[] cc, PdfChunk[] ck)
{
char c;
if (ck == null || ck[Math.Min(position, ck.Length - 1)] == null)
{
c = cc[position];
}
else
{
c = (char)ck[Math.Min(position, ck.Length - 1)].GetUnicodeEquivalent(cc[position]);
}
return c;
}
private bool IsCloseChar(char c)
{
UnicodeCategory cat = Char.GetUnicodeCategory(c);
return (cat == UnicodeCategory.ClosePunctuation //right bracket/brace, eg: )]
|| cat == UnicodeCategory.FinalQuotePunctuation //right quote, eg: ”
|| cat == UnicodeCategory.OtherPunctuation //other punctuation, eg: ,。
);
}
private bool IsOpenChar(char c)
{
UnicodeCategory cat = Char.GetUnicodeCategory(c);
return (cat == UnicodeCategory.OpenPunctuation //left bracket/brace, eg: ([
|| cat == UnicodeCategory.InitialQuotePunctuation //right quote, eg: “
);
}
}
Related
ASCII conversion
I wanted to convert ASCII values to its corresponding characters so I wrote this simple code: public class Test { public static void main(String[] args) { int i=0; char ch='c'; for(i=0;i<127;i++) { ch=(char)i; System.out.print(ch+"\t"); } System.out.println("finish"); } } But as output it's showing nothing and along with that the control is not even getting out of the loop though the process gets finished..plz explain this kind of behavior and the right code.
As other people have pointed out, you have included the control characters; if you alter the loop (as below) you get the full set, excluding these control characters: public static void main() { for(int i = 33; i < 127; i++) { char ch = (char) i; System.out.print(i + ":" + ch + "\t"); } System.out.println("finish"); }
How to programmatically add header and footer to an existing form-based PDF using iText?
I need to programmatically add header and footer to an existing form-based PDF using iText. The existing PDF comes from user and it contains no space for header and footer. So the solution is to create a new PDF by concatenating the contents of the existing PDF with the header and footer. However, this approach only works for PDF containing no form. For interactive PDF that contains AcroForm or XFA Form, it fails as follows: (1) AcroForm gets flattened in the new PDF. (2) XFA Form doesn't import at all - the new PDF shows "Please wait...If this message is not eventually replaced by proper contents of the document, your PDF viewer may not be able to display this type of document...". Here's my code: import java.awt.Color; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util.List; import com.lowagie.text.Document; import com.lowagie.text.DocumentException; import com.lowagie.text.Element; import com.lowagie.text.Rectangle; import com.lowagie.text.pdf.PdfContentByte; import com.lowagie.text.pdf.PdfReader; import com.lowagie.text.pdf.PdfWriter; import com.lowagie.text.Font; import com.lowagie.text.pdf.PdfGState; import com.lowagie.text.pdf.PdfStamper; import com.lowagie.text.FontFactory; import com.lowagie.text.pdf.BaseFont; import com.lowagie.text.Phrase; import com.lowagie.text.pdf.ColumnText; import com.lowagie.text.pdf.PdfImportedPage; public class PdfFormCopyTest { private static final String ACRO_FORM_PDF = "AcroForm.pdf"; private static final String XFA_FORM_PDF = "XfaForm.pdf"; private static final String NO_FORM_PDF = "NoForm.pdf"; private static final String ACRO_FORM_PDF_NEW = "AcroForm-new.pdf"; private static final String XFA_FORM_PDF_NEW = "XfaForm-new.pdf"; private static final String NO_FORM_PDF_NEW = "NoForm-new.pdf"; private static final float MARGIN_LEFT = 36.0f; private static final float MARGIN_RIGHT = 36.0f; private static final float MARGIN_BOTTOM = 56.0f; private static final float MARGIN_TOP = 36.0f; private static final float FONT_SIZE = 10.0f; private static final float MIN_LINE_HEIGHT = FONT_SIZE * 1.5f; /** * #param args */ public static void main(String[] args) { try { createPdfFromAcroFormBasedPdf(); createPdfFromXfaFormBasedPdf(); createPdfFromFormlessPdf(); } catch (Exception error) { System.out.println(error.getMessage()); } } private static void createPdfFromAcroFormBasedPdf() throws IOException, DocumentException { System.out.println("Creating new PDF from an existing PDF containing AcroForm....."); PdfReader reader = new PdfReader(ACRO_FORM_PDF); createNewPdfWithHeaderFooter(reader, ACRO_FORM_PDF_NEW); System.out.println("Success"); } private static void createPdfFromXfaFormBasedPdf() throws IOException, DocumentException { System.out.println("Creating new PDF from an existing PDF containing XfaForm......"); PdfReader reader = new PdfReader(XFA_FORM_PDF); createNewPdfWithHeaderFooter(reader, XFA_FORM_PDF_NEW); System.out.println("Success"); } private static void createPdfFromFormlessPdf() throws IOException, DocumentException { System.out.println("Creating new PDF from an existing PDF containing no form......"); PdfReader reader = new PdfReader(NO_FORM_PDF); createNewPdfWithHeaderFooter(reader, NO_FORM_PDF_NEW); System.out.println("Success"); } /** * Creates a new PDF which contains header and footer from the specified input PdfReader object * and saves the result as the specified output file. * #param reader A PdfReader for the existing PDF. * #param outputFileName Name of the PDF file which contains header and footer. * #throws IOException * #throws DocumentException */ private static void createNewPdfWithHeaderFooter(PdfReader reader, String outputFileName) throws IOException, DocumentException { String footer = getFooter(); String header = getHeader(); List<Float> footerHeights = computeHeights(footer, reader, Font.NORMAL); List<Float> headerHeights = computeHeights(header, reader, Font.BOLD); InputStream resizedPdfStream = createPdfWithHeaderFooterSpace(reader, footerHeights, headerHeights); PdfStamper stamper = null; try { FileOutputStream fos = new FileOutputStream(outputFileName); PdfReader newReader = new PdfReader(resizedPdfStream); stamper = new PdfStamper(newReader, fos); int numberOfPages = stamper.getReader().getNumberOfPages(); for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) { Rectangle rect = stamper.getReader().getPageSize(pageNumber); PdfContentByte pageContent = stamper.getOverContent(pageNumber); pageContent.saveState(); pageContent.setGState(new PdfGState()); renderHeaderFooter(rect, pageContent, header, footer); pageContent.restoreState(); } } finally { if (stamper != null) { stamper.close(); } } } /** * Computes the height of the specified content for each page * in the specified PdfReader with the specified font weight. * #param content The string content for which the height of each page is computed. * #param reader A PdfReader containing the existing PDF. * #param fontWeight The font weight. * #return A list of float representing the height of each page. * #throws IOException * #throws DocumentException */ private static List<Float> computeHeights(String content, PdfReader reader, int fontWeight) throws IOException, DocumentException { List<Float> contentHeights = new ArrayList<Float>(); int numberOfPages = reader.getNumberOfPages(); for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) { Rectangle pageSize = reader.getPageSize(pageNumber); float height = computeWrappedTextHeight(content, pageSize.getWidth(), fontWeight); contentHeights.add(pageNumber - 1, height); } return contentHeights; } /** * Creates a new PDF with place holder for header and footer from the specified parameters. * #param reader The PdfReader storing the contents of the PDF to be created. * #param footerHeights The footer height for each page. * #param headerHeights The header height for each page. * #return An InputStream representing the new PDF. * #throws IOException * #throws DocumentException */ private static InputStream createPdfWithHeaderFooterSpace(PdfReader reader, List<Float> footerHeights, List<Float> headerHeights) throws IOException, DocumentException { ByteArrayOutputStream baos = null; Document newDocument = null; try { baos = new ByteArrayOutputStream(); newDocument = new Document(); PdfWriter newPdfWriter = PdfWriter.getInstance(newDocument, baos); PdfContentByte newPdfCanvas = null; int numberOfPages = reader.getNumberOfPages(); for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) { Rectangle oldPageSize = reader.getPageSize(pageNumber); float oldPageWidth = oldPageSize.getWidth(); float oldPageHeight = oldPageSize.getHeight(); float footerHeight = footerHeights.get(pageNumber - 1); float headerHeight = headerHeights.get(pageNumber - 1); float newPageHeight = calculateNewPageHeight(oldPageHeight, headerHeight, footerHeight); float newPageWidth = calculateNewPageWidth(oldPageWidth); Rectangle newPageSize = new Rectangle(0, 0, newPageWidth, newPageHeight); newDocument.setPageSize(newPageSize); if (!newDocument.isOpen()) { newDocument.open(); newPdfCanvas = newPdfWriter.getDirectContent(); } float xFactor = 1.0f; float yFactor = 1.0f; float xOffset = MARGIN_LEFT; float yOffset = MARGIN_BOTTOM + footerHeight; PdfImportedPage importedPage = newPdfWriter.getImportedPage(reader, pageNumber); newPdfCanvas.addTemplate(importedPage, xFactor, 0, 0, yFactor, xOffset, yOffset); newDocument.newPage(); } } finally { if (newDocument != null && newDocument.isOpen()) { newDocument.close(); } } return new ByteArrayInputStream(baos.toByteArray()); } /** * Computes the height of the specified string content which must * wrap at the specified maximum line width with the specified font weight. * #param content The string content for which the height is computed. * #param maxLineWidth The maximum line width at which the content must wrap. * #param fontWeight The font weight. * #return The height of the specified content which wraps at * the specified maximum line width with the specified font weight. */ private static float computeWrappedTextHeight(String content, float maxLineWidth, int fontWeight) { float totalHeight = 0.0f; Font font = FontFactory.getFont(BaseFont.HELVETICA, FONT_SIZE); font.setStyle(fontWeight); BaseFont baseFont = font.getCalculatedBaseFont(true); String lineText = ""; int currentWordStart = -1; float lineHeight; for (int charIndex = 0; charIndex < content.length(); charIndex++) { String currentChar = content.substring(charIndex, charIndex + 1); lineText = lineText + currentChar; boolean isCurrentCharWordSeparator = isWordSeperator(currentChar); float lineWidth = computeLineWidth(lineText, baseFont); if (charIndex == 0 || (!isCurrentCharWordSeparator && currentWordStart < 0)) { currentWordStart = charIndex; } if (lineWidth > maxLineWidth || currentChar.equals("\n")) { // Start a new line. if (isCurrentCharWordSeparator) { // The current character is a word separator - break the line at the current character. lineHeight = computeLineHeight(lineText, baseFont); // Reset line text. if (currentChar.equals("\n")) { lineText = ""; } else { lineText = currentChar; } } else { // The current character is in the middle of a word - break the line at the previous word separator. int lineEnd = lineText.length() - (charIndex - currentWordStart) - 1; if (lineEnd > 0) { String currentWordExcludedLineText = lineText.substring(0, lineEnd); lineHeight = computeLineHeight(currentWordExcludedLineText, baseFont); charIndex = currentWordStart; // New line starts at the beginning of the current word. lineText = ""; } else { lineHeight = computeLineHeight(lineText, baseFont); lineText = currentChar; } } totalHeight = totalHeight + lineHeight; } // If it is at a new word break, reset the current word starting index so that // the next iteration can set it at the beginning of the next word. if (charIndex > 0 && isCurrentCharWordSeparator && currentWordStart >= 0) { currentWordStart = -1; } } lineHeight = computeLineHeight(lineText, baseFont); totalHeight = totalHeight + lineHeight; return totalHeight; } /** * Determines if the specified string is a word separator. * #param c The string to test. * #return true if the specified string is a word separator; false othewise. */ private static boolean isWordSeperator(String c) { return (c.equals("\n") || c.equals("\t") || c.equals(" ")); } /** * Computes the line width of the specified line text with the specified base font. * #param lineText The line text. * #param baseFont A BaseFont object representing the base font of the line. * #return A float representing the width of the line. */ private static float computeLineWidth(String lineText, BaseFont baseFont) { return baseFont.getWidthPoint(lineText, FONT_SIZE); } /** * Computes the line height with the specified parameters. * #param lineText The line text. * #param baseFont A BaseFont object representing the base font of the line. * #return A float value representing the height of the line. */ private static float computeLineHeight(String lineText, BaseFont baseFont) { float lineHeight = baseFont.getAscentPoint(lineText, FONT_SIZE) - baseFont.getDescentPoint(lineText, FONT_SIZE); if (lineHeight < MIN_LINE_HEIGHT) { lineHeight = MIN_LINE_HEIGHT; } return lineHeight; } /** * Renders the header and footer to the specified Rectangle with the specified page content, header and footer. * #param rect A Rectangle to render the header and footer. * #param pageContent A PdfContentByte representing the content of the page. * #param header The page header. * #param footer The page footer. * #throws DocumentException * #throws IOException */ private static void renderHeaderFooter(Rectangle rect, PdfContentByte pageContent, String header, String footer) throws DocumentException, IOException { float margin = 36.0f; int sides = 2; float footerHeight = (float)Math.ceil(computeWrappedTextHeight(footer, rect.getWidth() - margin * sides, Font.NORMAL)); float headerHeight = (float)Math.ceil(computeWrappedTextHeight(footer, rect.getWidth() - margin * sides, Font.BOLD)); if (headerHeight < MIN_LINE_HEIGHT) { headerHeight = MIN_LINE_HEIGHT; } // Render header. Font headerFont = getDefaultFont(); headerFont.setStyle(Font.BOLD); Phrase headerPhrase = new Phrase(header, headerFont); ColumnText headerRenderer = new ColumnText(pageContent); headerRenderer.setSimpleColumn(headerPhrase, margin, rect.getHeight() - headerHeight - margin + 4, rect.getWidth() - margin, rect.getHeight() - margin + 4, MIN_LINE_HEIGHT, Element.ALIGN_RIGHT); headerRenderer.go(); // Render footer. Phrase footerPhrase = new Phrase(footer, getDefaultFont()); ColumnText footerRender = new ColumnText(pageContent); footerRender.setSimpleColumn(footerPhrase, margin, margin, rect.getWidth() - margin, footerHeight + margin, MIN_LINE_HEIGHT, Element.ALIGN_CENTER); footerRender.go(); } /** * Calculates the height of the new page with the specified parameters. * #param oldPageHeight The height of the old page. * #param headerHeight The height of header. * #param footerHeight The height of footer. * #return The height of the new page. */ private static float calculateNewPageHeight(float oldPageHeight, float headerHeight, float footerHeight) { return oldPageHeight + MARGIN_TOP + headerHeight + footerHeight + MARGIN_BOTTOM; } /** * Calculates the width of the new page with the specified width of old page. * #param oldPageWidth The width of the old page. * #return The width of the new page. */ private static float calculateNewPageWidth(float oldPageWidth) { return oldPageWidth + MARGIN_LEFT + MARGIN_RIGHT; } private static String getHeader() { return "This is dynamically added header."; } private static String getFooter() { StringBuilder footerBuilder = new StringBuilder(); footerBuilder.append("This is the dynamically added footer."); footerBuilder.append("\n\n"); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. Footer paragraph 1 content. "); footerBuilder.append("\n\n"); footerBuilder.append("Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. "); footerBuilder.append("Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. "); footerBuilder.append("Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. "); footerBuilder.append("Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. Footer paragraph 2 content. "); footerBuilder.append("\n\n"); footerBuilder.append("Footer paragraph 3 content. Footer paragraph 3 content. Footer paragraph 3 content. Footer paragraph 3 content."); return footerBuilder.toString(); } private static Font getDefaultFont() { return FontFactory.getFont(BaseFont.HELVETICA, FONT_SIZE, Color.BLACK); } }
I need to programmatically add header and footer to an existing form-based PDF using iText. The existing PDF comes from user and it contains no space for header and footer. So the solution is to create a new PDF by concatenating the contents of the existing PDF with the header and footer. No, this is not a good solution. You had better use a PdfStamper, change the sizes of the existing pages, and add headers and footers in the new page area. In particular as you use a PdfStamper already now for the final step. #Mark Storer in this old answer shows how to manipulate the bottom of the MediaBox. Likewise you can also change its top. And as Mark remarks in his answer, you may also have to change the CropBox. However, this approach only works for PDF containing no form. For interactive PDF that contains AcroForm or XFA Form, it fails as follows: (1) AcroForm gets flattened in the new PDF. With your code the AcroForm form elements should not get flattened (i.e. their appearances should not get added to the static PDF content) but they should get lost. Sometimes, though, border lines or other indications of form field boundaries actually are already part of the static content. This might be the case for you. The reason is that your code uses PdfWriter.getImportedPage, a method that only takes the page content stream but no interactive features like AcroForm form field widget annotations. (2) XFA Form doesn't import at all - the new PDF shows "Please wait...If this message is not eventually replaced by proper contents of the document, your PDF viewer may not be able to display this type of document...". XFA forms are a document type of its own which merely use PDF files as transport medium. Your PdfWriter.getImportedPage does not even see the XFA data in the document and only copies a page that your XFA PDF document shows on PDF viewers without XFA support. In case of XFA forms the PDF page objects usually have no part in what eventually is displayed. Instead the PDF transports an XFA XML. Thus, all your changes to any existing PDF pages remain unseen. You have to extract that XFA XML, manipulate it, and store it again. iText only has limited support for XFA, and the ancient version you appear to use has none at all.
How to copy a line with fstream including white spaces?
So if the file i was reading from contained: 2:Hello World i would like for num to equal the number at the start of each line and text equal the remaining string of the line My question is how would i be able make it so my ifstream takes the entire line, including the spaces between words? instead of it taking only "Hello" #include <fstream> using namespace std; int main(int argc, char *argv[]) { ifstream fin; fin.open(file); string s = ""; fin >> s; loadPacket(s); } void loadPacket(string s) { static const size_t npos = -1; num = pString.substr(0,1); text = pString.substr(4, npos); }
How can you eliminate white-space in multiple columns using iTextSharp?
I'd like to add a Paragraph of text to pages in 2 columns. I understand that MultiColumnText has been eliminated. I know I can create column 1, write to it, and if there is more text create column 2 and write to it. If there is still more text, go to the next page and repeat. However I always end up with either: a long chunk of text orphaned in the left column. a full left column and partially used right column. How can I format my content in 2 columns while reducing white space, such as compressing the columns so I end with 2 full columns of equal length? Thanks!
You should add your column twice, once in simulation mode and once for real. I have adapted the ColumnTextParagraphs example to show what is meant by simulation mode. Take a look at the ColumnTextParagraphs2 example: We add the column in simulation mode to obtain the total height needed for the column. This is done in the following method: public float getNecessaryHeight(ColumnText ct) throws DocumentException { ct.setSimpleColumn(new Rectangle(0, 0, COLUMN_WIDTH, -500000)); ct.go(true); return -ct.getYLine(); } We use this height when we add the left and right column: Rectangle left; float top = COLUMNS[0].getTop(); float middle = (COLUMNS[0].getLeft() + COLUMNS[1].getRight()) / 2; float columnheight; int status = ColumnText.START_COLUMN; while (ColumnText.hasMoreText(status)) { if (checkHeight(height)) { columnheight = COLUMNS[0].getHeight(); left = COLUMNS[0]; } else { columnheight = (height / 2) + ERROR_MARGIN; left = new Rectangle( COLUMNS[0].getLeft(), COLUMNS[0].getTop() - columnheight, COLUMNS[0].getRight(), COLUMNS[0].getTop() ); } // left half ct.setSimpleColumn(left); ct.go(); height -= COLUMNS[0].getTop() - ct.getYLine(); // separator canvas.moveTo(middle, top - columnheight); canvas.lineTo(middle, top); canvas.stroke(); // right half ct.setSimpleColumn(COLUMNS[1]); status = ct.go(); height -= COLUMNS[1].getTop() - ct.getYLine(); // new page document.newPage(); } This is how we check the height: public boolean checkHeight(float height) { height -= COLUMNS[0].getHeight() + COLUMNS[1].getHeight() + ERROR_MARGIN; return height > 0; } As you can see, we add the full columns as long as the height of both columns is smaller than the remaining height. When columns are added, we adjust the remaining height. As soon as the height is lower than the height of to columns, we adapt the height of the first column. Note that we work with an ERROR_MARGIN because dividing by two often leads to a situation where the second column has one line more than the first column. It is better when it's the other way around. This is the full example at your request: /** * Example written by Bruno Lowagie in answer to: * http://stackoverflow.com/questions/29378407/how-can-you-eliminate-white-space-in-multiple-columns-using-itextsharp */ package sandbox.objects; import java.io.File; import java.io.FileOutputStream; import java.io.IOException; import com.itextpdf.text.Document; import com.itextpdf.text.DocumentException; import com.itextpdf.text.Paragraph; import com.itextpdf.text.Rectangle; import com.itextpdf.text.pdf.ColumnText; import com.itextpdf.text.pdf.PdfContentByte; import com.itextpdf.text.pdf.PdfWriter; public class ColumnTextParagraphs2 { public static final String DEST = "results/objects/column_paragraphs2.pdf"; public static final String TEXT = "This is some long paragraph that will be added over and over again to prove a point."; public static final float COLUMN_WIDTH = 254; public static final float ERROR_MARGIN = 16; public static final Rectangle[] COLUMNS = { new Rectangle(36, 36, 36 + COLUMN_WIDTH, 806), new Rectangle(305, 36, 305 + COLUMN_WIDTH, 806) }; public static void main(String[] args) throws IOException, DocumentException { File file = new File(DEST); file.getParentFile().mkdirs(); new ColumnTextParagraphs2().createPdf(DEST); } public void createPdf(String dest) throws IOException, DocumentException { // step 1 Document document = new Document(); // step 2 PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(dest)); // step 3 document.open(); // step 4 PdfContentByte canvas = writer.getDirectContent(); ColumnText ct = new ColumnText(canvas); addContent(ct); float height = getNecessaryHeight(ct); addContent(ct); Rectangle left; float top = COLUMNS[0].getTop(); float middle = (COLUMNS[0].getLeft() + COLUMNS[1].getRight()) / 2; float columnheight; int status = ColumnText.START_COLUMN; while (ColumnText.hasMoreText(status)) { if (checkHeight(height)) { columnheight = COLUMNS[0].getHeight(); left = COLUMNS[0]; } else { columnheight = (height / 2) + ERROR_MARGIN; left = new Rectangle( COLUMNS[0].getLeft(), COLUMNS[0].getTop() - columnheight, COLUMNS[0].getRight(), COLUMNS[0].getTop() ); } // left half ct.setSimpleColumn(left); ct.go(); height -= COLUMNS[0].getTop() - ct.getYLine(); // separator canvas.moveTo(middle, top - columnheight); canvas.lineTo(middle, top); canvas.stroke(); // right half ct.setSimpleColumn(COLUMNS[1]); status = ct.go(); height -= COLUMNS[1].getTop() - ct.getYLine(); // new page document.newPage(); } // step 5 document.close(); } public void addContent(ColumnText ct) { for (int i = 0; i < 35; i++) { ct.addElement(new Paragraph(String.format("Paragraph %s: %s", i, TEXT))); } } public float getNecessaryHeight(ColumnText ct) throws DocumentException { ct.setSimpleColumn(new Rectangle(0, 0, COLUMN_WIDTH, -500000)); ct.go(true); return -ct.getYLine(); } public boolean checkHeight(float height) { height -= COLUMNS[0].getHeight() + COLUMNS[1].getHeight() + ERROR_MARGIN; return height > 0; } }
add "rotation" when we encrypt an uppercase letter by rotating 13
package edu.secretcode; import java.util.Scanner; /** * Creates the secret code class. * * * */ public class SecretCode { /** * Perform the ROT13 operation * * #param plainText * the text to encode * #return the rot13'd encoding of plainText */ public static String rotate13(String plainText) { StringBuffer cryptText = new StringBuffer(""); for (int i = 0; i < plainText.length() - 1; i++) { int currentChar = plainText.charAt(i); String cS = currentChar+""; currentChar = (char) ((char) (currentChar - (int) 'A' + 13) % 255 + (int)'A'); if ((currentChar >= 'A') && (currentChar <= 'Z')) { currentChar = (((currentChar - 'A')+13) % 26) + 'A' - 1; } else { cryptText.append(currentChar); } } return cryptText.toString(); } /** * Main method of the SecretCode class * * #param args */ public static void main(String[] args) { Scanner input = new Scanner(System.in); while (1 > 0) { System.out.println("Enter plain text to encode, or QUIT to end"); Scanner keyboard = new Scanner(System.in); String plainText = keyboard.nextLine(); if (plainText.equals("QUIT")) { break; } String cryptText = SecretCode.rotate13(plainText); String encodedText = SecretCode.rotate13(plainText); System.out.println("Encoded Text: " + encodedText); } } } I need to make this rotation work by adding-13 to a character if the resulting character is greater-than 'Z' I am suppose to subtract 'Z' then add 'A' then subtract 1 (the number 1, not the letter '1') and do this only for capital letters. I did this in the if statement and when I typed in "HELLO WORLD!" I got 303923011009295302 and I was suppose to get "URYYB JBEYQ!" and the program is not encoding correctly. Any help would be appreciated. Thanks in advance.
You're appending an int rather than a char to cryptText. Use: cryptText.append ((char)currentChar); Update: Wouldn't bother with the character value manipulation stuff. You're making all sorts of character set assumptions as it is (try running on an IBM i, which uses EBCDIC rather than ASCII, and watch it all break). Use a lookup table instead: private static final String in = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; private static final String out = "NOPQRSTUVWXYZABCDEFGHIJKLM"; ... final int idx = in.indexOf (ch); cryptText.append ((-1 == idx) ? ch : out.charAt (idx));