Bolding with Rich Text Values in iTextSharp - itext

Is it possible to bold a single word within a sentence with iTextSharp? I'm working with large paragraphs of text coming from xml, and I am trying to bold several individual words without having to break the string into individual phrases.
Eg:
document.Add(new Paragraph("this is <b>bold</b> text"));
should output...
this is bold text

As #kuujinbo pointed out there is the XMLWorker object which is where most of the new HTML parsing work is being done. But if you've just got simple commands like bold or italic you can use the native iTextSharp.text.html.simpleparser.HTMLWorker class. You could wrap it into a helper method such as:
private Paragraph CreateSimpleHtmlParagraph(String text) {
//Our return object
Paragraph p = new Paragraph();
//ParseToList requires a StreamReader instead of just text
using (StringReader sr = new StringReader(text)) {
//Parse and get a collection of elements
List<IElement> elements = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(sr, null);
foreach (IElement e in elements) {
//Add those elements to the paragraph
p.Add(e);
}
}
//Return the paragraph
return p;
}
Then instead of this:
document.Add(new Paragraph("this is <b>bold</b> text"));
You could use this:
document.Add(CreateSimpleHtmlParagraph("this is <b>bold</b> text"));
document.Add(CreateSimpleHtmlParagraph("this is <i>italic</i> text"));
document.Add(CreateSimpleHtmlParagraph("this is <b><i>bold and italic</i></b> text"));

I know that this is an old question, but I could not get the other examples here to work for me. But adding the text in Chucks with different fonts did.
//define a bold font to be used
Font boldFont = FontFactory.GetFont(FontFactory.HELVETICA_BOLD, 12);
//add a phrase and add Chucks to it
var phrase2 = new Phrase();
phrase2.Add(new Chunk("this is "));
phrase2.Add(new Chunk("bold", boldFont));
phrase2.Add(new Chunk(" text"));
document.Add(phrase2);

Not sure how complex your Xml is, but try XMLWorker. Here's a working example with an ASP.NET HTTP handler:
<%# WebHandler Language="C#" Class="boldText" %>
using System;
using System.IO;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.xml;
using iTextSharp.tool.xml;
public class boldText : IHttpHandler {
public void ProcessRequest (HttpContext context) {
HttpResponse Response = context.Response;
Response.ContentType = "application/pdf";
StringReader xmlSnippet = new StringReader(
"<p>This is <b>bold</b> text</p>"
);
using (Document document = new Document()) {
PdfWriter writer = PdfWriter.GetInstance(
document, Response.OutputStream
);
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, xmlSnippet
);
}
}
public bool IsReusable { get { return false; } }
}
You may have to pre-process your Xml before sending it to XMLWorker. (notice the snippet is a bit different from yours) Support for parsing HTML/Xml was released relatively recently, so your mileage may vary.

Here is another XMLWorker example that uses a different overload of ParseHtml and returns a Phrase instead of writing it directly to the document.
private static Phrase CreateSimpleHtmlParagraph(String text)
{
var p = new Phrase();
var mh = new MyElementHandler();
using (TextReader sr = new StringReader("<html><body><p>" + text + "</p></body></html>"))
{
XMLWorkerHelper.GetInstance().ParseXHtml(mh, sr);
}
foreach (var element in mh.elements)
{
foreach (var chunk in element.Chunks)
{
p.Add(chunk);
}
}
return p;
}
private class MyElementHandler : IElementHandler
{
public List<IElement> elements = new List<IElement>();
public void Add(IWritable w)
{
if (w is iTextSharp.tool.xml.pipeline.WritableElement)
{
elements.AddRange(((iTextSharp.tool.xml.pipeline.WritableElement)w).Elements());
}
}
}

Related

Insert multiple lines of text into a Rich Text content control with OpenXML

I'm having difficulty getting a content control to follow multi-line formatting. It seems to interpret everything I'm giving it literally. I am new to OpenXML and I feel like I must be missing something simple.
I am converting my multi-line string using this function.
private static void parseTextForOpenXML(Run run, string text)
{
string[] newLineArray = { Environment.NewLine, "<br/>", "<br />", "\r\n" };
string[] textArray = text.Split(newLineArray, StringSplitOptions.None);
bool first = true;
foreach (string line in textArray)
{
if (!first)
{
run.Append(new Break());
}
first = false;
Text txt = new Text { Text = line };
run.Append(txt);
}
}
I insert it into the control with this
public static WordprocessingDocument InsertText(this WordprocessingDocument doc, string contentControlTag, string text)
{
SdtElement element = doc.MainDocumentPart.Document.Body.Descendants<SdtElement>().FirstOrDefault(sdt => sdt.SdtProperties.GetFirstChild<Tag>().Val == contentControlTag);
if (element == null)
throw new ArgumentException("ContentControlTag " + contentControlTag + " doesn't exist.");
element.Descendants<Text>().First().Text = text;
element.Descendants<Text>().Skip(1).ToList().ForEach(t => t.Remove());
return doc;
}
I call it with something like...
doc.InsertText("Primary", primaryRun.InnerText);
Although I've tried InnerXML and OuterXML as well. The results look something like
Example AttnExample CompanyExample AddressNew York, NY 12345 or
<w:r xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:t>Example Attn</w:t><w:br /><w:t>Example Company</w:t><w:br /><w:t>Example Address</w:t><w:br /><w:t>New York, NY 12345</w:t></w:r>
The method works fine for simple text insertion. It's just when I need it to interpret the XML that it doesn't work for me.
I feel like I must be super close to getting what I need, but my fiddling is getting me nowhere. Any thoughts? Thank you.
I believe the way I was trying to do it was doomed to fail. Setting the Text attribute of an element is always going to be interpreted as text to be displayed it seems. I ended up having to take a slightly different tack. I created a new insert method.
public static WordprocessingDocument InsertText(this WordprocessingDocument doc, string contentControlTag, Paragraph paragraph)
{
SdtElement element = doc.MainDocumentPart.Document.Body.Descendants<SdtElement>().FirstOrDefault(sdt => sdt.SdtProperties.GetFirstChild<Tag>().Val == contentControlTag);
if (element == null)
throw new ArgumentException("ContentControlTag " + contentControlTag + " doesn't exist.");
OpenXmlElement cc = element.Descendants<Text>().First().Parent;
cc.RemoveAllChildren();
cc.Append(paragraph);
return doc;
}
It starts the same, and gets the Content Control by searching for it's Tag. But then I get it's parent, remove the Content Control elements that were there and just replace them with a paragraph element.
It's not exactly what I had envisioned, but it seems to work for my needs.

Remove Content controls after adding text using open xml

By the help of some very kind community members here I managed to programatically create a function to replace text inside content controls in a Word document using open xml. After the document is generated it removes the formatting of the text after I replace the text.
Any ideas on how I can still keep the formatting in word and remove the content control tags ?
This is my code:
using (var wordDoc = WordprocessingDocument.Open(mem, true))
{
var mainPart = wordDoc.MainDocumentPart;
ReplaceTags(mainPart, "FirstName", _firstName);
ReplaceTags(mainPart, "LastName", _lastName);
ReplaceTags(mainPart, "WorkPhoe", _workPhone);
ReplaceTags(mainPart, "JobTitle", _jobTitle);
mainPart.Document.Save();
SaveFile(mem);
}
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//remove all paragraphs from the content block
field.SdtContentBlock.RemoveAllChildren<Paragraph>();
//create a new paragraph containing a run and a text element
Paragraph newParagraph = new Paragraph();
Run newRun = new Run();
Text newText = new Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//add the new paragraph to the content block
field.SdtContentBlock.Append(newParagraph);
}
}
Keeping the style is a tricky problem as there could be more than one style applied to the text you are trying to replace. What should you do in that scenario?
Assuming a simple case of one style (but potentially over many Paragraphs, Runs and Texts) you could keep the first Text element you come across per SdtBlock and place your required value in that element then delete any further Text elements from the SdtBlock. The formatting from the first Text element will then be maintained. Obviously you can apply this theory to any of the Text blocks; you don't have to necessarily use the first. The following code should show what I mean:
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
IEnumerable<Text> texts = field.SdtContentBlock.Descendants<Text>();
for (int i = 0; i < texts.Count(); i++)
{
Text text = texts.ElementAt(i);
if (i == 0)
{
text.Text = tagValue;
}
else
{
text.Remove();
}
}
}
}

c#-openxml word Replacement and page break

i am a new member an i really like this site because it help me always
my problem is
i want replace word document using openxml and add a page break
end then i want to write replaced text second page
here my codes
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(#"d:\a.docx", true))
{
using (StreamReader reader = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
text = reader.ReadToEnd();
}
Regex regexText = new Regex("#db#");
text = regexText.Replace(text, textBox4.Text.Trim());
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(text);
}
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
Run r = new Run();
Paragraph para = new Paragraph(new Run(new Break() { Type = BreakValues.Page }));
using (StreamWriter sw1 = new StreamWriter(mainPart.GetStream(FileMode.Create)))
{
sw1.Write(text);
}
mainPart.Document.Body.InsertAfter(para, mainPart.Document.Body.LastChild);
mainPart.Document.Save();
}
}
I suggest you insert a page break in your a.docx in advance. Then, use MergeField to locate where you want to replace with.
Here is the example

Merge 2 pdf byte streams using Itextsharp

I have a method that returns a pdf byte stream (from fillable pdf) Is there a straight forward way to merge 2 streams into one stream and make one pdf out of it? I need to run my method twice but need the two pdf's into One pdf stream. Thanks.
You didn't say if you're flattening the filled forms with the PdfStamper, so I'll just say you must flatten the before trying to merge them. Here's a working .ashx HTTP handler:
<%# WebHandler Language="C#" Class="mergeByteForms" %>
using System;
using System.IO;
using System.Web;
using iTextSharp.text;
using iTextSharp.text.pdf;
public class mergeByteForms : IHttpHandler {
HttpServerUtility Server;
public void ProcessRequest (HttpContext context) {
Server = context.Server;
HttpResponse Response = context.Response;
Response.ContentType = "application/pdf";
using (Document document = new Document()) {
using (PdfSmartCopy copy = new PdfSmartCopy(
document, Response.OutputStream) )
{
document.Open();
for (int i = 0; i < 2; ++i) {
PdfReader reader = new PdfReader(_getPdfBtyeStream(i.ToString()));
copy.AddPage(copy.GetImportedPage(reader, 1));
}
}
}
}
public bool IsReusable { get { return false; } }
// simulate your method to use __one__ byte stream for __one__ PDF
private byte[] _getPdfBtyeStream(string data) {
// replace with __your__ PDF template
string pdfTemplatePath = Server.MapPath(
"~/app_data/template.pdf"
);
PdfReader reader = new PdfReader(pdfTemplatePath);
using (MemoryStream ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
AcroFields form = stamper.AcroFields;
// replace this with your form field data
form.SetField("title", data);
// ...
// this is __VERY__ important; since you're using the same fillable
// PDF, if you don't set this property to true the second page will
// lose the filled fields.
stamper.FormFlattening = true;
}
return ms.ToArray();
}
}
}
Hopefully the inline comments make sense. _getPdfBtyeStream() method above simulates your PDF byte streams. The reason you need to set FormFlattening to true is that a when you fill PDF form fields, names are supposed to be unique. In your case the second page is the same fillable PDF form, so it has the same field names as the first page and when you fill them they're ignored. Comment out the example line above:
stamper.FormFlattening = true;
to see what I mean.
In other words, a lot of the generic code to merge PDFs on the Internet and even here on stackoverflow will not work (for fillable forms) because Acrofields are not being accounted for. In fact, if you take a look at stackoverflow's about itextsharp tag "SO FAQ & Popular" to Merge PDFs, it's mentioned in the third comment for the correctly marked answer by #Ray Cheng.
Another way to merge fillable PDF (without flattening the form) is to rename the form fields for the second/following page(s), but that's more work.

Syncfusion DocIO -- how to insert image (local file) at bookmark using BookmarksNavigator

I have been using Syncfusion DocIO for generating MS Word documents from my .net applications (winforms). So far I have dealt with plain text and it is fairly straightforward to insert text in a word document template where bookmarks serve as reference points for text insertion.
I am navigating the bookmarks using BookmarksNavigator.MoveToBookmark() . Now I need to insert an image at a bookmark but I am at a loss at how to go about it.
Please help...
Thanks.
Specifically for adding it to a bookmark :
//Move to the specified bookmark
bk.MoveToBookmark(bookmark);
//Insert the picture into the specified bookmark location
bk.DeleteBookmarkContent(true);
// we assume the text is a full pathname for an image file
// get the image file
System.Drawing.Image image = System.Drawing.Image.FromFile(sText);
IWParagraph paragraph = new WParagraph(document);
paragraph.AppendPicture(image);
bk.InsertParagraph(paragraph);
private System.Drawing.Image LoadSignature(string sFileName)
{
string sImagePath = sFileName;
System.Drawing.Image image = System.Drawing.Image.FromFile(sImagePath);
return image;
}
private void MergeSignature(WordDocument doc, string sFile, string sBalise)
{
System.Drawing.Image iSignature = LoadSignature(sFile);
WordDocument ImgDoc = new WordDocument();
ImgDoc.AddSection();
ImgDoc.Sections[0].AddParagraph().AppendPicture(iSignature);
if (iSignature != null)
{
TextSelection ts = null ;
Regex pattern = new Regex(sBalise);
ts = doc.Find(pattern);
if (ts != null)
{
doc.ReplaceFirst = true;
doc.Replace(pattern, ImgDoc, false);
}
}
iSignature.Dispose();
}
See here: https://help.syncfusion.com/file-formats/docio/working-with-mailmerge
1) You should create docx file with name "Template.docx". This file will use as template.
In your docx file create Field of type MergeField.
2) Create MergeFiled with name Image:Krishna
3)
using Syncfusion.DocIO.DLS;
using System.Drawing;
public class Source
{
public Image Krishna { get; set; } = Image.FromFile(#"C:\1.png");
}
and generating code:
public static void Generate()
{
WordDocument doc = new WordDocument("Template.docx");
Source data = new Source();
var dataTable = new MailMergeDataTable("", new Source[] { data });
doc.MailMerge.ExecuteGroup(dataTable);
doc.Save("result.docx");
doc.Close();
}