Open XML: Word - Getting all Paragraphs marked as "Heading1" style - ms-word

Using Word I have created a Docx with the standard normal.dot as a test. Hello-world level complexity.
I wish to get all the paragraphs which are styled with the "Heading1" style in Word.
I can get all the paragraphs, but don't know how to filter down to Heading1.
using (var doc = WordprocessingDocument.Open(documentFileName, false))
{
paragraphs = doc.MainDocumentPart.Document.Body
.OfType<Paragraph>().ToList();
}

[Test]
public void FindHeadingParagraphs()
{
var paragraphs = new List<Paragraph>();
// Open the file read-only since we don't need to change it.
using (var wordprocessingDocument = WordprocessingDocument.Open(documentFileName, false))
{
paragraphs = wordprocessingDocument.MainDocumentPart.Document.Body
.OfType<Paragraph>()
.Where(p => p.ParagraphProperties != null &&
p.ParagraphProperties.ParagraphStyleId != null &&
p.ParagraphProperties.ParagraphStyleId.Val.Value.Contains("Heading1")).ToList();
}
}

Related

How to remove the extra page at the end of a word document which created during mail merge

I have written a piece of code to create a word document by mail merge using Syncfusion (Assembly Syncfusion.DocIO.Portable, Version=17.1200.0.50,), Angular 7+ and .NET Core. Please see the code below.
private MemoryStream MergePaymentPlanInstalmentsScheduleToPdf(List<PaymentPlanInstalmentReportModel>
PaymentPlanDetails, byte[] templateFileBytes)
{
if (templateFileBytes == null || templateFileBytes.Length == 0)
{
return null;
}
var templateStream = new MemoryStream(templateFileBytes);
var pdfStream = new MemoryStream();
WordDocument mergeDocument = null;
using (mergeDocument = new WordDocument(templateStream, FormatType.Docx))
{
if (mergeDocument != null)
{
var mergeList = new List<PaymentPlanInstalmentScheduleMailMergeModel>();
var obj = new PaymentPlanInstalmentScheduleMailMergeModel();
obj.Applicants = 0;
if (PaymentPlanDetails != null && PaymentPlanDetails.Any()) {
var applicantCount = PaymentPlanDetails.GroupBy(a => a.StudentID)
.Select(s => new
{
StudentID = s.Key,
Count = s.Select(a => a.StudentID).Distinct().Count()
});
obj.Applicants = applicantCount?.Count() > 0 ? applicantCount.Count() : 0;
}
mergeList.Add(obj);
var reportDataSource = new MailMergeDataTable("Report", mergeList);
var tableDataSource = new MailMergeDataTable("PaymentPlanDetails", PaymentPlanDetails);
List<DictionaryEntry> commands = new List<DictionaryEntry>();
commands.Add(new DictionaryEntry("Report", ""));
commands.Add(new DictionaryEntry("PaymentPlanDetails", ""));
MailMergeDataSet ds = new MailMergeDataSet();
ds.Add(reportDataSource);
ds.Add(tableDataSource);
mergeDocument.MailMerge.ExecuteNestedGroup(ds, commands);
mergeDocument.UpdateDocumentFields();
using (var converter = new DocIORenderer())
{
using (var pdfDocument = converter.ConvertToPDF(mergeDocument))
{
pdfDocument.Save(pdfStream);
pdfDocument.Close();
}
}
mergeDocument.Close();
}
}
return pdfStream;
}
Once the document is generated, I notice there is a blank page (with the footer) at the end. I searched for a solution on the internet over and over again, but I was not able to find a solution. According to experts, I have done the initial checks such as making sure that the initial word template file has no page breaks, etc.
I am wondering if there is something that I can do from my code to remove any extra page breaks or anything like that, which can cause this.
Any other suggested solution for this, even including MS Word document modifications also appreciated.
Please refer the below documentation link to remove empty page at the end of Word document using Syncfusion Word library (Essential DocIO).
https://www.syncfusion.com/kb/10724/how-to-remove-empty-page-at-end-of-word-document
Please reuse the code snippet before converting Word to PDF in your sample application.
Note: I work for Syncfusion.

Insert HTML in docx file

I have made an application that fills wordfiles with customxmlparts now I am trying to put text into a textfield, but it has HTML in it and I want it to show the styling of it. I tried converting it to rich text format but that just gets pasted in the word file. Here is an example of the code:
var taskId = Guid.NewGuid();
var tempFilePath = $"{Path.GetTempPath()}/{taskId}";
using (var templateStream = new FileStream($"{tempFilePath}.docx", FileMode.CreateNew))
{
templateStream.Write(template, 0, template.Length);
// 1. Fill template.
using (WordprocessingDocument doc = WordprocessingDocument.Open(templateStream, true))
{
MainDocumentPart mainDocument = doc.MainDocumentPart;
if (mainDocument.CustomXmlParts != null)
{
mainDocument.DeleteParts<CustomXmlPart>(mainDocument.CustomXmlParts);
}
CustomXmlPart cxp = mainDocument.AddCustomXmlPart(CustomXmlPartType.CustomXml);
foreach (var line in data.Lines)
{
if (line.MoreInfo != null && line.MoreInfo != " ") {
}
}
var xmlData = ObjectToXml(data);
using (var stream = GenerateStreamFromString(tempFilePath, xmlData))
{
cxp.FeedData(stream);
}
mainDocument.Document.Save();
}
}
You can't just write the HTML formatted text into a DOCX field, you would need to convert it into a WordprocessingML format.
However, there is another way that you could try and that is to insert an "AltChunk" element. That element represents a sort of like a placeholder which can reference a HTML file and then when the DOCX file is opened in MS Word, it will make that HTML to WordprocessingML conversion for you. For details see: How to Use altChunk for Document Assembly
Alternatively you could use some third party, like GemBox.Document, which can make that HTML to WordprocessingML conversion for you.
For example check this Set Content example:
// Set content using HTML tags
document.Sections[0].Blocks[4].Content.LoadText(
"Paragraph 5 <b>(part of this paragraph is bold)</b>", LoadOptions.HtmlDefault);

Custom properties are not updated for the word via openXML

I am trying to update custom properties of word document thru Open XML programming but it seems the updated properties are not getting saved properly for the word document. So when I opening document after successful execution of the update custom property code, I am getting the message box which is "This document contains field that may refer to other files; Do you want to update the fields in the Document?" If I am pressing 'NO' button then all the update properties would not be saved to the document. If we are going for yes option then it will update properties but I need to save the properties explicitly. Please suggest to save properties to the document without getting confirmation message or corrupting the document. :)
the code snippet is given as below,
public void SetCustomValue(
WordprocessingDocument document, string propname, string aValue)
{
CustomFilePropertiesPart oDocCustomProps = document.CustomFilePropertiesPart;
Properties props = oDocCustomProps.Properties;
if (props != null)
{
//logger.Debug("props is not null");
foreach (var prop in props.Elements<CustomDocumentProperty>())
{
if (prop != null && prop.Name == propname)
{
//logger.Debug("Setting Property: " + prop.Name + " to value: " + aValue);
prop.Remove();
var newProp = new CustomDocumentProperty();
newProp.FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";
newProp.Name = prop.Name;
VTLPWSTR vTLPWSTR1 = new VTLPWSTR();
vTLPWSTR1.Text = aValue;
newProp.Append(vTLPWSTR1);
props.AppendChild(newProp);
props.Save();
}
}
int pid = 2;
foreach (CustomDocumentProperty item in props)
{
item.PropertyId = pid++;
}
props.Save();
}
}
I am using .Net framework 3.5 with Open XML SDK 2.0 and Office 2013.
Try this one
var CustomeProperties = xmlDOc.CustomFilePropertiesPart.Properties;
foreach (CustomDocumentProperty customeProperty in CustomeProperties)
{
if (customeProperty.Name == "DocumentName")
{
customeProperty.VTLPWSTR = new VTLPWSTR("My Custom Name");
}
else if (customeProperty.Name == "DocumentID")
{
customeProperty.VTLPWSTR = new VTLPWSTR("FNP.SMS.IQC");
}
else if (customeProperty.Name == "DocumentLastUpdate")
{
customeProperty.VTLPWSTR = new VTLPWSTR(DateTime.Now.ToShortDateString());
}
}
//Open Word Setting File
DocumentSettingsPart settingsPart = xmlDOc.MainDocumentPart.GetPartsOfType<DocumentSettingsPart>().First();
//Update Fields
UpdateFieldsOnOpen updateFields = new UpdateFieldsOnOpen();
updateFields.Val = new OnOffValue(true);
settingsPart.Settings.PrependChild<UpdateFieldsOnOpen>(updateFields);
settingsPart.Settings.Save();
you have to update your document fields on open.

Remove Content controls after adding text using open xml

By the help of some very kind community members here I managed to programatically create a function to replace text inside content controls in a Word document using open xml. After the document is generated it removes the formatting of the text after I replace the text.
Any ideas on how I can still keep the formatting in word and remove the content control tags ?
This is my code:
using (var wordDoc = WordprocessingDocument.Open(mem, true))
{
var mainPart = wordDoc.MainDocumentPart;
ReplaceTags(mainPart, "FirstName", _firstName);
ReplaceTags(mainPart, "LastName", _lastName);
ReplaceTags(mainPart, "WorkPhoe", _workPhone);
ReplaceTags(mainPart, "JobTitle", _jobTitle);
mainPart.Document.Save();
SaveFile(mem);
}
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//remove all paragraphs from the content block
field.SdtContentBlock.RemoveAllChildren<Paragraph>();
//create a new paragraph containing a run and a text element
Paragraph newParagraph = new Paragraph();
Run newRun = new Run();
Text newText = new Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//add the new paragraph to the content block
field.SdtContentBlock.Append(newParagraph);
}
}
Keeping the style is a tricky problem as there could be more than one style applied to the text you are trying to replace. What should you do in that scenario?
Assuming a simple case of one style (but potentially over many Paragraphs, Runs and Texts) you could keep the first Text element you come across per SdtBlock and place your required value in that element then delete any further Text elements from the SdtBlock. The formatting from the first Text element will then be maintained. Obviously you can apply this theory to any of the Text blocks; you don't have to necessarily use the first. The following code should show what I mean:
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
IEnumerable<Text> texts = field.SdtContentBlock.Descendants<Text>();
for (int i = 0; i < texts.Count(); i++)
{
Text text = texts.ElementAt(i);
if (i == 0)
{
text.Text = tagValue;
}
else
{
text.Remove();
}
}
}
}

Syncfusion DocIO -- how to insert image (local file) at bookmark using BookmarksNavigator

I have been using Syncfusion DocIO for generating MS Word documents from my .net applications (winforms). So far I have dealt with plain text and it is fairly straightforward to insert text in a word document template where bookmarks serve as reference points for text insertion.
I am navigating the bookmarks using BookmarksNavigator.MoveToBookmark() . Now I need to insert an image at a bookmark but I am at a loss at how to go about it.
Please help...
Thanks.
Specifically for adding it to a bookmark :
//Move to the specified bookmark
bk.MoveToBookmark(bookmark);
//Insert the picture into the specified bookmark location
bk.DeleteBookmarkContent(true);
// we assume the text is a full pathname for an image file
// get the image file
System.Drawing.Image image = System.Drawing.Image.FromFile(sText);
IWParagraph paragraph = new WParagraph(document);
paragraph.AppendPicture(image);
bk.InsertParagraph(paragraph);
private System.Drawing.Image LoadSignature(string sFileName)
{
string sImagePath = sFileName;
System.Drawing.Image image = System.Drawing.Image.FromFile(sImagePath);
return image;
}
private void MergeSignature(WordDocument doc, string sFile, string sBalise)
{
System.Drawing.Image iSignature = LoadSignature(sFile);
WordDocument ImgDoc = new WordDocument();
ImgDoc.AddSection();
ImgDoc.Sections[0].AddParagraph().AppendPicture(iSignature);
if (iSignature != null)
{
TextSelection ts = null ;
Regex pattern = new Regex(sBalise);
ts = doc.Find(pattern);
if (ts != null)
{
doc.ReplaceFirst = true;
doc.Replace(pattern, ImgDoc, false);
}
}
iSignature.Dispose();
}
See here: https://help.syncfusion.com/file-formats/docio/working-with-mailmerge
1) You should create docx file with name "Template.docx". This file will use as template.
In your docx file create Field of type MergeField.
2) Create MergeFiled with name Image:Krishna
3)
using Syncfusion.DocIO.DLS;
using System.Drawing;
public class Source
{
public Image Krishna { get; set; } = Image.FromFile(#"C:\1.png");
}
and generating code:
public static void Generate()
{
WordDocument doc = new WordDocument("Template.docx");
Source data = new Source();
var dataTable = new MailMergeDataTable("", new Source[] { data });
doc.MailMerge.ExecuteGroup(dataTable);
doc.Save("result.docx");
doc.Close();
}