How to remove the extra page at the end of a word document which created during mail merge - ms-word

I have written a piece of code to create a word document by mail merge using Syncfusion (Assembly Syncfusion.DocIO.Portable, Version=17.1200.0.50,), Angular 7+ and .NET Core. Please see the code below.
private MemoryStream MergePaymentPlanInstalmentsScheduleToPdf(List<PaymentPlanInstalmentReportModel>
PaymentPlanDetails, byte[] templateFileBytes)
{
if (templateFileBytes == null || templateFileBytes.Length == 0)
{
return null;
}
var templateStream = new MemoryStream(templateFileBytes);
var pdfStream = new MemoryStream();
WordDocument mergeDocument = null;
using (mergeDocument = new WordDocument(templateStream, FormatType.Docx))
{
if (mergeDocument != null)
{
var mergeList = new List<PaymentPlanInstalmentScheduleMailMergeModel>();
var obj = new PaymentPlanInstalmentScheduleMailMergeModel();
obj.Applicants = 0;
if (PaymentPlanDetails != null && PaymentPlanDetails.Any()) {
var applicantCount = PaymentPlanDetails.GroupBy(a => a.StudentID)
.Select(s => new
{
StudentID = s.Key,
Count = s.Select(a => a.StudentID).Distinct().Count()
});
obj.Applicants = applicantCount?.Count() > 0 ? applicantCount.Count() : 0;
}
mergeList.Add(obj);
var reportDataSource = new MailMergeDataTable("Report", mergeList);
var tableDataSource = new MailMergeDataTable("PaymentPlanDetails", PaymentPlanDetails);
List<DictionaryEntry> commands = new List<DictionaryEntry>();
commands.Add(new DictionaryEntry("Report", ""));
commands.Add(new DictionaryEntry("PaymentPlanDetails", ""));
MailMergeDataSet ds = new MailMergeDataSet();
ds.Add(reportDataSource);
ds.Add(tableDataSource);
mergeDocument.MailMerge.ExecuteNestedGroup(ds, commands);
mergeDocument.UpdateDocumentFields();
using (var converter = new DocIORenderer())
{
using (var pdfDocument = converter.ConvertToPDF(mergeDocument))
{
pdfDocument.Save(pdfStream);
pdfDocument.Close();
}
}
mergeDocument.Close();
}
}
return pdfStream;
}
Once the document is generated, I notice there is a blank page (with the footer) at the end. I searched for a solution on the internet over and over again, but I was not able to find a solution. According to experts, I have done the initial checks such as making sure that the initial word template file has no page breaks, etc.
I am wondering if there is something that I can do from my code to remove any extra page breaks or anything like that, which can cause this.
Any other suggested solution for this, even including MS Word document modifications also appreciated.

Please refer the below documentation link to remove empty page at the end of Word document using Syncfusion Word library (Essential DocIO).
https://www.syncfusion.com/kb/10724/how-to-remove-empty-page-at-end-of-word-document
Please reuse the code snippet before converting Word to PDF in your sample application.
Note: I work for Syncfusion.

Related

Custom properties are not updated for the word via openXML

I am trying to update custom properties of word document thru Open XML programming but it seems the updated properties are not getting saved properly for the word document. So when I opening document after successful execution of the update custom property code, I am getting the message box which is "This document contains field that may refer to other files; Do you want to update the fields in the Document?" If I am pressing 'NO' button then all the update properties would not be saved to the document. If we are going for yes option then it will update properties but I need to save the properties explicitly. Please suggest to save properties to the document without getting confirmation message or corrupting the document. :)
the code snippet is given as below,
public void SetCustomValue(
WordprocessingDocument document, string propname, string aValue)
{
CustomFilePropertiesPart oDocCustomProps = document.CustomFilePropertiesPart;
Properties props = oDocCustomProps.Properties;
if (props != null)
{
//logger.Debug("props is not null");
foreach (var prop in props.Elements<CustomDocumentProperty>())
{
if (prop != null && prop.Name == propname)
{
//logger.Debug("Setting Property: " + prop.Name + " to value: " + aValue);
prop.Remove();
var newProp = new CustomDocumentProperty();
newProp.FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";
newProp.Name = prop.Name;
VTLPWSTR vTLPWSTR1 = new VTLPWSTR();
vTLPWSTR1.Text = aValue;
newProp.Append(vTLPWSTR1);
props.AppendChild(newProp);
props.Save();
}
}
int pid = 2;
foreach (CustomDocumentProperty item in props)
{
item.PropertyId = pid++;
}
props.Save();
}
}
I am using .Net framework 3.5 with Open XML SDK 2.0 and Office 2013.
Try this one
var CustomeProperties = xmlDOc.CustomFilePropertiesPart.Properties;
foreach (CustomDocumentProperty customeProperty in CustomeProperties)
{
if (customeProperty.Name == "DocumentName")
{
customeProperty.VTLPWSTR = new VTLPWSTR("My Custom Name");
}
else if (customeProperty.Name == "DocumentID")
{
customeProperty.VTLPWSTR = new VTLPWSTR("FNP.SMS.IQC");
}
else if (customeProperty.Name == "DocumentLastUpdate")
{
customeProperty.VTLPWSTR = new VTLPWSTR(DateTime.Now.ToShortDateString());
}
}
//Open Word Setting File
DocumentSettingsPart settingsPart = xmlDOc.MainDocumentPart.GetPartsOfType<DocumentSettingsPart>().First();
//Update Fields
UpdateFieldsOnOpen updateFields = new UpdateFieldsOnOpen();
updateFields.Val = new OnOffValue(true);
settingsPart.Settings.PrependChild<UpdateFieldsOnOpen>(updateFields);
settingsPart.Settings.Save();
you have to update your document fields on open.

iText not returning text contents of a PDF after first page

I am trying to use the iText library with c# to capture the text portion of pdf files.
I created a pdf from excel 2013 (exported) and then copied the sample from the web of how to use itext (added the lib ref to the project).
It reads perfectly the first page but it gets garbled info after that. It is keeping part of the first page and merging the info with the next page. The commented lines is when I was trying to solve the problem, the string "thePage" is recreated inside the for loop.
Here is the code. I can email the pdf to whoever can help with this issue.
Thanks in advance
public static string ExtractTextFromPdf(string path)
{
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.LocationTextExtractionStrategy();
using (PdfReader reader = new PdfReader(path))
{
StringBuilder text = new StringBuilder();
//string[] theLines;
//theLines = new string[COLUMNS];
//string thePage;
for (int i = 1; i <= reader.NumberOfPages; i++)
{
string thePage = "";
thePage = PdfTextExtractor.GetTextFromPage(reader, i, its);
string [] theLines = thePage.Split('\n');
foreach (var theLine in theLines)
{
text.AppendLine(theLine);
}
// text.AppendLine(" ");
// Array.Clear(theLines, 0, theLines.Length);
// thePage = "";
}
return text.ToString();
}
}
A strategy object collects text data and does not know if a new page has started or not.
Thus, use a new strategy object for each page.

Openxml: Added ImagePart is not showing in Powerpoint / Missing RelationshipID

I'm trying to dynamically create a PowerPoint presentation. One slide has a bunch of placeholder images that need to be changed based on certain values.
My approach is to create a new ImagePart and link it to the according Blip. The image is downloaded and stored to the presentation just fine. The problem is, that there is no relationship created in slide.xml.rels file for the image, which leads to an warning about missing images and empty boxes on the slide.
Any ideas what I am doing wrong?
Thanks in advance for your help! Best wishes
SPSecurity.RunWithElevatedPrivileges(delegate()
{
using (SPSite siteCollection = new SPSite(SPContext.Current.Site.RootWeb.Url))
{
using (SPWeb oWeb = siteCollection.OpenWeb())
{
SPList pictureLibrary = oWeb.Lists[pictureLibraryName];
SPFile imgFile = pictureLibrary.RootFolder.Files[imgPath];
byte[] byteArray = imgFile.OpenBinary();
int pos = Convert.ToInt32(name.Replace("QQ", "").Replace("Image", ""));
foreach (DocumentFormat.OpenXml.Presentation.Picture pic in pictureList)
{
var oldimg = pic.BlipFill.Blip.Embed.ToString(); ImagePart ip = (ImagePart)slidePart.AddImagePart(ImagePartType.Png, oldimg+pos);
using (var writer = new BinaryWriter(ip.GetStream()))
{
writer.Write(byteArray);
}
string newId = slidePart.GetIdOfPart(ip);
setDebugMessage("new img id: " + newId);
pic.BlipFill.Blip.Embed = newId;
}
slidePart.Slide.Save();
}
}
});
So, for everyone who's experiencing a similar problem, I finally found the solution. Quite a stupid mistake. Instad of PresentationDocument document = PresentationDocument.Open(mstream, true); you have to use
using (PresentationDocument document = PresentationDocument.Open(mstream, true))
{
do your editing here
}
This answer brought me on the right way.

c#-openxml word Replacement and page break

i am a new member an i really like this site because it help me always
my problem is
i want replace word document using openxml and add a page break
end then i want to write replaced text second page
here my codes
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(#"d:\a.docx", true))
{
using (StreamReader reader = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
text = reader.ReadToEnd();
}
Regex regexText = new Regex("#db#");
text = regexText.Replace(text, textBox4.Text.Trim());
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(text);
}
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
Run r = new Run();
Paragraph para = new Paragraph(new Run(new Break() { Type = BreakValues.Page }));
using (StreamWriter sw1 = new StreamWriter(mainPart.GetStream(FileMode.Create)))
{
sw1.Write(text);
}
mainPart.Document.Body.InsertAfter(para, mainPart.Document.Body.LastChild);
mainPart.Document.Save();
}
}
I suggest you insert a page break in your a.docx in advance. Then, use MergeField to locate where you want to replace with.
Here is the example

' ', hexadecimal value 0x1F, is an invalid character. Line 1, position 1

I am trying to read a xml file from the web and parse it out using XDocument. It normally works fine but sometimes it gives me this error for day:
**' ', hexadecimal value 0x1F, is an invalid character. Line 1, position 1**
I have tried some solutions from Google but they aren't working for VS 2010 Express Windows Phone 7.
There is a solution which replace the 0x1F character to string.empty but my code return a stream which doesn't have replace method.
s = s.Replace(Convert.ToString((byte)0x1F), string.Empty);
Here is my code:
void webClient_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
using (var reader = new StreamReader(e.Result))
{
int[] counter = { 1 };
string s = reader.ReadToEnd();
Stream str = e.Result;
// s = s.Replace(Convert.ToString((byte)0x1F), string.Empty);
// byte[] str = Convert.FromBase64String(s);
// Stream memStream = new MemoryStream(str);
str.Position = 0;
XDocument xdoc = XDocument.Load(str);
var data = from query in xdoc.Descendants("user")
select new mobion
{
index = counter[0]++,
avlink = (string)query.Element("user_info").Element("avlink"),
nickname = (string)query.Element("user_info").Element("nickname"),
track = (string)query.Element("track"),
artist = (string)query.Element("artist"),
};
listBox.ItemsSource = data;
}
}
XML file:
http://music.mobion.vn/api/v1/music/userstop?devid=
0x1f is a Windows control character. It is not valid XML. Your best bet is to replace it.
Instead of using reader.ReadToEnd() (which by the way - for a large file - can use up a lot of memory.. though you can definitely use it) why not try something like:
string input;
while ((input = sr.ReadLine()) != null)
{
string = string + input.Replace((char)(0x1F), ' ');
}
you can re-convert into a stream if you'd like, to then use as you please.
byte[] byteArray = Encoding.ASCII.GetBytes( input );
MemoryStream stream = new MemoryStream( byteArray );
Or else you could keep doing readToEnd() and then clean that string of illegal characters, and convert back to a stream.
Here's a good resource for cleaning illegal characters in your xml - chances are, youll have others as well...
https://seattlesoftware.wordpress.com/tag/hexadecimal-value-0x-is-an-invalid-character/
What could be happening is that the content is compressed in which case you need to decompress it.
With HttpHandler you can do this the following way:
var client = new HttpClient(new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip
| DecompressionMethods.Deflate
});
With the "old" WebClient you have to derive your own class to achieve the similar effect:
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
Above taken from here
To use the two you would do something like this:
HttpClient
using (var client = new HttpClient(new HttpClientHandler { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }))
{
using (var stream = client.GetStreamAsync(url))
{
using (var sr = new StreamReader(stream.Result))
{
using (var reader = XmlReader.Create(sr))
{
var feed = System.ServiceModel.Syndication.SyndicationFeed.Load(reader);
foreach (var item in feed.Items)
{
Console.WriteLine(item.Title.Text);
}
}
}
}
}
WebClient
using (var stream = new MyWebClient().OpenRead("http://myrss.url"))
{
using (var sr = new StreamReader(stream))
{
using (var reader = XmlReader.Create(sr))
{
var feed = System.ServiceModel.Syndication.SyndicationFeed.Load(reader);
foreach (var item in feed.Items)
{
Console.WriteLine(item.Title.Text);
}
}
}
}
This way you also recieve the benefit of not having to .ReadToEnd() since you are working with the stream instead.
Consider using System.Web.HttpUtility.HtmlDecode if you're decoding content read from the web.
If you are having issues replacing the character
For me there were some issues if you try to replace using the string instead of the char. I suggest trying some testing values using both to see what they turn up. Also how you reference it has some effect.
var a = x.IndexOf('\u001f'); // 513
var b = x.IndexOf(Convert.ToString((byte)0x1F)); // -1
x = x.Replace(Convert.ToChar((byte)0x1F), ' '); // Works
x = x.Replace(Convert.ToString((byte)0x1F), " "); // Fails
I blagged this
I had the same issue and found that the problem was a  embedded in the xml.
The solution was:
s = s.Replace("", " ")
I'd guess it's probably an encoding issue but without seeing the XML I can't say for sure.
In terms of your plan to simply replace the character but not being able to, because you have a stream rather than a text, simply read the stream into a string and then remove the characters you don't want.
Works for me.........
string.Replace(Chr(31), "")
I used XmlSerializer to parse XML and faced the same exception.
The problem is that the XML string contains HTML codes of invalid characters
This method removes all invalid HTML codes from string (based on this thread - https://forums.asp.net/t/1483793.aspx?Need+a+method+that+removes+illegal+XML+characters+from+a+String):
public static string RemoveInvalidXmlSubstrs(string xmlStr)
{
string pattern = "&#((\\d+)|(x\\S+));";
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
if (regex.IsMatch(xmlStr))
{
xmlStr = regex.Replace(xmlStr, new MatchEvaluator(m =>
{
string s = m.Value;
string unicodeNumStr = s.Substring(2, s.Length - 3);
int unicodeNum = unicodeNumStr.StartsWith("x") ?
Convert.ToInt32(unicodeNumStr.Substring(1), 16)
: Convert.ToInt32(unicodeNumStr);
//according to https://www.w3.org/TR/xml/#charsets
if ((unicodeNum == 0x9 || unicodeNum == 0xA || unicodeNum == 0xD) ||
((unicodeNum >= 0x20) && (unicodeNum <= 0xD7FF)) ||
((unicodeNum >= 0xE000) && (unicodeNum <= 0xFFFD)) ||
((unicodeNum >= 0x10000) && (unicodeNum <= 0x10FFFF)))
{
return s;
}
else
{
return String.Empty;
}
})
);
}
return xmlStr;
}
Nobody can answer if you don't show relevant info - I mean the Xml content.
As a general advice I would put a breakpoint after ReadToEnd() call. Now you can do a couple of things:
Reveal Xml content to this forum.
Test it using VS Xml visualizer.
Copy-paste the string into a txt file and investigate it offline.