Getting page margins of a Word document using OpenXML - ms-word

Tried different version of word documents and different page margins in them, but nothing I found on the web did the job.
Here's my attempt at reading page margins from a .docx file.
var document = WordprocessingDocument.Open(sourceDocxPath, true);
var myMargins = document.MainDocumentPart.Document.GetFirstChild<PageMargin>();
// always null

In the truest programmer fashion, I figured out the answer 5 minutes after posting
var leftMargin = document.MainDocumentPart.Document.Body
.GetFirstChild<SectionProperties>()
.GetFirstChild<PageMargin>().Left;
I figured it out by inspecting the outerXml of the document and seeing where things are positioned.

Related

Issue while adding/updating TOC in MS Word using OpenXML

I have a requirement to add/update TOC - Table of Contents in MS document using OpenXML. I am facing challenges to achieve the same. I am using MS Office 2016.
I have tried all the options from this post:
How to generate Table Of Contents using OpenXML SDK 2.0?
Also gone through Eric White videos.
I am trying to use UpdateField option and able to add empty TOC following the sample code from the above link.
However when I open the document I am not getting a pop-up dialog which will ask to update the TOC.
Here is the sample code:
var sdtBlock = new SdtBlock();
sdtBlock.InnerXml = GetTOC(); //TOC Xml
document.MainDocumentPart.Document.Body.AppendChild(sdtBlock);
DocumentFormat.OpenXml.Wordprocessing.SimpleField f;
f = new SimpleField();
f.Instruction = "sdtContent";
f.Dirty = true;
document.MainDocumentPart.Document.Body.Append(f);
var setting = document.MainDocumentPart.DocumentSettingsPart;
if (setting != null)
{
document.MainDocumentPart.DocumentSettingsPart.Settings.Append(new UpdateFieldsOnOpen() { Val = new DocumentFormat.OpenXml.OnOffValue(true)});
document.MainDocumentPart.DocumentSettingsPart.Settings.Save();
}
It is displaying default message, whereas I have valid entries (Headings).
Is it due to MS Office 2016? UpdateField Pop-up is not coming?
I don't want to go with below options:
Word Automation - Due to Microsoft.Office.Interop.Word
Word Automation Services - Require Sharepoint for this
Adding Macro - As it is asking to save the document every time I open it.
Also let me know if there is any better option to create/update TOC.
Your answer/comment is really very helpful.

How to get MS Word total pages count using Open XML SDK?

I am using below code to get the page count but it is not giving actual page count(PFA). What is the better way to get the total pages count?
var pageCount = doc.ExtendedFilePropertiesPart.Properties.Pages.Text.Trim();
Note: We cannot use the Office Primary Interop Assemblies in my Azure web app service
Thanks in advance.
In theory, the following property can return that information from the Word Open XML file, using the Open XML SDK:
int pageCount = (int) document.ExtendedFilePropertiesPart.Properties.Pages.Text;
In practice, however, this isn't reliable. It might work, but then again, it might not - it all depends on 1) What Word managed to save in the file before it was closed and 2) what kind of editing may have been done on the closed file.
The only sure way to get a page number or a page count is to open a document in the Word application interface. Page count and number of pages is calculated dynamically, during editing, by Word. When a document is closed, this information is static and not necessarily what it will be when the document is open or printed.
See also https://github.com/OfficeDev/Open-XML-SDK/issues/22 for confirmation.
This code worked for me. It adds "Page X of Y" to the document.
para = new Paragraph(new Run(
new Text() { Text = "Page ", Space = SpaceProcessingModeValues.Preserve },
new SimpleField() { Instruction = "PAGE" },
new Text() { Text = " of ", Space = SpaceProcessingModeValues.Preserve },
new SimpleField() { Instruction = "NUMPAGES \\*MERGEFORMAT" }));

iTextSharp insert image PushbuttonField not working

I have searched and searched and cannot find the answer to my problem. I've tried many different approaches in my code, but I've hit a wall and I'm not sure where to go from here. I seem to be wanting to do the same thing as these two threads:
Trying to insert an image into a pdf‏ in c#
Add image in an existing PDF with itextsharp
They are very similar and the answer is the same. However, when I use that exact code, the result is a PDF without an image. Here is my code:
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream, '\0', true);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (var field in form.Fields)
{
if (field.Key == "form1[0].ec_Bldg_Photo_1[0].ImageField2[0]")
{
PushbuttonField imageField = form.GetNewPushbuttonFromField(field.Key);
imageField.Layout = PushbuttonField.LAYOUT_ICON_ONLY;
imageField.IconReference = null;
imageField.ProportionalIcon = true;
imageField.Image = Image.GetInstance(#"PATH_TO_IMAGE\front.jpg");
form.ReplacePushbuttonField(field.Key, imageField.Field);
}
}
stamper.FormFlattening = false;
stamper.Close();
pdfReader.Close();
}
I have tried to rule out all of the obvious things. My path to the image is correct, the field is indeed a PushbuttonField when I read the existing PDF field and get the field type. If I open the PDF in Adobe Reader and click on the placeholder for the image, it allows me to pick a file from my PC. When I place an image in the file, save, and then read in that PDF, I can then change my code to this:
imageField.ProportionalIcon = false;
And now all of sudden the image is stretched on the saved copy. So I see that it is changing this part but this is when I enter the image manually in Adobe Reader. When I read in the field after I set that image in Adobe Reader and it shows correctly, I see a couple interesting things. The field.Image property IS NULL and the field.IconReference is NOT NULL. When I use the original code to try and insert the image, it is reversed, where Image is NOT NULL but IconReference IS NULL
Any help would be greatly appreciated, thank you!!
EDIT 1: Ok so I didn't see it the first time, but I went back and checked more thoroughly and I did find that key. Here it is:
Several things are at play here.
Usage Rights:
The PDF is digitally signed with a private key owned by Adobe.
You can see this using RUPS here (in your screen shot you didn't go deep enough):
This has two implications:
The signature unlocks special permissions in Adobe Reader, such as the permission to save a filled out form locally.
Making any changes to the original PDF breaks the signature and removes the special permissions leading to an ugly error message in Adobe Reader.
This functionality is deprecated in (and even removed from) PDF 2.0. It's old technology that became obsolete with the emergence of PDF viewers other than Adobe Reader.
My suggestion: remove the usage rights to avoid breaking the signature. See the FAQ entry "Why do I get an error saying that "use of extended features is no longer available"?" iText 7 / iText 5
This is the iText 7 code:
public void removeUsageRights(PdfDocument pdfDoc) {
PdfDictionary perms = pdfDoc.getCatalog().getPdfObject().getAsDictionary(PdfName.Perms);
if (perms == null) {
return;
}
perms.remove(new PdfName("UR"));
perms.remove(PdfName.UR3);
if (perms.size() == 0) {
pdfDoc.getCatalog().remove(PdfName.Perms);
}
}
This is the iText 5 code:
PdfReader reader = new PdfReader(old_file);
if (reader.hasUsageRights()) {
reader.removeUsageRights();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(new_file));
stamper.close();
}
reader.close();
This is the iText 5 answer.
Hybrid Form:
If you click on the /AcroForm entry, you see this:
There is a /Fields array with references to field dictionaries that are also widget annotations. That means that the document has an AcroForm form inside. However, there is also an /XFA entry with a series of XML snippets. That means that the document has an XFA form inside.
In other words: the same form description is added twice inside. You are changing a button in one form (the AcroForm part), but not in the other (the XFA form) and that leads to inconsistencies.
XFA has been deprecated in PDF 2.0 because there weren't many vendors supporting that technology. It's kind of frustrating to be confronted with forms that use deprecated technology.
My suggestion: I would remove the XFA part. See the FAQ entry "Is it safe to remove XFA?" iText 5 / iText 7
In iText 5, removing XFA is done like this:
AcroFields form = stamper.getAcroFields();
form.removeXfa();
Important: my suggestion is to remove all the deprecated functionality from the PDF, but if the government expects that functionality to be present, then you're out of luck. In that case, you will need to use Adobe software to process the form. If that's the case, you could complain to the government that their requirements lead to a de facto vendor lock-in. By the way: iText Software is also a vendor. It's an open source company that offers open source software under the AGPL license. The AGPL license allows free use under certain circumstances (see How do I make sure my software complies with AGPL: How can I use iText for free?) If you don't meet those requirements, you will have to purchase a commercial license for your use of iText.

Keeping track of text position on pdf using FPDF and GetY

I am trying to keep track of the current Y position on a PDF page created using FPDF so that I can correctly start a new page ensuring tables do not cross a page break. Firstly am I right in using GetY to monitor this and if so what is the correct syntax. I am trying
$currentYposition = GetY();
but it does not seem to work. Any advice?
No idea why this works - but it does:
If you just grab Y after the call, it seems to be the value before the MultiCell.
Grabbing it before and after and taking the difference gives you the height.
$oldY = $this->getY();
$this->MultiCell(150, 4, utf8_decode($description), 0, "L");
$newY = $this->getY();
$multiCellHeight = $newY-$oldY;
This one worked for me.
$y = $pdf->GetY();
I came to this question when programming in python and using the fpdf module. I'll post in case anyone else need this, I could not find this solution in the official documentation but for me following worked:
from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
current_y = FPDF.get_y(pdf)

How to access hyperlinks in PDF documents (iPhone)?

Is it possible to get access to "internal" links in PDF documents using CGPDFDocument, or other means? I'm building a simple reader app and would like to deliver my content in PDF form, but if I can't support links between pages in the doc this probably isn't going to work.
This question is similar, but does not address the issue of how to support hyperlinks.
See my answer here. Basically, you'll need to get familiar with PDF link annotations.
see this sample code...pdf hyperlinks works in this
https://github.com/vfr/Reader
If you're using Quartz to open and view a PDF, then yes, it looks like you will have access to internal links. Quartz will also let you add new links to PDFs. I don't have any first hand experience with iPhone/Mac development, but it would be quite strange for them to let you add hyperlinks, but not use them.
You need to do in two steps.
First: Parse your pdf to locate marked content operators
That's an exemple of a parsing code :
-(void)parseContent() {
CGPDFOperatorTableRef myTable;
myTable = CGPDFOperatorTableCreate();
CGPDFOperatorTableSetCallback(myTable, "BMC", &myOperator_BMC);
CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage(page);
CGPDFScannerRef myScanner = CGPDFScannerCreate(myContentStream, myTable, autoZoomAreas);
CGPDFScannerScan(myScanner);
CGPDFScannerRelease(myScanner);
CGPDFContentStreamRelease(myContentStream);
}
void myOperator_BMC(CGPDFScannerRef s, void *info)
{
const char *name;
CGPDFScannerPopName(s, &name);
}
(You need to complete and adjust this code to match your requirement)
Second: respond to the toucheEnded message to handle tap on those zones and make the UI respond accordingly.