Mailmerge Field not always saved the same way in Word .docx - ms-word

I have created a Word document with Word 2003 and inserted some MergeField via the GUI.
I have saved it as a .docx by using Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats. Some Mergefields are stored as a SimpleField, while others are stored as a FieldCode (with start-FieldChar and end-FieldChar). Some Googling brought me to this blog. As you can see the guy is facing the same problem; but hasn't found a solution yet.
I'm using the following code sample on Codeplex [Fill Mergefields] to replace the MergeFields with the actual values from different datasources.
Any help is welcome.

If a field's value is just simple text with consistent formatting, it can be stored as a fldSimple node. However, if the field's value has varying formatting, it has to be saved as a complex field (fldChar Start, Optional Separate, and End) so that each run within the field's value can have it's different formatting defined in the run properties <w:rPr>. I think that also happens if word uses the rsid attributes to track changes. The fldChar Start/Separate/End are also necessary if the fields are nested, such as multiple IF fields, so that it can store an arbitrary number of <w:r>, <w:p>, <w:tbl> as the field's value.
And sometimes it stores them like that for seemingly no good reason. (As that blog noted).

Related

Ms-Word Mail Merging format

I'm using MS-Office 2016 to do a Mail Merging for a contract where the original Data is stored in an Excel file.
I have problem formating the Plate number as it should be in a format like
this ##-##### and that's how it's shown in Excel, However, when i do the merge it shows like
this #####-## and here is a photo to show exactly one entry.
I tried placing the merged field out of the table, tweaked with right to left, left to right and tried to generate a format based on what i saw on the internet about formatting Date and Currency, However, I got no positive results.
I would appreciate if you could help me formatting this field or direct me to reference in which i can learn how this formatting thing works.
Or, I don't know if this viable but every time i open the .docx file it says that it need to run sql query, can I access that query?, I think I have a base knowledge about that and I may be able to format it through the query.

Insert a word document into another word document without losing formatting

I am working on my portfolio for grad school and I am having an issue with inserting a word document into another and keeping the original formatting of both. I built the main document so that all I would need to do is insert my supporting documentation which has to be of a different format. I tried next page section breaks and it generates the next page but all the formatting is still tied to the main document. Thanks in advance.
Just make a regular copy and paste. Control+C on source and Control+V on destiny. Best.

Is there a way to search text contained in linked documents in Enterprise Archietct?

Plenty of information is being stored in Enterprise Architect as linked documents. Is there a way to search content contained in these documents?
In the GUI, the only way to do this is to open the document and search inside it, which I assume is not what you're after.
I don't think an in-EA SQL search would get this done either. The linked documents are in RTF, so you'd have to parse that to find the text you're looking for.
But you could do it with a script or an Add-In.
The Object Model API method Element.GetLinkedDocument() returns the RTF contents from an element's linked document. Then, you can use Repository.GetFieldFromFormat() and Repository.GetFormatFromField() to convert that to plain text.

Can I use VSTO instead of Open XML to manipulate altChunk features?

I would like to embed one Word document (call it "hidden.docx") into another Word document (call it "host.docx"). The document hidden.docx would not be visible at all when host.docx is opened in Word by an end-user. Document hidden.docx would only be carried inside host.docx, sort of as unstructured cargo data.
All research I have done points me to the use of something called altChunk offered by the Open XML SDK. I have installed Open XML SDK and got a sample working: http://msdn.microsoft.com/en-us/library/gg490656%28v=office.14%29.aspx
My question: In order to insert an altChunk into a docx, do I really need the Open XML SDK? Can this not be accomplished using VSTO? If so, how?
[PS: My ultimate goal is, for a pair of documents where one document is the original text and the other is its translated version in another language, to be able to preserve the original document within the translated document, so as not to lose it. For any document pair, there's always the risk that the two documents become separated through misplacement of one of them.]
Yes and No.
1.) That's not what AltChunks do. AltChunks are a way to embed one document into another document such that they get merged together. They are not hidden. If you create a docx package with an AltChunk in it, and then open up Word, Word will immediately merge that AltChunk into the document. (If that AltChunk is another Word document that also contains child AltChunks they will be recursively merged into the parent as well.) Basically, it's an easy way to merge content together without having to reconsolidate all their styles, rIDs, etc. -- if you save the document and examine it, the AltChunk will be gone, and you will notice that Word has merged the document back together into a single document again.
2.) Range.InsertXML, if provided a valid Flat Package for a full Word document will, under the hood, invoke that same merge functionality (down to having the same bugs, etc.) that you would get from an AltChunk. The two behave identical, and you can even create a document package with the OpenXML SDK that contains embedded AltChunks, and insert those (I've done this in Word 2007, 2010, and 2013) -- of course, as I mentioned above, the AltChunks are never persisted, they're immediately merged into the document.
If you want to save hidden data in a document, I recommend using Custom XML (take a look at Document.CustomXMLParts). Keep in mind though, at least in Word 2010, Undo does not revert changes to CustomXML parts.
If you simply want to include some file into the Open XML package, then the simplest way is to use API from the System.IO.Packaging namespace (First obtain the reference to the main document part of the host part):
EmbeddedPackagePart hiddenDocumentPart = mainDocumentPart.AddEmbeddedPackagePart(#"application/vnd.openxmlformats-officedocument.wordprocessingml.document");
hiddenDocumentPart.FeedData(File.Open(hiddenDocumentFile, FileMode.Open));
Just to be sure, this way the hidden document will be in no way part of the host document content. It will only be part of its file (package). You can later extract it with a similar method: Get the main part of the host document, find the embedded (hidden) part and get/read the data from it.

Unique ID for MS Word 2007 paragraph

I am writing large MS Word 2007 documents, which are often being changed. I have to number paragraphs with stationary unique numbers, that will not change while changing the documents. The numbers should be unique, and will not change even if previous numbers are deleted. The order of the list is not mandatory, and addition of a new number before existing numbers is possible (for instance: the sequence 1, 4, 3 means that paragraphs 1-3 were written, then #2 was deleted, then #5 was added. #3 was not affected by the later editing)
The mechanism should be internal to the document, as I am working on line and off line. The numbers are allocated to every document individually.
Since I don't know to program under MS Word, I'd appreciate getting a complete solution.
Nope, it's not possible out-of-the-box. Word does not assign a permanent index to paragraphs. The simpliest way, but it ain't so simple, to do this is to programmatically assign an index number of each Paragraph range item to a CustomXML control that wraps the paragraph on load or whenever you run it. For this or any other solution, you'll need to learn the Word object model and program it through VBA or VSTO or OpenXML.
You can wrap a paragraph in a content control (structured document tag); these can have IDs.
Iirc, Word 2010 allows paragraphs to have IDs. M$ added this because they needed it for the concurrent editing introduced in 2010.