Getting the Content of a ContentControl Field as RichText - ms-word

i am trying to create a Word Template, where Users put in notes with basic formating options (like Bullet List , indentations etc.). But i need that content as Richtext, to display it in a browser.
I thought the RichText ContentControl field would save its content as.. well Richtext. But it turns out to be ooxml.
Is there maybe some converter to convert a richtext field to actual richtext?
Currently its content looks like this:
<w:sdtContent>
<w:p>
<w:pPr>
<w:pStyle w:val="Listenabsatz" />
<w:ind w:left="4320" />
</w:pPr>
<w:proofErr w:type="spellStart" />
<w:r>
<w:t>Ztzjtzj</w:t>
</w:r>
<w:proofErr w:type="spellEnd" />
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="Listenabsatz" />
<w:ind w:left="4320" />
</w:pPr>
<w:proofErr w:type="spellStart" />
<w:r>
<w:t>Tzjtzjtzj</w:t>
</w:r>
<w:proofErr w:type="spellEnd" />
</w:p>
</w:sdtContent>

Related

Define tab size in Word document

I would like to calculate tab size in white spaces, that is, roughly how many spaces the tab contains.
Here is a snippet of this Open XML document:
<w:body>
<w:p w14:paraId="606C743E" w14:textId="78B4DBA2" w:rsidR="009322DC" w:rsidRPr="00641B97" w:rsidRDefault="00641B97" w:rsidP="00641B97">
<w:pPr>
<w:pStyle w:val="KNotice"/>
<w:keepNext/>
<w:spacing w:after="0"/>
<w:rPr>
<w:szCs w:val="20"/>
</w:rPr>
</w:pPr>
<w:r>
<w:rPr>
<w:szCs w:val="20"/>
</w:rPr>
<w:tab/>
</w:r>
<w:r w:rsidR="00B846CC">
<w:rPr>
<w:szCs w:val="20"/>
</w:rPr>
<w:t>Some text.</w:t>
</w:r>
</w:p>
<w:sectPr w:rsidR="009322DC" w:rsidRPr="00641B97">
<w:pgSz w:w="12240" w:h="15840"/>
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="720" w:footer="720" w:gutter="0"/>
<w:cols w:space="720"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
I will use Open XML SDK for parsing, but before, trying to find property responsible for this tab size, for example, changing <w:szCs w:val="20"/> does not affect.
I noticed that this size is changed while manipulating First Line Indent or Left Indent, hence I assume there is not some single property that stores value of tab size and maybe I have to manipulate some several properties for that.
Is there such property in document structure? Or it should be calculated in some way?

Word field definition issue after save

I have field definition in Word 2016 (Office 365 same problem) like this
<w:r>
<w:fldChar w:fldCharType="begin"/>
<w:instrText xml:space="preserve"> DOCPROPERTY "UohsDate" \* MERGEFORMAT </w:instrText>
<w:fldChar w:fldCharType="separate"/>
</w:r>
after filling field by docx4j, editing document in word and saving document again, my field definition is divided into two:
<w:r>
<w:instrText xml:space="preserve"> DOCPROPERTY "UohsDa</w:instrText>
</w:r>
<w:r>
<w:instrText xml:space="preserve">te" \* MERGEFORMAT </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
Please, any suggestions? Thanks a lot

When does suppressLineNumbers not suppress line numbers?

I have a Word document with this style:
<w:style w:type="paragraph" w:customStyle="1" w:styleId="myStyle">
<w:name w:val="myStyle"/>
<w:basedOn w:val="ListParagraph"/>
<w:qFormat/>
<w:rsid w:val="005F5C84"/>
<w:pPr>
<w:widowControl/>
<w:numPr>
<w:numId w:val="1"/>
</w:numPr>
<w:suppressLineNumbers/>
<w:spacing w:before="240" w:line="360" w:lineRule="auto"/>
<w:ind w:left="0" w:firstLine="720"/>
<w:contextualSpacing w:val="0"/>
</w:pPr>
</w:style>
Because of the <w:suppressLineNumbers/> tag, I'd expect that Word would not show line numbers, but Word does show the line numbers.
How do I know when <w:suppressLineNumbers/> should take effect? I'm not finding any good documentation on this.
Below is the paragraph in case that helps.
<w:p ...>
<w:pPr>
<w:pStyle w:val="myStyle"/>
</w:pPr>
<w:r w:rsidRPr="00126430">
<w:t xml:space="preserve">Text</w:t>
</w:r>
</w:p>
Word line numbering is per Section, not per Style. Add or remove line numbers
It's possible to drop this into the XML for a paragraph to suppress the line numbering for that paragraph:
<w:pPr>
<w:suppressLineNumbers/>
</w:pPr>
But there's nothing analogous for use in a style. There's no online documentation for using that parameter in a style, either.

Word ooxml difference between Mac and Windows

When attempting to get the ooxml from the document I've found a variance.
In Windows, iPad, and Word Online in a document that has a cross reference that will break if you update it, but is not yet broken. (Meaning if you right click it and press update then it will go to to the "Error! Reference source not found." Then the xml will display the text value:
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText xml:space="preserve"> REF _Ref274197646 \n \h </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
<w:bookmarkStart w:id="39" w:name="_Read0190056i"/>
<w:bookmarkStart w:id="40" w:name="_Read0190047i"/>
<w:r>
<w:t>(i)</w:t>
</w:r>
<w:bookmarkEnd w:id="39"/>
<w:bookmarkEnd w:id="40"/>
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
However in the Mac even though the screen is displaying the text, the xml shows the broken reference text.
<w:r>
<w:rPr>
<w:b/>
<w:bCs/>
</w:rPr>
<w:t>Error! Reference source not found.</w:t>
</w:r>
This feels like a bug, as the different versions of word should report the xml largely the same.

Issue with listnum track changes in OpenXml(Revisions)

I have turned on the track changes(Revisions) option in word and made some changes and found all the track changes were being tracked and found in the openxml content. but i am not seeing the deleted listnum value in openxml content and the listnum values are continued from the next paragraph. so how can I track/get the deleted listnum value in openXml.
More details on the issue - we have 5 paragraphs with listnums (a) to (e). I turned on track changes and deleted listnum value (b) so that second paragraph has no listnum now. I thought I might get the value (b) in openxml since I turned on track changes but I am not able to get the deleted value (b) from openxml.
Thanks,
Manu
A single bulletpoint may use the following xml. It's a single Paragraph containing the text 'Item1' in a Run. The ParagraphProperties applies the style 'ListParagraph' and refers to a numbering:
<w:p>
<w:pPr>
<w:pStyle w:val="ListParagraph" />
<w:numPr>
<w:ilvl w:val="0" />
<w:numId w:val="1" />
</w:numPr>
</w:pPr>
<w:r>
<w:t>Item1</w:t>
</w:r>
</w:p>
If Track Changes is enabled and I delete the text 'Item1' I get xml like the following:
<w:p>
<w:pPr>
<w:pStyle w:val="ListParagraph" />
<w:pPrChange w:author="Daniel Brixen" w:date="2017-02-16T09:37:00Z" w:id="0">
<w:pPr>
<w:pStyle w:val="ListParagraph" />
<w:numPr>
<w:numId w:val="1" />
</w:numPr>
<w:ind w:hanging="360" />
</w:pPr>
</w:pPrChange>
</w:pPr>
<w:del w:author="Daniel Brixen" w:date="2017-02-16T09:37:00Z" w:id="2">
<w:r>
<w:delText>Item2</w:delText>
</w:r>
</w:del>
</w:p>
Two things to note:
The deleted text is in a DeletedRun-element
The change in paragraph-properties is recorded by a ParagraphPropertiesChange-element.
So you should be able to find the deleted text by using something like this:
using (var doc = WordprocessingDocument.Open(#"c:\temp\test.docx", true))
{
var deletedText = doc.MainDocumentPart.Document.Body.Descendants<DeletedText>();
Console.WriteLine(String.Join(" ", deletedText.Select(t => t.Text)));
}
Using Open XML Productivity Tool is helpful when debugging stuff like this.