can open xml sdk be used in creating xml files? - openxml

is it possible to use the OPEN XML SDK and generate an xml file that contains some metadata of a particular docx file?
details: i have a docx file, from which i want to extract some metadata(using open xml) and display them as xml file and later use Jquery to present them in a more readable form.

You can use the SDK to extract info from the various properties parts which may be present in the docx (for example, the core properties part, which included dublin core type info).
You can extract it in its native XML form:
<cp:coreProperties
xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core- properties"
xmlns:dc="http://purl.org/dc/elements/1.1/" .. >
<dc:creator>Joe</dc:creator>
<cp:lastModifiedBy>Joe</cp:lastModifiedBy>
<cp:revision>1</cp:revision>
<dcterms:created xsi:type="dcterms:W3CDTF">2010-11-10T00:32:00Z</dcterms:created>
<dcterms:modified xsi:type="dcterms:W3CDTF">2010-11-10T00:33:00Z</dcterms:modified>
</cp:coreProperties>
or, in some other XML dialect of your own choosing.

I know question was posted a long time ago, but first result of google search sent me here. So if there are others looking for a solution to this, there is a snippet on MSDN website https://msdn.microsoft.com/en-us/library/office/cc489219.aspx
short answer is... using XmlTextWritter, and it applies to Office 2013 afaik:
// Add the CoreFilePropertiesPart part in the new word processing document.
var coreFilePropPart = wordDoc.AddCoreFilePropertiesPart();
using (XmlTextWriter writer = new XmlTextWriter(coreFilePropPart.GetStream(FileMode.Create), System.Text.Encoding.UTF8))
{
writer.WriteRaw("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<cp:coreProperties xmlns:cp=\"http://schemas.openxmlformats.org/package/2006/metadata/core-properties\"></cp:coreProperties>");
writer.Flush();
}

Related

Can OpenXML be used to launch a new Word instance?

I'm able to generate Word documents without issue. I save the resulting *.docx file to a temporary location and then need to launch the file in Word.
The requirement is to not "open" the file in Word (easily done with a Process.Start) but to have load into Word as a new unsaved file. This is because certain propriety integrations for Word need to take over when a user saves the file and don't kick in if the file is ready saved but to a location on disk.
I've achieved this by using Interop calls to the Word application, adding the new document to Word's workspace. My problem is with Interop which tends to break on various client machines, particularly when Office upgrades take place (say a client had 32-bit office but upgraded with a 64-bit version).
I'm somewhat new to OpenXML, but can it be used to automate Word or is Interop my only real option?
object oFilename = tmpFileName;
object oNewTemplate = false;
object oDocumentType = 0;
object oVisible = true;
Document document = _application.Documents.Add(ref oFilename, ref oNewTemplate, ref oDocumentType, ref oVisible);
No, the Open XML technology has no way of interacting with the Office (Word) application - it's for file creation/manipulation, only. The interop is required in order to do anything with the Word application.
There is sort of a way around this - and it's only possible with Word, no other Office application has this - is to convert the Open XML content to the OPC flat-file format. This "concatenates" the various packages that make up the zip file to a pure text string, essetially a single XML file.
XML content in the OPC flat-file format can then be written to an already opened (even newly created) Word document using the Range.InsertXML method via "the interop". In a way, this "streams" the Open XML content into the opened Word document.
The problem with this approach is that certain document-level properties are not written to the target document, so not all aspects of the opened document can be changed. For example: page size, orientation, headers, footers... So if this kind of thing also needs to be affected the interop is required for such settings.

Domino Document to MS Word

I need to export a Domino document with RTF (images, tables, etc) to MS.Word in background mode or via Web. I have tried with POI4Xpages but I don't know how to export it.
You need to write your POI document to an output stream. This could be a fileOutputStream or the response. When using the response you need to set the header to reflect the file type. Check for XAgent for a sample how to do that. See: https://www.wissel.net/blog/2008/12/xagents-web-agents-xpages-style.html

How to make a section optional when mapped to optional data in a Word OpenXml Part?

I'm using OpenXml SDK to generate word 2013 files. I'm running on a server (part of a server solution), so automation is not an option.
Basically I have an xml file that is output from a backend system. Here's a very simplified example:
<my:Data
xmlns:my="https://schemas.mycorp.com">
<my:Customer>
<my:Details>
<my:Name>Customer Template</my:Name>
</my:Details>
<my:Orders>
<my:Count>2</my:Count>
<my:OrderList>
<my:Order>
<my:Id>1</my:Id>
<my:Date>19/04/2017 10:16:04</my:Date>
</my:Order>
<my:Order>
<my:Id>2</my:Id>
<my:Date>20/04/2017 10:16:04</my:Date>
</my:Order>
</my:OrderList>
</my:Orders>
</my:Customer>
</my:Data>
Then I use Word's Xml Mapping pane to map this data to content control:
I simply duplicate the word file, and write new Xml data when generating new files.
This is working as expected. When I update the xml part, it reflects the data from my backend.
Thought, there's a case that does not works. If a customer has no order, the template content is kept in the document. The xml data is :
<my:Data
xmlns:my="https://schemas.mycorp.com">
<my:Customer>
<my:Details>
<my:Name>Some customer</my:Name>
</my:Details>
<my:Orders>
<my:Count>0</my:Count>
<my:OrderList>
</my:OrderList>
</my:Orders>
</my:Customer>
</my:Data>
(see the empty order list).
In Word, the xml pane reflects the correct data (meaning no Order node):
But as you can see, the template content is still here.
Basically, I'd like to hide the order list when there's no order (or at least an empty table).
How can I do that?
PS: If it can help, I uploaded the word and xml files, and a small PowerShell script that injects the data : repro.zip
Thanks for sharing your files so we can better help you.
I had a difficult time trying to solve your problem with your existing Word Content Controls, XML files and the PowerShell script that added the XML to the Word document. I found what seemed to be Microsoft's VSTO example solution to your problem, but I couldn't get this to work cleanly.
I was however able to write a simple C# console application that generates a Word file based on your XML data. The OpenXML code to generate the Word file was generated code from the Open XML Productivity Tool. I then added some logic to read your XML file and generate the second table rows dynamically depending on how many orders there are in the data. I have uploaded the code for you to use if you are interested in this solution. Note: The xml data file should be in c:\temp and the generated word files will be in c:\temp also.
Another added bonus to this solution is if you were to add all of the customer data into one XML file, the application will create separate word files in your temp directory like so:
customer_<name1>.docx
customer_<name2>.docx
customer_<name3>.docx
etc.
Here is the document generated from the first xml file
Here is the document generated from the second xml file with the empty row
Hope this helps.

itextSharp: Mistake in Loading XMP in PDF with C#

I am using iTextSharp to load the XMP in PDF file
Reference: Is it possible to load XMP file in PDF using iTextSharp?
From above instruction I am loaded XMP data in PDF file, but there is one problem
In keywords section "; " semicolon and single space added in prefix by default as show below in screen shot.
PDF Properties Window:
XMP Sample I used to load:
I used source code to sort out this problem but I can't, still i am searching only. Before I would like to let you know to iTextSharp author, so only am posting this question.
Note:
In case I am setting Keywords dictionary by
Dictionary<String, String> info = reader.Info;
info.Add("Keywords", ",key1; key2");
It working fine.
The problem is probably caused by the XMP file you are adding. Adobe Reader is adding extra stuff to the keywords you define in the <dc:subject> based on what is present or missing in the pdf:Keywords attribute.
Please take a look at this example: xmp_metadata_added.pdf
This is what the XMP file looks like:
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.1.0-jc003">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
dc:format="application/pdf"
pdf:Keywords="Metadata, iText, PDF"
pdf:Producer="iText® 5.5.1 ©2000-2014 iText Group NV (AGPL-version); modified using iText® 5.5.1 ©2000-2014 iText Group NV (AGPL-version)"
xmp:CreateDate="2014-05-16T17:04:59+01:00"
xmp:CreatorTool="My program using iText"
xmp:ModifyDate="2014-05-16T17:04:59+01:00"
xmp:MetadataDate="2014-05-16T17:04:59+01:00">
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">This example shows how to add metadata</rdf:li>
</rdf:Alt>
</dc:description>
<dc:creator>
<rdf:Seq>
<rdf:li>Bruno Lowagie</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:subject>
<rdf:Bag>
<rdf:li>Metadata</rdf:li>
<rdf:li>iText</rdf:li>
<rdf:li>PDF</rdf:li>
</rdf:Bag>
</dc:subject>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Hello World example</rdf:li>
</rdf:Alt>
</dc:title>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
You recognize:
<dc:subject>
<rdf:Bag>
<rdf:li>Metadata</rdf:li>
<rdf:li>iText</rdf:li>
<rdf:li>PDF</rdf:li>
</rdf:Bag>
</dc:subject>
But do you see:
pdf:Keywords="Metadata, iText, PDF"
You need that part too.
This is a screen shot with that part:
When I remove pdf:Keywords="Metadata, iText, PDF", I can reproduce your problem:
This proves that your problem is caused by your XMP file, not by iText.

Do we have any Equivalent of Response.AppendHeader in windows application

I came around this technique of converting datatable to excel
http://www26.brinkster.com/mvark/dyna/downloadasexcel.html
Do we have any Equivalent of Response.AppendHeader in windows application in C#.
Regards
Hema
The trick in the code sample that you have mentioned to dynamically generate an Excel file is based on the fact that documents can be converted from Word/Excel to HTML (File->Save As) and vice versa. Essentially a HTML page containing Office XML is created & in a web application a file download is triggered with the help of the following Response.AppendHeader statements -
Response.AppendHeader("Content-Type", "application/vnd.ms-excel");
Response.AppendHeader("Content-disposition", "attachment; filename=my.xls");
If you want to use this technique in a Winforms application, just save the string content as a text file and give the file an extension of ".xls". Instead of the last 3 lines in the sample's Page_Load method, replace it with this line -
System.IO.File.WriteAllText(#"C:\Report.xls", strBody);
HTH