How to remove macros from binary MS Office documents? - macros

How can I programmatically remove macros from binary Microsoft Office documents (doc, xls, ppt)?
I do not want to do it for the XML format (xlsx, docx, pptx).
I do not want to disable macros in office, I really want to modify the file(s) and strip them of any macros.

This is assuming you want to automate the actual applications.
Add references to:
Microsoft.Office.Interop.Word
Microsoft.Office.Interop.Excel
Microsoft.Office.Interop.PowerPoint
Microsoft.Office.Interop.PowerPoint Microsoft Visual Basic for
Applications Extensibility Library
The following should get rid of your macro code for Excel. Automating Word and PowerPoint will be similar.
using Excel = Microsoft.Office.Interop.Excel;
using Word = Microsoft.Office.Interop.Word;
using PPT = Microsoft.Office.Interop.PowerPoint;
using Microsoft.Vbe.Interop;
namespace OfficeMacros
{
class Program
{
static void Main(string[] args)
{
}
static void RemoveMacrosExcel(string fileName, string newFileName)
{
var excel = new Excel.Application();
var workbook = excel.Workbooks.Open(fileName,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing, Type.Missing, Type.Missing,
Type.Missing, Type.Missing);
foreach (VBComponent component in workbook.VBProject.VBComponents)
{
switch (component.Type)
{
case (vbext_ComponentType.vbext_ct_StdModule):
case (vbext_ComponentType.vbext_ct_MSForm):
case (vbext_ComponentType.vbext_ct_ClassModule):
workbook.VBProject.VBComponents.Remove(component);
break;
default:
component.CodeModule.DeleteLines(1, component.CodeModule.CountOfLines);
break;
}
}
workbook.Close(true, newFileName, Type.Missing);
// Release variables
workbook = null;
excel = null;
// Collect garbage
GC.Collect();
}
}
}
If you want to parse the binary file structures yourself, you will need to review these documents:
[MS-DOC]: Word (.doc) Binary File Format
[MS-XLS]: Excel Binary File Format (.xls) Structure Specification
[MS-PPT]: PowerPoint (.ppt) Binary File Format
If that seems too daunting, there is an excellent library created by DIaLOGIKa called b2xtranslator (Binary(doc,xls,ppt) to OpenXMLTranslator). It has most (if not all) of the binary structures for .doc, .xls, and .ppt mapped to C# objects.
While b2xtranslator's intention is to translate binary office documents to the newer OpenXML format, you could use the library to parse the documents and remove the macro elements yourself.

Related

How to edit pasted content using the Open XML SDK

I have a custom template in which I'd like to control (as best I can) the types of content that can exist in a document. To that end, I disable controls, and I also intercept pastes to remove some of those content types, e.g. charts. I am aware that this content can also be drag-and-dropped, so I also check for it later, but I'd prefer to stop or warn the user as soon as possible.
I have tried a few strategies:
RTF manipulation
Open XML manipulation
RTF manipulation is so far working fairly well, but I'd really prefer to use Open XML as I expect it to be more useful in the future. I just can't get it working.
Open XML Manipulation
The wonderfully-undocumented (as far as I can tell) "Embed Source" appears to contain a compound document object, which I can use to modify the copied content using the Open XML SDK. But I have been unable to put the modified content back into an object that lets it be pasted correctly.
The modification part seems to work fine. I can see, if I save the modified content to a temporary .docx file, that the changes are being made correctly. It's the return to the clipboard that seems to be giving me trouble.
I have tried assigning just the Embed Source object back to the clipboard (so that the other types such as RTF get wiped out), and in this case nothing at all gets pasted. I've also tried re-assigning the Embed Source object back to the clipboard's data object, so that the remaining data types are still there (but with mismatched content, probably), which results in an empty embedded document getting pasted.
Here's a sample of what I'm doing with Open XML:
using OpenMcdf;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
...
object dataObj = Forms.Clipboard.GetDataObject();
object embedSrcObj = dateObj.GetData("Embed Source");
if (embedSrcObj is Stream)
{
// read it with OpenMCDF
Stream stream = embedSrcObj as Stream;
CompoundFile cf = new CompoundFile(stream);
CFStream cfs = cf.RootStorage.GetStream("package");
byte[] bytes = cfs.GetData();
string savedDoc = Path.GetTempFileName() + ".docx";
File.WriteAllBytes(savedDoc, bytes);
// And then use the OpenXML SDK to read/edit the document:
using (WordprocessingDocument openDoc = WordprocessingDocument.Open(savedDoc, true))
{
OpenXmlElement body = openDoc.MainDocumentPart.RootElement.ChildElements[0];
foreach (OpenXmlElement ele in body.ChildElements)
{
if (ele is Paragraph)
{
Paragraph para = (Paragraph)ele;
if (para.ParagraphProperties != null && para.ParagraphProperties.ParagraphStyleId != null)
{
string styleName = para.ParagraphProperties.ParagraphStyleId.Val;
Run run = para.LastChild as Run; // I know I'm assuming things here but it's sufficient for a test case
run.RunProperties = new RunProperties();
run.RunProperties.AppendChild(new DocumentFormat.OpenXml.Wordprocessing.Text("test"));
}
}
// etc.
}
openDoc.MainDocumentPart.Document.Save(); // I think this is redundant in later versions than what I'm using
}
// repackage the document
bytes = File.ReadAllBytes(savedDoc);
cf.RootStorage.Delete("Package");
cfs = cf.RootStorage.AddStream("Package");
cfs.Append(bytes);
MemoryStream ms = new MemoryStream();
cf.Save(ms);
ms.Position = 0;
dataObj.SetData("Embed Source", ms);
// or,
// Clipboard.SetData("Embed Source", ms);
}
Question
What am I doing wrong? Is this just a bad/unworkable approach?

OpenXml ChangeDocumentType

I need to convert a powerpoint template from potx to pptx. As seen here: http://www.codeproject.com/Tips/366463/Create-PowerPoint-presentation-using-PowerPoint-te I have tried with the following code. However the resulting pptx document is invalid and can't be opened by Office Powerpoint. If I skip the line newDoc.ChangeDocumentType then the resulting document is valid, but not converted to pptx.
templateContentBytes is a byte array containing the content of the potx document.
And temppath points to its local version.
using (var stream = new MemoryStream())
{
stream.Write(templateContentBytes, 0, templateContentBytes.Length);
using (var newdoc = PresentationDocument.Open(stream, true))
{
newdoc.ChangeDocumentType(PresentationDocumentType.Presentation);
PresentationPart presentationPart = newdoc.PresentationPart;
presentationPart.PresentationPropertiesPart.AddExternalRelationship(
"http://schemas.openxmlformats.org/officeDocument/2006/" + "relationships/attachedTemplate",
new Uri(tempPath, UriKind.Absolute));
presentationPart.Presentation.Save();
File.WriteAllBytes(tempPathResult, stream.ToArray());
I had the same problem, just move
File.WriteAllBytes(tempPathResult, stream.ToArray());
outside of the using

How can I set XFA data in a static XFA form in iTextSharp and get it to save?

I'm having a very strange issue with XFA Forms in iText / iTextSharp (iTextSharp 5.3.3 via NuGet). I am trying to fill out a static XFA styled form, however my changes are not taking.
I have both editions of iText in Action and have been consulting the second edition as well as the iTextSharp code sample conversions from the book.
Background: I have an XFA Form that I can fill out manually using Adobe Acrobat on my computer. Using iTextSharp I can read what the Xfa XML data is and see the structure of the data. I am essentially trying to mimic that with iText.
What the data looks like when I add data and save in Acrobat (note: this is only the specific section for datasets)
Here is the XML file I am trying to read in to replace the existing data (note: this is the entire contexts of that file):
However, when I pass the path to the replacement XML File in and try to set the data, the new file created (a copy of the original with the data replaced) without any errors being thrown, but the data is not being updated. I can see that the new file is created and I can open it, but there is no data in the file.
Here is the code being utilized to replace the data or populate for the first time, which is a variation of http://sourceforge.net/p/itextsharp/code/HEAD/tree/trunk/book/iTextExamplesWeb/iTextExamplesWeb/iTextInAction2Ed/Chapter08/XfaMovie.cs
public void Generate(string sourceFilePath, string destinationtFilePath, string replacementXmlFilePath)
{
PdfReader pdfReader = new PdfReader(sourceFilePath);
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(pdfReader, ms))
{
XfaForm xfaForm = new XfaForm(pdfReader);
XmlDocument doc = new XmlDocument();
doc.Load(replacementXmlFilePath);
xfaForm.DomDocument = doc;
xfaForm.Changed = true;
XfaForm.SetXfa(xfaForm, stamper.Reader, stamper.Writer);
}
var bytes = ms.ToArray();
File.WriteAllBytes(destinationtFilePath, bytes);
}
}
Any help would be very much appreciated.
I found the issue. The replacement DomDocument needs to be the entire merged XML of the new document, not just the data or datasets portion.
I upvoted your answer, because it's not incorrect (I'm happy my reference to the demo led you to have another look at your code), but now that I have a second look at your original code, I think it's better to use the book example:
public byte[] ManipulatePdf(String src, String xml) {
PdfReader reader = new PdfReader(src);
using (MemoryStream ms = new MemoryStream()) {
using (PdfStamper stamper = new PdfStamper(reader, ms)) {
AcroFields form = stamper.AcroFields;
XfaForm xfa = form.Xfa;
xfa.FillXfaForm(XmlReader.Create(new StringReader(xml)));
}
return ms.ToArray();
}
}
As you can see, it's not necessary to replace the whole XFA XML. If you use the FillXfaForm method, the data is sufficient.
Note: for the C# version of the examples, see http://tinyurl.com/iiacsCH08 (change the 08 into a number from 01 to 16 for the examples of the other chapters).

OpenXML SDK and MathML

I use MathML to create some data blocks and I need to insert it throught OpenXML SDK into docx file. I've heard it is possible, but I didn't manage it. Could somebody help me with this problem?
As far as I know, the OpenXml SDK does not support presentation MathML out of the box.
Instead, the OpenXml SDK supports Office MathML.
So, to insert presentation MathML into a word document we first have
to transform the presentation MathML into Office MathML.
Fortunately, Microsoft provides a XSL file (called MML2OMML.xsl) to transform presentation MathML
into Office MathML. The file MML2OMML.xsl is located under %ProgramFiles%\Microsoft Office\Office12.
In conjunction with the .Net Framework class
XslCompiledTransform we are able to transform presentation MathML into Office MathML.
The next step is to create a OfficeMath object from the transformed MathML.
The OfficeMath class represents a run containing WordprocessingML which shall be handled as though it was Office Open XML Math.
For more info please refer to MSDN.
The presentation MathML does not contain font information. To get a nice result
we must add font information to the created OfficeMath object.
In the last step we have to add the OfficeMath object to our word document.
In the example below I simply search for the first Paragraph in a
word document called template.docx and add the OfficeMath object to the found paragraph.
XslCompiledTransform xslTransform = new XslCompiledTransform();
// The MML2OMML.xsl file is located under
// %ProgramFiles%\Microsoft Office\Office12\
xslTransform.Load("MML2OMML.xsl");
// Load the file containing your MathML presentation markup.
using (XmlReader reader = XmlReader.Create(File.Open("mathML.xml", FileMode.Open)))
{
using (MemoryStream ms = new MemoryStream())
{
XmlWriterSettings settings = xslTransform.OutputSettings.Clone();
// Configure xml writer to omit xml declaration.
settings.ConformanceLevel = ConformanceLevel.Fragment;
settings.OmitXmlDeclaration = true;
XmlWriter xw = XmlWriter.Create(ms, settings);
// Transform our MathML to OfficeMathML
xslTransform.Transform(reader, xw);
ms.Seek(0, SeekOrigin.Begin);
StreamReader sr = new StreamReader(ms, Encoding.UTF8);
string officeML = sr.ReadToEnd();
Console.Out.WriteLine(officeML);
// Create a OfficeMath instance from the
// OfficeMathML xml.
DocumentFormat.OpenXml.Math.OfficeMath om =
new DocumentFormat.OpenXml.Math.OfficeMath(officeML);
// Add the OfficeMath instance to our
// word template.
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open("template.docx", true))
{
DocumentFormat.OpenXml.Wordprocessing.Paragraph par =
wordDoc.MainDocumentPart.Document.Body.Descendants<DocumentFormat.OpenXml.Wordprocessing.Paragraph>().FirstOrDefault();
foreach (var currentRun in om.Descendants<DocumentFormat.OpenXml.Math.Run>())
{
// Add font information to every run.
DocumentFormat.OpenXml.Wordprocessing.RunProperties runProperties2 =
new DocumentFormat.OpenXml.Wordprocessing.RunProperties();
RunFonts runFonts2 = new RunFonts() { Ascii = "Cambria Math", HighAnsi = "Cambria Math" };
runProperties2.Append(runFonts2);
currentRun.InsertAt(runProperties2, 0);
}
par.Append(om);
}
}
}

How to limit export formats in crystal reports

For Crystal reports for visual studio .net 2005, you can export the report
to various file format such as pdf, excel, word, rpt etc. If I just want to
limit the user see only excel and word format and set the default file
format to excel, is there a way to do it? Sometimes too many choose is
not good, is it?
Using CRVS2010 , you can remove unwanted export Option.
A new feature of CRVS2010 is the ability to modify the available export formats from the viewer export button. The following C# sample code demonstrates how to set the CrystalReportViewer to export only to PDF and Excel file formats:
int exportFormatFlags = (int)(CrystalDecisions.Shared.ViewerExportFormats.PdfFormat | CrystalDecisions.Shared.ViewerExportFormats.ExcelFormat);
CrystalReportViewer1.AllowedExportFormats = exportFormatFlags;
For More Details Please refer below link..
http://scn.sap.com/community/crystal-reports-for-visual-studio/blog/2011/01/26/export-file-formats-for-sap-crystal-reports-for-vs-2010
Try this:
Dim formats As Integer
formats = (CrystalDecisions.Shared.ViewerExportFormats.PdfFormat Or CrystalDecisions.Shared.ViewerExportFormats.XLSXFormat)
CrystalReportViewer1.AllowedExportFormats = formats
You don't mention whether you are using C# / VB.NET or Web/WinForms.
C#
I don't think this is possible. You would have to implement your own Export Button.
Something along the lines of this MSDN article
C#
// Declare variables and get the export options.
ExportOptions exportOpts = new ExportOptions();
ExcelFormatOptions excelFormatOpts = new ExcelFormatOptions ();
DiskFileDestinationOptions diskOpts = new DiskFileDestinationOptions();
exportOpts = Report.ExportOptions;
// Set the excel format options.
excelFormatOpts.ExcelUseConstantColumnWidth = true;
exportOpts.ExportFormatType = ExportFormatType.Excel;
exportOpts.FormatOptions = excelFormatOpts;
// Set the disk file options and export.
exportOpts.ExportDestinationType = ExportDestinationType.DiskFile;
diskOpts.DiskFileName = fileName;
exportOpts.DestinationOptions = diskOpts;
Report.Export ();
VB.NET
' Declare variables and get the export options.
Dim exportOpts As New ExportOptions()
Dim diskOpts As New DiskFileDestinationOptions()
Dim excelFormatOpts As New ExcelFormatOptions()
exportOpts = Report.ExportOptions
' Set the excel format options.
excelFormatOpts.ExcelTabHasColumnHeadings = true
exportOpts.ExportFormatType = ExportFormatType.Excel
exportOpts.FormatOptions = excelFormatOpts
' Set the export format.
exportOpts.ExportFormatType = ExportFormatType.Excel
exportOpts.ExportDestinationType = ExportDestinationType.DiskFile
' Set the disk file options.
diskOpts.DiskFileName = fileName
exportOpts.DestinationOptions = diskOpts
Report.Export()
VB.NET
You used to be able to remove certain export DLLs from the client installation. i.e remove all apart from the Excel DLLs and then it would only display the export options as Excel
To disable Crystal Report Rpt format try this :
Dim formats As Integer
formats = (CrystalDecisions.Shared.ViewerExportFormats.AllFormats Xor CrystalDecisions.Shared.ViewerExportFormats.RptFormat)
CrystalReportViewer1.AllowedExportFormats = formats
Or Short Version :
CrystalReportViewer1.AllowedExportFormats = (CrystalDecisions.Shared.ViewerExportFormats.AllFormats Xor CrystalDecisions.Shared.ViewerExportFormats.RptFormat)