Word ExportAsFixedFormat - powershell

I'm trying to do a tricky script to export to pdf, some Word files that are corrected, but with "SIMPLE revision marks".
So for now I use ExportAsFixedFormat() from Microsoft but the WdExportItem option is binary (0 or 7) : ALL revision marks or none.
Does someone as an idea of an api that would help me in this goal ?
Below, my powershell script :
$path = 'C:\path'
$wd = New-Object -ComObject Word.Application
Get-ChildItem -Path $path -Include *.doc, *.docx -Recurse |
ForEach-Object {
$doc = $wd.Documents.Open($_.Fullname)
$pdf = $_.FullName -replace $_.Extension, '.pdf'
$doc.ExportAsFixedFormat($pdf,17,$false,0,0,0,0,7,$false, $false,0,$false, $true)
$doc.Close()
}
$wd.Quit()

I'm using Word 2019
Disclaimer: The settings that my program changes to meet the
requirements seem to be sticky, i.e., although I don't save the doc
after changing the markup settings, they persist across closing and
re-opening the document. My program does not attempt to put the markup
settings back to their original settings
The trick here is to programmatically set Markup Insertions to None and Markup Deletions to Hidden
These two lines of code accomplish this. See Code below for complete working and tested program
$wordApp.Options.InsertedTextMark = [Microsoft.Office.Interop.Word.WdInsertedTextMark]::wdInsertedTextMarkNone
$wordApp.Options.DeletedTextMark = [Microsoft.Office.Interop.Word.WdDeletedTextMark]::wdDeletedTextMarkHidden
Here are the corresponding settings in Word
Sample docx input
Sample pdf output
Code
cls
try
{
$path = 'C:\temp\'
$Error.Clear()
$wordApp = New-Object -ComObject Word.Application
$wordApp.Visible = $false
$docOpen = $false
$wordDocFqPathList = #(Get-ChildItem -Path $path -Include *.doc, *.docx -Recurse)
foreach ($wordDocFqPath in $wordDocFqPathList)
{
$doc = $wordApp.Documents.Open($wordDocFqPath.FullName, $false, $true)
$docOpen = $true
$doc.Activate()
$doc.ActiveWindow.View.Type = [Microsoft.Office.Interop.Word.WdViewType]::wdPrintView
$doc.ShowRevisions = $true
#set tracked changes to show change bars only
$doc.ActiveWindow.View.RevisionsFilter.View = [Microsoft.Office.Interop.Word.WdRevisionsView]::wdRevisionsViewFinal
$doc.ActiveWindow.View.RevisionsFilter.Markup = [Microsoft.Office.Interop.Word.WdRevisionsMarkup]::wdRevisionsMarkupSimple
$wordApp.Options.InsertedTextMark = [Microsoft.Office.Interop.Word.WdInsertedTextMark]::wdInsertedTextMarkNone
$wordApp.Options.DeletedTextMark = [Microsoft.Office.Interop.Word.WdDeletedTextMark]::wdDeletedTextMarkHidden
$pdfDocFqPath = $wordDocFqPath.FullName.Replace(".docx", ".pdf").Replace(".doc", ".pdf")
#https://learn.microsoft.com/en-us/office/vba/api/word.document.exportasfixedformat
$doc.ExportAsFixedFormat($pdfDocFqPath,`
[Microsoft.Office.Interop.Word.WdExportFormat]::wdExportFormatPDF,`
$false,`
[Microsoft.Office.Interop.Word.WdExportOptimizeFor]::wdExportOptimizeForPrint,`
[Microsoft.Office.Interop.Word.WdExportRange]::wdExportAllDocument,`
0, 0,`
[Microsoft.Office.Interop.Word.WdExportItem]::wdExportDocumentWithMarkup,`
$true, $false)
$doc.Close([Microsoft.Office.Interop.Word.WdSaveOptions]::wdDoNotSaveChanges)
$docOpen = $false
}
}
finally
{
if ($docOpen -eq $true)
{
$doc.Close([Microsoft.Office.Interop.Word.WdSaveOptions]::wdDoNotSaveChanges)
}
$wordApp.Quit()
}

Related

Powershell Word SaveAs command errors when run using a service account

This has been driving me nuts for days.... I have a powershell script that converts all .doc files in a target directory to PDF's using Word SaveAs interop.
The script works fine when run within context of the logged in user, but errors with "You cannot call a method on a null-valued expression." when I try to execute the script using a service account (via task scheduler, run as another user)... service account has local admin rights.
The exception occurs at this line: $Doc.SaveAs([ref]$Name.value,[ref]17)
My code is as follows, Im not the best coder in the world so any advice would be gratefully received.
thanks.
try
{
$FileSource = 'D:\PROCESSOR\NewArrivals\*.doc'
$SuccessPath = 'D:\PROCESSOR\Success\'
$docextn='.doc'
$Files=Get-ChildItem -path $FileSource
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
#check files exist to be processed.
$WordFileCount = Get-ChildItem $FileSource -Filter *$docextn -File| Measure-Object | %{$_.Count} -ErrorAction Stop
If ($WordFileCount -gt 0) {
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs([ref]$Name.value,[ref]17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
}
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
}
catch
{
}
finally
{
}
If you are certain the service account has access to Word, then I think the exception you encounter is in the [ref] while doing the SaveAs().
AKAIK only Office versions below 2010 need [ref], versions above do not.
Next I think your code can be tydied up somewhat, for instance by releasing the com objects ($Doc and $Word) inside the finally block, as that is always executed.
Also, there is no need to perform a Get-ChildItem twice.
Something like this:
$SuccessPath = 'D:\PROCESSOR\Success'
$FileSource = 'D:\PROCESSOR\NewArrivals'
$filesProcessed = 0
try {
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
# get a list of FileInfo objects that have the .doc extension and loop through
Get-ChildItem -Path $FileSource -Filter '*.doc' -File | ForEach-Object {
# change the extension to pdf for the output file
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
$Doc = $Word.Documents.Open($_.FullName)
# Check Version of Office Installed. Pre 2010 versions need the [ref]
if ($word.Version -gt '14.0') {
$Doc.SaveAs($pdf, 17)
}
else {
$Doc.SaveAs([ref]$pdf,[ref]17)
}
$Doc.Close($false)
$filesProcessed++
}
}
finally {
# cleanup code
if ($Word) {
$Word.Quit()
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Doc)
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Word)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
$Word = $null
}
}
Then, there is the question of $SuccessPath. You never use it. Is it your intention to save the PDF files in that path? If so, change the line
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
into
$pdf = Join-Path -Path $SuccessPath -ChildPath ([System.IO.Path]::ChangeExtension($_.Name, '.pdf'))
Hope that helps

Spellcheck content of Document in Powershell

I have added spell checking by reading contents of a file using powershell script?
This script does my job, but I want to check if there are any external packages or modules available for the same, since it would make the work easier.
$file = Get-ChildItem ./Code-Duplication/master.md
$Proofread_text = Get-Content $file.FullName
$Word = New-Object -COM Word.Application
$Document = $Word.Documents.Add()
$Textrange = $Document.Range(0)
#$english = FindLanguage("English (US)")
#$Textrange.LanguageID = $english.ID
$Textrange.InsertAfter($Proofread_text)
<#Handle misspelled words here#>
$file.Name
Write-Output "---------------"
foreach($spell_error in $textrange.SpellingErrors){
Write-Host $spell_error.Text
}
$Document.Close(0)
$Word.Quit()

Replace specific Text in Header, Footer and normal Text in Word file

I'm trying to write a PowerShell script which replaces one String with another string in word files. I need to update more than 500 word templates and files so I don't want to make it by hand. One problem is that i can't find the text in footer or header, because they are all individual and are tables with images. I manage to find the text in the normal "body" text but haven't replaced it by now. Here is my code for finding.
$path = "C:\Users\BambergerSv\Desktop\PS\Vorlagen"
$files = Get-Childitem $path -Include *dotm, *docx, *.dot, *.doc, *.DOT, *DOTM, *.DOCX, *.DOC -Recurse |
Where-Object { !($_.PSIsContainer) }
$application = New-Object -ComObject Word.Application
$application.Visible = $true
$findtext = "www.subdomain.domain.com"
function getStringMatch {
foreach ($file In $files) {
#Write-Host $file.FullName
$document = $application.Documents.Open($file.FullName, $false, $true)
if ($document.Content.Text -match $findtext) {
Write-Host "found text in file " $file.FullName "`n"
}
try {
$application.Documents.Close()
} catch {
continue
Write-Host $file.FullName "is a read only file" #if it is write protected because of the makros
}
}
$application.Quit()
}
getStringMatch
I've searched on internet.I found an answer to this question.
At first you need to understand VBA. Write the below macro in MS WORD and then save it.
Public Function CustomReplace(findValue As String, replaceValue As String) As String
For Each myStoryRange In ActiveDocument.StoryRanges
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
While myStoryRange.find.Found
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
Wend
While Not (myStoryRange.NextStoryRange Is Nothing)
Set myStoryRange = myStoryRange.NextStoryRange
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
While myStoryRange.find.Found
myStoryRange.find.Execute FindText:=findValue, Forward:=True,ReplaceWith:=replaceValue, replace:=wdReplaceAll
Wend
Wend
Next myStoryRange
CustomReplace = ActiveDocument.FullName
End Function
After the above macro added to MS WORD, go to Powershell and execute the below code.
$word = New-Object -ComObject Word.Application
$word.visible=$false
$files = Get-ChildItem "C:\Users\Ali\Desktop\Test" -Filter *.docx
$find=[ref]"Hello"
$replace=[ref]"Hi"
for ($i=0; $i -lt $files.Count; $i++) {
$filename = $files[$i].FullName
$doc = $word.Documents.Open($filename)
$word.Run("CustomReplace",$find,$replace)
$doc.Save()
$doc.close()
}
$word.quit()

Powershell Script to change the font of word docs

Hi I am trying to create a script that changes the font of all word docs in a specific folder, I have managed to create one that changes the font for 1 document but cannot work out how to do this for all files my script is below
$Folder = Read-Host "Select Folder name"
$test = Test-Path C:\Users\andy.burton\Desktop\Layouts\$Folder\
$File = #
("0.bil","0.est","0.lbl","1.arm","1.bil","1.crd","1.env","1.est","1.frm","1.gp",
"1.hos","1.ins","1.lbl","1.lc","1.lmr","1.mls","1.NON","1.OP","1.pat","1.PRS
","1.rcl","1.rec","1.rmd","1.stm","10.pat","10.rec","11.pat","11.rec","12.pa
t","12.rec","13.pat","13.rec","14.PAT","14.rec","15.pat","16.pat","17.pat","18.pat","2.arm","2.bil","2.env","2.est","2.frm","2.gp","2.hos","2.ins","2.lbl","2.lc","2.lmr","2.mls","2.NON","2.pat","2.rcl","2.rec","2.rmd","2.stm","3.arm","3.bil","3.env","3.est","3.gp","3.hos","3.ins","3.lbl","3.lc","3.lmr","3.NON","3.pat","3.rec","3.rmd","3.STM","4.arm","4.bil","4.env","4.est","4.gp","4.hos","4.ins","4.lbl","4.lc","4.lmr","4.non","4.pat","4.rec","4.rmd","4.STM","5.arm","5.bil","5.env","5.est","5.gp","5.hos","5.ins","5.lbl","5.lc","5.lmr","5.non","5.pat","5.rec","5.rmd","6.bil","6.env","6.est","6.lbl","6.pat","6.rec","7.env","7.lbl","7.pat","7.rec","8.ENV","8.LBL","8.pat","8.rec","9.env","9.lbl","9.pat","9.rec","acchead.doc","address.lbl","apptreminder.email","apptreminder.sms","BMIBOOK.FRM","BUPA.OCR","clinicprint.rep","clinicprint2.rep","clinicprint2B.rep","clinicprint2C.rep","CLUB.BIL","Consent1.doc","consent2.doc","DEPOSIT.REC","ebs021.doc","ebs022.doc","EBS023.DOC","FConsent1.DOC","FEBS021.DOC","FEBS023.DOC","GP.OP","HCABOOK.FRM","InvCen.doc","InvFoot.doc","INVOICE.LBL","InvoiceGrid.doc","InvoiceGridNonVat.doc","InvoiceGridVAT.doc","InvoiceTotals.doc","InvoiceTotalsNonVAT.doc","InvoiceTotalsVAT.doc","J8160-95.LBL","J8160.LBL","J8162-95.LBL","J8162.LBL","J8163-95.LBL","J8163.LBL","J8165-95.LBL","J8165.LBL","J8360-95.LBL","J8360.LBL","J8362-95.LBL","J8362.LBL","J8363-95.LBL","J8363.LBL","J8365-95.LBL","J8365.LBL","J8560-95.LBL","J8560.LBL","J8562-95.LBL","J8562.LBL","J8563-95.LBL","J8563.LBL","J8565-95.LBL","J8565.LBL","L7160-95.LBL","L7160.LBL","L7161-95.LBL","L7161.LBL","L7162-95.LBL","L7162.LBL","L7163-95.LBL","L7163.LBL","L7164-95.LBL","L7164.LBL","L7165-95.LBL","L7165.LBL","L7166-95.LBL","L7166.LBL","L7167-95.LBL","L7167.LBL","L7168-95.LBL","L7168.LBL","L8162-95.LBL","letfoot.doc","lethead.doc","MC.BIL","NHS.PRS","NUFFBOOK.FRM","OPNOTE.OP","RecCen.doc","RECEIPT2.CC","ReceiptTotals.doc","recfoot.doc","remfoot.doc","shoulder1.jpg","shoulder2.jpg","TDL.FRM","TDL2.FRM","theatreprint.rep","VOUCHER2.CC")
if($test -eq $False) {
Copy-Item -Path 'C:\Users\andy.burton\desktop\Layouts\Arial 10' -Destination C:\Users\andy.burton\Desktop\Layouts\$Folder\ -Recurse
$file.ForEach({
$Word = New-Object -ComObject Word.Application
$Word.Visible = $False
$Doc = $word.Documents.Open()
$Selection = $word.Selection
$Doc.Select()
$Selection.Font.Name = "Calibri"
$Selection.Font.Size = 11
$Doc.Close()
$Word.Quit()
})
}
Else {Write-Warning "Folder Already Exists"}
You will want to use a Foreach loop and loop through your $File array. You may also want to rename $File to $Files
foreach($file in $Files)
{
# peform font change here
}
I had a need to change the font of a bunch of word documents in a share and the internet seems to point back to this post for results and references about changing the font of an entire word document using powershell. This code and answer did not show it very clearly so I contribute this code. This code gets the word documents in a folder that match the keywords you list in the $Files section, then changes the font to whatever you put in the $Selection.Font.Name = section. I modified the original code from this post to build this and I have already tested it on a local folder as well as a server share. Please let me know if it can be approved upon. For instance it would be nice to output to screen each file that is being changed.
$Folder = "\\SERVER\SHARE\FOLDER\"
$Files = Get-ChildItem -Path $Folder -Recurse -Include *misc*.doc, *.misc*.docx, *Misc*.docx, *Misc*.doc, *MISC.docx, *MISC*.doc, *test*.doc, *test*.docx, whatever.doc
foreach($File in $Files)
{
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
$Doc = $Word.Documents.Open($File.FullName)
$Selection = $Word.Selection
$Doc.Select()
$Selection.Font.Name = "Verdana Pro SemiBold"
$Doc.Close()
$Word.Quit()
}
Write-Warning "Finshed With Font Change"

Basic Powershell - batch convert Word Docx to PDF

I am trying to use PowerShell to do a batch conversion of Word Docx to PDF - using a script found on this site:
http://blogs.technet.com/b/heyscriptingguy/archive/2013/03/24/weekend-scripter-convert-word-documents-to-pdf-files-with-powershell.aspx
# Acquire a list of DOCX files in a folder
$Files=GET-CHILDITEM "C:\docx2pdf\*.DOCX"
$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION
Foreach ($File in $Files) {
# open a Word document, filename from the directory
$Doc=$Word.Documents.Open($File.fullname)
# Swap out DOCX with PDF in the Filename
$Name=($Doc.Fullname).replace("docx","pdf")
# Save this File as a PDF in Word 2010/2013
$Doc.saveas([ref] $Name, [ref] 17)
$Doc.close()
}
And I keep on getting this error and can't figure out why:
PS C:\docx2pdf> .\docx2pdf.ps1
Exception calling "SaveAs" with "16" argument(s): "Command failed"
At C:\docx2pdf\docx2pdf.ps1:13 char:13
+ $Doc.saveas <<<< ([ref] $Name, [ref] 17)
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
Any ideas?
Also - how would I need to change it to also convert doc (not docX) files, as well as use the local files (files in same location as the script location)?
Sorry - never done PowerShell scripting...
This will work for doc as well as docx files.
$documents_path = 'c:\doc2pdf'
$word_app = New-Object -ComObject Word.Application
# This filter will find .doc as well as .docx documents
Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object {
$document = $word_app.Documents.Open($_.FullName)
$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
$document.SaveAs([ref] $pdf_filename, [ref] 17)
$document.Close()
}
$word_app.Quit()
The above answers all fell short for me, as I was doing a batch job converting around 70,000 word documents this way. As it turns out, doing this repeatedly eventually leads to Word crashing, presumably due to memory issues (the error was some COMException that I didn't know how to parse). So, my hack to get it to proceed was to kill and restart word every 100 docs (arbitrarily chosen number).
Additionally, when it did crash occasionally, there would be resulting malformed pdfs, each of which were generally 1-2 kb in size. So, when skipping already generated pdfs, I make sure they are at least 3kb in size. If you don't want to skip already generated PDFs, you can delete that if statement.
Excuse me if my code doesn't look good, I don't generally use Windows and this was a one-off hack. So, here's the resulting code:
$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*"
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
if ((Test-Path $Name) -And (Get-Item $Name).length -gt 3kb) {
echo "skipping $($Name), already exists"
continue
}
echo "$($filesProcessed): processing $($File.FullName)"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs($Name, 17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
This works for me (Word 2007):
$wdFormatPDF = 17
$word = New-Object -ComObject Word.Application
$word.visible = $false
$folderpath = Split-Path -parent $MyInvocation.MyCommand.Path
Get-ChildItem -path $folderpath -recurse -include "*.doc" | % {
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
$doc = $word.documents.open($_.fullname)
$doc.saveas($path, $wdFormatPDF)
$doc.close()
}
$word.Quit()
Neither of the solutions posted here worked for me on Windows 8.1 (btw. I'm using Office 365). My PowerShell somehow does not like the [ref] arguments (I don't know why, I use PowerShell very rarely).
This is the solution that worked for me:
$Files=Get-ChildItem 'C:\path\to\files\*.docx'
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Doc = $Word.Documents.Open($File.FullName)
$Name=($Doc.FullName).replace('docx', 'pdf')
$Doc.SaveAs($Name, 17)
$Doc.Close()
}
I've updated this one to work on latest office :
# Get invocation path
$curr_path = Split-Path -parent $MyInvocation.MyCommand.Path
# Create a PowerPoint object
$ppt_app = New-Object -ComObject PowerPoint.Application
#$ppt.visible = $false
# Get all objects of type .ppt? in $curr_path and its subfolders
Get-ChildItem -Path $curr_path -Recurse -Filter *.ppt? | ForEach-Object {
Write-Host "Processing" $_.FullName "..."
# Open it in PowerPoint
$document = $ppt_app.Presentations.Open($_.FullName,0,0,0)
# Create a name for the PDF document; they are stored in the invocation folder!
# If you want them to be created locally in the folders containing the source PowerPoint file, replace $curr_path with $_.DirectoryName
$pdf_filename = "$($curr_path)\$($_.BaseName).pdf"
# Save as PDF -- 17 is the literal value of `wdFormatPDF`
#$opt= [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
$document.SaveAs($pdf_filename,32)
# Close PowerPoint file
$document.Close()
}
# Exit and release the PowerPoint object
$ppt_app.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($ppt_app)