Removing passwords from .Docx files using Powershell - powershell

I'm very new to Powershell and been banging my head against this for a while, hopefully someone can point me towards where I am going wrong. I am trying to use Powershell to remove the opening passwords from multiple .docx files in a folder. I can get it to change the password to something else but cannot get it to remove entirely, the part in Bold below is where I am getting tripped up and the error code details are at the bottom, appreciate any help with this!
$path = ("FilePath")
$passwd = ("CurrentPassword")
$counter=1
$WordObj = New-Object -ComObject Word.Application
foreach ($file in $count=Get-ChildItem $path -Filter \*.docx) {
$WordObj.Visible = $true
$WordDoc = $[WordObj.Documents.Open](https://WordObj.Documents.Open)
($file.FullName, $null, $false, $null, $passwd)
$WordDoc.Activate()
$WordDoc.Password=$null
$WordDoc.Close()
Write-Host("Finished: "+$counter+" of "+$count.Length)
$counter++
}
$WordObj.Application.Quit()
**Error details -** Object reference not set to an instance of an object. At line: 14 char: 5
\+$WordDoc.Password=$Null
\+Category info: Operations Stopped: (:) \[\], NullReferenceException
\+FullyQualifiedErrorId: System.NullReferenceException
I got an answer elsewhere to try using .unprotect instead but not sure how to insert this into my code!

$path = 'X:\TheFolderWhereTheProtectedDocumentsAre'
$passwd = 'CurrentPassword'
$counter = 0
$WordObj = New-Object -ComObject Word.Application
$WordObj.Visible = $false
# get the .docx files. Make sure this is an array using #()
$documentFiles = #(Get-ChildItem -Path $path -Filter '*.docx' -File)
foreach ($file in $documentFiles) {
try {
# add password twice, first for the document, next for the documents template
$WordDoc = $WordObj.Documents.Open($file.FullName, $null, $false, $null, $passwd, $passwd)
$WordDoc.Activate()
$WordDoc.Password = $null
$WordDoc.Close(-1) # wdSaveChanges, see https://learn.microsoft.com/en-us/office/vba/api/word.wdsaveoptions
$counter++
}
catch {
Write-Warning "Could not open file $($file.FullName):`r`n$($_.Exception.Message)"
}
}
Write-Host "Finished: $counter documents of $($documentFiles.Count)"
# quit Word and dispose of the used COM objects in memory
$WordObj.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($WordDoc)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($WordObj)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Related

Powershell Word SaveAs command errors when run using a service account

This has been driving me nuts for days.... I have a powershell script that converts all .doc files in a target directory to PDF's using Word SaveAs interop.
The script works fine when run within context of the logged in user, but errors with "You cannot call a method on a null-valued expression." when I try to execute the script using a service account (via task scheduler, run as another user)... service account has local admin rights.
The exception occurs at this line: $Doc.SaveAs([ref]$Name.value,[ref]17)
My code is as follows, Im not the best coder in the world so any advice would be gratefully received.
thanks.
try
{
$FileSource = 'D:\PROCESSOR\NewArrivals\*.doc'
$SuccessPath = 'D:\PROCESSOR\Success\'
$docextn='.doc'
$Files=Get-ChildItem -path $FileSource
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
#check files exist to be processed.
$WordFileCount = Get-ChildItem $FileSource -Filter *$docextn -File| Measure-Object | %{$_.Count} -ErrorAction Stop
If ($WordFileCount -gt 0) {
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs([ref]$Name.value,[ref]17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
}
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
}
catch
{
}
finally
{
}
If you are certain the service account has access to Word, then I think the exception you encounter is in the [ref] while doing the SaveAs().
AKAIK only Office versions below 2010 need [ref], versions above do not.
Next I think your code can be tydied up somewhat, for instance by releasing the com objects ($Doc and $Word) inside the finally block, as that is always executed.
Also, there is no need to perform a Get-ChildItem twice.
Something like this:
$SuccessPath = 'D:\PROCESSOR\Success'
$FileSource = 'D:\PROCESSOR\NewArrivals'
$filesProcessed = 0
try {
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
# get a list of FileInfo objects that have the .doc extension and loop through
Get-ChildItem -Path $FileSource -Filter '*.doc' -File | ForEach-Object {
# change the extension to pdf for the output file
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
$Doc = $Word.Documents.Open($_.FullName)
# Check Version of Office Installed. Pre 2010 versions need the [ref]
if ($word.Version -gt '14.0') {
$Doc.SaveAs($pdf, 17)
}
else {
$Doc.SaveAs([ref]$pdf,[ref]17)
}
$Doc.Close($false)
$filesProcessed++
}
}
finally {
# cleanup code
if ($Word) {
$Word.Quit()
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Doc)
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Word)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
$Word = $null
}
}
Then, there is the question of $SuccessPath. You never use it. Is it your intention to save the PDF files in that path? If so, change the line
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
into
$pdf = Join-Path -Path $SuccessPath -ChildPath ([System.IO.Path]::ChangeExtension($_.Name, '.pdf'))
Hope that helps

Word bypass password protected files

PURPOSE
The script should iterate through each file in a folder, convert to .txt and upload text to Azure database
PROBLEM
Everything works fine up until it hits a password protected file, I just want to skip these files. I am running this on hundreds of thousands of documents and the script will pause if it hits a password protected file until you either enter the password or click Cancel.
SCRIPT
Write-Output "Processing: $($file)"
Try {
$doc = $word.Documents.OpenNoRepairDialog($file)
}
Catch {
}
if ($doc) {
$fileName = [io.path]::GetFileNameWithoutExtension($file)
$fileName = $filename + ".txt"
$doc.SaveAs("$env:TEMP\$fileName", [ref]$saveFormat)
$doc.Close()
$4ID = $fileName.split('-')[-1].replace(' ', '').replace(".txt", "")
$text = Get-Content -raw "$env:TEMP\$fileName"
$text = $text.replace("'", "")
$query += "
('$text', $4ID),"
Remove-Item -Force "$env:TEMP\$fileName"
}
SOLUTION
For anyone having the same issue, the solution was to pass a non empty string to the open call like so:
$wd.Documents.Open($file, $false, $falsel, $false, "ttt")
rather than
$wd.Documents.Open($file, $false, $falsel, $false, "")
Here is a demo script of indicating if a Word document is password protected in the current directory. If the file opening doesn't get triggered by the catch block, continue your logic in the try block.
$wd = New-Object -ComObject Word.Application
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$files = Get-ChildItem $dir -Include *.doc, *.docx -Recurse
foreach ($file in $files) {
try {
$doc = $wd.Documents.Open($file, $null, $null, $null, "")
} catch {
Write-Host "$file is password-protected!"
}
}
You will need to integrate the rest of your logic if you choose this approach, but it shows the general idea of checking password protected files.

Check if word file is password protected in powershell

I've written a script to recursively loop through every directory, find word files and append "CONFIDENTIAL" to the footer. This worked fine until it came across an encrypted file which caused the script to hang and when I clicked cancel on the password prompt, it caused the script to crash. I've attempted to
check if the document is encrypted before opening it but the prompt still opens which crashes the script. Is there a reliable way to check if the document is password protected that will work on .doc and .docx files? I've already tried using the code in the other thread and the first two methods don't work, the third method detects every file as encrypted because it throws an exception.
Current code:
$word = New-Object -ComObject Word.Application
$word.Visible = $false
$files = Get-ChildItem -Recurse C:\temp\FooterDocuments -include *.docx,*.doc
$restricted = "CONFIDENTIAL"
foreach ($file in $files) {
$filename = $file.FullName
Write-Host $filename
try {
$document = $word.Documents.Open($filename, $null, $null, $null, "")
if($document.ProtectionType -ne -1) {
$document.Close()
Write-Host "$filename is encrypted"
continue
}
} catch {
Write-Host "$filename is encrypted"
continue
}
foreach ($section in $document.Sections) {
$footer = $section.Footers.Item(1)
$footer.Range.Characters.Last.InsertAfter("`n" + $restricted)
}
$document.Save()
$document.Close()
}
$word.Quit()
You can just use get-content.
$filelist = dir c:\tmp\*.docx
foreach ($file in $filelist) {
[pscustomobject]#{
File = $file.FullName
HasPassword = [bool]((get-content $file.FullName) -match "http://schemas.microsoft.com/office/2006/keyEncryptor/password" )
}
}
sample output:
File HasPassword
---- -----------
C:\tmp\New Microsoft Word Document (2).docx False
C:\tmp\New Microsoft Word Document.docx True

Powershell Script to change the font of word docs

Hi I am trying to create a script that changes the font of all word docs in a specific folder, I have managed to create one that changes the font for 1 document but cannot work out how to do this for all files my script is below
$Folder = Read-Host "Select Folder name"
$test = Test-Path C:\Users\andy.burton\Desktop\Layouts\$Folder\
$File = #
("0.bil","0.est","0.lbl","1.arm","1.bil","1.crd","1.env","1.est","1.frm","1.gp",
"1.hos","1.ins","1.lbl","1.lc","1.lmr","1.mls","1.NON","1.OP","1.pat","1.PRS
","1.rcl","1.rec","1.rmd","1.stm","10.pat","10.rec","11.pat","11.rec","12.pa
t","12.rec","13.pat","13.rec","14.PAT","14.rec","15.pat","16.pat","17.pat","18.pat","2.arm","2.bil","2.env","2.est","2.frm","2.gp","2.hos","2.ins","2.lbl","2.lc","2.lmr","2.mls","2.NON","2.pat","2.rcl","2.rec","2.rmd","2.stm","3.arm","3.bil","3.env","3.est","3.gp","3.hos","3.ins","3.lbl","3.lc","3.lmr","3.NON","3.pat","3.rec","3.rmd","3.STM","4.arm","4.bil","4.env","4.est","4.gp","4.hos","4.ins","4.lbl","4.lc","4.lmr","4.non","4.pat","4.rec","4.rmd","4.STM","5.arm","5.bil","5.env","5.est","5.gp","5.hos","5.ins","5.lbl","5.lc","5.lmr","5.non","5.pat","5.rec","5.rmd","6.bil","6.env","6.est","6.lbl","6.pat","6.rec","7.env","7.lbl","7.pat","7.rec","8.ENV","8.LBL","8.pat","8.rec","9.env","9.lbl","9.pat","9.rec","acchead.doc","address.lbl","apptreminder.email","apptreminder.sms","BMIBOOK.FRM","BUPA.OCR","clinicprint.rep","clinicprint2.rep","clinicprint2B.rep","clinicprint2C.rep","CLUB.BIL","Consent1.doc","consent2.doc","DEPOSIT.REC","ebs021.doc","ebs022.doc","EBS023.DOC","FConsent1.DOC","FEBS021.DOC","FEBS023.DOC","GP.OP","HCABOOK.FRM","InvCen.doc","InvFoot.doc","INVOICE.LBL","InvoiceGrid.doc","InvoiceGridNonVat.doc","InvoiceGridVAT.doc","InvoiceTotals.doc","InvoiceTotalsNonVAT.doc","InvoiceTotalsVAT.doc","J8160-95.LBL","J8160.LBL","J8162-95.LBL","J8162.LBL","J8163-95.LBL","J8163.LBL","J8165-95.LBL","J8165.LBL","J8360-95.LBL","J8360.LBL","J8362-95.LBL","J8362.LBL","J8363-95.LBL","J8363.LBL","J8365-95.LBL","J8365.LBL","J8560-95.LBL","J8560.LBL","J8562-95.LBL","J8562.LBL","J8563-95.LBL","J8563.LBL","J8565-95.LBL","J8565.LBL","L7160-95.LBL","L7160.LBL","L7161-95.LBL","L7161.LBL","L7162-95.LBL","L7162.LBL","L7163-95.LBL","L7163.LBL","L7164-95.LBL","L7164.LBL","L7165-95.LBL","L7165.LBL","L7166-95.LBL","L7166.LBL","L7167-95.LBL","L7167.LBL","L7168-95.LBL","L7168.LBL","L8162-95.LBL","letfoot.doc","lethead.doc","MC.BIL","NHS.PRS","NUFFBOOK.FRM","OPNOTE.OP","RecCen.doc","RECEIPT2.CC","ReceiptTotals.doc","recfoot.doc","remfoot.doc","shoulder1.jpg","shoulder2.jpg","TDL.FRM","TDL2.FRM","theatreprint.rep","VOUCHER2.CC")
if($test -eq $False) {
Copy-Item -Path 'C:\Users\andy.burton\desktop\Layouts\Arial 10' -Destination C:\Users\andy.burton\Desktop\Layouts\$Folder\ -Recurse
$file.ForEach({
$Word = New-Object -ComObject Word.Application
$Word.Visible = $False
$Doc = $word.Documents.Open()
$Selection = $word.Selection
$Doc.Select()
$Selection.Font.Name = "Calibri"
$Selection.Font.Size = 11
$Doc.Close()
$Word.Quit()
})
}
Else {Write-Warning "Folder Already Exists"}
You will want to use a Foreach loop and loop through your $File array. You may also want to rename $File to $Files
foreach($file in $Files)
{
# peform font change here
}
I had a need to change the font of a bunch of word documents in a share and the internet seems to point back to this post for results and references about changing the font of an entire word document using powershell. This code and answer did not show it very clearly so I contribute this code. This code gets the word documents in a folder that match the keywords you list in the $Files section, then changes the font to whatever you put in the $Selection.Font.Name = section. I modified the original code from this post to build this and I have already tested it on a local folder as well as a server share. Please let me know if it can be approved upon. For instance it would be nice to output to screen each file that is being changed.
$Folder = "\\SERVER\SHARE\FOLDER\"
$Files = Get-ChildItem -Path $Folder -Recurse -Include *misc*.doc, *.misc*.docx, *Misc*.docx, *Misc*.doc, *MISC.docx, *MISC*.doc, *test*.doc, *test*.docx, whatever.doc
foreach($File in $Files)
{
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
$Doc = $Word.Documents.Open($File.FullName)
$Selection = $Word.Selection
$Doc.Select()
$Selection.Font.Name = "Verdana Pro SemiBold"
$Doc.Close()
$Word.Quit()
}
Write-Warning "Finshed With Font Change"

Basic Powershell - batch convert Word Docx to PDF

I am trying to use PowerShell to do a batch conversion of Word Docx to PDF - using a script found on this site:
http://blogs.technet.com/b/heyscriptingguy/archive/2013/03/24/weekend-scripter-convert-word-documents-to-pdf-files-with-powershell.aspx
# Acquire a list of DOCX files in a folder
$Files=GET-CHILDITEM "C:\docx2pdf\*.DOCX"
$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION
Foreach ($File in $Files) {
# open a Word document, filename from the directory
$Doc=$Word.Documents.Open($File.fullname)
# Swap out DOCX with PDF in the Filename
$Name=($Doc.Fullname).replace("docx","pdf")
# Save this File as a PDF in Word 2010/2013
$Doc.saveas([ref] $Name, [ref] 17)
$Doc.close()
}
And I keep on getting this error and can't figure out why:
PS C:\docx2pdf> .\docx2pdf.ps1
Exception calling "SaveAs" with "16" argument(s): "Command failed"
At C:\docx2pdf\docx2pdf.ps1:13 char:13
+ $Doc.saveas <<<< ([ref] $Name, [ref] 17)
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
Any ideas?
Also - how would I need to change it to also convert doc (not docX) files, as well as use the local files (files in same location as the script location)?
Sorry - never done PowerShell scripting...
This will work for doc as well as docx files.
$documents_path = 'c:\doc2pdf'
$word_app = New-Object -ComObject Word.Application
# This filter will find .doc as well as .docx documents
Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object {
$document = $word_app.Documents.Open($_.FullName)
$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
$document.SaveAs([ref] $pdf_filename, [ref] 17)
$document.Close()
}
$word_app.Quit()
The above answers all fell short for me, as I was doing a batch job converting around 70,000 word documents this way. As it turns out, doing this repeatedly eventually leads to Word crashing, presumably due to memory issues (the error was some COMException that I didn't know how to parse). So, my hack to get it to proceed was to kill and restart word every 100 docs (arbitrarily chosen number).
Additionally, when it did crash occasionally, there would be resulting malformed pdfs, each of which were generally 1-2 kb in size. So, when skipping already generated pdfs, I make sure they are at least 3kb in size. If you don't want to skip already generated PDFs, you can delete that if statement.
Excuse me if my code doesn't look good, I don't generally use Windows and this was a one-off hack. So, here's the resulting code:
$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*"
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
if ((Test-Path $Name) -And (Get-Item $Name).length -gt 3kb) {
echo "skipping $($Name), already exists"
continue
}
echo "$($filesProcessed): processing $($File.FullName)"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs($Name, 17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
This works for me (Word 2007):
$wdFormatPDF = 17
$word = New-Object -ComObject Word.Application
$word.visible = $false
$folderpath = Split-Path -parent $MyInvocation.MyCommand.Path
Get-ChildItem -path $folderpath -recurse -include "*.doc" | % {
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
$doc = $word.documents.open($_.fullname)
$doc.saveas($path, $wdFormatPDF)
$doc.close()
}
$word.Quit()
Neither of the solutions posted here worked for me on Windows 8.1 (btw. I'm using Office 365). My PowerShell somehow does not like the [ref] arguments (I don't know why, I use PowerShell very rarely).
This is the solution that worked for me:
$Files=Get-ChildItem 'C:\path\to\files\*.docx'
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Doc = $Word.Documents.Open($File.FullName)
$Name=($Doc.FullName).replace('docx', 'pdf')
$Doc.SaveAs($Name, 17)
$Doc.Close()
}
I've updated this one to work on latest office :
# Get invocation path
$curr_path = Split-Path -parent $MyInvocation.MyCommand.Path
# Create a PowerPoint object
$ppt_app = New-Object -ComObject PowerPoint.Application
#$ppt.visible = $false
# Get all objects of type .ppt? in $curr_path and its subfolders
Get-ChildItem -Path $curr_path -Recurse -Filter *.ppt? | ForEach-Object {
Write-Host "Processing" $_.FullName "..."
# Open it in PowerPoint
$document = $ppt_app.Presentations.Open($_.FullName,0,0,0)
# Create a name for the PDF document; they are stored in the invocation folder!
# If you want them to be created locally in the folders containing the source PowerPoint file, replace $curr_path with $_.DirectoryName
$pdf_filename = "$($curr_path)\$($_.BaseName).pdf"
# Save as PDF -- 17 is the literal value of `wdFormatPDF`
#$opt= [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
$document.SaveAs($pdf_filename,32)
# Close PowerPoint file
$document.Close()
}
# Exit and release the PowerPoint object
$ppt_app.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($ppt_app)