PURPOSE
The script should iterate through each file in a folder, convert to .txt and upload text to Azure database
PROBLEM
Everything works fine up until it hits a password protected file, I just want to skip these files. I am running this on hundreds of thousands of documents and the script will pause if it hits a password protected file until you either enter the password or click Cancel.
SCRIPT
Write-Output "Processing: $($file)"
Try {
$doc = $word.Documents.OpenNoRepairDialog($file)
}
Catch {
}
if ($doc) {
$fileName = [io.path]::GetFileNameWithoutExtension($file)
$fileName = $filename + ".txt"
$doc.SaveAs("$env:TEMP\$fileName", [ref]$saveFormat)
$doc.Close()
$4ID = $fileName.split('-')[-1].replace(' ', '').replace(".txt", "")
$text = Get-Content -raw "$env:TEMP\$fileName"
$text = $text.replace("'", "")
$query += "
('$text', $4ID),"
Remove-Item -Force "$env:TEMP\$fileName"
}
SOLUTION
For anyone having the same issue, the solution was to pass a non empty string to the open call like so:
$wd.Documents.Open($file, $false, $falsel, $false, "ttt")
rather than
$wd.Documents.Open($file, $false, $falsel, $false, "")
Here is a demo script of indicating if a Word document is password protected in the current directory. If the file opening doesn't get triggered by the catch block, continue your logic in the try block.
$wd = New-Object -ComObject Word.Application
$scriptpath = $MyInvocation.MyCommand.Path
$dir = Split-Path $scriptpath
$files = Get-ChildItem $dir -Include *.doc, *.docx -Recurse
foreach ($file in $files) {
try {
$doc = $wd.Documents.Open($file, $null, $null, $null, "")
} catch {
Write-Host "$file is password-protected!"
}
}
You will need to integrate the rest of your logic if you choose this approach, but it shows the general idea of checking password protected files.
Related
I'm very new to Powershell and been banging my head against this for a while, hopefully someone can point me towards where I am going wrong. I am trying to use Powershell to remove the opening passwords from multiple .docx files in a folder. I can get it to change the password to something else but cannot get it to remove entirely, the part in Bold below is where I am getting tripped up and the error code details are at the bottom, appreciate any help with this!
$path = ("FilePath")
$passwd = ("CurrentPassword")
$counter=1
$WordObj = New-Object -ComObject Word.Application
foreach ($file in $count=Get-ChildItem $path -Filter \*.docx) {
$WordObj.Visible = $true
$WordDoc = $[WordObj.Documents.Open](https://WordObj.Documents.Open)
($file.FullName, $null, $false, $null, $passwd)
$WordDoc.Activate()
$WordDoc.Password=$null
$WordDoc.Close()
Write-Host("Finished: "+$counter+" of "+$count.Length)
$counter++
}
$WordObj.Application.Quit()
**Error details -** Object reference not set to an instance of an object. At line: 14 char: 5
\+$WordDoc.Password=$Null
\+Category info: Operations Stopped: (:) \[\], NullReferenceException
\+FullyQualifiedErrorId: System.NullReferenceException
I got an answer elsewhere to try using .unprotect instead but not sure how to insert this into my code!
$path = 'X:\TheFolderWhereTheProtectedDocumentsAre'
$passwd = 'CurrentPassword'
$counter = 0
$WordObj = New-Object -ComObject Word.Application
$WordObj.Visible = $false
# get the .docx files. Make sure this is an array using #()
$documentFiles = #(Get-ChildItem -Path $path -Filter '*.docx' -File)
foreach ($file in $documentFiles) {
try {
# add password twice, first for the document, next for the documents template
$WordDoc = $WordObj.Documents.Open($file.FullName, $null, $false, $null, $passwd, $passwd)
$WordDoc.Activate()
$WordDoc.Password = $null
$WordDoc.Close(-1) # wdSaveChanges, see https://learn.microsoft.com/en-us/office/vba/api/word.wdsaveoptions
$counter++
}
catch {
Write-Warning "Could not open file $($file.FullName):`r`n$($_.Exception.Message)"
}
}
Write-Host "Finished: $counter documents of $($documentFiles.Count)"
# quit Word and dispose of the used COM objects in memory
$WordObj.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($WordDoc)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($WordObj)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
This has been driving me nuts for days.... I have a powershell script that converts all .doc files in a target directory to PDF's using Word SaveAs interop.
The script works fine when run within context of the logged in user, but errors with "You cannot call a method on a null-valued expression." when I try to execute the script using a service account (via task scheduler, run as another user)... service account has local admin rights.
The exception occurs at this line: $Doc.SaveAs([ref]$Name.value,[ref]17)
My code is as follows, Im not the best coder in the world so any advice would be gratefully received.
thanks.
try
{
$FileSource = 'D:\PROCESSOR\NewArrivals\*.doc'
$SuccessPath = 'D:\PROCESSOR\Success\'
$docextn='.doc'
$Files=Get-ChildItem -path $FileSource
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
#check files exist to be processed.
$WordFileCount = Get-ChildItem $FileSource -Filter *$docextn -File| Measure-Object | %{$_.Count} -ErrorAction Stop
If ($WordFileCount -gt 0) {
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs([ref]$Name.value,[ref]17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
}
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
}
catch
{
}
finally
{
}
If you are certain the service account has access to Word, then I think the exception you encounter is in the [ref] while doing the SaveAs().
AKAIK only Office versions below 2010 need [ref], versions above do not.
Next I think your code can be tydied up somewhat, for instance by releasing the com objects ($Doc and $Word) inside the finally block, as that is always executed.
Also, there is no need to perform a Get-ChildItem twice.
Something like this:
$SuccessPath = 'D:\PROCESSOR\Success'
$FileSource = 'D:\PROCESSOR\NewArrivals'
$filesProcessed = 0
try {
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
# get a list of FileInfo objects that have the .doc extension and loop through
Get-ChildItem -Path $FileSource -Filter '*.doc' -File | ForEach-Object {
# change the extension to pdf for the output file
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
$Doc = $Word.Documents.Open($_.FullName)
# Check Version of Office Installed. Pre 2010 versions need the [ref]
if ($word.Version -gt '14.0') {
$Doc.SaveAs($pdf, 17)
}
else {
$Doc.SaveAs([ref]$pdf,[ref]17)
}
$Doc.Close($false)
$filesProcessed++
}
}
finally {
# cleanup code
if ($Word) {
$Word.Quit()
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Doc)
$null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($Word)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
$Word = $null
}
}
Then, there is the question of $SuccessPath. You never use it. Is it your intention to save the PDF files in that path? If so, change the line
$pdf = [System.IO.Path]::ChangeExtension($_.FullName, '.pdf')
into
$pdf = Join-Path -Path $SuccessPath -ChildPath ([System.IO.Path]::ChangeExtension($_.Name, '.pdf'))
Hope that helps
I've written a script to recursively loop through every directory, find word files and append "CONFIDENTIAL" to the footer. This worked fine until it came across an encrypted file which caused the script to hang and when I clicked cancel on the password prompt, it caused the script to crash. I've attempted to
check if the document is encrypted before opening it but the prompt still opens which crashes the script. Is there a reliable way to check if the document is password protected that will work on .doc and .docx files? I've already tried using the code in the other thread and the first two methods don't work, the third method detects every file as encrypted because it throws an exception.
Current code:
$word = New-Object -ComObject Word.Application
$word.Visible = $false
$files = Get-ChildItem -Recurse C:\temp\FooterDocuments -include *.docx,*.doc
$restricted = "CONFIDENTIAL"
foreach ($file in $files) {
$filename = $file.FullName
Write-Host $filename
try {
$document = $word.Documents.Open($filename, $null, $null, $null, "")
if($document.ProtectionType -ne -1) {
$document.Close()
Write-Host "$filename is encrypted"
continue
}
} catch {
Write-Host "$filename is encrypted"
continue
}
foreach ($section in $document.Sections) {
$footer = $section.Footers.Item(1)
$footer.Range.Characters.Last.InsertAfter("`n" + $restricted)
}
$document.Save()
$document.Close()
}
$word.Quit()
You can just use get-content.
$filelist = dir c:\tmp\*.docx
foreach ($file in $filelist) {
[pscustomobject]#{
File = $file.FullName
HasPassword = [bool]((get-content $file.FullName) -match "http://schemas.microsoft.com/office/2006/keyEncryptor/password" )
}
}
sample output:
File HasPassword
---- -----------
C:\tmp\New Microsoft Word Document (2).docx False
C:\tmp\New Microsoft Word Document.docx True
I'm trying to write a PowerShell script which replaces one String with another string in word files. I need to update more than 500 word templates and files so I don't want to make it by hand. One problem is that i can't find the text in footer or header, because they are all individual and are tables with images. I manage to find the text in the normal "body" text but haven't replaced it by now. Here is my code for finding.
$path = "C:\Users\BambergerSv\Desktop\PS\Vorlagen"
$files = Get-Childitem $path -Include *dotm, *docx, *.dot, *.doc, *.DOT, *DOTM, *.DOCX, *.DOC -Recurse |
Where-Object { !($_.PSIsContainer) }
$application = New-Object -ComObject Word.Application
$application.Visible = $true
$findtext = "www.subdomain.domain.com"
function getStringMatch {
foreach ($file In $files) {
#Write-Host $file.FullName
$document = $application.Documents.Open($file.FullName, $false, $true)
if ($document.Content.Text -match $findtext) {
Write-Host "found text in file " $file.FullName "`n"
}
try {
$application.Documents.Close()
} catch {
continue
Write-Host $file.FullName "is a read only file" #if it is write protected because of the makros
}
}
$application.Quit()
}
getStringMatch
I've searched on internet.I found an answer to this question.
At first you need to understand VBA. Write the below macro in MS WORD and then save it.
Public Function CustomReplace(findValue As String, replaceValue As String) As String
For Each myStoryRange In ActiveDocument.StoryRanges
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
While myStoryRange.find.Found
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
Wend
While Not (myStoryRange.NextStoryRange Is Nothing)
Set myStoryRange = myStoryRange.NextStoryRange
myStoryRange.find.Execute FindText:=findValue, Forward:=True, ReplaceWith:=replaceValue, replace:=wdReplaceAll
While myStoryRange.find.Found
myStoryRange.find.Execute FindText:=findValue, Forward:=True,ReplaceWith:=replaceValue, replace:=wdReplaceAll
Wend
Wend
Next myStoryRange
CustomReplace = ActiveDocument.FullName
End Function
After the above macro added to MS WORD, go to Powershell and execute the below code.
$word = New-Object -ComObject Word.Application
$word.visible=$false
$files = Get-ChildItem "C:\Users\Ali\Desktop\Test" -Filter *.docx
$find=[ref]"Hello"
$replace=[ref]"Hi"
for ($i=0; $i -lt $files.Count; $i++) {
$filename = $files[$i].FullName
$doc = $word.Documents.Open($filename)
$word.Run("CustomReplace",$find,$replace)
$doc.Save()
$doc.close()
}
$word.quit()
I am trying to rewrite an Add-Content script as a StreamWriter version, reason being that the file is ~140 MB and Add-Content is far too slow.
This is my Add-Content version, which loops through each row until it can find a header row starting FILE| and creates a new file with a filename of the second delimited (by pipe) value in that row. The Add-Content works as intended, but is really slow. It takes 35-40 mins to do it:
Param(
[string]$filepath = "\\fileserver01\Transfer",
[string]$filename = "sourcedata.txt"
)
$Path = $filepath
$InputFile = (Join-Path $Path $filename)
$Reader = New-Object System.IO.StreamReader($InputFile)
while (($Line = $Reader.ReadLine()) -ne $null) {
if ($Line -match 'FILE\|([^\|]+)') {
$OutputFile = "$($matches[1]).txt"
}
Add-Content (Join-Path $Path $OutputFile) $Line
}
I've researched that StreamWriter should be faster. Here is my attempt, but I get the error
The process cannot access the file '\fileserver01\Transfer\datafile1.txt' because it is being used by another process.
Param(
[string]$filepath = "\\fileserver01\Transfer",
[string]$filename = "sourcedata.txt"
)
$Path = $filepath
$InputFile = (Join-Path $Path $filename)
$Reader = New-Object System.IO.StreamReader($InputFile)
while (($Line = $Reader.ReadLine()) -ne $null) {
if ($Line -match 'FILE\|([^\|]+)') {
$OutputFile = "$($matches[1])"
}
$sw = New-Object System.IO.StreamWriter (Join-Path $Path $OutputFile)
$sw.WriteLine($line)
}
I assume it's something to do with using it in my loop.
Sample data:
FILE|datafile1|25/04/17
25044|0001|37339|10380|TT75
25045|0001|37339|10398|TT75
25046|0001|78711|15940|TT75
FILE|datafile2|25/04/17
25047|0001|98745|11263|TT75
25048|0001|96960|13011|TT84
FILE|datafile3|25/04/17
25074|0001|57585|13639|TT84
25075|0001|59036|10495|TT84
FILE|datafile4|25/04/17
25076|0001|75844|13956|TT84
25077|0001|17430|01111|TT84
Desired outcome is 1 file per FILE| heade row using the second delimited value as the file name.
You're creating the writer inside the while loop without ever closing it, thus your code is trying to re-open the already opened output file with every iteration. Close an existing writer and open a new one whenever your filename changes:
while (($Line = $Reader.ReadLine()) -ne $null) {
if ($Line -match 'FILE\|([^\|]+)') {
if ($sw) { $sw.Close(); $sw.Dispose() }
$sw = New-Object IO.StreamWriter (Join-Path $Path $matches[1])
}
$sw.WriteLine($line)
}
if ($sw) { $sw.Close(); $sw.Dispose() }
Note that this assumes that you won't open the same file twice. If the same output file can appear multiple times in the input file you need to open the file for appending. In that case replace
$sw = New-Object IO.StreamWriter (Join-Path $Path $matches[1])
with
$sw = [IO.File]::AppendText((Join-Path $Path $matches[1]))
Note also that the code doesn't do any error handling (e.g. input file doesn't begin with a FILE|... line, input file is empty, etc.). You may want to change that.