I'm extremely new to Powershell (as in, this is the first solution I've tried). I'm trying to create a solution that will change all fonts within a given Word document to Arial. So far I've composed this solution which works for body text.
$WordExts = '.docx','.doc','.docm'
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
$folder = Get-ChildItem $PSScriptRoot | ? {$_.Extension -in $WordExts}
foreach ($file in $folder){
echo $file.FullName
$worddoc = $Word.Documents.Open($file.FullName)
$Selection = $Word.Selection
$Selection.Font.Name = "Arial"
$worddoc = $null
$Word = $null
This changes the body text of the Word document as intended. However, it is not capturing or making changes to strings within text boxes or the table of contents. Which method do I need to use to target these shapes and make changes to the text within them? I've been trying to look through the Powershell documentation but have been unable to find the answer. Thanks in advance.

I started with the link in jonsson's comment, and was able to figure out how to iterate through headers and footers.
Text boxes took some more work, since they aren't located in the same place. I asked for some help from my husband and we started using echo to look at what was in all those sections (and then in THOSE sections) until we found the right object. Also, while headers and footers are clearly defined, text boxes are scattered across several different sections, so we had to iterate through each section in the document to look for the textbox object ShapeRange. I also later added a method to remove document protection, because it was holding up the script.
$WordExts = '.docx','.doc','.docm'
$Word = New-Object -ComObject Word.Application
$Word.Visible = $false
$folder = Get-ChildItem $PSScriptRoot | ? {$_.Extension -in $WordExts}
#Remove document protection
ForEach ($file in $folder) {
$file | Unblock-File
#Iterate through each file in the parent folder
foreach ($file in $folder){
echo $file.FullName
$worddoc = $Word.Documents.Open($file.FullName)
#Change the font in the main body of the document
$Selection = $Word.Selection
$Selection.Font.Name = "Arial"
#Iterate through each section to locate ShapeRange (textbox) to change font
$target = $worddoc.Sections.Count
$count = 1;
While($count -lt ($target + 1)){
ForEach ($section in $worddoc.Sections($count).Range.ShapeRange) {
ForEach($item in $section.TextFrame){
ForEach($i in $item.TextRange){
$i.Font.Name = "Arial"
#Iterate through each header and footer to change font
ForEach ($section in $worddoc.Sections) {
ForEach ($header in $section.Headers) {
$header.Range.Font.Name = "Arial"
ForEach ($footer in $section.Footers) {
$footer.Range.Font.Name = "Arial"
$worddoc = $null
$Word = $null


Search large .log for specific string quickly without streamreader

Problem: I need to search a large log file that is currently being used by another process. I cannot stop this other process or put a lock on the .log file. I need to quickly search this file, and I can't read it all into memory. I get that StreamReader() is the fastest, but I can't figure out how to avoid it attempting to grab a lock on the file.
$p = "Seachterm:Search"
$files = "\\remoteserver\c\temp\tryingtofigurethisout.log"
$SearchResult= Get-Content -Path $files | Where-Object { $_ -eq $p }
The below doesn't work because I can't get a lock of the file.
$reader = New-Object System.IO.StreamReader($files)
$lines = #()
if ($reader -ne $null) {
while (!$reader.EndOfStream) {
$line = $reader.ReadLine()
if ($line.Contains($p)) {
$lines += $line
$lines | Select-Object -Last 1
This takes too long:
get-content $files -ReadCount 500 |
foreach { $_ -match $p }
I would greatly appreciate any pointers in how I can go about quickly and efficiently (memory wise) searching a large log file.
Perhaps this will work for you. It tries to read the lines of the file as fast as possible, but with a difference to your second approach (which is approx. the same as what [System.IO.File]::ReadAllLines() would do).
To collect the lines, I use a List object which will perform faster than appending to an array using +=
$p = "Seachterm:Search"
$path = "\\remoteserver\c$\temp\tryingtofigurethisout.log"
if (!(Test-Path -Path $path -PathType Leaf)) {
Write-Warning "File '$path' does not exist"
else {
try {
$fileStream = [System.IO.FileStream]::new($path, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read, [System.IO.FileShare]::ReadWrite)
$streamReader = [System.IO.StreamReader]::new($fileStream)
# or use this syntax:
# $fileMode = [System.IO.FileMode]::Open
# $fileAccess = [System.IO.FileAccess]::Read
# $fileShare = [System.IO.FileShare]::ReadWrite
# $fileStream = New-Object -TypeName System.IO.FileStream $path, $fileMode, $fileAccess, $fileShare
# $streamReader = New-Object -TypeName System.IO.StreamReader -ArgumentList $fileStream
# use a List object of type String or an ArrayList to collect the strings quickly
$lines = New-Object System.Collections.Generic.List[string]
# read the lines as fast as you can and add them to the list
while (!$streamReader.EndOfStream) {
# close and dispose the obects used
# do the 'Contains($p)' after reading the file to not slow that part down
$lines.ToArray() | Where-Object { $_.Contains($p) } | Select-Object -Last 1
catch [System.IO.IOException] {}
Basically, it does what your second code does, but with the difference that using just the StreamReader, the file is opened with [System.IO.FileShare]::Read, whereas this code opens the file with [System.IO.FileShare]::ReadWrite
Note that there may be exceptions thrown using this because another application has write permissions to the file, hence the try{...} catch{...}
Hope that helps

Using Powershell to Print a Folder of Text files to PDF (Retaining the Original Base name)

First time posting - but I think this is a good one as I've spent 2 days researching, talked with local experts, and still haven't found this done.
Individual print jobs must be regularly initiated on a large set of files (.txt files), and this must be converted through the print job to a local file (i.e. through a PDF printer) which retains the original base name for each file. Further, the script must be highly portable.
The objective will not be met if the file is simply converted (and not printed), the original base file name is not retained, or the print process requires manual interaction at each print.
After my research, this is what stands so far in PowerShell:
PROBLEM: This script does everything but actually print the contents of the file.
It iterates through the files, and "prints" a .pdf while retaining the original file name base; but the .pdf is empty.
I know I'm missing something critical (i.e. maybe a stream use?); but after searching and searching have not been able to find it. Any help is greatly appreciated.
As mentioned in the code, the heart of the print function is gathered from this post:
# The heart of this script (ConvertTo-PDF) is largley taken and slightly modified from
# The $OutputFolder variable can be disregarded at the moment. It is an added bonus, and a work in progress, but not cirital to the objective.
function ConvertTo-PDF {
$TextDocumentPath, $OutputFolder
Write-Host "TextDocumentPath = $TextDocumentPath"
Write-Host "OutputFolder = $OutputFolder"
Add-Type -AssemblyName System.Drawing
$doc = New-Object System.Drawing.Printing.PrintDocument
$doc.DocumentName = $TextDocumentPath
$doc.PrinterSettings = new-Object System.Drawing.Printing.PrinterSettings
$doc.PrinterSettings.PrinterName = 'Microsoft Print to PDF'
$doc.PrinterSettings.PrintToFile = $true
Write-Host "file = $file"
$pdf= [io.path]::Combine($file.DirectoryName, $file.BaseName) + '.pdf'
Write-Host "pdf = $pdf"
$doc.PrinterSettings.PrintFileName = $pdf
Write-Host "Attempted Print: $pdf"
# get the relative path of the TestFiles and OutpufFolder folders.
$scriptPath = split-path -parent $MyInvocation.MyCommand.Definition
Write-Host "scriptPath = $scriptPath"
$TestFileFolder = "$scriptPath\TestFiles\"
Write-Host "TestFileFolder = $TestFileFolder"
$OutputFolder = "$scriptPath\OutputFolder\"
Write-Host "OutputFolder = $OutputFolder"
# initialize the files variable with content of the TestFiles folder (relative to the script location).
$files = Get-ChildItem -Path $TestFileFolder
# Send each test file to the print job
foreach ($testFile in $files)
$testFile = "$TestFileFolder$testFile"
Write-Host "Attempting Print from: $testFile"
Write-Host "Attemtping Print to : $OutputFolder"
ConvertTo-PDF $testFile $OutputFolder
You are missing a handler that reads the text file and passes the text to the printer. It is defined as a scriptblock like this:
$PrintPageHandler =
param([object]$sender, [System.Drawing.Printing.PrintPageEventArgs]$ev)
# More code here - see below for details
and is added to the PrintDocument object like this:
The full code that you need is below:
$PrintPageHandler =
param([object]$sender, [System.Drawing.Printing.PrintPageEventArgs]$ev)
$linesPerPage = 0
$yPos = 0
$count = 0
$leftMargin = $ev.MarginBounds.Left
$topMargin = $ev.MarginBounds.Top
$line = $null
$printFont = New-Object System.Drawing.Font "Arial", 10
# Calculate the number of lines per page.
$linesPerPage = $ev.MarginBounds.Height / $printFont.GetHeight($ev.Graphics)
# Print each line of the file.
while ($count -lt $linesPerPage -and (($line = $streamToPrint.ReadLine()) -ne $null))
$yPos = $topMargin + ($count * $printFont.GetHeight($ev.Graphics))
$ev.Graphics.DrawString($line, $printFont, [System.Drawing.Brushes]::Black, $leftMargin, $yPos, (New-Object System.Drawing.StringFormat))
# If more lines exist, print another page.
if ($line -ne $null)
$ev.HasMorePages = $true
$ev.HasMorePages = $false
function Out-Pdf
param($InputDocument, $OutputFolder)
Add-Type -AssemblyName System.Drawing
$doc = New-Object System.Drawing.Printing.PrintDocument
$doc.DocumentName = $InputDocument.FullName
$doc.PrinterSettings = New-Object System.Drawing.Printing.PrinterSettings
$doc.PrinterSettings.PrinterName = 'Microsoft Print to PDF'
$doc.PrinterSettings.PrintToFile = $true
$streamToPrint = New-Object System.IO.StreamReader $InputDocument.FullName
$doc.PrinterSettings.PrintFileName = "$($InputDocument.DirectoryName)\$($InputDocument.BaseName).pdf"
Get-Childitem -Path "$PSScriptRoot\TextFiles" -File -Filter "*.txt" |
ForEach-Object { Out-Pdf $_ $_.Directory }
Incidentally, this is based on the official Microsoft C# example here:

Optimize Word document keyword search

I'm trying to search for keywords across a large number of MS Word documents, and return the results to a file. I've got a working script, but I wasn't aware of the scale, and what I've got isn't nearly efficient enough, it would take days to plod through everything.
The script as it stands now takes keywords from CompareData.txt and runs it through all the files in a specific folder, then appends it to a file.
So when I'm done I will know how many files have each specific keyword.
$Path = "C:\willscratch\"
) #end param
$findTexts = (Get-Content c:\scratch\CompareData.txt)
Foreach ($Findtext in $FindTexts)
$matchCase = $false
$matchWholeWord = $true
$matchWildCards = $false
$matchSoundsLike = $false
$matchAllWordForms = $false
$forward = $true
$wrap = 1
$application = New-Object -comobject word.application
$application.visible = $False
$docs = Get-childitem -path $Path -Recurse -Include *.docx
$i = 1
$totaldocs = 0
Foreach ($doc in $docs)
Write-Progress -Activity "Processing files" -status "Processing $($doc.FullName)" -PercentComplete ($i /$docs.Count * 100)
$document = $$doc.FullName)
$range = $document.content
$null = $range.movestart()
$wordFound = $range.find.execute($findText,$matchCase,
$totaldocs ++
} #end if $wordFound
} #end foreach $doc
"There are $totaldocs total files with $findText" | Out-File -Append C:\scratch\output.txt
#clean up stuff
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($range) | Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($document) | Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($application) | Out-Null
Remove-Variable -Name application
What I'd like to do is figure out a way to search each file for everything in CompareData.txt once, rather than iterate through it a bunch of times. If I was dealing with a small set of data, the approach I've got would get the job done - but I've come to find out that both the data in CompareData.txt and the source Word file directory will be very large.
Any ideas on how to optimize this?
Right now you're doing this (pseudocode):
foreach $Keyword {
create Word Application
foreach $File {
load Word Document from $File
find $Keyword
That means that if you have a 100 keywords and 10 documents, you're opening and closing a 100 instances of Word and loading in a thousand word documents before you're done.
Do this instead:
create Word Application
foreach $File {
load Word Document from $File
foreach $Keyword {
find $Keyword
So you only launch one instance of Word and only load each document once.
As noted in the comments, you may optimize the whole process by using the OpenXML SDK, rather than launching Word:
(assuming you've installed OpenXML SDK in its default location)
# Import the OpenXML library
Add-Type -Path 'C:\Program Files (x86)\Open XML SDK\V2.5\lib\DocumentFormat.OpenXml.dll'
# Grab the keywords and file names
$Keywords = Get-Content C:\scratch\CompareData.txt
$Documents = Get-childitem -path $Path -Recurse -Include *.docx
# hashtable to store results per document
$KeywordMatches = #{}
# store OpenXML word document type in variable as a shorthand
$WordDoc = [DocumentFormat.OpenXml.Packaging.WordprocessingDocument] -as [type]
foreach($Docx in $Docs)
# create array to hold matched keywords
$KeywordMatches[$Docx.FullName] = #()
# open document, wrap content stream in streamreader
$Document = $WordDoc::Open($Docx.FullName, $false)
$DocumentStream = $Document.MainDocumentPart.GetStream()
$DocumentReader = New-Object System.IO.StreamReader $DocumentStream
# read entire document
$DocumentContent = $DocumentReader.ReadToEnd()
# test for each keyword
foreach($Keyword in $Keywords)
$Pattern = [regex]::Escape($KeyWord)
$WordFound = $DocumentContent -match $Pattern
$KeywordMatches[$Docx.FullName] += $Keyword
Now, you can show the word count for each document:
$KeywordMatches.GetEnumerator() |Select File,#{n="Count";E={$_.Value.Count}}

Import Excel data into PowerShell variables

I have an Excel File which has an unknown number of records in it, and these 3 columns:
Variable Name, Store Number, Email Address
I use this in QlikView to import data for certain stores and then create a separate report for each store in the list. I then need to email each report to each individual store (store number will be in the report file name).
So in PowerShell I would like to read the Excel File and set variables for each store:
$Store1 = The Store Number in Row 2 of the Excel File
$Store1Email = The Store Email in Row 2 of the Excel File
$Store2 = The Store Number in Row 3 of the Excel File
$Store2Email = The Store Email in Row 3 of the Excel File
etc. for each Storein the file (can be any number of stores).
Please note the "Variable Name" in the excel file must be ignored (that is for QLikView) and the PowerShell variables must be named as per my above examples, each time incrementing the number.
Check out my PowerShell Excel Module on Github. You can also grab it from the PowerShell Gallery.
$stores = Import-Excel C:\Temp\store.xlsx
'All stores'
Ok, first off if you are going to be working with actual .XLS or .XLSX or .XLSM files I would highly suggest using the Import-XLS function from the TechNet gallery (found here).
After that, just reference the object it imports to send the emails instead of making objects for each store. Such as:
$StoreList = Import-XLS <path to Excel file>
GC <report folder> | %{
$Current = $_
$Store = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreNumber
$Email = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreEmail
<code to send $Current to $Email>
My preference is to Save-As the Excel file to a '.csv' type. The comma separated value can easily be imported into PowerShell.
$csvFile = Import-Csv -Path c:\scripts\temp\excelFile.csv
#now the entire Excel '.csv' file is saved into csvFile variable
$csvFile |Get-Member
#look at the properties
Remember to study the greats so your PowerShell script looks great. Jeffery Snover, Jason Hicks, Don Jones, Ashley McGlone, and anyone on their friends list ha ha
The above answers usually work, but I just had a project with excel datasheets that caused some problems.
edit: Here's a much more advanced version that will pull it into an object, can handle blank and duplicate column names, and can skip human information at the beginning of the worksheet by looking for something in the header row. I've also included some example usages
Your example:
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$Store1 = $[0]."Store Number" #first row, column named "Store Number"
$Store1Email = $[0]."Store Email" #first row, column named "Store Email"
foreach ($row in $
write-host "Store: $($row."Store Number")"
write-host "Store Email: $($row."Store Email")"
Example 1:
# Simplest example
$file = New-Module -AsCustomObject -ScriptBlock $file_template
Example 2:
#advanced usage
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.header_contains="First Name" # if included it will drop everything before the first line that contains this, useful if there are instructions for humans in the worksheet
$file.indexer_column = 5 # Default: 1 (first column); This column's contents will set the minimum number of rows, use if there are blank rows in your file but more data after them
$file.worksheet_index = "January" # Default: 1; can be a sheet index or sheet name
$file.filename = "c:\folder\file.xls" #can set this independently, useful for validation and troubleshooting
$file.from_excel() #This is where we actually pull from excel
$collected = $|ogv -pass thru #this is a neat way to select some rows you want
$file.headers.count # It stores an array of the headers here, useful for troubleshooting and advanced logic
Excel Reader pseudoclass
$file_template = {
# -- universal --
$filename = ""
$delimiter = ","
$headers = #()
$data = #()
# -- used by some functions --
# we put these here to allow assigning them before calling functions, which improves readability and auditability
function from_excel(
$this.filename = $filename
$this.worksheet_index = $worksheet_index
$data_by_row = $this.from_excel_as_csv() # $data_by_row = $file.from_excel_as_csv($test_file)
$data_by_row = $data_by_row -split"`n"
#if ($this.headers.count -lt 1) {$this.headers = $data_by_row[0] -split $this.delimiter} #this would let us set headers elsewhere which is more flexible but less adaptive, Because columns change unpredicably we need something more adaptive
$temp_headers = $data_by_row[0] -split $this.delimiter
$temp_headers = $this.fix_blank_headers($temp_headers)
$this.headers = $this.dedupe_headers($temp_headers)
$ = $data_by_row|select -Skip 1|ConvertFrom-Csv -Header $this.headers -Delimiter $this.delimiter
function from_csv($filename=$this.filename)`
$this.filename = $filename
$this.headers = (Get-Content $this.filename -ReadCount 1|select -first 1) -split $this.delimiter
$ = Get-Content $this.filename|ConvertFrom-Csv -Delimiter $this.delimiter
function from_excel_as_csv(
$this.filename = $filename
$this.worksheet_index = $worksheet_index
#set up excel
Write-Host "Importing from excel, this may take a little while..."
$excel = New-Object -ComObject Excel.Application
$excel.DisplayAlerts = $false
$excel.Visible = $false
$workbook = $$this.filename)
$worksheet = $workbook.Worksheets.Item($this.worksheet_index)
#import from excel
$data_by_row = ""
$indexed_column = $worksheet.columns.item($this.indexer_column).value2 #we use this to work around some files having headers with blank space
$minimum_rows = (($indexed_column -join "◘").TrimEnd("◘") -split "◘").count # This Strips the million or so extra blank rows excel appends to get a realistic column length.
[bool]$header_found = 0
do `
$row = $worksheet.rows.item($i).value2
$row_as_text = $row -join "◘" # ◘ (alt+8) is just a placeholder that's unlikely to show up in the text
$row_as_text = $row_as_text -replace $this.delimiter,"."
$row_as_text = $row_as_text.TrimEnd("◘")
$row_as_text = $row_as_text -replace "◘",$this.delimiter
if ($row_as_text -like "*$($this.header_contains)*"){[bool]$header_found=1}
if ($header_found) {$data_by_row+="$row_as_text`n"}
while ( ($row_as_text.Length -gt 1) -or ($i -lt $minimum_rows) )
catch {Write-Warning "ERROR Importing from excel"}
#close excel
write-host "Done importing from excel"
return $data_by_row
function dedupe_headers($headers){
$dupes = ($headers|group)|?{$_.count -gt 1}
if ($dupes.count -ge 1)
foreach ($dupe in $dupes)
{ #$dupe = $dupes[0]
$new_headers = #()
foreach ($header in $headers)
{ #$header = $headers[0]
if ($header -eq $
$header = "$($header)_$($i)" # "header_#"
$new_headers += $header
else {$new_headers = $headers} # no duplicates found
return $new_headers
function fix_blank_headers($headers)
$replace_blanks_with = "_"
$new_headers = #()
foreach ($header in $headers)
if ($header -eq "") {$new_headers += $replace_blanks_with}
else {$new_headers += $header}
if ($new_headers.count -ne $header)
$error_json = #($headers),#($new_headers)|ConvertTo-Json -Compress
Write-Error "Error when fixing blank headers, original and new counts are different $($error_json)"
return $new_headers
<# function some_function($some_parameter){return $some_parameter} #>
Export-ModuleMember -Function * -Variable *
Forgive the ugliness here. I am not a programmer, so there are undoubtedly more optimized ways to do this, as well as better formatting. It will work, however, if I understand your requirements correctly.
$excelfile = import-csv "c:\myfile.csv"
$i = 1
$excelfile | ForEach-Object {
New-Variable "Store$i" $_."Store Number"
$iemail = $i.ToString() + "Email"
New-Variable "Store$iemail" $_."Email Address"
$i ++
edit: as per the reply to your original post, this works with a csv file. Just save it to csv first if necessary.
$excelfile = import-csv "C:\Temp\store.csv"
$i = 1 $excelfile | ForEach-Object {
$NA= $_."Name"
$SN= $_."StoreNumber"
Write-Output "row $i"
$i++ }

preserving header footer while combining several doc files into one

this code creates some files with different header and footer.
I want them to merge into one with preserving there header and footer
how can i do that in powershell
$val=ls $dir
set-variable -name wdAlignPageNumberCenter -value 1
foreach ($file in $val){
$Word = New-Object -ComObject Word.Application
$Word.Visible = $true
$Doc = $Word.Documents.Add()
$Section = $Doc.Sections.Item(1)
$Header = $Section.Headers.Item(1)
$Footer = $Section.Footers.Item(1)
****some code******
$filedata = (get-content $filename)
$head="$abcd`t`tFile ID: $file"
$Header.Range.Text = $head
$line_count = 0
***some code*******
Start a new section in your new Word file before adding each old Word file. Each section of a Word document can have its own header/footer. But, if you don't specify that the new header/footer is independent, you will change the previous header/footer. Do it manually first to get the hang of it.
I just realized, this is basically the same thing you were told yesterday, with a similar problem.