I am attempting to take a large document, search for a "^m" (page break) and create a new text file for each page break I find.
Using:
$SearchText = "^m"
$word = new-object -ComObject "word.application"
$path = "C:\Users\me\Documents\Test.doc"
$doc = $word.documents.open("$path")
$doc.content.find.execute("$SearchText")
I am able to find text, but how do I save the text before the page break into a new file? In VBScript, I would just do a readline and save it to a buffer, but powershell is much different.
EDIT:
$text = $word.Selection.MoveUntil (cset:="^m")
returns an error:
Missing ')' in method call.
I think my solution is kinda stupid, but here is my own solution (please help me find a better one):
Param(
[string]$file
)
#$file = "C:\scripts\docSplit\test.docx"
$word = New-Object -ComObject "word.application"
$doc=$word.documents.open($file)
$txtPageBreak = "<!--PAGE BREAK--!>"
$fileInfo = Get-ChildItem $file
$folder = $fileInfo.directoryName
$fileName = $fileInfo.name
$newFileName = $fileName.replace(".", "")
#$findtext = "^m"
#$replaceText = $txtPageBreak
function Replace-Word ([string]$Document,[string]$FindText,[string]$ReplaceText) {
#Variables used to Match And Replace
$ReplaceAll = 2
$FindContinue = 1
$MatchCase = $False
$MatchWholeWord = $True
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False
$Selection = $Word.Selection
$Selection.Find.Execute(
$FindText,
$MatchCase,
$MatchWholeWord,
$MatchWildcards,
$MatchSoundsLike,
$MatchAllWordForms,
$Forward,
$Wrap,
$Format,
$ReplaceText,
$ReplaceAll
)
$newFileName = "$folder\$newFileName.txt"
$Doc.saveAs([ref]"$newFileName",[ref]2)
$doc.close()
}
Replace-Word($file, "^m", $txtPageBreak)
$word.quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word)
Remove-Variable word
#begin txt file manipulation
#add end of file marker
$eof = "`n<!--END OF FILE!-->"
Add-Content $newfileName $eof
$masterTextFile = Get-Content $newFileName
$buffer = ""
foreach($line in $masterTextFile){
if($line.compareto($eof) -eq 0){
#end of file, save buffer to new file, be done
}
else {
$found = $line.CompareTo($txtPageBreak)
if ($found -eq 1) {
$buffer = "$buffer $line `n"
}
else {
#save the buffer to a new file (still have to write this part)
}
}
}
Related
I'm trying to replace multiple strings in a word document using PowerShell, but only one string is replaced when running the code below:
#Includes
Add-Type -AssemblyName System.Windows.Forms
#Functions
#Function to find and replace in a word document
function FindAndReplace($objSelection, $findText,$replaceWith){
$matchCase = $true
$matchWholeWord = $true
$matchWildcards = $false
$matchSoundsLike = $false
$matchAllWordForms = $false
$forward = $true
$wrap = [Microsoft.Office.Interop.Word.WdFindWrap]::wdReplaceAll
$format = $false
$replace = [Microsoft.Office.Interop.Word.WdFindWrap]::wdFindContinue
$objSelection.Find.Execute($findText,$matchCase,$matchWholeWord,$matchWildcards,$matchSoundsLike,$matchAllWordForms,$forward,$wrap,$format,$replaceWith, $replace) > $null
}
$item1 = "Should"
$item2 = "this"
$item3 = "work"
$item4 = "?"
$fileName = "NewFile"
#Opens a file browsers to select a word document
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property #{
InitialDirectory = [Environment]::GetFolderPath('Desktop')
Filter = 'Documents (*.docx)|*.docx'
}
Write-Host "Select word template file"
$FileBrowser.ShowDialog()
$templateFile = $FileBrowser.FileName
$word = New-Object -comobject Word.Application
$word.Visible = $false
$template = $word.Documents.Open($templateFile)
$selection = $template.ActiveWindow.Selection
FindAndReplace $selection '#ITEM1#' $item1
FindAndReplace $selection '#ITEM2#' $item2
FindAndReplace $selection '#ITEM3#' $item3
FindAndReplace $selection '#ITEM4#' $item4
$fileName = $fileName
$template.SaveAs($fileName)
$word.Quit()
If I comment out FindAndReplace the first one that runs works, but subsequent calls do not.
For example running this as is results in:
Input Output
#ITEM1# Should
#ITEM2# #ITEM2#
#ITEM3# #ITEM3#
#ITEM4# #ITEM4#
I'm not sure what I'm missing, any help would be appreciated
As was suggested it appears that the cursor was not returning to the beginning of the document. I added the following code:
Set-Variable -Name wdGoToLine -Value 3 -Option Constant
Set-Variable -Name wdGoToAbsolute -Value 1 -Option Constant
To the beginning of my script and:
$objSelection.GoTo($wdGoToLine, $wdGoToAbsolute, 1) > $null
as the first line in my FindAndReplace function, and now it works as expected.
There may be a more elegant solution, but this works for me
I am sort of new to scripting and here's my task:
A folder with X files. Each file contains some Word documents, Excel sheets, etc. In these files, there is a client name and I need to assign an ID number.
This change will affect all the files in this folder that contain this client's name.
How can do this using Windows Power Shell?
$configFiles = Get-ChildItem . *.config -rec
foreach ($file in $configFiles)
{
(Get-Content $file.PSPath) |
Foreach-Object { $_ -replace " JOHN ", "123" } |
Set-Content $file.PSPath
}
Is this the right approach ?
As #lee_Daily pointed out you would need to have different code to perform a find and replace in different file types. Here is an example of how you could go about doing that:
$objWord = New-Object -comobject Word.Application
$objWord.Visible = $false
foreach ( $file in (Get-ChildItem . -r ) ) {
Switch ( $file.Extension ) {
".config" {
(Get-Content $file.FullName) |
Foreach-Object { $_ -replace " JOHN ", "123" } |
Set-Content $file.FullName
}
{('.doc') -or ('.docx')} {
### Replace in word document using $file.fullname as the target
}
{'.xlsx'} {
### Replace in spreadsheet using $file.fullname as the target
}
}
}
For the actual code to perform the find and replace, i would suggest com objects for both.
Example of word find and replace https://codereview.stackexchange.com/questions/174455/powershell-script-to-find-and-replace-in-word-document-including-header-footer
Example of excel find and replace Search & Replace in Excel without looping?
I would suggest learning the ImportExcel module too, it is a great tool which i use a lot.
For Word Document : This is what I'm using. Just can't figure out how this script could also change Header and Footer in a Word Document
$objWord = New-Object -comobject Word.Application
$objWord.Visible = $false
$list = Get-ChildItem "C:\Users\*.*" -Include *.doc*
foreach($item in $list){
$objDoc = $objWord.Documents.Open($item.FullName,$true)
$objSelection = $objWord.Selection
$wdFindContinue = 1
$FindText = " BLAH "
$MatchCase = $False
$MatchWholeWord = $true
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $wdFindContinue
$Format = $False
$wdReplaceNone = 0
$ReplaceWith = "help "
$wdFindContinue = 1
$ReplaceAll = 2
$a = $objSelection.Find.Execute($FindText,$MatchCase,$MatchWholeWord, `
$MatchWildcards,$MatchSoundsLike,$MatchAllWordForms,$Forward,`
$Wrap,$Format,$ReplaceWith,$ReplaceAll)
$objDoc.Save()
$objDoc.Close()
}
$objWord.Quit()
What If I try to run on C# ? Is anything else missing?
}
string rootfolder = #"C:\Temp";
string[] files = Directory.GetFiles(rootfolder, "*.*",SearchOption.AllDirectories);
foreach (string file in files)
{ try
{ string contents = File.ReadAllText(file);
contents = contents.Replace(#"Text to find", #"Replacement text");
// Make files writable
File.SetAttributes(file, FileAttributes.Normal);
File.WriteAllText(file, contents);
}
catch (Exception ex)
{ Console.WriteLine(ex.Message);
}
}
i want to replace a string with an hyperlink
i try with something like this
Update:
$FindText = "[E-mail]"
$email ="asdadasd#asdada.com"
$a=$objSelection.Find.Execute($FindText)
$newaddress = $objSelection.Hyperlinks.Add($objSelection.Range,$email) )
but this insert the email at beginnig of file word don't replace the string "[E-mail]"
Add-Type -AssemblyName "Microsoft.Office.Interop.Word"
$wdunits = "Microsoft.Office.Interop.Word.wdunits" -as [type]
$objWord = New-Object -ComObject Word.Application
$objWord.Visible = $false
$findText = "[E-mail]"
$emailAddress = "someemail#example.com"
$mailTo = "mailto:"+$emailAddress
$objDoc = $objWord.Documents.Open("Path\to\input.docx")
$saveAs = "Path\to\output.docx")
$range = $objDoc.Content
$null = $range.movestart($wdunits::wdword,$range.start)
$objSelection = $objWord.Selection
$matchCase = $false
$matchWholeWord = $true
$matchWildcards = $false
$matchSoundsLike = $false
$matchAllWordForms = $false
$forward = $true
$wrap = 1
$format = $False
$wdReplaceNone = 0
$wdFindContinue = 1
$wdReplaceAll = 2
$wordFound = $range.find.execute($findText,$matchCase,$matchWholeWord,$matchWildCards,$matchSoundsLike,$matchAllWordForms,$forward,$wrap)
if($wordFound)
{
if ($range.style.namelocal -eq "normal")
{
$null = $objDoc.Hyperlinks.Add($range,$mailTo,$null,$null,$emailAddress)
}
}
$objDoc.SaveAs($saveAs)
$objDoc.Close()
$objWord.Quit()
Remove-Variable -Name objWord
[gc]::Collect()
[gc]::WaitForPendingFinalizers()
Kinda ugly, but this script will do what you need. It loads the .docx specified with $objDoc, finds all instances of $findText, and replaces it with a mailto link for $emailAddress and then saves the changes to $saveAs.
Most of this based on a "Hey, Scripting Guy" Article
Below code works fine, we got start and end point which needs to be extracted but im not able to get range.set/select to work
I'm able to get the range from below, just need to extra and save it to CSV file...
$found = $paras2.Range.SetRange($startPosition, $endPosition) - this piece doesn't work.
$file = "D:\Files\Scan.doc"
$SearchKeyword1 = 'Keyword1'
$SearchKeyword2 = 'Keyword2'
$word = New-Object -ComObject Word.Application
$word.Visible = $false
$doc = $word.Documents.Open($file,$false,$true)
$sel = $word.Selection
$paras = $doc.Paragraphs
$paras1 = $doc.Paragraphs
$paras2 = $doc.Paragraphs
foreach ($para in $paras)
{
if ($para.Range.Text -match $SearchKeyword1)
{
Write-Host $para.Range.Text
$startPosition = $para.Range.Start
}
}
foreach ($para in $paras1)
{
if ($para.Range.Text -match $SearchKeyword2)
{
Write-Host $para.Range.Text
$endPosition = $para.Range.Start
}
}
Write-Host $startPosition
Write-Host $endPosition
$found = $paras2.Range.SetRange($startPosition, $endPosition)
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($doc) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
This line of code is the problem
$found = $paras2.Range.SetRange($startPosition, $endPosition)
When designating a Range by the start and end position it's necessary to do so relative to the document. The code above refers to a Paragraphs collection. In addition, it uses SetRange, but should only use the Range method. So:
$found = $doc.Range.($startPosition, $endPosition)
I am trying to put together a PowerShell script to do multiple find and replace throughout a whole Word Document, that is including Headers, Footers and any Shape potentially displaying text.
There are plenty of VBA examples around so it's not too difficult, but there is a know bug that is circumvented in VBA with a solution dubbed as "Peter Hewett 's VBA trickery". See this example and also this one.
I have tried to address this bug in a similar fashion in PowerShell but it is not working as expected. Some TextBoxes in Header or Footer are still being ignored.
I noticed however, that runnning my script twice will actually end up working.
Any idea as to a solution to this problem would be greatly appreciated.
$folderPath = "C:\Users\user\folder\*" # multi-folders: "C:\fso1*", "C:\fso2*"
$fileType = "*.doc" # *.doc will take all .doc* files
$textToReplace = #{
# "TextToFind" = "TextToReplaceWith"
"This1" = "That1"
"This2" = "That2"
"This3" = "That3"
}
$word = New-Object -ComObject Word.Application
$word.Visible = $false
$storyTypes = [Microsoft.Office.Interop.Word.WdStoryType]
#Val, Name
# 1, wdMainTextStory
# 2, wdFootnotesStory
# 3, wdEndnotesStory
# 4, wdCommentsStory
# 5, wdTextFrameStory
# 6, wdEvenPagesHeaderStory
# 7, wdPrimaryHeaderStory
# 8, wdEvenPagesFooterStory
# 9, wdPrimaryFooterStory
# 10, wdFirstPageHeaderStory
# 11, wdFirstPageFooterStory
# 12, wdFootnoteSeparatorStory
# 13, wdFootnoteContinuationSeparatorStory
# 14, wdFootnoteContinuationNoticeStory
# 15, wdEndnoteSeparatorStory
# 16, wdEndnoteContinuationSeparatorStory
# 17, wdEndnoteContinuationNoticeStory
Function findAndReplace($objFind, $FindText, $ReplaceWith) {
#simple Find and Replace to execute on a Find object
$matchCase = $true
$matchWholeWord = $true
$matchWildcards = $false
$matchSoundsLike = $false
$matchAllWordForms = $false
$forward = $true
$findWrap = [Microsoft.Office.Interop.Word.WdReplace]::wdReplaceAll
$format = $false
$replace = [Microsoft.Office.Interop.Word.WdFindWrap]::wdFindContinue
$objFind.Execute($FindText, $matchCase, $matchWholeWord, $matchWildCards, $matchSoundsLike, $matchAllWordForms, \`
$forward, $findWrap, $format, $ReplaceWith, $replace) > $null
}
Function findAndReplaceAll($objFind, $FindText, $ReplaceWith) {
findAndReplace $objFind $FindText $ReplaceWith
While ($objFind.Found) {
findAndReplace $objFind $FindText $ReplaceWith
}
}
Function findAndReplaceMultiple($objFind, $lookupTable) {
#apply multiple Find and Replace on the same Find object
$lookupTable.GetEnumerator() | ForEach-Object {
findAndReplaceAll $objFind $_.Key $_.Value
}
}
Function findAndReplaceMultipleWholeDoc($Document, $lookupTable) {
ForEach ($storyRge in $Document.StoryRanges) {
#Loop through each StoryRange
Do {
findAndReplaceMultiple $storyRge.Find $lookupTable
#check if the StoryRange has shapes (we check only StoryTypes 6 to 11, basically Headers and Footers)
# as the Shapes inside the wdMainTextStory will be checked
# see http://wordmvp.com/FAQs/Customization/ReplaceAnywhere.htm
# and http://gregmaxey.com/using_a_macro_to_replace_text_wherever_it_appears_in_a_document.html
If (($storyRge.StoryType -ge $storyTypes::wdEvenPagesHeaderStory) -and \`
($storyRge.StoryType -le $storyTypes::wdFirstPageFooterStory)) {
If ($storyRge.ShapeRange.Count) { #non-zero is True
ForEach ($shp in $storyRge.ShapeRange) {
If ($shp.TextFrame.HasText) { #non-zero is True, in case of text .HasText = -1
findAndReplaceMultiple $shp.TextFrame.TextRange.Find $lookupTable
}
}
}
}
#check for linked Ranges
$storyRge = $storyRge.NextStoryRange
} Until (!$storyRge) #non-null is True
}
}
Function processDoc {
$doc = $word.Documents.Open($_.FullName)
# The "VBA trickey" translated to PowerShell...
$junk = $doc.Sections.Item(1).Headers.Item(1).Range.StoryType
#... but not working
findAndReplaceMultipleWholeDoc $doc $textToReplace
$doc.Close([ref]$true)
}
$sw = [Diagnostics.Stopwatch]::StartNew()
$countf = 0
Get-ChildItem -Path $folderPath -Recurse -Filter $fileType | ForEach-Object {
Write-Host "Processing \`"$($_.Name)\`"..."
processDoc
$countf++
}
$sw.Stop()
$elapsed = $sw.Elapsed.toString()
Write-Host "Done. $countf files processed in $elapsed"
$word.Quit()
$word = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()
I checked out Microsoft documentation documentation here and then I think the below code can do it.
$word = New-Object -ComObject Word.Application
$word.visible=$false
$files = Get-ChildItem "C:\Users\Ali\Desktop\Test" -Filter *.docx
$find="Hello"
$replace="Bye"
$wdHeaderFooterPrimary = 1
$ReplaceAll = 2
$FindContinue = 1
$MatchCase = $false
$MatchWholeWord = $false
$MatchWildcards = $false
$MatchSoundsLike = $false
$MatchAllWordForms = $false
$Forward = $true
$Wrap = $findContinue
$Format = $false
for ($i=0; $i -lt $files.Count; $i++) {
$filename = $files[$i].FullName
$doc = $word.Documents.Open($filename)
ForEach ($StoryRange In $doc.StoryRanges){
$StoryRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
While ($StoryRange.find.Found){
$StoryRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
}
While (-Not($StoryRange.NextStoryRange -eq $null)){
$StoryRange = $StoryRange.NextStoryRange
$StoryRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
While ($StoryRange.find.Found){
$StoryRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
}
}
}
#shapes in footers and headers
for ($j=1; $j -le $doc.Sections.Count; $j++) {
$FooterShapesCount = $doc.Sections($j).Footers($wdHeaderFooterPrimary).Shapes.Count
$HeaderShapesCount = $doc.Sections($j).Headers($wdHeaderFooterPrimary).Shapes.Count
for ($i=1; $i -le $FooterShapesCount; $i++) {
$TextRange = $doc.Sections($j).Footers($wdHeaderFooterPrimary).Shapes($i).TextFrame.TextRange
$TextRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
}
for ($i=1; $i -le $HeaderShapesCount; $i++) {
$TextRange = $doc.Sections($j).Headers($wdHeaderFooterPrimary).Shapes($i).TextFrame.TextRange
$TextRange.Find.Execute($find,$MatchCase,
$MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
$MatchAllWordForms,$Forward,$Wrap,$Format,
$replace,$ReplaceAll)
}
}
$doc.Save()
$doc.close()
}
$word.quit()