Parsing Word Document using Powershell - powershell

I have to write a powershell script that can parse word document. Specifically, it has to read text from textboxes and store it in a variable. I know how to read text e.g. from a paragraph, but I don't know how to deal with textboxes. I would appreciate a sample.
Here is a part of my code:
$Word = New-Object -comobject Word.Application
$Word.Visible = $False
$datasheet = $word.Documents.Open($files[$i].FullName)
$value = ....?
Here is a screenshot of the box I need to read from:

Enumerate the contents of the Shapes collection and filter for shapes of the msoTextBox type (17):
$datasheet.Shapes |Where-Object {$_.Type -eq 17} |ForEach-Object {
# copy $_.TextFrame.TextRange.Text to wherever you need it in here
# $_.TextFrame.TextRange is a range object like any other (useful if you need formatting details or raw XML for example)
}

Related

How add autocorrect entries with an hyperlink using powershell script

In MS Outlook, is there a way to automatically replace some words like Google, MSN, Facebook, etc (I have an exhausting list in a CSV file), by the hyperlink that redirects to correct website.
So basically when I type google it transforms it to a hyperlink.
My CSV file:
Word, URL
Facebook, https://facebook.com
MSN, https://msn.com
Google, https://google.com
What I have so far is a script that add to the object autocorrect entries a word and replaces it by another word not using a CSV but a word document. But I'm not able to replace it by an hyperlink. It causes an error saying that autocorrect entries accept only string format and not object (hyperlink).
Reference: Add formatted text to Word autocorrect via PowerShell
When I create manually via outlook an hyperlink and I add this hyperlink to autocorrect and I run the following PowerShell script I can't find this autocorrect entry:
(New-Object -ComObject word.application).AutoCorrect.Entries | where{$_.Value -like "*http*"}
I want to adapt this code coming from Use PowerShell to Add Bulk AutoCorrect Entries to Word
If someone has an idea on how to add a hyperlink to the autocorrect entries, I would be grateful.
Thanks!
I finally managed how to add autocorrect entries for both word and outlook.
I need to create a .docx file with 'X row' and '2 Columns', the first column contain the word that i want an autocorrect like 'google' and the second column the 'google' link.
$objWord = New-Object -Com Word.Application
$filename = 'C:\Users\id097109\Downloads\test3.docx'
$objDocument = $objWord.Documents.Open($filename)
$LETable = $objDocument.Tables.Item(1)
$LETableCols = $LETable.Columns.Count
$LETableRows = $LETable.Rows.Count
$entries = $objWord.AutoCorrect.entries
for($r=1; $r -le $LETableRows; $r++) {
$replace = $LETable.Cell($r,1).Range.Text
$replace = $replace.Substring(0,$replace.Length-2)
$withRange = $LETable.Cell($r,2).Range
$withRange.End = $withRange.End -1
# $with = $withRange.Text
Try {
$entries.AddRichText($replace, $withRange) | out-null
}
Catch [system.exception] {
Write-Host $_.Exception.ToString()
}
}
$objDocument.Close()
$objWord.Quit()
[gc]::collect()
[gc]::WaitForPendingFinalizers()
$rc = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objWord)
This code allow to modify the file Normal.dotm that contains all the autocorrect to an object link (C:\Users{your user id}\AppData\Roaming\Microsoft\Templates)
But then to apply those change to Outlook you have delete the 'NormalEmail.dotm' and the copy/paste 'Normal.dotm' and the rename it to 'NormalEmail.dotm'
This is the script to avoid to do it manually :
$FileName='C:\Users\{your id}\AppData\Roaming\Microsoft\Templates\Normal.dotm'
$SaveTo='C:\Users\{your id}\AppData\Roaming\Microsoft\Templates\NormalEmail.dotm'
Remove-Item –path $SaveTo
$Word = New-Object –ComObject Word.Application
$Document=$Word.Documents.Open($Filename)
$Document.SaveAs($SaveTo)
$Document.Close

Creating powershell array

I am a newby to powershell and am trying to automate a complex process. I need to copy and paste a 3 columns of data from one spreadsheet to a second one which will then be exported to an SAP transaction line by line. In my search I found the link below that discusses using $arr1 = #(0) * 20 to create the array. I have two questions.
The first question is what is the (0) referencing?
Also, using this formula how do I reference the workbook, sheet, and column that I need to use to create the array? Any help would be greatly appreciated.
PowerShell array initialization
#() is an array literal (typically used to define empty or single element array)
You can also use list object (System.Collection.ArrayList), might be more convenient since it has dynamic size.
Below is a quick and dirty example with COM object (reads data from range into array list)
$excel= new-object -com excel.application
$excel.Visible = $false
$wb = $excel.workbooks.open("$home\Documents\book1.xlsx")
$ws = $wb.Sheets.Item("Sheet1")
$data = New-Object System.Collections.ArrayList
foreach ($i in 1..20){
$data.Add( (New-Object PSObject -Property #{A=$ws.Range("A$i").Value2; B=$ws.Range("B$i").Value2}) ) | Out-Null
}
Write-Output $data
$wb.Close()

Columns Page Layout in Word with Powershell

I'm creating a word document with Powershell and I need to create a two-column column similar to the GUI method shown in the screen shot below:
I've researched other websites that explain basic Powershell Word objects, properties and methods, such as this one. However, there seems to be a lot more functionality that is "hidden" deep in the pages and pages of properties and methods. I'm looking to create a two-column column in my word doc. Here is the code I used to create the document and write to it:
$fileName = 'C:\template.docx'
$word = New-Object -Com Word.Application
$word.Visible = $true
$document = $word.Documents.Open($fileName)
$selection = $word.Selection
$text = "Test Text."
$selection.TypeText($text)
$document.SaveAs($fileName)
$document.Close()
$word.Quit()
$word = $null
Having worked with Excel ComObjects, it's not the easiest to figure out how to make it work with PowerShell.
You're missing this line:
$selection.PageSetup.TextColumns.SetCount(2)
How to get there?
Check the Word Interop Com Object MSDN page
It's probably the PageSetup object we're interesting in (because in the GUI the two columns appear under Layout > Page Setup > Columns)
Googling "word com object pagesetup" leds to a better MSDN documentation page that lists the properties
Repeat this process for TextColumns - it has the methods in the "Remarks" but I prefer to a doc page which lists the members
Finally, finding the SetCount method.
Hopefully this helps you to figure out how to navigate the Word ComObject document in future. The examples are in VBA or C# at best and need to be translated to PowerShell.
$fileName = 'C:\Template.docx'
$binding = "System.Reflection.BindingFlags" -as [type]
$word = New-Object -Com Word.Application
$word.Visible = $true
$document = $word.Documents.Open($fileName)
$selection = $word.Selection
$text = "Test Text."
$selection.TypeText($text)
$selection.PageSetup.TextColumns.SetCount(2)
# check the GUI here.
# You will see the Layout > Page Setup > Columns > Two is selected
$document.SaveAs($fileName)
$document.Close()
$word.Quit()
$word = $null

How to Format the csv file so that on opening in excel, data should be displayed in a formatted manner using powershell scripting

we have a Csv file containing details of all logged users. Currently we are displaying the required details as table in the mail body and whole list as attachment and but we have to open the excel and do manual formatting to see the whole details attachment.
Is there anyway to sort this issue so that we can see the formatted excel itself on opening the attachment from mail.
Any help is really appreciated!!!
Thanks
gv
You could check out the ImportExcel module, I use this to generate a report on a remote server where Excel is not installed. It conveniently does most of the formatting for you, there are a few issues but you can return an OpenOfficeXML object which is relatively easy to work with, when you know where everything is and get used to the quirks (like stuff that applies to the top row doesn't account for adding a title, and that indexing starts from 1 instead of 0).
An example of a spreadsheet with two worksheets and the sort of stuff you can do:
$data | Export-Excel -WorkSheetname "MyData" -Title $atitle -TitleSize 20 -Path "$report\Data.xlsx" -PassThru
$xl = $data2 | Export-Excel -WorkSheetname "MyData2" -Title $btitle -TitleSize 20 -Path "$report\Data.xlsx" -PassThru
$ws = $xl.Workbook.Worksheets
1..($ws.Count) | Foreach-Object {
Foreach ($col in 2..($ws[$_].Dimension.Columns))
{
$ws[$_].Column($col).Style.HorizontalAlignment = "Center" # Align centre except first column
}
$ws[$_].Cells["A2:H2"].AutoFilter = $true # Set autofilter on headers
$ws[$_].Cells["A1:H2"].Style.Font.Bold = $true # Bold title and headers
$ws[$_].Row(2).Height = 40 # Increase height of header row
$ws[$_].Row(2).Style.VerticalAlignment = "Center" # Center header row
$ws[$_].Row(2).Style.Border.Bottom.Style = "Thin" # Underline header row
$ws[$_].Cells["B3:C" + ($ws[$_].Dimension.Rows).ToString()].Style.NumberFormat.Format = "0.0%;[Red]-0.0%" # Format activity columns as percentages
$ws[$_].View.FreezePanes(3, ($ws.Dimensions.Columns)) # Freeze top two rows
$ws[$_].Cells["A1:H1"].Merge = $true
$ws[$_].Cells["A2:H" + ($ws[$_].Dimension.Rows).ToString()].AutoFitColumns() # Autofit columns excluding header
Foreach ($col in 1..($ws[$_].Dimension.Columns))
{
$ws[$_].Column($col).Width = $ws[$_].Column($col).Width + 2 # Bump up column width as autosize seems to underestimate
}
}
$xl.Save()
As indicated here the easiest way to produce formatted output is convert to HTML. Excel in turn can easily read html.
For instance this code creates list of processes and exports it excel/html:
get-process | ConvertTo-Html | out-file c:\result.xls
You can't presereve formatting in a csv - csv is flat file with nothing except data.
If you have Excel installed on the machine thats producing the extract, the code below will open your csv, add some formatting and save as an xlsx.
# create an Excel com object
$Excel = New-Object -ComObject Excel.Application
# set it to visible (not needed, but it means we can see what's happening)
$Excel.Visible = $True
# open our csv file
$Workbook = $Excel.Workbooks.Open('C:\Path\To\File.csv')
# get a handle to the worksheet
$Sheet = $Workbook.ActiveSheet
# set the first row to a bold, size 10 font
$Sheet.Rows.Item(1).Font.Size = 10
$Sheet.Rows.Item(1).Font.Bold = $true
# add an autofilter
$Sheet.UsedRange.AutoFilter()
# make the columns autofit
$Sheet.UsedRange.EntireColumn.AutoFit()
# select the 2nd row
$Sheet.Rows.Item(2).Select()
# freeze the first row
$Excel.ActiveWindow.FreezePanes = $true
# save the csv as an excel doc, so we keep our formatting
$Workbook.SaveAs('C:\Path\To\File.xlsx',51)
# clos the workbook
$Workbook.Close()
Otherwise, you could try PSExcel, or EPPlus.

Powershell Select-All from Word Doc

I want to open, then select all of the text from a word document, not any of the properties, formatting, etc. Ihave searched this site and googled it to no end. Basically similar to opening a Word doc and pressing Ctrl-A and assigning the result to a variable.
$word = New-Object -ComObject Word.Application
$word.visible = $True
$wordfilepath = "\\symphony1\powershell\Phones\Phone.docx"
$doc = $word.Documents.Open($wordfilepath)
????
$selection" >> $textfilepath
Basically a newbie question, but can anyone help?
Thanks.
This will probably suit your needs. It creates a new word object, opens your existing file, and pulls the text from it.
$filePath = <your file here>
$doc = New-Object -com word.application
$fileToOpen = $doc.Documents.Open("$filePath")
$text = $fileToOpen.Range().text
Be forewarned that it will strip out even very basic formatting features such as new lines. Here's a nice list of other range members and properties that you may find helpful.