How to access contents of OneNote Page? - powershell

I'm new to powershell, and have gotten this far looking for code examples online, however I'm unable to find any examples that show me how to get at the actual page contents of a OneNote page with Powershell.
$OneNote = New-Object -ComObject OneNote.Application
[xml]$Hierarchy = ""
$OneNote.GetHierarchy("",
[Microsoft.Office.InterOp.OneNote.HierarchyScope]::hsPages, [ref]$Hierarchy)
foreach ($notebook in $Hierarchy.Notebooks.Notebook ) {
$notebook.Name
"=============="
foreach ($section in $notebook.Section) {
"# TAB: " + $section.Name
foreach ($page in $section.page) {
" " + $page.Name
#$page.GetAttribute.ToString()
#$page.Attributes
#$page.InnerText
# How do I get to the contents of the page?
}
}
" "
}

Since I was looking for a solution today on how to read OneNote notebook contents using powershell, I was initially pleased that I found this article here. But then I didn't get any further at first because I kept getting an error; namely in the line
Select-Xml -xml($xml.Value) -Namespace $schema -Xpath "//one:Notebook/one:Section" |foreach{
It always said that $xml.Value has the value NULL, and it cannot be cast to the type XmlNode[].
After some trial and error, I found that all I had to do was simply change this line to
Select-Xml -xml $xml -Namespace $schema -Xpath "//one:Notebook/one:Section" |foreach{
so just omit the .Value and the surrounding brackets.
And yes, of course: Thanks to the article creator! I would never have come up with this solution on my own.

You are almost there. Seeing that OneNote content is within an XMLElement. you will need to use Select-Xml to get the information.
Below is an example of how to get the Page property of the XMLElement:
$OneNote = New-Object -ComObject OneNote.Application
$schema = #{one=”http://schemas.microsoft.com/office/onenote/2013/onenote”}
[xml]$Hierarchy = ""
$OneNote.GetHierarchy("",
[Microsoft.Office.InterOp.OneNote.HierarchyScope]::hsPages, [ref]$Hierarchy)
foreach ($notebook in $Hierarchy.Notebooks.Notebook ) {
$notebook.Name
"=============="
foreach ($section in $notebook.Section) {
foreach ($page in $section.page) {
" " + $page
foreach($xml in $page)
{
Select-Xml -xml($xml.Value) -Namespace $schema -Xpath "//one:Notebook/one:Section" |foreach{
$node = $psitem.node
$npath = Split-Path -Path $node.Path -Parent
#This is where all the magic happens
$props = [ordered]#{
Workbook= Split-Path -Path $npath -Leaf
Section = $node.name
Child = $node.ChildNodes
Page = $node.Page #This is your page content
}
New-Object -TypeName PSObject -Property $props
}
}
}
}
}

Related

Powershell function not receiving parameters

I have a Powershell script. My ultimate goal is to compare the two Excel files and highlight differences in both versions. Part of my "preparatory code" is this:
function DefineVars () {
Clear-Host
# Define some basic variables
$Directory = Split-Path -Parent $PSCommandPath
$FilePath = $Directory + "\xlsx\"
$FileName1 = $FilePath + "Firewallv2.xlsx"
$FileName2 = $FilePath + "Firewallv3.xlsx"
$OutFile1 = $FilePath + "file1_raw.csv"
$OutFile2 = $FilePath + "file2_raw.csv"
# Create an Object Excel.Application using Com interface
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
# Generate the Workbook Objects
$WorkBook1 = $Excel.Workbooks.Open($FileName1)
$WorkBook2 = $Excel.Workbooks.Open($FileName2)
return $Directory, $FilePath, $FileName1, $FileName2, $OutFile1, $OutFile2, $Excel, $WorkBook1, $WorkBook2
}
function GenerateData ($WorkBook, $OutFile) {
$Results = #()
Write-Host $OutFile
foreach ($CurrentWorkSheet in $WorkBook.Worksheets) {
$CurrentWorkSheetName = $CurrentWorkSheet.Name
$CurrentWorkSheetRows = $CurrentWorkSheet.UsedRange.Rows.Count
$CurrentWorkSheetColumns = $CurrentWorkSheet.UsedRange.Columns.Count
$CurrentWorkSheet.Activate()
for ($CurrentColumn = 1; $CurrentColumn -le $CurrentWorkSheetColumns; $CurrentColumn++) {
for ($CurrentRow = 1; $CurrentRow -le $CurrentWorkSheetRows; $CurrentRow++) {
$CurrentCell = $CurrentWorksheet.Cells.Item($CurrentRow, $CurrentColumn)
$CurrentCellContent = $CurrentCell.Text
if ([System.IO.File]::Exists($OutFile)) {
Write-Host "true"
#","+$CurrentCellContent | Out-File $OutFile -Append
} else {
Write-Host "false"
#$CurrentCellContent | Out-File $OutFile
}
}
}
}
return $Results
}
function CloseExcel () {
$WorkBook1.Close($true)
$WorkBook2.Close($true)
$Excel.Quit()
spps -n Excel
}
$Directory, $FilePath, $FileName1, $FileName2, $OutFile1, $OutFile2, $Excel, $WorkBook1, $WorkBook2 = DefineVars
$ResultsFile1 = GenerateData($WorkBook1, $OutFile1)
$ResultsFile2 = GenerateData($WorkBook2, $OutFile2)
CloseExcel
My problem is that the parameter call to the GenerateData functions of the $OutFile variables won't work for some reason. All the other parameters appear to be passed successfully, e.g. the WorkBooks. But if I insert a Write-Host $OutFile at the beginning of the GenerateData function, the string is empty (which means it doesn't get passed, if I am not mistaken).
I am sure this is easily explained, but I just can't seem to figure this one out.
Thanks and best
Simon
I got it. My problem was the syntax in the main method. Being caught up in other languages, I thought I needed parentheses and commas to pass arguments. Yet, it's much simpler with Powershell:
$ResultsFile1 = GenerateData $WorkBook1 $OutFile1
$ResultsFile2 = GenerateData $WorkBook2 $OutFile2
CompareObjects $ResultsFile1 $ResultsFile2
CloseExcel
This did the trick! The only weird thing about it is that Powershell doesn't throw an error, if you stick to the parentheses-comma style of coding. The argument simply doesn't get passed.

Windows Power Shell rename files

I am sort of new to scripting and here's my task:
A folder with X files. Each file contains some Word documents, Excel sheets, etc. In these files, there is a client name and I need to assign an ID number.
This change will affect all the files in this folder that contain this client's name.
How can do this using Windows Power Shell?
$configFiles = Get-ChildItem . *.config -rec
foreach ($file in $configFiles)
{
(Get-Content $file.PSPath) |
Foreach-Object { $_ -replace " JOHN ", "123" } |
Set-Content $file.PSPath
}
Is this the right approach ?
As #lee_Daily pointed out you would need to have different code to perform a find and replace in different file types. Here is an example of how you could go about doing that:
$objWord = New-Object -comobject Word.Application
$objWord.Visible = $false
foreach ( $file in (Get-ChildItem . -r ) ) {
Switch ( $file.Extension ) {
".config" {
(Get-Content $file.FullName) |
Foreach-Object { $_ -replace " JOHN ", "123" } |
Set-Content $file.FullName
}
{('.doc') -or ('.docx')} {
### Replace in word document using $file.fullname as the target
}
{'.xlsx'} {
### Replace in spreadsheet using $file.fullname as the target
}
}
}
For the actual code to perform the find and replace, i would suggest com objects for both.
Example of word find and replace https://codereview.stackexchange.com/questions/174455/powershell-script-to-find-and-replace-in-word-document-including-header-footer
Example of excel find and replace Search & Replace in Excel without looping?
I would suggest learning the ImportExcel module too, it is a great tool which i use a lot.
For Word Document : This is what I'm using. Just can't figure out how this script could also change Header and Footer in a Word Document
$objWord = New-Object -comobject Word.Application
$objWord.Visible = $false
$list = Get-ChildItem "C:\Users\*.*" -Include *.doc*
foreach($item in $list){
$objDoc = $objWord.Documents.Open($item.FullName,$true)
$objSelection = $objWord.Selection
$wdFindContinue = 1
$FindText = " BLAH "
$MatchCase = $False
$MatchWholeWord = $true
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $wdFindContinue
$Format = $False
$wdReplaceNone = 0
$ReplaceWith = "help "
$wdFindContinue = 1
$ReplaceAll = 2
$a = $objSelection.Find.Execute($FindText,$MatchCase,$MatchWholeWord, `
$MatchWildcards,$MatchSoundsLike,$MatchAllWordForms,$Forward,`
$Wrap,$Format,$ReplaceWith,$ReplaceAll)
$objDoc.Save()
$objDoc.Close()
}
$objWord.Quit()
What If I try to run on C# ? Is anything else missing?
}
string rootfolder = #"C:\Temp";
string[] files = Directory.GetFiles(rootfolder, "*.*",SearchOption.AllDirectories);
foreach (string file in files)
{ try
{ string contents = File.ReadAllText(file);
contents = contents.Replace(#"Text to find", #"Replacement text");
// Make files writable
File.SetAttributes(file, FileAttributes.Normal);
File.WriteAllText(file, contents);
}
catch (Exception ex)
{ Console.WriteLine(ex.Message);
}
}

How do I use PowerShell to pull headers from a OneNote document

Background:
In my work environment, we have a transitional location for our knowledgebase notes. These reside in a number of OneNote 2016 workbooks which have been maintained over years. I am currently in the middle of delegating content update efforts to our staff and part of this work involves importing all our OneNote notebook names and section names into an excel spreadsheet for hierarchy management.
Task: I spent ages looking online for an easy and quick way to export hierarchy information from OneNote to csv using PowerShell and could not for the life of me find an easy way that worked. The following code resonated through the interwebs but each time I tried to run the code, I kept getting errors.
$onenote = New-Object -ComObject OneNote.Application
$scope = [Microsoft.Office.Interop.OneNote.HierarchyScope]::hsPages
[ref]$xml = $null
$onenote.GetHierarchy($null, $scope, $xml)
$schema = #{one=”http://schemas.microsoft.com/office/onenote/2013/onenote”}
$xpath = “//one:Notebook/one:Section”
Select-Xml -Xml (
$xml.Value) -Namespace $schema -XPath $xpath |
foreach {
$node = $psitem.Node
$npath = Split-Path -Path $node.Path -Parent
$props = [ordered]#{
Workbook = Split-Path -Path $npath -Leaf
Section = $node.Name
}
New-Object -TypeName PSObject -Property $props
}
Error:
The error I would get from executing this code was as follows:
value of type "System.String" to type "System.Xml.XmlNode".
At line:10 char:17
+ Select-Xml -Xml (
Solution:
In the end I had to break down the established connection to the Onenote Application and found a workable solution for OneNote 2016. I've provided my solution but am keen to hear of any other possible ways to manipulate this data effectively in the future:
Function Get-OneNoteHeaders{
[CmdletBinding()]
Param()
Begin
{
$onenote = New-Object -ComObject OneNote.Application
$scope = [Microsoft.Office.Interop.OneNote.HierarchyScope]::hsPages
[ref]$xml = $null
$csvOutput = "c:\temp\onenote-headers.csv"
}
Process
{
$onenote.GetHierarchy($null, $scope, $xml)
[xml]$result = ($xml.Value)
Foreach($notebook in $($result.DocumentElement.Notebook)){
Add-content -Path $csvOutput -Value "$($notebook.name)"
Foreach($section in $($notebook.section)){
Add-content -Path $csvOutput -Value ",$($section.name)"
Foreach($page in $section.page){
Add-content -Path $csvOutput -Value ",,$($page.name)"
}
}
}
}
End{}
}
#Get-OneNoteHeaders

How to extract metadata using a specific filename (get-childitem) rather than looping through ComObject namespace items

I have found multiple code snippets to scroll through a folder and display the metadata of each item in the folder, like this:
function funLine($strIN)
{
$strLine = "=" * $strIn.length
Write-Host -ForegroundColor Yellow "`n$strIN"
Write-Host -ForegroundColor Cyan $strLine
}
$sfolder = "S:\Temp"
$objShell = New-Object -ComObject Shell.Application
$objFolder = $objShell.namespace($sFolder)
foreach ($strFileName in $objFolder.items())
{funline "$($strFileName.name)"
for ($a ; $a -le 266; $a++)
{
$a
if($objFolder.getDetailsOf($strFileName, $a))
{
$hash += #{ $($objFolder.getDetailsOf($objFolder.items, $a)) = $a.tostring() + $($objFolder.getDetailsOf($strFileName, $a)) }
$hash | out-file c:\temp\output.txt -Append
$hash.clear()
}
}
$a=0
}
But in my script, I would like to loop through the folder(s) using Get-ChildItem and for selected files, I would like to use the getDetailsOf() to extract the authors of MS Office documents.
So, knowing the filename (example: $strFileName, can I skip the looping through each $strFileName in $objFolder.items() and just access the metadata details (where $a = 20) for the authors of $sFileName?
I have seen it done using "New-Object -ComObject word.application" but I believe that opens the document, so on a large file system with many files locked by users, this could be slow and painful.
Can I just jump to the index of $objFolder.items() for my selected filename?
Here, I was curious how it'd be done too so I looked it up and made a function that'll add that property to your [FileInfo] object (what's normally passed for a file by the Get-ChildItem cmdlet).
Function Get-CreatedBy{
[cmdletbinding()]
Param(
[Parameter(ValueFromPipelineByPropertyName=$true)]
[Alias("Path")]
[string[]]$FullName
)
Begin{
$Shell = New-Object -ComObject Shell.Application
}
Process{
ForEach($FilePath in $FullName){
$NameSpace = $Shell.NameSpace((Split-Path $FilePath))
$File = $NameSpace.ParseName((Split-Path $FilePath -Leaf))
$CreatedBy = $NameSpace.GetDetailsOf($File,20)
[System.IO.FileInfo]$FilePath|Add-Member 'CreatedBy' $CreatedBy -PassThru
}
}
}
Then you can just pipe things to that, or specify a path directly like:
Get-ChildItem *.docx | Get-CreatedBy | FT Name,CreatedBy
or
Get-CreatedBy 'C:\Temp\File.docx' | Select -Expand CreatedBy
Edit: Fixed for arrays of files! Sorry about the previous error.
Thanks Matt! Although that question was different, it had the one piece I was looking for - how to reference $objFolder.items().item($_.Name)
So this makes a quick little snippet to display the Authors (or any other metadata field):
$FullName = "S:\Temp\filename.xlsx"
$Folder = Split-Path $FullName
$File = Split-Path $FullName -Leaf
$objShell = New-Object -ComObject Shell.Application
$objFolder = $objShell.namespace($Folder)
$Item = $objFolder.items().item($File)
$Author = $objFolder.getDetailsOf($Item, 20)
Write-Host "$FullName is owned by $Author"
Where Author is the 20th metadata item.

How can I get programmatic access to the "Date taken" field of an image or video using powershell?

I'm trying convert a bunch of pictures and videos, but when I convert it to a new format I obviously lose the properties of the original file. I'd like to be able to read the "Date taken" property from the old file and update it on the new one using powershell.
I can't test it right now (don't have any images with XIF data laying around, but I think this should work:
[reflection.assembly]::LoadWithPartialName("System.Drawing")
$pic = New-Object System.Drawing.Bitmap('C:\PATH\TO\SomePic.jpg')
$bitearr = $pic.GetPropertyItem(36867).Value
$string = [System.Text.Encoding]::ASCII.GetString($bitearr)
$DateTime = [datetime]::ParseExact($string,"yyyy:MM:dd HH:mm:ss`0",$Null)
$DateTime
In general, you can access any extended property for a file shown in explorer through the shell GetDetailsOf method. Here's a short example, adapted from another answer:
$file = Get-Item IMG_0386.jpg
$shellObject = New-Object -ComObject Shell.Application
$directoryObject = $shellObject.NameSpace( $file.Directory.FullName )
$fileObject = $directoryObject.ParseName( $file.Name )
$property = 'Date taken'
for(
$index = 5;
$directoryObject.GetDetailsOf( $directoryObject.Items, $index ) -ne $property;
++$index ) { }
$value = $directoryObject.GetDetailsOf( $fileObject, $index )
However, according to the comments on another question, there is no general-purpose mechanism for setting these properties. The System.Drawing.Bitmap class that EBGreen mentioned will work for images, but I'm afraid I also do not know of a .NET option for video files.
This works for me, thanks to the above help and others.
try{
Get-ChildItem C:\YourFolder\Path | Where-Object {$_.extension -eq '.jpg'} |
ForEach-Object {
$path = $_.FullName
Add-Type -AssemblyName System.Drawing
$bitmap = New-Object System.Drawing.Bitmap($path)
$propertyItem = $bitmap.GetPropertyItem(36867)
$bytes = $propertyItem.Value
$string = [System.Text.Encoding]::ASCII.GetString($bytes)
$dateTime = [DateTime]::ParseExact($string,"yyyy:MM:dd HH:mm:ss`0",$Null)
$bitmap.Dispose()
$_.LastWriteTime = $dateTime
$_.CreationTime = $dateTime
}}
finally
{
}
To read and write the "date taken" property of an image, use the following code (building on the answer of #EBGreen):
try
{
$path = "C:\PATH\TO\SomePic.jpg"
$pathModified = "C:\PATH\TO\SomePic_MODIFIED.jpg"
Add-Type -AssemblyName System.Drawing
$bitmap = New-Object System.Drawing.Bitmap($path)
$propertyItem = $bitmap.GetPropertyItem(36867)
$bytes = $propertyItem.Value
$string = [System.Text.Encoding]::ASCII.GetString($bytes)
$dateTime = [DateTime]::ParseExact($string,"yyyy:MM:dd HH:mm:ss`0",$Null)
$dateTimeModified = $dateTime.AddDays(1) # Set new date here
$stringModified = $dateTimeModified.ToString("yyyy:MM:dd HH:mm:ss`0",$Null)
$bytesModified = [System.Text.Encoding]::ASCII.GetBytes($stringModified)
$propertyItem.Value = $bytesModified
$bitmap.SetPropertyItem($propertyItem)
$bitmap.Save($pathModified)
}
finally
{
$bitmap.Dispose()
}