Im trying to iterate through multiple HTML files in multiple computers.
My code is below:
ForEach ($system in (Get-Content C:\temp\computers.txt)) {
$folder = "\\$system\c`$\ProgramData\Autodesk\AdLM\"
Get-ChildItem $folder *.html |
Foreach-Object {
$c = $_.BaseName
$html = New-Object -ComObject "HTMLFile"
$HTML.IHTMLDocument2_write($(Get-content $_.Name -Raw ))
$para1 = $HTML.getElementById('para1') | % InnerText
Add-Content -path c:\temp\results.csv "$c,$system,$para1"
}
}
I'm getting the following error:
New-Object : Cannot find parameter Raw
You can use the Internet Explorer COM object to do what you'd like the HTMLFile COM object to do. HTMLFile isn't working 100% in all versions of Powershell, so this is a viable alternative.
ForEach ($system in (Get-Content C:\temp\computers.txt)) {
$folder = "\\$system\c`$\ProgramData\Autodesk\AdLM\"
Get-ChildItem $folder *.html |
ForEach-Object {
$c = $_.BaseName
$ie=New-Object -ComObject InternetExplorer.Application
$ie.Navigate("$_")
while ($ie.busy -eq $true) {
Start-Sleep -Milliseconds 500
}
$doc=$ie.Document
$elements=$doc.GetElementByID('para1')
$elements.innerText | ForEach-Object { Add-Content -path c:\temp\results.csv "$c,$system,$para1" }
}
}
Related
i have a powershell script that automatically download from outlook and save in the file i already set. the script works fine but then i realise that some of the attachment downloaded is corrupted. here is the script that i use.
Function saveattachmentexcel
{
$Null = Add-type -Assembly "Microsoft.Office.Interop.Outlook"
#olFolders = "Microsoft.Office.Interop.Outlook.olDefaultFolders" -as [type]
#olFolderInbox = 6
$outlook = new-object -comobject outlook.application
$namespace = $outlook.GetNameSpace("MAPI")
$folder = $nameSpace.GetDefaultFolder([Microsoft.Office.Interop.Outlook.OlDefaultFolders]::olFolderInbox)
$filepath = "D:\DMR Folder\"
$folder.Items | Where {$_.UnRead -eq $True -and $($_.attachments).filename -match '.xlsm'} | ForEach-object {
$filename = $($_.attachments | where filename -match '.xlsm').filename
foreach($file in $filename)
{
$outpath = join-path $filepath $file
$($_.attachments).saveasfile($outpath)
}
$_.UnRead = $False
}
}
saveattachmentexcel
i do not know why this is happening. could anyone please help me?
This is likely because you attempt to save every single attachment to the same file name on disk with the $($_.attachments).saveasfile($outpath) statement.
Change this:
$filename = $($_.attachments | where filename -match '.xlsm').filename
foreach($file in $filename)
{
$outpath = join-path $filepath $file
$($_.attachments).saveasfile($outpath)
}
to:
foreach($attachment in $_.attachments)
{
if($attachment.Filename -like '*.xlsm'){
$outpath = Join-Path $filepath $attachment.Filename
# Only save this particular attachment to disk - not all of them
$attachment.SaveAsFile($outpath)
}
}
Say foo.zip contains:
a
b
c
|- c1.exe
|- c2.dll
|- c3.dll
where a, b, c are folders.
If I
Expand-Archive .\foo.zip -DestinationPath foo
all files/folders in foo.zip are extracted.
I would like to extract only the c folder.
try this
Add-Type -Assembly System.IO.Compression.FileSystem
#extract list entries for dir myzipdir/c/ into myzipdir.zip
$zip = [IO.Compression.ZipFile]::OpenRead("c:\temp\myzipdir.zip")
$entries=$zip.Entries | where {$_.FullName -like 'myzipdir/c/*' -and $_.FullName -ne 'myzipdir/c/'}
#create dir for result of extraction
New-Item -ItemType Directory -Path "c:\temp\c" -Force
#extraction
$entries | foreach {[IO.Compression.ZipFileExtensions]::ExtractToFile( $_, "c:\temp\c\" + $_.Name) }
#free object
$zip.Dispose()
This one does not use external libraries:
$shell= New-Object -Com Shell.Application
$shell.NameSpace("$(resolve-path foo.zip)").Items() | where Name -eq "c" | ? {
$shell.NameSpace("$PWD").copyhere($_) }
Perhaps it can be simplified a bit.
Here is something that works for me. Of Course, you will need to edit the code to fit your objective
$results =#()
foreach ($p in $Path)
{
$shell = new-object -Comobject shell.application
$fileName = $p
$zip = $shell.namespace("$filename")
$Results += $zip.items()| where-object { $_.Name -like "*C*" -or $_.Name -like
"*b*" }
}
foreach($item in $Results )
{
$shell.namespace($dest).copyhere($item)
}
I have drafted a PowerShell script that searches for a string among a large number of Word files. The script is working fine, but I have around 1 GB of data to search through and it is taking around 15 minutes.
Can anyone suggest any modifications I can do to make it run faster?
Set-StrictMode -Version latest
$path = "c:\Tester1"
$output = "c:\Scripts\ResultMatch1.csv"
$application = New-Object -comobject word.application
$application.visible = $False
$findtext = "Roaming"
$charactersAround = 30
$results = #()
Function getStringMatch
{
For ($i=1; $i -le 4; $i++) {
$j="D"+$i
$finalpath=$path+"\"+$j
$files = Get-Childitem $finalpath -Include *.docx,*.doc -Recurse | Where-Object { !($_.psiscontainer) }
# Loop through all *.doc files in the $path directory
Foreach ($file In $files)
{
$document = $application.documents.open($file.FullName,$false,$true)
$range = $document.content
If($range.Text -match ".{$($charactersAround)}$($findtext).{$($charactersAround)}"){
$properties = #{
File = $file.FullName
Match = $findtext
TextAround = $Matches[0]
}
$results += New-Object -TypeName PsCustomObject -Property $properties
$document.close()
}
}
}
If($results){
$results | Export-Csv $output -NoTypeInformation
}
$application.quit()
}
getStringMatch
import-csv $output
As mentioned in comments, you might want to consider using the OpenXML SDK library (you can also get the newest version of the SDK on GitHub), since it's way less overhead than spinning up an instance of Word.
Below I've turned your current function into a more generic one, using the SDK and with no dependencies on the caller/parent scope:
function Get-WordStringMatch
{
param(
[Parameter(Mandatory,ValueFromPipeline)]
[System.IO.FileInfo[]]$Files,
[string]$FindText,
[int]$CharactersAround
)
begin {
# import the OpenXML library
Add-Type -Path 'C:\Program Files (x86)\Open XML SDK\V2.5\lib\DocumentFormat.OpenXml.dll' |Out-Null
# make a "shorthand" reference to the word document type
$WordDoc = [DocumentFormat.OpenXml.Packaging.WordprocessingDocument] -as [type]
# construct the regex pattern
$Pattern = ".{$CharactersAround}$([regex]::Escape($FindText)).{$CharactersAround}"
}
process {
# loop through all the *.doc(x) files
foreach ($File In $Files)
{
# open document, wrap content stream in streamreader
$Document = $WordDoc::Open($File.FullName, $false)
$DocumentStream = $Document.MainDocumentPart.GetStream()
$DocumentReader = New-Object System.IO.StreamReader $DocumentStream
# read entire document
if($DocumentReader.ReadToEnd() -match $Pattern)
{
# got a match? output our custom object
New-Object psobject -Property #{
File = $File.FullName
Match = $FindText
TextAround = $Matches[0]
}
}
}
}
end{
# Clean up
$DocumentReader.Dispose()
$DocumentStream.Dispose()
$Document.Dispose()
}
}
Now that you have a nice function that supports pipeline input, all you need to do is gather your documents and pipe them to it!
# variables
$path = "c:\Tester1"
$output = "c:\Scripts\ResultMatch1.csv"
$findtext = "Roaming"
$charactersAround = 30
# gather the files
$files = 1..4|ForEach-Object {
$finalpath = Join-Path $path "D$i"
Get-Childitem $finalpath -Recurse | Where-Object { !($_.PsIsContainer) -and #('*.docx','*.doc' -contains $_.Extension)}
}
# run them through our new function
$results = $files |Get-WordStringMatch -FindText $findtext -CharactersAround $charactersAround
# got any results? export it all to CSV
if($results){
$results |Export-Csv -Path $output -NoTypeInformation
}
Since all of our components now support pipelining, you could do it all in one go:
1..4|ForEach-Object {
$finalpath = Join-Path $path "D$i"
Get-Childitem $finalpath -Recurse | Where-Object { !($_.PsIsContainer) -and #('*.docx','*.doc' -contains $_.Extension)}
} |Get-WordStringMatch -FindText $findtext -CharactersAround $charactersAround |Export-Csv -Path $output -NoTypeInformation
Got this script running. Nearly completed my mission to print attachments from it that land in a specific subfolder of outlook
$OutputFolder = 'C:\tests';
$outlook = New-Object -ComObject Outlook.Application;
$olFolderInbox = 6;
$ns = $outlook.GetNameSpace("MAPI");
$inbox = $ns.GetDefaultFolder($olFolderInbox);
$inbox.Folders `
| ? Name -eq 'colour' `
| % Items `
| % Attachments `
| % {
$OutputFileName = Join-Path -Path $OutputFolder -ChildPath $_.FileName;
if (Test-Path $OutputFileName) {
$FileDirectoryName = [System.IO.Path]::GetDirectoryName($OutputFileName);
$FileNameWithoutExtension = [System.IO.Path]::GetFileNameWithoutExtension($OutputFileName);
$FileExtension = [System.IO.Path]::GetExtension($OutputFileName);
for ($i = 2; Test-Path $OutputFileName; $i++) {
$OutputFileName = "{0} ({1}){2}" -f (Join-Path -Path $FileDirectoryName -ChildPath $FileNameWithoutExtension), $i, $FileExtension;
}
}
Write-Host $OutputFileName;
$_.SaveAsFile($OutputFileName)
}
Remove-Item -Path C:\tests\*.jpg
Dir C:\tests\ | Out-Printer -name xerox-b8
Remove-Item -Path C:\tests\*.*
when i try to pipe the objects to print i am getting XML printing out, or just the directory contents
I have tried:
select-object (wrong)
get-childitem (wrong)
DIR C:\tests\*.* (only returns directory listing printout)
These either return a load of XML rubbish or just a directory listing,
How can i pipe the contents of a folder to a printer using powershell, surely this can be done
I am new to Powershell and am struggling a bit. I have obtained an example of the sort of function I want to use and adapted it partially. What I want is for it to loop through each subdirectory of C:\Test\, and combine just the PDFs in each subdirectory together (leaving the resulting PDF in each subdirectory).
At the moment I can get it to comb through the subdirectories, but it then combines the contents of all subdirectories into one giant PDF in the top level directory, which is not what I want. I feel like maybe I need to use an array of sorts but I don't know Powershell well enough yet.
BTW this uses PDFSharp - a .Net library.
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
}
$output.Save($filename)
}
Your question was unclear about how many levels you need to go down. You can try this (untested). It goes one level down from $filepath, gets all pdf files in that folder and it's subfolders and combines them into Subfoldername-Combined.pdf:
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder(FIRST LEVEL ONLY!)
Get-ChildItem $filepath | Where-Object { $_.PSIsContainer } | Foreach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.FullName "$($_.Name)-Combined.pdf"
#Find and add pdf files in subfolders
Get-ChildItem -Path $_.FullName -Filter *.pdf -Recurse | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #Don't think this one's necessary
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
}
}
So you should get this:
c:\Test\Folder1\Folder1-Combined.pdf #should include all pages in Folder1 and ANY subfolders below)
c:\Test\Folder2\Folder2-Combined.pdf #should include all pages in Folder2 and ANY subfolders below)
#etc.
If you need it to create a combined pdf for every subfolder(not only the first level), then you could try this(untested):
Function PDFCombine {
$filepath = 'C:\Test\'
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
#Foreach subfolder with pdf files
Get-ChildItem -Path $filepath -Filter *.pdf -Recurse | Group-Object DirectoryName | ForEach-Object {
#Create new ouput pdf-file
$output = New-Object PdfSharp.Pdf.PdfDocument
$outfilepath = Join-Path $_.Name "Combined.pdf"
#Find and add pdf files in subfolders
$_.Group | ForEach-Object {
#$input = New-Object PdfSharp.Pdf.PdfDocument #I don't think you need this
$input = $PdfReader::Open($_.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{ $output.AddPage($_) }
}
#Save
$output.Save($outfilepath)
#Remove output-object
Remove-Variable output
}
}
not tested ...
Function PDFCombine {
$filepath = 'C:\Test\'
$filename = '.\Combined' #<--- ???
$output = New-Object PdfSharp.Pdf.PdfDocument
$PdfReader = [PdfSharp.Pdf.IO.PdfReader]
$PdfDocumentOpenMode = [PdfSharp.Pdf.IO.PdfDocumentOpenMode]
$lastdir=""
foreach($i in (gci $filepath *.pdf -Recurse)) {
$input = New-Object PdfSharp.Pdf.PdfDocument
$input = $PdfReader::Open($i.fullname, $PdfDocumentOpenMode::Import)
$input.Pages | %{$output.AddPage($_)}
if ($lastdir -ne $_.directoryname){
$lastdir=$_.directoryname
$output.Save("$lastdir\$filename")
$output = New-Object PdfSharp.Pdf.PdfDocument
}
}
}