optimize full path extraction of 1 million files +/- and iteration from file - powershell

I am a programming enthusiast and novice, I am using Powershell to try to solve the following need:
I need to extract the full path of files with extension .img. inside a folder with +/- 900 thousand folders and +/- million files. -/+ 900,000 img files.
Each img file must be processed in an exe. that is read from a file.
Which is better to store the result of the GetChildItem in a variable or a file?
I would greatly appreciate your guidance and support to optimize and / or find the best way to speed up processes vs. resource consumption.
Thank you un advance!!
This is the code I am currently using:
$PSDefaultParameterValues['*:Encoding'] = 'Ascii'
$host.ui.RawUI.WindowTitle = “DICOM IMPORT IN PROGRESS”
#region SET WINDOW FIXED WIDTH
$pshost = get-host
$pswindow = $pshost.ui.rawui
$newsize = $pswindow.buffersize
$newsize.height = 3000
$newsize.width = 150
$pswindow.buffersize = $newsize
$newsize = $pswindow.windowsize
$newsize.height = 50
$newsize.width = 150
$pswindow.windowsize = $newsize
#endregion
#
$out = ("$pwd\log_{0:yyyyMMdd_HH.mm.ss}_import.txt" -f (Get-Date))
cls
"`n" | tee -FilePath $out -Append
"*****************" | tee -FilePath $out -Append
"**IMPORT SCRIPT**" | tee -FilePath $out -Append
"*****************" | tee -FilePath $out -Append
"`n" | tee -FilePath $out -Append
#
# SET SEARCH FOLDERS #
"Working Folder" | tee -FilePath $out -Append
$path1 = Read-Host "Enter folder location" | tee -FilePath $out -Append
"`n" | tee -FilePath $out -Append
#
#
# SET & SHOW HOSTNAME
"SERVER NAME" | tee -FilePath $out -Append
$ht = hostname | tee -FilePath $out -Append
Write-Host $ht
Start-Sleep -Seconds 3
"`n" | tee -FilePath $out -Append
#
#
# GET FILES
"`n" | tee -FilePath $out -Append
#"SEARCHING IMG FILES, PLEASE WAIT..." | tee -FilePath $out -Append
$files = $path1 | Get-ChildItem -recurse -file -filter *.img | ForEach-Object { $_.FullName }
# SHOW Get-ChildItem PROCESS ON CONSOLE
Out-host -InputObject $files
"`n" | tee -FilePath $out -Append
Write-Output ($files | Measure).Count "IMG FILES FOUND TO PUSH" | tee -FilePath $out -Append
# DUMP Get-ChildIte into a file
$files > $pwd\pf
Start-Sleep -Seconds 5
# TIMESTAMP
"`n" | tee -FilePath $out -Append
"IMPORT START" | tee -FilePath $out -Append
("{0:yyyy/MM/dd HH:mm:ss}" -f (Get-Date)) | tee -FilePath $out -Append
"********************************" | tee -FilePath $out -Append
"`n" | tee -FilePath $out -Append
#
#
#SET TOOL
$ir = $Env:folder_tool
$pt = "utils\tool.exe"
#
#PROCESSING FILES
$n = 1
$pe = foreach ($file in Get-Content $pwd\pf ) {
$tb = (Get-Date -f HH:mm:ss) | tee -FilePath $out -Append
$fp = "$n. $file" | tee -FilePath $out -Append
#
$ep = & $ir$pt -c $ht"FIR" -i $file | tee -FilePath $out -Append
$as = "`n" | tee -FilePath $out -Append
# PRINT CONSOLE IMG FILES PROCESS
Write-Host $tb
Write-Host $fp
Out-host -InputObject $ep
Write-Host $as
$n++
}
#
#TIMESTAMP
"********************************" | tee -FilePath $out -Append
"IMPORT END" | tee -FilePath $out -Append
("{0:yyyy/MM/dd HH:mm:ss}" -f (Get-Date)) | tee -FilePath $out -Append
"`n" | tee -FilePath $out -Append

Try using parallel with PoshRSJob.
Replace Start-Process in Process-File with your code and note that there is no access to console. Process-File must return string.
Adjust $JobCount and $inData.
The main idea is to load all file list into ConcurrentQueue, start 20 background jobs and wait them to exit. Each job will take value from queue and pass to Process-File, then repeat until queue is empty.
NOTE: If you stop script, RS Jobs will continue to run until they finished or powershell closed. Use Get-RSJob | Stop-RSJob and Get-RSJob | Remove-RSJob to stop background work
Import-Module PoshRSJob
Function Process-File
{
Param(
[String]$FilePath
)
$process = Start-Process -FilePath 'ping.exe' -ArgumentList '-n 5 127.0.0.1' -PassThru -WindowStyle Hidden
$process.WaitForExit();
return "Processed $FilePath"
}
$JobCount = [Environment]::ProcessorCount - 2
$inData = [System.Collections.Concurrent.ConcurrentQueue[string]]::new(
[System.IO.Directory]::EnumerateFiles('S:\SCRIPTS\FileTest', '*.img')
)
$JobScript = [scriptblock]{
$inQueue = [System.Collections.Concurrent.ConcurrentQueue[string]]$args[0]
$outBag = [System.Collections.Concurrent.ConcurrentBag[string]]$args[1]
$currentItem = $null
while($inQueue.TryDequeue([ref] $currentItem) -eq $true)
{
try
{
# Add result to OutBag
$result = Process-File -FilePath $currentItem -EA Stop
$outBag.Add( $result )
}
catch
{
# Catch error
Write-Output $_.Exception.ToString()
}
}
}
$resultData = [System.Collections.Concurrent.ConcurrentBag[string]]::new()
$i_cur = $inData.Count
$i_max = $i_cur
# Start jobs
$jobs = #(1..$JobCount) | % { Start-RSJob -ScriptBlock $JobScript -ArgumentList #($inData, $resultData) -FunctionsToImport #('Process-File') }
# Wait queue to empty
while($i_cur -gt 0)
{
Write-Progress -Activity 'Doing job' -Status "$($i_cur) left of $($i_max)" -PercentComplete (100 - ($i_cur / $i_max * 100))
Start-Sleep -Seconds 3 # Update frequency
$i_cur = $inData.Count
}
# Wait jobs to complete
$logs = $jobs | % { Wait-RSJob -Job $_ } | % { Receive-RSJob -Job $_ }
$jobs | % { Remove-RSJob -Job $_ }
$Global:resultData = $resultData
$Global:logs = $logs
$Global:resultData is array of Process-File return strings

Which is better to store the result of the GetChildItem in a variable or a file?
If you're hoping to keep memory utilization low, the best solution is to not store them at all - simply consume the output from Get-ChildItem directly:
$pe = Get-ChildItem -Recurse -File -filter *.img |ForEach-Object {
$file = $_.FullName
$tb = (Get-Date -f HH:mm:ss) | tee -FilePath $out -Append
$fp = "$n. $file" | tee -FilePath $out -Append
#
$ep = & $ir$pt -c $ht"FIR" -i $file | tee -FilePath $out -Append
$as = "`n" | tee -FilePath $out -Append
# PRINT CONSOLE IMG FILES PROCESS
Write-Host $tb
Write-Host $fp
Out-host -InputObject $ep
Write-Host $as
$n++
}

Related

Out-File only printing half of my results

I currently have a CSV file that has 2,440 lines of data. The data looks something like:
server1:NT:Y:N:N:00:N
server2:NT:Y:N:n:33:N
This is what I have so far:
$newCsvPath = Get-Content .\sever.csv |
Where-Object { $_ -notmatch '^#|^$|^"#' }
[int]$windows = 0
[int]$totalsever = 0
$Results = #()
$date = Get-Date -Format g
Clear-Content .\results.csv -Force
foreach ($thing in $newCsvPath) {
$totalsever++
$split = $thing -split ":"
if ($split[1] -contains "NT") {
$windows++
$thing | Out-File results.csv -Append -Force
} else {
continue
}
}
Clear-Content .\real.csv -Force
$servers = Get-Content results.csv
foreach ($server in $servers) {
$server.Split(':')[0] | Out-File real.csv -Append -Force
}
My issue is that when the script gets to the $server.Split(':')[0] | Out-File real.csv -Append -Force part, for some reason it only outputs 1,264 lines instead of all 2,440 to "real.csv". However, when I remove | Out-File real.csv -Append -Force, $server stores ALL 2,400 names of servers.
Does anyone have any idea of why this is happening?

Powershell script using more and more memory

I have a powershell script with a while loop, when itr first start the script its using 15,504kb memory, but it looks like every time it goes through the while loop the memory usage is increasing by about 100kb, and it never seems to go down.
$exe = 'C:\Program Files (x86)\MyApp\MyApp.exe'
$logOutput = 'C:\Log\Log.log'
$logdir1 = 'C:\Log'
$storage = 'ext'
$date = (Get-Date)
"{0:dd/MM/yy HH:mm:ss} - START" -f $date | Out-File -filepath $logOutput -NoClobber -Append -Encoding ASCII
While ($true){
$date = (Get-Date)
"{0:dd/MM/yy HH:mm:ss} - Calling exe for D:\" -f $date | Out-File -filepath $logOutput -NoClobber -Append -Encoding ASCII
& $exe /logdir $logdir1 /storage $storage
Start-Sleep -s 5
$date = (Get-Date)
"{0:dd/MM/yy HH:mm:ss} - Calling exe for E:\" -f $date | Out-File -filepath $logOutput -NoClobber -Append -Encoding ASCII
& $exe /logdir $logdir1 /storage $storage
Start-Sleep -s 10
}
Is there something wrong with my script?

Powershell Out-file not working

I have a powershell script which gathers some file information on some remote servers.
When writing to the console, I see the results just fine, but they don't seem to be getting written into the specified out-file:
out-file -filepath $filepath -inputobject $date -force -encoding ASCII -width 50
ForEach($server in $serverLists)
{
$result += $server
$result += $break
ForEach($filePath in $filePaths)
{
$result += $filePath
$result += $break
$command = "forfiles /P " + $filePath + " /S /D -1 > C:\temp\result.txt"
$cmd = "cmd /c $command"
Invoke-WmiMethod -class Win32_process -name Create -ArgumentList $cmd -ComputerName $server
sleep 2
$result += Get-Content \\$server\C$\temp\result.txt
$result += $break
$result += $break
write-host $result
#out-file -filepath $filepath -Append -inputobject $result -encoding ASCII -width 200
Remove-Item \\$server\C$\temp\result.txt
}
$result += $break
}
out-file -filepath $filepath -Append -inputobject $result -encoding ASCII -width 200
You use same variable $filepath in two places: as loop variable for foreach loop and outside of loop. foreach loop does not restore value of loop variable after loop ends. So, right after loop ends loop variable will have last value of loop or value where loop was interrupted by break command. If you does not want variable value to be overwritten by foreach loop, then you should choose different name for loop variable.

get return from another power shell script

I have a PS script which Invoke-Expression on other scripts on the same computer.
Here is the code:
$webMemory = "C:\Memory_Script\WebMemory_Script.ps1"
$intMemory = "C:\Memory_Script\IntMemory_Script.ps1"
$hungWeb = "C:\Scripts\HungWeb_Script.ps1"
$hungInt = "C:\Scripts\HungInt_Script.ps1"
$intMemoryResult = #()
$webMemoryResult = #()
$hungWebResult = #()
$hungIntResult = #()
$date = Get-Date
$shortDate = (get-date -format ddMMyyy.hhmm)
$filepath = "C:\Scripts\Memory&HungResults\Results" + $shortdate + ".txt"
$break = "`r`n"
out-file -filepath $filepath -inputobject $date -force -encoding ASCII -width 50
out-file -filepath $filepath -Append -inputobject $break -encoding ASCII -width 50
$intMemoryResult += Invoke-Expression $intMemory
$webMemoryResult += Invoke-Expression $webMemory
$hungWebResult += Invoke-Expression $hungWeb
$hungIntResult += Invoke-Expression $hungInt
Write-host $webMemoryResult
out-file -filepath -Append -inputobject $intMemoryResult -encoding ASCII -width 200
out-file -filepath -Append -inputobject $break -encoding ASCII -width 200
out-file -filepath -Append -inputobject $webMemoryResult -encoding ASCII -width 200
out-file -filepath -Append -inputobject $break -encoding ASCII -width 200
out-file -filepath -Append -inputobject $hungIntResult -encoding ASCII -width 200
out-file -filepath -Append -inputobject $break -encoding ASCII -width 200
out-file -filepath -Append -inputobject $hungWebResult -encoding ASCII -width 200
out-file -filepath -Append -inputobject $break -encoding ASCII -width 200
Code in one of the scripts being called (the other three have similar functions)
$serverList = #("List of servers")
$w3wpMemory = #()
$w3wpMemory += "---------- W3WP Memory Consumption ----------"
$w3wpresult = #()
$toBeRecycled =#()
$toBeRecycled += "******************** THE INT SERVERS BELOW NEED TO BE RECYCLED (Hung) ********************" + "`r`n"
$date = Get-Date
$shortDate = (get-date -format ddMMyyy.hhmm)
$filepath = "C:\Scripts\HungIntResults\HungServerResults" + $shortdate + ".txt"
$break = "`r`n"
out-file -filepath $filepath -inputobject $date -force -encoding ASCII -width 50
out-file -filepath $filepath -Append -inputobject $break -encoding ASCII -width 50
ForEach($server in $serverList)
{
$w3wpresult += (get-wmiobject Win32_Process -filter "commandline like '%serviceoptimization%'" -computername $server).privatepagecount / 1gb
$w3wpMemory += $server + ":" + $w3wpresult + "`n"
}
$i = 0
ForEach($server in $serverList)
{
$w3wpresult2 = (get-wmiobject Win32_Process -filter "commandline like '%serviceoptimization%'" -computername $server).privatepagecount / 1gb
Write-Host $w3wpresult2 " , " ($w3wpresult | select-object -index $i)
if($w3wpresult -contains ($w3wpresult2))
{
$toBeRecycled += $server + "`r`n"
}
$i = $i + 1
}
$toBeRecycled += "*******************************************************************************"
$toBeRecycled += "`r`n"
Write-Host $toBeRecycled
out-file -filepath $filepath -Append -inputobject $toBeRecycled -encoding ASCII -width 100
return $toBeRecycled
When the script runs, I see the output of from the execution of the other scripts.
The results from the "Invoke-Expression" command are returning null, why is this?
Write-Host writes directly to the host display. If you want to capture this output then use Write-Output instead or just put a variable on a line by itself because the default output is the "output" stream:
$toBeRecycled
BTW when you execute a PowerShell script from another PowerShell script, the child script will execute synchronously (unless you are using jobs).

PowerShell passing parameter with Invoke-Command

I can't seem to find the correct syntax to pass 2 variables from the CALL-script to the execution script in order to have it executed on the remote server. I tried single quotes, double quotes, brackets, .. nothing I can fiind passes the $Target and $OlderThanDays parameters to the script.
Thank you for your help.
The CALL-Script:
#================= VARIABLES ==================================================
$ScriptDir = "\\Server\Scripts"
#================= BODY =======================================================
# Invoke-Command -ComputerName SERVER1 -FilePath $ScriptDir\"Auto_Clean.ps1"
Invoke-Command -FilePath .\Test.ps1 -ComputerName SERVER01 -ArgumentList {-Target ´E:\Share\Dir1\Dir2´,-OlderThanDays ´10´}
The execution Script:
#================= PARAMETERS =================================================
Param(
[Parameter(Mandatory=$True,Position=1)]
[string]$Target,
[Parameter(Mandatory=$True,Position=2)]
[string]$OlderThanDays
)
#================= BODY =======================================================
# Set start time & logname
$StartTime = (Get-Date).ToShortDateString()+", "+(Get-Date).ToLongTimeString()
$LogName = "Auto_Clean.log"
# Format header for log
$TimeStamp = (Get-Date).ToShortDateString()+" | "+(Get-Date).ToLongTimeString()+" |"
$Header = "`n$TimeStamp Deleting files and folders that are older than $OlderThanDays days:`n"
Write-Output "$Header" # to console
Out-File $Target\$LogName -inputobject $Header -Append # to log
# PS 2.0 Workaround (`tee-object -append`) // PS 4.0: `Write-Output "`nDeleting folders that are older than $OlderThanDays days:`n" | Tee-Object $LogFile -Append`
# Remove files older than
Get-ChildItem -Path $Target -Exclude $LogName -Recurse |
Where-Object { $_.LastWriteTime -lt (Get-Date).AddDays(-$OlderThanDays) } | ForEach {
$Item = $_.FullName
Remove-Item $Item -Recurse -Force -ErrorAction SilentlyContinue
$Timestamp = (Get-Date).ToShortDateString()+" | "+(Get-Date).ToLongTimeString()
# If folder can't be removed
if (Test-Path $Item)
{ "$Timestamp | FAILLED: $Item (IN USE)" }
else
{ "$Timestamp | REMOVED: $Item" }
} | Out-File $Target\$LogName -Append
# PS 4.0: ´| Tee-Object $Target\$LogName -Append` # Output folder names to console & logfile at the same time
# Remove empty folders
while (Get-ChildItem $Target -recurse | where {!#(Get-ChildItem -force $_.FullName)} | Test-Path) {
Get-ChildItem $Target -recurse | where {!#(Get-ChildItem -force $_.FullName)} | Remove-Item
}
# Format footer
$EndTime = (Get-Date).ToShortDateString()+", "+(Get-Date).ToLongTimeString()
$TimeTaken = New-TimeSpan -Start $StartTime -End $EndTime
Write-Output ($Footer = #"
Start Time : $StartTime
End Time : $EndTime
Total Runtime : $TimeTaken
$("-"*79)
"#)
# Write footer to log
Out-File -FilePath $Target\$LogName -Append -InputObject $Footer
# Clean up variables
$Target=$StartTime=$EndTime=$OlderThanDays = $null
The execution script:
you have got to use " or ' but not ´ :
-argumentlist #('E:\Share\Dir1\Dir2',10)