How can I speed up a PowerShelll foreach loop - powershell

I have a PowerShell script that connects to a database and pulls a list of user data. I take this data and create a foreach loop to run a script for the data.
This is working but its slow as the results could be 1000+ entries, and it has to complete the Script.bat for User A before it can start User B. The Script.bat for a single user is independent from another and takes ~30s for each user.
Is there a way to speed this up at all? I've been playing with -Parallel, ForEach-Object and workflow but I can't get it to work, likely due to me being a noob in PS.
foreach ($row in $Dataset.tables[0].rows)
{
$UserID=$row.value
$DeviceID=$row.value1
$EmailAddress=$row.email_address
cmd.exe /c "`"$PSScriptRoot`"\bin\Script.bat -c `" -Switch $UserID`" >> `"$PSScriptRoot`"\${FileName3}_REST_${DateTime}.txt 2> nul";
}

You said it yourself, your bottleneck is with the batch file in your script, not the loop itself. foreach (as opposed to ForEach-Object) is already the faster foreach loop mechanism in PowerShell. Investigate your batch file to find out why it takes 30 seconds to complete, and optimize it where you can.
Using Jobs
Note: Start-Job will run the job under another process. If you have PowerShell Core you can make use of the Start-ThreadJob cmdlet in lieu of Start-Job. This will start your job as part of another thread of the same process instead of starting another process.
If you can't optimize your batch script or optimize it to meet your needs, then you can consider using Start-Job to kick off the job to execute asynchronously, and then check the result and get any output from it using Receive-Job. For example:
# Master list of jobs you need to check the result of later
$jobs = New-Object System.Collections.Generic.List[System.Management.Automation.Job]
# Run your script for each row
foreach ($row in $Dataset.tables[0].rows)
{
$UserID=$row.value
$DeviceID=$row.value1
$EmailAddress=$row.email_address
# Use Start-Job here to kick off the script and store the job information
# for later retrieval.
# The $using: scope modifier allows you to make use of variables that were
# defined in the session calling Start-Job
$job = Start-Job -ScriptBlock { cmd.exe /c "`"${using:PSScriptRoot}`"\bin\Script.bat -c `" -Switch ${using:UserID}`" >> `"${using:PSScriptRoot}`"\${using:FileName3}_REST_${DateTime}.txt 2> nul"; }
# Add the execution to the $jobs list to check the result of later
# Casting to void here prevents the Add method from returning the object
# we've added.
[void]$jobs.Add($job)
}
# Wait for the jobs to be done
Write-Host 'Waiting for all jobs to complete...'
while( $jobs | Where-Object { $_.State -eq 'Running' } ){
Start-Sleep -s 10
}
# Retrieve the output of the jobs
foreach( $j in $jobs ) {
Receive-Job $j
}
Note: Since you have ~1000 times you need to execute this script, you may want to consider writing your logic to only run a certain number of jobs at a time. My example above starts all necessary jobs without regarding the number that may execute at once.
For more information about jobs and the properties you can inspect on a running/completed job, check the links below:
About Jobs
Job Class
Using Scope*
* The documentation states that the using scope can only be declared when working with remote sessions, but this seems to work fine with Start-Job even if the job is local.

Related

Do threads still execute using -asjob with wait-job?

Hello all and good afternoon!
I had a quick question regarding -asjob running with invoke-command.
If I run 2 Invoke-Command's using -asjob, does it run simultaneously when I try to receive the ouput? Does this mean wait-job waits till the first job specified is finished running to get the next results?
Write-Host "Searching for PST and OST files. Please be patient!" -BackgroundColor White -ForegroundColor DarkBlue
$pSTlocation = Invoke-Command -ComputerName localhost -ScriptBlock {Get-Childitem "C:\" -Recurse -Filter "*.pst" -ErrorAction SilentlyContinue | % {Write-Host $_.FullName,$_.lastwritetime}} -AsJob
$OSTlocation = Invoke-Command -ComputerName localhost -ScriptBlock {Get-Childitem "C:\Users\me\APpdata" -Recurse -Filter "*.ost" -ErrorAction SilentlyContinue | % {Write-Host $_.FullName,$_.lastwritetime} } -AsJob
$pSTlocation | Wait-Job | Receive-Job
$OSTlocation | Wait-Job | Receive-Job
Also, another question: can i save the output of the jobs to a variable without it showing to the console? Im trying to make it where it checks if theres any return, and if there is output it, but if theres not do something else.
I tried:
$job1 = $pSTlocation | Wait-Job | Receive-Job
if(!$job1){write-host "PST Found: $job1"} else{ "No PST Found"}
$job2 = $OSTlocation | Wait-Job | Receive-Job
if(!$job2){write-host "OST Found: $job2"} else{ "No OST Found"}
No luck, it outputs the following:
Note: This answer does not directly answer the question - see the other answer for that; instead, it shows a reusable idiom for a waiting for multiple jobs to finish in a non-blocking fashion.
The following sample code uses the child-process-based Start-Job cmdlet to create local jobs, but the solution equally works with local thread-based jobs created by Start-ThreadJob as well as jobs based on remotely executing Invoke-Command -ComputerName ... -AsJob commands, as used in the question.
It shows a reusable idiom for a waiting for multiple jobs to finish in a non-blocking fashion that allows for other activity while waiting, along with collecting per-job output in an array.
Here, the output is only collected after each job completes, but note that collecting it piecemeal, as it becomes available, is also an option, using (potentially multiple) Receive-Job calls even before a job finishes.
# Start two jobs, which run in parallel, and store the objects
# representing them in array $jobs.
# Replace the Start-Job calls with your
# Invoke-Command -ComputerName ... -AsJob
# calls.
$jobs = (Start-Job { Get-Date; sleep 1 }),
(Start-Job { Get-Date '1970-01-01'; sleep 2 })
# Initialize a helper array to keep track of which jobs haven't finished yet.
$remainingJobs = $jobs
# Wait iteratively *without blocking* until any job finishes and receive and
# output its output, until all jobs have finished.
# Collect all results in $jobResults.
$jobResults =
while ($remainingJobs) {
# Check if at least 1 job has terminated.
if ($finishedJob = $remainingJobs | Where State -in Completed, Failed, Stopped, Disconnected | Select -First 1) {
# Output the just-finished job's results as part of custom object
# that also contains the original command and the
# specific termination state.
[pscustomobject] #{
Job = $finishedJob.Command
State = $finishedJob.State
Result = $finishedJob | Receive-Job
}
# Remove the just-finished job from the array of remaining ones...
$remainingJobs = #($remainingJobs) -ne $finishedJob
# ... and also as a job managed by PowerShell.
Remove-Job $finishedJob
} else {
# Do other things...
Write-Host . -NoNewline
Start-Sleep -Milliseconds 500
}
}
# Output the jobs' results
$jobResults
Note:
It's tempting to try $remainingJobs | Wait-Job -Any -Timeout 0 to momentarily check for termination of any one job without blocking execution, but as of PowerShell 7.1 this doesn't work as expected: even already completed jobs are never returned - this appears to be bug, discussed in GitHub issue #14675.
If I run 2 Invoke-Command's using -asjob, does it run simultaneously when I try to receive the output?
Yes, PowerShell jobs always run in parallel, whether they're executing remotely, as in your case (with Invoke-Command -AsJob, assuming that localhost in the question is just a placeholder for the actual name of a different computer), or locally (using Start-Job or Start-ThreadJob).
However, by using (separate) Wait-Job calls, you are synchronously waiting for each jobs to finish (in a fixed sequence, too). That is, each Wait-Job calls blocks further execution until the target job terminates.[1]
Note, however, that both jobs continue to execute while you're waiting for the first one to finish.
If, instead of waiting in a blocking fashion, you want to perform other operations while you wait for both jobs to finish, you need a different approach, detailed in the the other answer.
can i save the output of the jobs to a variable without it showing to the console?
Yes, but the problem is that in your remotely executing script block ({ ... }) you're mistakenly using Write-Host in an attempt to output data.
Write-Host is typically the wrong tool to use, unless the intent is to write to the display only, bypassing the success output stream and with it the ability to send output to other commands, capture it in a variable, or redirect it to a file. To output a value, use it by itself; e.g., $value instead of Write-Host $value (or use Write-Output $value, though that is rarely needed); see this answer.
Therefore, your attempt to collect the job's output in a variable failed, because the Write-Host output bypassed the success output stream that variable assignments capture and went straight to the host (console):
# Because the job's script block uses Write-Host, its output goes to the *console*,
# and nothing is captured in $job1
$job1 = $pSTlocation | Wait-Job | Receive-Job
(Incidentally, the command could be simplified to
$job1 = $pSTlocation | Receive-Job -Wait).
[1] Note that Wait-Job has an optional -Timeout parameter, which allows you to limit waiting to at most a given number of seconds and return without output if the target job hasn't finished yet. However, as of PowerShell 7.1, -Timeout 0 for non-blocking polling for whether jobs have finished does not work - see GitHub issue #14675.

Stop a process running longer than an hour

I posted a question a couple ago, I needed a powershell script that would start a service if it was stopped, stop the process if running longer than an hour then start it again, and if running less than an hour do nothing. I was given a great script that really helped, but I'm trying to convert it to a "process". I have the following code (below) but am getting the following error
Error
"cmdlet Start-Process at command pipeline position 3
Supply values for the following parameters:
FilePath: "
Powershell
# for debugging
$PSDefaultParameterValues['*Process:Verbose'] = $true
$str = Get-Process -Name "Chrome"
if ($str.Status -eq 'stopped') {
$str | Start-Process
} elseif ($str.StartTime -lt (Get-Date).AddHours(-1)) {
$str | Stop-Process -PassThru | Start-Process
} else {
'Chrome is running and StartTime is within the past hour!'
}
# other logic goes here
Your $str is storing a list of all processes with the name "Chrome", so I imagine you want a single process. You'll need to specify an ID in Get-Process or use $str[0] to single out a specific process in the list.
When you store a single process in $str, if you try to print your $str.Status, you'll see that it would output nothing, because Status isn't a property of a process. A process is either running or it doesn't exist. That said, you may want to have your logic instead check if it can find the process and then start the process if it can't, in which case it needs the path to the executable to start the process. More info with examples can be found here: https://technet.microsoft.com/en-us/library/41a7e43c-9bb3-4dc2-8b0c-f6c32962e72c?f=255&MSPPError=-2147217396
If you're using Powershell ISE, try storing the process in a variable in the terminal, type the variable with a dot afterwards, and Intellisense (if it's on) should give a list of all its available properties.

Assign a variable inside of a scriptblock while running a job

Related to Terminate part of powershell script and continue.
Partially related to Powershell Job Always Shows Complete.
My script runs locally and access the registry hive of a remote PC. I need the value of registry keys to be written into a $RegHive variable. And I want to monitor it as a job in case some PC freezes, I can terminate the command and move on to another PC.
My original code would be:
$global:RegHive = $null
$job = Start-Job -ScriptBlock {
$RegHive = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey("SomeKeyName", "SomePCName")
}
But no matter what I do, the variable $RegHive is empty.
If I do $RegHive = (Get-Job | Receive-Job) some value gets assigned to $RegHive that on one side looks exactly as if I would run it normally without a job/scriptblock, ie:
$RegHive = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey("SomeKeyName", "SomePCName")
and even has the same $RegHive.SubKeyCount
But the "normal" one has $RegHive.GetSubKeyName() method and the one from job doesn't.
How do I escape assigning a variable with Receive-Job and do the assignment directly inside the scriptblock, which is run as a job?
In simple words:
$job = Start-Job -ScriptBlock {$a = 1 + 2}
How to get $a be equal to 3 without $a = (Get-job | Receive-job)?
This might be helpful for you. The job is sort of like a variable
What you can do is name the job and then call it by name with -Keep to maintain it's value stored - aka it will store all final output inside itself until you call it. (it can be kept but the default is to remove it once called)
$global:RegHive = $null
Start-Job -Name "RegHive" -ScriptBlock {
[Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey("SomeKeyName", "SomePCName")
}
Receive-Job -Name "RegHive" -Keep
obviously calling the Receive-Job immediately after defeats the purpose of jobs, they add a lot of overhead, and are only efficient when needing to do multiple things at once. - if you call for 100s or thousands at once, you could do get-job | wait-job then when finished start using their outputs ---- wait-job also accepts job names or can wait on your entire list of jobs.
another option to set the variable is
$RegHive = "Receive-Job -Name "RegHive"
and finally, you can do this to use the value
get-<insert command> -value "$(Receive-Job -Name 'RegHive' -Keep)" -argument2 "YADA YADA"
remember keep will not delete the value and can be "Received" again later.

Forcing a powershell script to the next line

I have a powershell script that at one point will call 2 other powershell scripts to run. It runs one script to completion, then the other, but this causes it to take longer. Can I force the script to execute the other scripts and continue cycling through? When I used to run these scripts manually I would have 20-30 sessions running and walk away while it worked. What I wrote took the monotony of clicking through them manually
Here's the parent script:
$List = Get-Content C:\archive\${env:id}.txt
$Batch = New-Object System.Collections.ArrayList
foreach ($Data in $List){
if ($Data -eq "" -or $data -eq $List[-1]){
$ProjectName = $Batch[0]
out-file C:\archive\"$ProjectName".txt
foreach($Data in $Batch -ne $Batch[0]){
Add-Content -Path C:\archive\"$ProjectName".txt -Exclude
$Batch[0] -Value $Data
}
--> C:\archive\GetPrograms.ps1 $ProjectName
--> C:\archive\GetNetwork.ps1 $ProjectName
$Batch = New-Object System.Collections.ArrayList
}
else{
[void]$Batch.Add($Data)
}
}
The parent script is not contingent on the data produced by the other 2 scripts. It simply executes them by passing in data
Honestly, based on your use case description, you really want to be looking at Parallel job/task/script processing.
Here is a post along the lines of what should satisfy your goals.
How do I run my PowerShell scripts in parallel without using Jobs?
Update - While this answer explains the process and mechanics of
PowerShell runspaces and how they can help you multi-thread
non-sequential workloads, fellow PowerShell aficionado Warren 'Cookie
Monster' F has gone the extra mile and incorporated these same
concepts into a single tool called Invoke-Parallel - it does what I
describe below, and he has since expanded it with optional switches
for logging and prepared session state including imported modules,
really cool stuff - I strongly recommend you check it out before
building you own shiny solution!
https://serverfault.com/questions/626711/how-do-i-run-my-powershell-scripts-in-parallel-without-using-jobs

copy / create multiple files

I need to first create and then copy some hundreds of folders & files via powershell (first create them on a local store and then copy them to a remote store).
However, when my foreach loop runs, every 40 or so write attempt fails due to "another process" which blocks the file/folder.
I currently fixed the issue using a simple sleep between every file creation (100ms). However, I wonder if there is no better way to do this? Especially when copying multiple files the sleep would depend on the network latency and dosn't seem to be a good solution to me.
Is there a way to "wait" till the write-operation of a file completed before starting another operation? Or to check if a file is still blocked by one process and wait till it's free again?
Have you tried running your code as a job? Example:
foreach ($file in $files) {
$job = Start-Job -ScriptBlock {
#operation here..
} | Wait-Job
#Log result of job using ex. $job and: '$job | Receive-Job' to get output
}
You could also extend it to create multiple jobs, and then use Get-Job | Wait-Job to wait for the all to finish before you proceed.