powershell start process within for each loop - powershell

I have a for each loop I wish to run in parallel, however I'm not getting a uniform time as to when each iteration kicks off, which is leading to some edgecase timing issues.
As such, I've changed my loop just to be sequential. However, I don't want to wait for the iteration commands to complete.
Each iteration is essentially doing:
Invoke-Pester #{Path= "tests.ps1"; Parameters = #{...}} -Tag 'value' -OutputFile $xmlpath -OutputFormat NUnitXML -EnableExit
I want each iteration to run sequentially (tests expect to be run sequentially) however I don't wish to wait for tests to complete.
What is the best way to ensure the iteration does not wait for the Invoke-Pester command to complete, such that the next iteration kicks off after the previous iteration has initiated? I've tried using Start-Process Invoke-Pester which I think invalidated further code structure.
Thank you

One way for async processing are PowerShell jobs. A very basic example:
$jobs = foreach ($xmlPath in (...)) {
Start-job { Invoke-Pester -OutputFile $Args[0] ... } -ArgumentList $xmlPath
}
# get the job results later:
$jobs | Receive-Job
However, jobs are very slow. A more performant but also a little more complex way is using background runspaces:
$jobs = foreach ($xmlPath in (...)) {
$ps = [Powershell]::Create()
[void]$ps.AddScript({ Invoke-Pester -OutputFile $Args[0] ... })
[void]$ps.AddArgument($xmlPath)
#{
PowerShell = $ps
AsyncResult = $ps.BeginInvoke()
}
}
# get results:
foreach ($job in $jobs) {
$job.PowerShell.EndInvoke($_.AsyncResult)
$job.PowerShell.Dispose()
}

Related

Test-Path timeout for PowerShell

I'm trying to routinely check the presence of particular strings in text files on hundreds of computers on our domain.
foreach ($computer in $computers) {
$hostname = $computer.DNSHostName
if (Test-Connection $hostname -Count 2 -Quiet) {
$FilePath = "\\" + $hostname + "c$\SomeDirectory\SomeFile.txt"
if (Test-Path -Path $FilePath) {
# Check for string
}
}
}
For the most part, the pattern of Test-Connection and then Test-Path is effective and fast. There are certain computers, however, that ping successfully but Test-Path takes around 60 seconds to resolve to FALSE. I'm not sure why, but it may be a domain trust issue.
For situations like this, I would like to have a timeout for Test-Path that defaults to FALSE if it takes more than 2 seconds.
Unfortunately the solution in a related thread (How can I wrap this Powershell cmdlet into a timeout function?) does not apply to my situation. The proposed do-while loop gets hung up in the code block.
I've been experimenting with Jobs but it appears even this won't force quit the Test-Path command:
Start-Job -ScriptBlock {param($Path) Test-Path $Path} -ArgumentList $Path | Wait-Job -Timeout 2 | Remove-Job -Force
The job continues to hang in the background. Is this the cleanest way I can achieve my requirements above? Is there a better way to timeout Test-Path so the script doesn't hang besides spawning asynchronous activities? Many thanks.
Wrap your code in a [powershell] object and call BeginInvoke() to execute it asynchronously, then use the associated WaitHandle to wait for it to complete only for a set amount of time.
$sleepDuration = Get-Random 2,3
$ps = [powershell]::Create().AddScript("Start-Sleep -Seconds $sleepDuration; 'Done!'")
# execute it asynchronously
$handle = $ps.BeginInvoke()
# Wait 2500 milliseconds for it to finish
if(-not $handle.AsyncWaitHandle.WaitOne(2500)){
throw "timed out"
return
}
# WaitOne() returned $true, let's fetch the result
$result = $ps.EndInvoke($handle)
return $result
In the example above, we randomly sleep for either 2 or 3 seconds, but set a 2 and a half second timeout - try running it a couple of times to see the effect :)

Copy-item using invoke-async in Powershell

This article shows how to use Invoke-Async in PowerShell: https://sqljana.wordpress.com/2018/03/16/powershell-sql-server-run-in-parallel-collect-sql-results-with-print-output-from-across-your-sql-farm-fast/
I wish to run in parallel the copy-item cmdlet in PowerShell because the alternative is to use FileSystemObject via Excel and copy one file at a time out of a total of millions of files.
I have cobbled together the following:
.SYNOPSIS
<Brief description>
For examples type:
Get-Help .\<filename>.ps1 -examples
.DESCRIPTION
Copys files from one path to another
.PARAMETER FileList
e.g. C:\path\to\list\of\files\to\copy.txt
.PARAMETER NumCopyThreads
default is 8 (but can be 100 if you want to stress the machine to maximum!)
.EXAMPLE
.\CopyFilesToBackup -filelist C:\path\to\list\of\files\to\copy.txt
.NOTES
#>
[CmdletBinding()]
Param(
[String] $FileList = "C:\temp\copytest.csv",
[int] $NumCopyThreads = 8
)
$filesToCopy = New-Object "System.Collections.Generic.List[fileToCopy]"
$csv = Import-Csv $FileList
foreach($item in $csv)
{
$file = New-Object fileToCopy
$file.SrcFileName = $item.SrcFileName
$file.DestFileName = $item.DestFileName
$filesToCopy.add($file)
}
$sb = [scriptblock] {
param($file)
Copy-item -Path $file.SrcFileName -Destination $file.DestFileName
}
$results = Invoke-Async -Set $filesToCopy -SetParam file -ScriptBlock $sb -Verbose -Measure:$true -ThreadCount 8
$results | Format-Table
Class fileToCopy {
[String]$SrcFileName = ""
[String]$DestFileName = ""
}
the csv input for which looks like this:
SrcFileName,DestFileName
C:\Temp\dummy-data\101438\101438-0154723869.zip,\\backupserver\Project Archives\101438\0154723869.zip
C:\Temp\dummy-data\101438\101438-0165498273.xlsx,\\backupserver\Project Archives\101438\0165498273.xlsx
What am I missing to get this working, because when I run .\CopyFiles.ps1 -FileList C:\Temp\test.csv nothing happens. The files exist in the source path, but the file objects aren't being pulled from the -Set collection. (Unless I have misunderstood how the collection is used?)
No, I can't use robocopy to do this because there are millions of files which resolve to different paths depending upon their original location.
I have no explanation for your symptom based on the code in your question (see bottom section), but I suggest basing your solution on the (now) standard Start-ThreadJob cmdlet (comes with PowerShell Core; in Windows PowerShell, install it with Install-Module ThreadJob -Scope CurrentUser, for instance[1]):
Such a solution is more efficient than use of the third-party Invoke-Async function, which as of this writing is flawed in that it waits for jobs to finish in a tight loop, which creates unnecessary processing overhead.
Start-ThreadJob jobs are a lightweight, thread-based alternative to the process-based Start-Job background jobs, yet they integrate with the standard job-management cmdlets, such as Wait-Job and Receive-Job.
Here's a self-contained example based on your code that demonstrates its use:
Note: Whether you use Start-ThreadJob or Invoke-Async, you won't be able to explicit reference custom classes such as [fileToCopy] in the script block that runs in separate threads (runspaces; see bottom section), so the solution below simply uses [pscustomobject] instances with the properties of interest for simplicity and brevity.
# Create sample CSV file with 10 rows.
$FileList = Join-Path ([IO.Path]::GetTempPath()) "tmp.$PID.csv"
#'
Foo,SrcFileName,DestFileName,Bar
1,c:\tmp\a,\\server\share\a,baz
2,c:\tmp\b,\\server\share\b,baz
3,c:\tmp\c,\\server\share\c,baz
4,c:\tmp\d,\\server\share\d,baz
5,c:\tmp\e,\\server\share\e,baz
6,c:\tmp\f,\\server\share\f,baz
7,c:\tmp\g,\\server\share\g,baz
8,c:\tmp\h,\\server\share\h,baz
9,c:\tmp\i,\\server\share\i,baz
10,c:\tmp\j,\\server\share\j,baz
'# | Set-Content $FileList
# How many threads at most to run concurrently.
$NumCopyThreads = 8
Write-Host 'Creating jobs...'
$dtStart = [datetime]::UtcNow
# Import the CSV data and transform it to [pscustomobject] instances
# with only .SrcFileName and .DestFileName properties - they take
# the place of your original [fileToCopy] instances.
$jobs = Import-Csv $FileList | Select-Object SrcFileName, DestFileName |
ForEach-Object {
# Start the thread job for the file pair at hand.
Start-ThreadJob -ThrottleLimit $NumCopyThreads -ArgumentList $_ {
param($f)
$simulatedRuntimeMs = 2000 # How long each job (thread) should run for.
# Delay output for a random period.
$randomSleepPeriodMs = Get-Random -Minimum 100 -Maximum $simulatedRuntimeMs
Start-Sleep -Milliseconds $randomSleepPeriodMs
# Produce output.
"Copied $($f.SrcFileName) to $($f.DestFileName)"
# Wait for the remainder of the simulated runtime.
Start-Sleep -Milliseconds ($simulatedRuntimeMs - $randomSleepPeriodMs)
}
}
Write-Host "Waiting for $($jobs.Count) jobs to complete..."
# Synchronously wait for all jobs (threads) to finish and output their results
# *as they become available*, then remove the jobs.
# NOTE: Output will typically NOT be in input order.
Receive-Job -Job $jobs -Wait -AutoRemoveJob
Write-Host "Total time lapsed: $([datetime]::UtcNow - $dtStart)"
# Clean up the temp. file
Remove-Item $FileList
The above yields something like:
Creating jobs...
Waiting for 10 jobs to complete...
Copied c:\tmp\b to \\server\share\b
Copied c:\tmp\g to \\server\share\g
Copied c:\tmp\d to \\server\share\d
Copied c:\tmp\f to \\server\share\f
Copied c:\tmp\e to \\server\share\e
Copied c:\tmp\h to \\server\share\h
Copied c:\tmp\c to \\server\share\c
Copied c:\tmp\a to \\server\share\a
Copied c:\tmp\j to \\server\share\j
Copied c:\tmp\i to \\server\share\i
Total time lapsed: 00:00:05.1961541
Note that the output received does not reflect the input order, and that the overall runtime is roughly 2 times the per-thread runtime of 2 seconds (plus overhead), because 2 "batches" have to be run due to the input count being 10, whereas only 8 threads were made available.
If you upped the thread count to 10 or more (50 is the default), the overall runtime would drop to 2 seconds plus overhead, because all jobs then run concurrently.
Caveat: The above numbers stem from running in PowerShell Core, version on Microsoft Windows 10 Pro (64-bit; Version 1903), using version 2.0.1 of the ThreadJob module.
Inexplicably, the same code is much slower in Windows PowerShell, v5.1.18362.145.
However, for performance and memory consumption it is better to use batching (chunking) in your case, i.e, to process multiple file pairs per thread.
The following solution demonstrates this approach; tweak $chunkSize to find a batch size that works for you.
# Create sample CSV file with 10 rows.
$FileList = Join-Path ([IO.Path]::GetTempPath()) "tmp.$PID.csv"
#'
Foo,SrcFileName,DestFileName,Bar
1,c:\tmp\a,\\server\share\a,baz
2,c:\tmp\b,\\server\share\b,baz
3,c:\tmp\c,\\server\share\c,baz
4,c:\tmp\d,\\server\share\d,baz
5,c:\tmp\e,\\server\share\e,baz
6,c:\tmp\f,\\server\share\f,baz
7,c:\tmp\g,\\server\share\g,baz
8,c:\tmp\h,\\server\share\h,baz
9,c:\tmp\i,\\server\share\i,baz
10,c:\tmp\j,\\server\share\j,baz
'# | Set-Content $FileList
# How many threads at most to run concurrently.
$NumCopyThreads = 8
# How many files to process per thread
$chunkSize = 3
# The script block to run in each thread, which now receives a
# $chunkSize-sized *array* of file pairs.
$jobScriptBlock = {
param([pscustomobject[]] $filePairs)
$simulatedRuntimeMs = 2000 # How long each job (thread) should run for.
# Delay output for a random period.
$randomSleepPeriodMs = Get-Random -Minimum 100 -Maximum $simulatedRuntimeMs
Start-Sleep -Milliseconds $randomSleepPeriodMs
# Produce output for each pair.
foreach ($filePair in $filePairs) {
"Copied $($filePair.SrcFileName) to $($filePair.DestFileName)"
}
# Wait for the remainder of the simulated runtime.
Start-Sleep -Milliseconds ($simulatedRuntimeMs - $randomSleepPeriodMs)
}
Write-Host 'Creating jobs...'
$dtStart = [datetime]::UtcNow
$jobs = & {
# Process the input objects in chunks.
$i = 0
$chunk = [pscustomobject[]]::new($chunkSize)
Import-Csv $FileList | Select-Object SrcFileName, DestFileName | ForEach-Object {
$chunk[$i % $chunkSize] = $_
if (++$i % $chunkSize -ne 0) { return }
# Note the need to wrap $chunk in a single-element helper array (, $chunk)
# to ensure that it is passed *as a whole* to the script block.
Start-ThreadJob -ThrottleLimit $NumCopyThreads -ArgumentList (, $chunk) -ScriptBlock $jobScriptBlock
$chunk = [pscustomobject[]]::new($chunkSize) # we must create a new array
}
# Process any remaining objects.
# Note: $chunk -ne $null returns those elements in $chunk, if any, that are non-null
if ($remainingChunk = $chunk -ne $null) {
Start-ThreadJob -ThrottleLimit $NumCopyThreads -ArgumentList (, $remainingChunk) -ScriptBlock $jobScriptBlock
}
}
Write-Host "Waiting for $($jobs.Count) jobs to complete..."
# Synchronously wait for all jobs (threads) to finish and output their results
# *as they become available*, then remove the jobs.
# NOTE: Output will typically NOT be in input order.
Receive-Job -Job $jobs -Wait -AutoRemoveJob
Write-Host "Total time lapsed: $([datetime]::UtcNow - $dtStart)"
# Clean up the temp. file
Remove-Item $FileList
While the output is effectively the same, note how only 4 jobs were created this time, each of which processed (up to) $chunkSize (3) file pairs.
As for what you tried:
The screen shot you show suggests that the problem is that your custom class, [fileToCopy], isn't visible to the script block run by Invoke-Async.
Since Invoke-Async invokes the script block via the PowerShell SDK in separate runspaces that know nothing about the caller's state, it is to be expected that these runspaces don't know your class (this equally applies to Start-ThreadJob).
However, it is unclear why that is a problem in your code, because your script block doesn't make an explicit reference to you class: your script-block parameter $file is not type-constrained (it is implicitly [object]-typed).
Therefore, simply accessing the properties of your custom-class instance inside the script block should work, and indeed does in my tests on Windows PowerShell v5.1.18362.145 on Microsoft Windows 10 Pro (64-bit; Version 1903).
However, if your real script-block code were to explicitly reference custom class [fileToCopy] - such as by defining the parameter as param([fileToToCopy] $file) - you would see the symptom.
[1] In Windows PowerShell v3 and v4, which do not come with the PowerShellGet module, Install-Module isn't available by default. However, the module can be installed on demand, as described in Installing PowerShellGet.

Increment a variable in a job

I want to increment a variable in a PowerShell job with a number defined before the job starts.
I tried with a global variable but it didn't work, so now I try to write in a file and load it in my job but that didn't work either.
I summarize my loop:
$increment = 1
$Job_Nb = 1..3
foreach ($nb in $Job_Nb) {
$increment > "C:\increment.txt"
Start-Job -Name $nb -ScriptBlock {
$increment_job = Get-Content -Path "C:\increment.txt"
$increment_job
}
$increment++
}
I want my 2 variables $increment_job and $increment to be equal.
I obtain the good result with the command Wait-Job, like that:
$increment = 1
$Job_Nb = 1..3
foreach ($nb in $Job_Nb) {
$increment > "C:\increment.txt"
Start-Job -Name $nb -ScriptBlock {
$increment_job = Get-Content -Path "C:\increment.txt"
$increment_job
} | Wait-Job | Receive-Job
$increment++
}
But I can't wait each job to finish before starting the next, it's too long... I need to execute a lot of jobs in the background.
For me, even $nb, $increment and $increment_job can be equal.
If it can help you to understand, a really simple way to put it:
$nb = 1
$Job_Nb = 1..3
foreach ($nb in $Job_Nb) {
Start-Job -Name $nb -ScriptBlock {$nb}
$nb++
}
If you want the two variables to be equal, you can just pass $increment into your script block as an argument.
# Configure the Jobs
$increment = 1
$Job_Nb = 1..3
Foreach ($nb in $Job_Nb) {
Start-Job -Name $nb -ScriptBlock {
$increment_job = $args[0]
$increment_job
} -ArgumentList $increment
$increment++
}
# Retrieve the Jobs After Waiting For All to Complete
Wait-Job -Name $job_nb | Receive-Job
The problem with your initial approach as you have discovered is that PowerShell processes the entire loop before a single job completes. Therefore, the job doesn't read the increment.txt file until after its contents are set to 3.
Passing values into the -ArgumentList parameter of a script block without a parameter block will automatically assign the arguments to the $args array. Space delimited arguments will each become an element of the array. A value not space-separated can simply be retrieved as $args or $args[0] with the difference being $args will return a type of Object[] and $args[0] will return the type of the data you passed into it.
Obviously, you do not need to wait for all jobs to complete. You can just use Get-Job to retrieve whichever jobs you want at any time.

Powershell command timeout

I am trying to execute a function or a scriptblock in powershell and set a timeout for the execution.
Basically I have the following (translated into pseudocode):
function query{
#query remote system for something
}
$computerList = Get-Content "C:\scripts\computers.txt"
foreach ($computer in $computerList){
$result = query
#do something with $result
}
The query can range from a WMI query using Get-WmiObject to a HTTP request and the script has to run in a mixed environment, which includes Windows and Unix machines which do not all have a HTTP interface.
Some of the queries will therefore necessarily hang or take a VERY long time to return.
In my quest for optimization I have written the following:
$blockofcode = {
#query remote system for something
}
foreach ($computer in $computerList){
$Job = Start-Job -ScriptBlock $blockofcode -ArgumentList $computer
Wait-Job $Job.ID -Timeout 10 | out-null
$result = Receive-Job $Job.ID
#do something with result
}
But unfortunately jobs seem to carry a LOT of overhead. In my tests a query that executes in 1.066 seconds (according to timers inside $blockofcode) took 6.964 seconds to return a result when executed as a Job. Of course it works, but I would really like to reduce that overhead. I could also start all jobs together and then wait for them to finish, but the jobs can still hang or take ridiculous amounts to time to complete.
So, on to the question: is there any way to execute a statement, function, scriptblock or even a script with a timeout that does not comprise the kind of overhead that comes with jobs? If possible I would like to run the commands in parallel, but that is not a deal-breaker.
Any help or hints would be greatly appreciated!
EDIT: running powershell V3 in a mixed windows/unix environment
Today, I ran across a similar question, and noticed that there wasn't an actual answer to this question. I created a simple PowerShell class, called TimedScript. This class provides the following functionality:
Method: Start() method to kick off the job, when you're ready
Method:GetResult() method, to retrieve the output of the script
Constructor: A constructor that takes two parameters:
ScriptBlock to execute
[int] timeout period, in milliseconds
It currently lacks:
Passing in arguments to the PowerShell ScriptBlock
Other useful features you think up
Class: TimedScript
class TimedScript {
[System.Timers.Timer] $Timer = [System.Timers.Timer]::new()
[powershell] $PowerShell
[runspace] $Runspace = [runspacefactory]::CreateRunspace()
[System.IAsyncResult] $IAsyncResult
TimedScript([ScriptBlock] $ScriptBlock, [int] $Timeout) {
$this.PowerShell = [powershell]::Create()
$this.PowerShell.AddScript($ScriptBlock)
$this.PowerShell.Runspace = $this.Runspace
$this.Timer.Interval = $Timeout
Register-ObjectEvent -InputObject $this.Timer -EventName Elapsed -MessageData $this -Action ({
$Job = $event.MessageData
$Job.PowerShell.Stop()
$Job.Runspace.Close()
$Job.Timer.Enabled = $False
})
}
### Method: Call this when you want to start the job.
[void] Start() {
$this.Runspace.Open()
$this.Timer.Start()
$this.IAsyncResult = $this.PowerShell.BeginInvoke()
}
### Method: Once the job has finished, call this to get the results
[object[]] GetResult() {
return $this.PowerShell.EndInvoke($this.IAsyncResult)
}
}
Example Usage of TimedScript Class
# EXAMPLE: The timeout period is set longer than the execution time of the script, so this will succeed
$Job1 = [TimedScript]::new({ Start-Sleep -Seconds 2 }, 4000)
# EXAMPLE: This script will fail. Even though Get-Process returns quickly, the Start-Sleep call will cause it to be terminated by its Timer.
$Job2 = [TimedScript]::new({ Get-Process -Name s*; Start-Sleep -Seconds 3 }, 2000)
# EXAMPLE: This job will fail, because the timeout is less than the script execution time.
$Job3 = [TimedScript]::new({ Start-Sleep -Seconds 3 }, 1000)
$Job1.Start()
$Job2.Start()
$Job3.Start()
Code is also hosted on GitHub Gist.
I think you might want to investigate using Powershell runspaces:
http://learn-powershell.net/2012/05/13/using-background-runspaces-instead-of-psjobs-for-better-performance/

Powershell: How do I get the exit code returned from a process run inside a PsJob?

I have the following job in powershell:
$job = start-job {
...
c:\utils\MyToolReturningSomeExitCode.cmd
} -ArgumentList $JobFile
How do I access the exit code returned by c:\utils\MyToolReturningSomeExitCode.cmd ? I have tried several options, but the only one I could find that works is this:
$job = start-job {
...
c:\utils\MyToolReturningSomeExitCode.cmd
$LASTEXITCODE
} -ArgumentList $JobFile
...
# collect the output
$exitCode = $job | Wait-Job | Receive-Job -ErrorAction SilentlyContinue
# output all, except the last line
$exitCode[0..($exitCode.Length - 2)]
# the last line is the exit code
exit $exitCode[-1]
I find this approach too wry to my delicate taste. Can anyone suggest a nicer solution?
Important, I have read in the documentation that powershell must be run as administrator in order for the job related remoting stuff to work. I cannot run it as administrator, hence -ErrorAction SilentlyContinue. So, I am looking for solutions not requiring admin privileges.
Thanks.
If all you need is to do something in background while the main script does something else then PowerShell class is enough (and it is normally faster). Besides it allows passing in a live object in order to return something in addition to output via parameters.
$code = #{}
$job = [PowerShell]::Create().AddScript({
param($JobFile, $Result)
cmd /c exit 42
$Result.Value = $LASTEXITCODE
'some output'
}).AddArgument($JobFile).AddArgument($code)
# start thee job
$async = $job.BeginInvoke()
# do some other work while $job is working
#.....
# end the job, get results
$job.EndInvoke($async)
# the exit code is $code.Value
"Code = $($code.Value)"
UPDATE
The original code was with [ref] object. It works in PS V3 CTP2 but does not work in V2. So I corrected it, we can use other objects instead, a hashtable, for example, in order to return some data via parameters.
One way you can detect if the background job failed or not based on an exit code is to evaluate the exit code inside the background job itself and throw an exception if the exit code indicates an error occurred. For instance, consider the following example:
$job = start-job {
# ...
$output = & C:\utils\MyToolReturningSomeExitCode.cmd 2>&1
if ($LASTEXITCODE -ne 0) {
throw "Job failed. The error was: {0}." -f ([string] $output)
}
} -ArgumentList $JobFile
$myJob = Start-Job -ScriptBlock $job | Wait-Job
if ($myJob.State -eq 'Failed') {
Receive-Job -Job $myJob
}
A couple things of note in this example. I am redirecting the standard error output stream to the standard output stream to capture all textual output from the batch script and returning it if the exit code is non-zero indicating it failed to run. By throwing an exception this way the background job object State property will let us know the result of the job.