I have a recursive function that is executed around 750~ times - iterating over XML files and processing. The code is running using Start-Job
Example below:
$job = Start-Job -ScriptBlock {
function Test-Function {
Param
(
$count
)
Write-Host "Count is: $count"
$count++
Test-Function -count $count
}
Test-Function -count 1
}
Output:
$job | Receive-Job
Count is: 224
Count is: 225
Count is: 226
Count is: 227
The script failed due to call depth overflow.
The depth overflow occurs at 227 consistently on my machine. If I remove Start-Job, I can reach 750~ (and further). I am using jobs for batch processing.
Is there a way to configure the depth overflow value when using Start-Job?
Is this a limitation of PowerShell Jobs?
I can't answer about the specifics of call depth overflow limitations in PS 5.1 / 7.2 but you could do your recursion based-off a Queue within the job.
So instead of doing the recursion within the function, you do it from outside (still within the job though).
Here's what this look like.
$job = Start-Job -ScriptBlock {
$Queue = [System.Collections.Queue]::new()
function Test-Function {
Param
(
$count
)
Write-Host "Count is: $count"
$count++
# Next item to process.
$Queue.Enqueue($Count)
}
# Call the function once
Test-Function -count 1
# Process the queue
while ($Queue.Count -gt 0) {
Test-Function -count $Queue.Dequeue()
}
}
Reference:
.net Queue class
Not an answer that would solve the problem but rather an informative one. You could use an instance of the PowerShell SDK's [powershell] class instead of Start-Job, which can handle more levels of recursion (by a big amount) in case it helps, here are my results:
Technique
Iterations
PowerShell Version
Operating System
Start-Job
226~
5.1
Windows 10
Start-Job
2008~
7.2.1
Linux
PowerShell Instance
4932~
5.1
Windows 10
PowerShell Instance
11193~
7.2.1
Linux
Code to reproduce
$instance = [powershell]::Create().AddScript({
function Test-Function {
Param($count)
Write-Host "Count is: $count"
$count++
Test-Function -count $count
}
Test-Function -count 1
})
$handle = $instance.BeginInvoke()
do {
$done = $handle.AsyncWaitHandle.WaitOne(500)
} until($done)
$instance.Streams.Information[-1]
$instance.Dispose()
Related
Assuming Get-Foo and Get-Foo2 and Deploy-Jobs are 3 functions that are part of a very large module. I would like to use Get-Foo and Get-Foo2 in Deploy-Jobs's Start-ThreadJob (below) without reloading the entire module each time.
Is an working example available for how to do this?
function Deploy-Jobs {
foreach ($Device in $Devices) {
Start-ThreadJob -Name $Device -ThrottleLimit 50 -InitializationScript $initScript -ScriptBlock {
param($Device)
Get-Foo | Get-Foo2 -List
} -ArgumentList $Device | out-null
}
}
The method you can use to pass the function's definition to a different scope is the same for Invoke-Command (when PSRemoting), Start-Job, Start-ThreadJob and ForeEach-Object -Parallel. Since you want to invoke 2 different functions in your job's script block, I don't think -InitializationScript is an option, and even if it is, it might make the code even more complicated than it should be.
You can use this as an example of how you can store 2 function definitions in an array ($def), which is then passed to the scope of each TreadJob, this array is then used to define each function in said scope to be later used by each Job.
function Say-Hello {
"Hello world!"
}
function From-ThreadJob {
param($i)
"From ThreadJob # $i"
}
$def = #(
${function:Say-Hello}.ToString()
${function:From-ThreadJob}.ToString()
)
function Run-Jobs {
param($numerOfJobs, $functionDefinitions)
$jobs = foreach($i in 1..$numerOfJobs) {
Start-ThreadJob -ScriptBlock {
# bring the functions definition to this scope
$helloFunc, $threadJobFunc = $using:functionDefinitions
# define them in this scope
${function:Say-Hello} = $helloFunc
${function:From-ThreadJob} = $threadJobFunc
# sleep random seconds
Start-Sleep (Get-Random -Maximum 10)
# combine the output from both functions
(Say-Hello) + (From-ThreadJob -i $using:i)
}
}
Receive-Job $jobs -AutoRemoveJob -Wait
}
Run-Jobs -numerOfJobs 10 -functionDefinitions $def
Assuming Get-Foo and Get-Foo2 and Deploy-Jobs are 3 functions that are part of a very large module. I would like to use Get-Foo and Get-Foo2 in Deploy-Jobs's Start-ThreadJob (below) without reloading the entire module each time.
Is an working example available for how to do this?
function Deploy-Jobs {
foreach ($Device in $Devices) {
Start-ThreadJob -Name $Device -ThrottleLimit 50 -InitializationScript $initScript -ScriptBlock {
param($Device)
Get-Foo | Get-Foo2 -List
} -ArgumentList $Device | out-null
}
}
The method you can use to pass the function's definition to a different scope is the same for Invoke-Command (when PSRemoting), Start-Job, Start-ThreadJob and ForeEach-Object -Parallel. Since you want to invoke 2 different functions in your job's script block, I don't think -InitializationScript is an option, and even if it is, it might make the code even more complicated than it should be.
You can use this as an example of how you can store 2 function definitions in an array ($def), which is then passed to the scope of each TreadJob, this array is then used to define each function in said scope to be later used by each Job.
function Say-Hello {
"Hello world!"
}
function From-ThreadJob {
param($i)
"From ThreadJob # $i"
}
$def = #(
${function:Say-Hello}.ToString()
${function:From-ThreadJob}.ToString()
)
function Run-Jobs {
param($numerOfJobs, $functionDefinitions)
$jobs = foreach($i in 1..$numerOfJobs) {
Start-ThreadJob -ScriptBlock {
# bring the functions definition to this scope
$helloFunc, $threadJobFunc = $using:functionDefinitions
# define them in this scope
${function:Say-Hello} = $helloFunc
${function:From-ThreadJob} = $threadJobFunc
# sleep random seconds
Start-Sleep (Get-Random -Maximum 10)
# combine the output from both functions
(Say-Hello) + (From-ThreadJob -i $using:i)
}
}
Receive-Job $jobs -AutoRemoveJob -Wait
}
Run-Jobs -numerOfJobs 10 -functionDefinitions $def
I have the following PowerShell function to help me do benchmarks. The idea is that you provide the command and a number and the function will run the code that number of times. After that the function will report the testing result, such as Min, Max and Average time taken.
function Measure-MyCommand()
{
[CmdletBinding()]
Param (
[Parameter(Mandatory = $True)] [scriptblock] $ScriptBlock,
[Parameter()] [int] $Count = 1,
[Parameter()] [switch] $ShowOutput
)
$time_elapsed = #();
while($Count -ge 1) {
$timer = New-Object 'Diagnostics.Stopwatch';
$timer.Start();
$temp = & $ScriptBlock;
if($ShowOutput) {
Write-Output $temp;
}
$timer.Stop();
$time_elapsed += $timer.Elapsed;
$Count--;
}
$stats = $time_elapsed | Measure-Object -Average -Minimum -Maximum -Property Ticks;
Write-Host "Min: $((New-Object 'System.TimeSpan' $stats.Minimum).TotalMilliseconds) ms";
Write-Host "Max: $((New-Object 'System.TimeSpan' $stats.Maximum).TotalMilliseconds) ms";
Write-Host "Avg: $((New-Object 'System.TimeSpan' $stats.Average).TotalMilliseconds) ms";
}
The problem is with the switch parameter $ShowOutput. As I understand, when you provide a switch parameter, its value is true. Otherwise it's false. However it doesn't seem to work. See my testing.
PS C:\> Measure-MyCommand -ScriptBlock {Write-Host "f"} -Count 3 -ShowOutput
f
f
f
Min: 0.4935 ms
Max: 0.8392 ms
Avg: 0.6115 ms
PS C:\> Measure-MyCommand -ScriptBlock {Write-Host "f"} -Count 3
f
f
f
Min: 0.4955 ms
Max: 0.8296 ms
Avg: 0.6251 ms
PS C:\>
Can anyone help to explain it please?
This is because Write-Host doesn't return the object and is not a part of pipeline stream, instead it sends text directly to console. You can't simply hide Write-Host output by writing it to variable, because Write-Host writes to console stream which isn't exposed in PowerShell before version 5. What you need is to replace Write-Host with Write-Output cmdlet call. Then you get what you expect. This solution is valid for PowerShell 2.0+.
Update:
Starting with PowerShell 5, Write-Host writes to its dedicated stream and you can handle and redirect this stream somewhere else. See this response for more details: https://stackoverflow.com/a/60353648/3997611.
The problem is with the specific script block you're passing, not with your [switch] parameter declaration (which works fine):
{Write-Host "f"}
By using Write-Host, you're bypassing PowerShell's success output stream (which the assignment to variable $temp collects) and printing directly to the host[1] (the console), so your output always prints.
To print to the success output stream, use Write-Output, or better yet, use PowerShell's implicit output feature:
# "f", due not being capture or redirected, is implicitly sent
# to the success output stream.
Measure-MyCommand -ScriptBlock { "f" } -Count 3 -ShowOutput
[1] In PowerShell 5.0 and higher, Write-Host now writes to a new stream, the information stream (number 6), which by default prints to the host. See about_Redirection.
Therefore, a 6> redirection now does allow you to capture Write-Host output; e.g.: $temp = Write-Host hi 6>&1. Note that the type of the objects captured this way is System.Management.Automation.InformationRecord.
I'm basically building my own parallel foreach pipeline function, using runspaces.
My problem is: I call my function like this:
somePipeline | MyNewForeachFunction { scriptBlockHere } | pipelineGoesOn...
How can I pass the $_ parameter correctly into the ScriptBlock? It works when the ScriptBlock contains as first line
param($_)
But as you might have noticed, the powershell built-in ForEach-Object and Where-Object do not need such a parameter declaration in every ScriptBlock that is passed to them.
Thanks for your answers in advance
fjf2002
EDIT:
The goal is: I want comfort for the users of function MyNewForeachFunction - they shoudln't need to write a line param($_) in their script blocks.
Inside MyNewForeachFunction, The ScriptBlock is currently called via
$PSInstance = [powershell]::Create().AddScript($ScriptBlock).AddParameter('_', $_)
$PSInstance.BeginInvoke()
EDIT2:
The point is, how does for example the implementation of the built-in function ForEach-Object achieve that $_ need't be declared as a parameter in its ScriptBlock parameter, and can I use that functionality, too?
(If the answer is, ForEach-Object is a built-in function and uses some magic I can't use, then this would disqualify the language PowerShell as a whole in my opinion)
EDIT3:
Thanks to mklement0, I could finally build my general foreach loop. Here's the code:
function ForEachParallel {
[CmdletBinding()]
Param(
[Parameter(Mandatory)] [ScriptBlock] $ScriptBlock,
[Parameter(Mandatory=$false)] [int] $PoolSize = 20,
[Parameter(ValueFromPipeline)] $PipelineObject
)
Begin {
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $poolSize)
$RunspacePool.Open()
$Runspaces = #()
}
Process {
$PSInstance = [powershell]::Create().
AddCommand('Set-Variable').AddParameter('Name', '_').AddParameter('Value', $PipelineObject).
AddCommand('Set-Variable').AddParameter('Name', 'ErrorActionPreference').AddParameter('Value', 'Stop').
AddScript($ScriptBlock)
$PSInstance.RunspacePool = $RunspacePool
$Runspaces += New-Object PSObject -Property #{
Instance = $PSInstance
IAResult = $PSInstance.BeginInvoke()
Argument = $PipelineObject
}
}
End {
while($True) {
$completedRunspaces = #($Runspaces | where {$_.IAResult.IsCompleted})
$completedRunspaces | foreach {
Write-Output $_.Instance.EndInvoke($_.IAResult)
$_.Instance.Dispose()
}
if($completedRunspaces.Count -eq $Runspaces.Count) {
break
}
$Runspaces = #($Runspaces | where { $completedRunspaces -notcontains $_ })
Start-Sleep -Milliseconds 250
}
$RunspacePool.Close()
$RunspacePool.Dispose()
}
}
Code partly from MathiasR.Jessen, Why PowerShell workflow is significantly slower than non-workflow script for XML file analysis
The key is to define $_ as a variable that your script block can see, via a call to Set-Variable.
Here's a simple example:
function MyNewForeachFunction {
[CmdletBinding()]
param(
[Parameter(Mandatory)]
[scriptblock] $ScriptBlock
,
[Parameter(ValueFromPipeline)]
$InputObject
)
process {
$PSInstance = [powershell]::Create()
# Add a call to define $_ based on the current pipeline input object
$null = $PSInstance.
AddCommand('Set-Variable').
AddParameter('Name', '_').
AddParameter('Value', $InputObject).
AddScript($ScriptBlock)
$PSInstance.Invoke()
}
}
# Invoke with sample values.
1, (Get-Date) | MyNewForeachFunction { "[$_]" }
The above yields something like:
[1]
[10/26/2018 00:17:37]
What I think you're looking for (and what I was looking for) is to support a "delay-bind" script block, supported in PowerShell 5.1+. The Microsoft documentation tells a bit about what's required, but doesn't provide any user-script examples (currently).
The gist is that PowerShell will implicitly detect that your function can accept a delay-bind script block if it defines an explicitly typed pipeline parameter (either by Value or by PropertyName), as long as it's not of type [scriptblock] or type [object].
function Test-DelayedBinding {
param(
# this is our typed pipeline parameter
# per doc this cannot be of type [scriptblock] or [object],
# but testing shows that type [object] may be permitted
[Parameter(ValueFromPipeline, Mandatory)][string]$string,
# this is our scriptblock parameter
[Parameter(Position=0)][scriptblock]$filter
)
Process {
if (&$filter $string) {
Write-Output $string
}
}
}
# sample invocation
>'foo', 'fi', 'foofoo', 'fib' | Test-DelayedBinding { return $_ -match 'foo' }
foo
foofoo
Note that the delay-bind will only be applied if input is piped into the function, and that the script block must use named parameters (not $args) if additional parameters are desired.
The frustrating part is that there is no way to explicitly specify that delay-bind should be used, and errors resulting from incorrectly structuring your function may be non-obvious.
Maybe this can help.
I'd normally run auto-generated jobs in parallel this way:
Get-Job | Remove-Job
foreach ($param in #(3,4,5)) {
Start-Job -ScriptBlock {param($lag); sleep $lag; Write-Output "slept for $lag seconds" } -ArgumentList #($param)
}
Get-Job | Wait-Job | Receive-Job
If I understand you correctly, you are trying to get rid of param() inside the scriptblock. You may try to wrap that SB with another one. Below is the workaround for my sample:
Get-Job | Remove-Job
#scriptblock with no parameter
$job = { sleep $lag; Write-Output "slept for $lag seconds" }
foreach ($param in #(3,4,5)) {
Start-Job -ScriptBlock {param($param, $job)
$lag = $param
$script = [string]$job
Invoke-Command -ScriptBlock ([Scriptblock]::Create($script))
} -ArgumentList #($param, $job)
}
Get-Job | Wait-Job | Receive-Job
# I was looking for an easy way to do this in a scripted function,
# and the below worked for me in PSVersion 5.1.17134.590
function Test-ScriptBlock {
param(
[string]$Value,
[ScriptBlock]$FilterScript={$_}
)
$_ = $Value
& $FilterScript
}
Test-ScriptBlock -Value 'unimportant/long/path/to/foo.bar' -FilterScript { [Regex]::Replace($_,'unimportant/','') }
I am trying to execute a function or a scriptblock in powershell and set a timeout for the execution.
Basically I have the following (translated into pseudocode):
function query{
#query remote system for something
}
$computerList = Get-Content "C:\scripts\computers.txt"
foreach ($computer in $computerList){
$result = query
#do something with $result
}
The query can range from a WMI query using Get-WmiObject to a HTTP request and the script has to run in a mixed environment, which includes Windows and Unix machines which do not all have a HTTP interface.
Some of the queries will therefore necessarily hang or take a VERY long time to return.
In my quest for optimization I have written the following:
$blockofcode = {
#query remote system for something
}
foreach ($computer in $computerList){
$Job = Start-Job -ScriptBlock $blockofcode -ArgumentList $computer
Wait-Job $Job.ID -Timeout 10 | out-null
$result = Receive-Job $Job.ID
#do something with result
}
But unfortunately jobs seem to carry a LOT of overhead. In my tests a query that executes in 1.066 seconds (according to timers inside $blockofcode) took 6.964 seconds to return a result when executed as a Job. Of course it works, but I would really like to reduce that overhead. I could also start all jobs together and then wait for them to finish, but the jobs can still hang or take ridiculous amounts to time to complete.
So, on to the question: is there any way to execute a statement, function, scriptblock or even a script with a timeout that does not comprise the kind of overhead that comes with jobs? If possible I would like to run the commands in parallel, but that is not a deal-breaker.
Any help or hints would be greatly appreciated!
EDIT: running powershell V3 in a mixed windows/unix environment
Today, I ran across a similar question, and noticed that there wasn't an actual answer to this question. I created a simple PowerShell class, called TimedScript. This class provides the following functionality:
Method: Start() method to kick off the job, when you're ready
Method:GetResult() method, to retrieve the output of the script
Constructor: A constructor that takes two parameters:
ScriptBlock to execute
[int] timeout period, in milliseconds
It currently lacks:
Passing in arguments to the PowerShell ScriptBlock
Other useful features you think up
Class: TimedScript
class TimedScript {
[System.Timers.Timer] $Timer = [System.Timers.Timer]::new()
[powershell] $PowerShell
[runspace] $Runspace = [runspacefactory]::CreateRunspace()
[System.IAsyncResult] $IAsyncResult
TimedScript([ScriptBlock] $ScriptBlock, [int] $Timeout) {
$this.PowerShell = [powershell]::Create()
$this.PowerShell.AddScript($ScriptBlock)
$this.PowerShell.Runspace = $this.Runspace
$this.Timer.Interval = $Timeout
Register-ObjectEvent -InputObject $this.Timer -EventName Elapsed -MessageData $this -Action ({
$Job = $event.MessageData
$Job.PowerShell.Stop()
$Job.Runspace.Close()
$Job.Timer.Enabled = $False
})
}
### Method: Call this when you want to start the job.
[void] Start() {
$this.Runspace.Open()
$this.Timer.Start()
$this.IAsyncResult = $this.PowerShell.BeginInvoke()
}
### Method: Once the job has finished, call this to get the results
[object[]] GetResult() {
return $this.PowerShell.EndInvoke($this.IAsyncResult)
}
}
Example Usage of TimedScript Class
# EXAMPLE: The timeout period is set longer than the execution time of the script, so this will succeed
$Job1 = [TimedScript]::new({ Start-Sleep -Seconds 2 }, 4000)
# EXAMPLE: This script will fail. Even though Get-Process returns quickly, the Start-Sleep call will cause it to be terminated by its Timer.
$Job2 = [TimedScript]::new({ Get-Process -Name s*; Start-Sleep -Seconds 3 }, 2000)
# EXAMPLE: This job will fail, because the timeout is less than the script execution time.
$Job3 = [TimedScript]::new({ Start-Sleep -Seconds 3 }, 1000)
$Job1.Start()
$Job2.Start()
$Job3.Start()
Code is also hosted on GitHub Gist.
I think you might want to investigate using Powershell runspaces:
http://learn-powershell.net/2012/05/13/using-background-runspaces-instead-of-psjobs-for-better-performance/