What is the best way to structure an advanced multi-threaded Powershell function to work both on the pipeline as well as non-pipeline calls? - powershell

I have a function that flattens directories in parallel for multiple folders. It works great when I call it in a non-pipeline fashion:
$Files = Get-Content $FileList
Merge-FlattenDirectory -InputPath $Files
But now I want to update my function to work both on the pipeline as well as when called off the pipeline. Someone on discord recommended the best way to do this is to defer all processing to the end block, and use the begin and process blocks to add pipeline input to a list. Basically this:
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$List = [System.Collections.Generic.List[PSObject]]#()
}
process {
if(($InputPath.GetType().BaseType.Name) -eq "Array"){
Write-Host "Array detected"
$List = $InputPath
} else {
$List.Add($InputPath)
}
}
end {
$List | ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
However, this is still not working on the pipeline for me. When I do this:
$Files | Merge-FlattenDirectory
It actually passes individual arrays of length 1 to the function. So testing for ($InputPath.GetType().BaseType.Name) -eq "Array" isn't really the way forward, as only the first pipeline value gets used.
My million dollar question is the following:
What is the most robust way in the process block to differentiate between pipeline input and non-pipeline input? The function should add all pipeline input to a generic list, and in the case of non-pipeline input, should skip this step and process the collection as-is moving directly to the end block.
The only thing I could think of is the following:
if((($InputPath.GetType().BaseType.Name) -eq "Array") -and ($InputPath.Length -gt 1)){
$List = $InputPath
} else {
$List.Add($InputPath)
}
But this just doesn't feel right. Any help would be extremely appreciated.

You might just do
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$List = [System.Collections.Generic.List[String]]::new()
}
process {
$InputPath.ForEach{ $List.Add($_) }
}
end {
$List |ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
Which will process the input values either from the pipeline or the input parameter.
But that doesn't comply with the Strongly Encouraged Development Guidelines to Support Well Defined Pipeline Input (SC02) especially for Implement for the Middle of a Pipeline
This means if you correctly want to implement the PowerShell Pipeline, you should directly (parallel) process your items in the Process block and immediately output any results from there:
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$SharedPool = New-ThreadPool -Limit 16
}
process {
$InputPath |ForEach-Object -Parallel -threadPool $Using:SharedPool {
# Process your current item ($_) here ...
}
}
}
In general, script authors are advised to use idiomatic PowerShell which often comes down to lesser object manipulations and usually results in a correct PowerShell pipeline implementation with less memory usage.
Please let me know if you intent to collect (and e.g. order) the output based on this suggestion.
Caveat
The full invocation of the ForEach-Object -Parallel cmdlet itself is somewhat inefficient as you open and close a new pipeline each iteration. To resolve this, my whole general statement about idiomatic PowerShell falls a bit apart, but should be resolvable by using a steppable pipeline.
To implement this, you might use the ForEach-Object cmdlet as a template:
[System.Management.Automation.ProxyCommand]::Create((Get-Command ForEach-Object))
And set the ThrottleLimit of the ThreadPool in the Begin Block
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$PSBoundParameters += #{
ThrottleLimit = 4
Parallel = {
Write-Host (Get-Date).ToString('HH:mm:ss.s') 'Started' $_
Start-Sleep -Seconds 3
Write-Host (Get-Date).ToString('HH:mm:ss.s') 'finished' $_
}
}
$wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('ForEach-Object', [System.Management.Automation.CommandTypes]::Cmdlet)
$scriptCmd = {& $wrappedCmd #PSBoundParameters }
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
}
process {
$InputPath.ForEach{ $steppablePipeline.Process($_) }
}
end {
$steppablePipeline.End()
}
}
1..5 |Merge-FlattenDirectory
17:57:40.40 Started 3
17:57:40.40 Started 2
17:57:40.40 Started 1
17:57:40.40 Started 4
17:57:43.43 finished 3
17:57:43.43 finished 1
17:57:43.43 finished 4
17:57:43.43 finished 2
17:57:43.43 Started 5
17:57:46.46 finished 5

Here's how I would write it with comments where I have changed it.
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
$InputPath # <this may be a string, a path object, a file object,
# or an array
)
begin {
$List = #() # Use an array for less than 100K objects.
}
process {
#even if InputPath is a string for each will iterate once and set $p
#if it is an array of strings add each. If it is one or more objects,
#try to find the right property for the path.
foreach ($p in $inputPath) {
if ($p -is [String]) {$list += $p }
elseif ($p.Path) {$list += $p.Path}
elseif ($p.FullName) {$list += $p.FullName}
elseif ($p.PSPath) {$list += $p.PSPath}
else {Write-warning "$P makes no sense"}
}
}
end {
$List | ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
#iRon That "write for the middle of the pipeline" in the docs does not mean write everything in the process block .
function one { #(1,2,3,4,5) }
function two {
param ([parameter(ValueFromPipeline=$true)] $p )
begin {Write-host "Two begins" ; $a = #() }
process {Write-host "Two received $P" ; $a += $p }
end {Write-host "Two ending" ; $a; Write-host "Two ended"}
}
function three {
param ([parameter(ValueFromPipeline=$true)] $p )
begin {Write-host "three Starts" }
process {Write-host "Three received $P" }
end {Write-host "Three ended" }
}
one | two | three
One is treated as an end block.
One, two and three all run their begins (one's is empty).
One's output goes to the process block in two, which just collects the data. Two's end block starts after one's end-block ends, and sends output
At this point three's process block gets input. After two's end block ends, three's endblock runs.
Two is "in the middle" it has a process block to deal with multiple piped items (if it were all one 'end' block it would only process the last one).

Related

Redirect/Capture Write-Host output even with -NoNewLine

The function Select-WriteHost from an answer to another Stackoverflow question (see code below) will redirect/capture Write-Host output:
Example:
PS> $test = 'a','b','c' |%{ Write-Host $_ } | Select-WriteHost
a
b
c
PS> $test
a
b
c
However, if I add -NoNewLine to Write-Host, Select-WriteHost will ignore it:
PS> $test = 'a','b','c' |%{ Write-Host -NoNewLine $_ } | Select-WriteHost
abc
PS> $test
a
b
c
Can anyone figure out how to modify Select-WriteHost (code below) to also support -NoNewLine?
function Select-WriteHost
{
[CmdletBinding(DefaultParameterSetName = 'FromPipeline')]
param(
[Parameter(ValueFromPipeline = $true, ParameterSetName = 'FromPipeline')]
[object] $InputObject,
[Parameter(Mandatory = $true, ParameterSetName = 'FromScriptblock', Position = 0)]
[ScriptBlock] $ScriptBlock,
[switch] $Quiet
)
begin
{
function Cleanup
{
# Clear out our proxy version of write-host
remove-item function:\write-host -ea 0
}
function ReplaceWriteHost([switch] $Quiet, [string] $Scope)
{
# Create a proxy for write-host
$metaData = New-Object System.Management.Automation.CommandMetaData (Get-Command 'Microsoft.PowerShell.Utility\Write-Host')
$proxy = [System.Management.Automation.ProxyCommand]::create($metaData)
# Change its behavior
$content = if($quiet)
{
# In quiet mode, whack the entire function body,
# simply pass input directly to the pipeline
$proxy -replace '(?s)\bbegin\b.+', '$Object'
}
else
{
# In noisy mode, pass input to the pipeline, but allow
# real Write-Host to process as well
$proxy -replace '(\$steppablePipeline\.Process)', '$Object; $1'
}
# Load our version into the specified scope
Invoke-Expression "function ${scope}:Write-Host { $content }"
}
Cleanup
# If we are running at the end of a pipeline, we need
# to immediately inject our version into global
# scope, so that everybody else in the pipeline
# uses it. This works great, but it is dangerous
# if we don't clean up properly.
if($pscmdlet.ParameterSetName -eq 'FromPipeline')
{
ReplaceWriteHost -Quiet:$quiet -Scope 'global'
}
}
process
{
# If a scriptblock was passed to us, then we can declare
# our version as local scope and let the runtime take
# it out of scope for us. It is much safer, but it
# won't work in the pipeline scenario.
#
# The scriptblock will inherit our version automatically
# as it's in a child scope.
if($pscmdlet.ParameterSetName -eq 'FromScriptBlock')
{
. ReplaceWriteHost -Quiet:$quiet -Scope 'local'
& $scriptblock
}
else
{
# In a pipeline scenario, just pass input along
$InputObject
}
}
end
{
Cleanup
}
}
PS: I tried inserting -NoNewLine to the line below (just to see how it would react) however, its producing the exception, "Missing function body in function declaration"
Invoke-Expression "function ${scope}:Write-Host { $content }"
to:
Invoke-Expression "function ${scope}:Write-Host -NoNewLine { $content }"
(Just to recap) Write-Host is meant for host, i.e. display / console output only, and originally couldn't be captured (in-session) at all. In PowerShell 5, the ability to capture Write-Host output was introduced via the information stream, whose number is 6, enabling techniques such as redirection 6>&1 in order to merge Write-Host output into the success (output) stream (whose number is 1), where it can be captured as usual.
However, due to your desire to use the -NoNewLine switch across several calls, 6>&1 by itself is not enough, because the concept of not emitting a newline only applies to display output, not to distinct objects in the pipeline.
E.g., in the following call -NoNewLine is effectively ignored, because there are multiple Write-Host calls producing multiple output objects (strings) that are captured separately:
'a','b','c' | % { Write-Host $_ -NoNewline } 6>&1
Your Select-WriteHost function - necessary in PowerShell 4 and below only - would have the same problem if you adapted it to support the -NoNewLine switch.
An aside re 6>&1: The strings that Write-Host invariably outputs are wrapped in [System.Management.Automation.InformationRecord] instances, due to being re-routed via the information stream. In display output you will not notice the difference, but to get the actual string you need to access the .MessageData.Message property or simply call .ToString().
There is no general solution I am aware of, but situationally the following may work:
If you know that the code of interest uses only Write-Host -NoNewLine calls:
Simply join the resulting strings after the fact without a separator to emulate -NoNewLine behavior:
# -> 'abc'
# Note: Whether or not you use -NoNewLine here makes no difference.
-join ('a','b','c' | % { Write-Host -NoNewLine $_ })
If you know that all instances of Write-Host -NoNewLine calls apply only to their respective pipeline input, you can write a simplified proxy function that collects all input up front and performs separator-less concatenation of the stringified objects:
# -> 'abc'
$test = & {
# Simplified proxy function
function Write-Host {
param([switch] $NoNewLine)
if ($MyInvocation.ExpectingInput) { $allInput = $Input }
else { $allInput = $args }
if ($NoNewLine) { -join $allInput.ForEach({ "$_" }) }
else { $allInput.ForEach({ "$_" }) }
}
# Important: pipe all input directly.
'a','b','c' | Write-Host -NoNewLine
}

Function pipeline how to skip process block? Similar to continue in a foreach loop

I'm working on a function to accept pipeline input of multiple objects.
I want to be able to skip the processing of an object if the Property matches certain criteria; similar to using "Continue" in a foreach loop. I realize that I could use switch/If statements but I'm trying to find if there's a way to just skip that object altogether.
The simple code below shows more or less what I'm trying to accomplish
Function Get-Foobar{
Param (
[parameter(ValueFromPipelineByPropertyName)][string]$Item
)
Begin{
# DO SOME INIT STUFF
}
Process{
If($Item -ne "Foobar") {
#SOMEHOW SKIP THE PROCESS BLOCK#
# CONTINUE? BREAK? RETURN? WHAT?
}
Else{
Write-host "Item was $($Item)"
}
#Continue doing stuff here.
Write-host "Evertying was totally $Item"
}
End {
# DO CLEAN UP STUFF
}
}
$test = #([pscustomobject]#{Item = "Foobar"},[pscustomobject]#{Item = "NotFubar"} )
$test |Get-Foobar
Item was Foobar
Evertying was totally Foobar
Evertying was totally NotFubar <---- liked to skip this without having it in the upper if/Else
The Process block is a self-contained scriptblock, so continue won't affect it.
Use return inside process to skip to the next input item in the pipeline:
function Get-Foobar {
param(
[Parameter(ValueFromPipelineByPropertyName = $true)]
[string]$Item
)
Begin {
# DO SOME INIT STUFF
}
Process {
if ($Item -ne "Foobar") {
# skip it!
return
}
else {
Write-host "Item was $($Item)"
}
#Continue doing stuff here.
Write-host "Evertying was totally $Item"
}
End {
# DO CLEAN UP STUFF
}
}

How to pass $_ ($PSItem) in a ScriptBlock

I'm basically building my own parallel foreach pipeline function, using runspaces.
My problem is: I call my function like this:
somePipeline | MyNewForeachFunction { scriptBlockHere } | pipelineGoesOn...
How can I pass the $_ parameter correctly into the ScriptBlock? It works when the ScriptBlock contains as first line
param($_)
But as you might have noticed, the powershell built-in ForEach-Object and Where-Object do not need such a parameter declaration in every ScriptBlock that is passed to them.
Thanks for your answers in advance
fjf2002
EDIT:
The goal is: I want comfort for the users of function MyNewForeachFunction - they shoudln't need to write a line param($_) in their script blocks.
Inside MyNewForeachFunction, The ScriptBlock is currently called via
$PSInstance = [powershell]::Create().AddScript($ScriptBlock).AddParameter('_', $_)
$PSInstance.BeginInvoke()
EDIT2:
The point is, how does for example the implementation of the built-in function ForEach-Object achieve that $_ need't be declared as a parameter in its ScriptBlock parameter, and can I use that functionality, too?
(If the answer is, ForEach-Object is a built-in function and uses some magic I can't use, then this would disqualify the language PowerShell as a whole in my opinion)
EDIT3:
Thanks to mklement0, I could finally build my general foreach loop. Here's the code:
function ForEachParallel {
[CmdletBinding()]
Param(
[Parameter(Mandatory)] [ScriptBlock] $ScriptBlock,
[Parameter(Mandatory=$false)] [int] $PoolSize = 20,
[Parameter(ValueFromPipeline)] $PipelineObject
)
Begin {
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $poolSize)
$RunspacePool.Open()
$Runspaces = #()
}
Process {
$PSInstance = [powershell]::Create().
AddCommand('Set-Variable').AddParameter('Name', '_').AddParameter('Value', $PipelineObject).
AddCommand('Set-Variable').AddParameter('Name', 'ErrorActionPreference').AddParameter('Value', 'Stop').
AddScript($ScriptBlock)
$PSInstance.RunspacePool = $RunspacePool
$Runspaces += New-Object PSObject -Property #{
Instance = $PSInstance
IAResult = $PSInstance.BeginInvoke()
Argument = $PipelineObject
}
}
End {
while($True) {
$completedRunspaces = #($Runspaces | where {$_.IAResult.IsCompleted})
$completedRunspaces | foreach {
Write-Output $_.Instance.EndInvoke($_.IAResult)
$_.Instance.Dispose()
}
if($completedRunspaces.Count -eq $Runspaces.Count) {
break
}
$Runspaces = #($Runspaces | where { $completedRunspaces -notcontains $_ })
Start-Sleep -Milliseconds 250
}
$RunspacePool.Close()
$RunspacePool.Dispose()
}
}
Code partly from MathiasR.Jessen, Why PowerShell workflow is significantly slower than non-workflow script for XML file analysis
The key is to define $_ as a variable that your script block can see, via a call to Set-Variable.
Here's a simple example:
function MyNewForeachFunction {
[CmdletBinding()]
param(
[Parameter(Mandatory)]
[scriptblock] $ScriptBlock
,
[Parameter(ValueFromPipeline)]
$InputObject
)
process {
$PSInstance = [powershell]::Create()
# Add a call to define $_ based on the current pipeline input object
$null = $PSInstance.
AddCommand('Set-Variable').
AddParameter('Name', '_').
AddParameter('Value', $InputObject).
AddScript($ScriptBlock)
$PSInstance.Invoke()
}
}
# Invoke with sample values.
1, (Get-Date) | MyNewForeachFunction { "[$_]" }
The above yields something like:
[1]
[10/26/2018 00:17:37]
What I think you're looking for (and what I was looking for) is to support a "delay-bind" script block, supported in PowerShell 5.1+. The Microsoft documentation tells a bit about what's required, but doesn't provide any user-script examples (currently).
The gist is that PowerShell will implicitly detect that your function can accept a delay-bind script block if it defines an explicitly typed pipeline parameter (either by Value or by PropertyName), as long as it's not of type [scriptblock] or type [object].
function Test-DelayedBinding {
param(
# this is our typed pipeline parameter
# per doc this cannot be of type [scriptblock] or [object],
# but testing shows that type [object] may be permitted
[Parameter(ValueFromPipeline, Mandatory)][string]$string,
# this is our scriptblock parameter
[Parameter(Position=0)][scriptblock]$filter
)
Process {
if (&$filter $string) {
Write-Output $string
}
}
}
# sample invocation
>'foo', 'fi', 'foofoo', 'fib' | Test-DelayedBinding { return $_ -match 'foo' }
foo
foofoo
Note that the delay-bind will only be applied if input is piped into the function, and that the script block must use named parameters (not $args) if additional parameters are desired.
The frustrating part is that there is no way to explicitly specify that delay-bind should be used, and errors resulting from incorrectly structuring your function may be non-obvious.
Maybe this can help.
I'd normally run auto-generated jobs in parallel this way:
Get-Job | Remove-Job
foreach ($param in #(3,4,5)) {
Start-Job -ScriptBlock {param($lag); sleep $lag; Write-Output "slept for $lag seconds" } -ArgumentList #($param)
}
Get-Job | Wait-Job | Receive-Job
If I understand you correctly, you are trying to get rid of param() inside the scriptblock. You may try to wrap that SB with another one. Below is the workaround for my sample:
Get-Job | Remove-Job
#scriptblock with no parameter
$job = { sleep $lag; Write-Output "slept for $lag seconds" }
foreach ($param in #(3,4,5)) {
Start-Job -ScriptBlock {param($param, $job)
$lag = $param
$script = [string]$job
Invoke-Command -ScriptBlock ([Scriptblock]::Create($script))
} -ArgumentList #($param, $job)
}
Get-Job | Wait-Job | Receive-Job
# I was looking for an easy way to do this in a scripted function,
# and the below worked for me in PSVersion 5.1.17134.590
function Test-ScriptBlock {
param(
[string]$Value,
[ScriptBlock]$FilterScript={$_}
)
$_ = $Value
& $FilterScript
}
Test-ScriptBlock -Value 'unimportant/long/path/to/foo.bar' -FilterScript { [Regex]::Replace($_,'unimportant/','') }

How to convert infinity while loop to a pipeline statement

I'm trying to use PowerShell pipeline for some recurring tasks and checks, like
perform certain checks X times or skip forward after the response in pipeline will have different state.
The simplest script I can write to do such checks is something like this:
do {
$result=Update-ACMEIdentifier dns1 -ChallengeType dns-01
if($result.Status -ne 'pending')
{
"Hurray"
break
}
"Still pending"
Start-Sleep -s 3
} while ($true)
The question is - how can I write this script as a single pipeline.
It looks like the only thing I need is infinity pipeline to start with:
1..Infinity |
%{ Start-Sleep -Seconds 1 | Out-Null; $_ } |
%{Update-ACMEIdentifier dns1 -ChallengeType dns-01 } |
Select -ExpandProperty Status | ?{$_ -eq 'pending'} |
#some code here to stop infinity producer or stop the pipeline
So is there any simple one-liner, which allows me to put infinity object producer on one side of the pipeline?
Good example of such object may be a tick generator that generates current timestamp into pipeline every 13 seconds
#PetSerAl gave the crucial pointer in a comment on the question: A script block containing an infinite loop, invoked with the call operator (&), creates an infinite source of objects that can be sent through a pipeline:
& { while ($true) { ... } }
A later pipeline segment can then stop the pipeline on demand.
Note:
As of PS v5, only Select-Object is capable of directly stopping a pipeline.
An imperfect generic pipeline-stopping function can be found in this answer of mine.
Using break to stop the pipeline is tricky, because it doesn't just stop the pipeline, but breaks out of any enclosing loop - safe use requires wrapping the pipeline in a dummy loop.
Alternatively, a Boolean variable can be used to terminate the infinite producer.
Here are examples demonstrating each approach:
A working example with Select-Object -First:
& { while ($true) { Get-Date; Start-Sleep 1 } } | Select-Object -First 5
This executes Get-Date every second indefinitely, but is stopped by Select-Object after 5 iterations.
An equivalent example with break and a dummy loop:
do {
& { while ($true) { Get-Date; Start-Sleep 1 } } |
% { $i = 0 } { $_; if (++$i -eq 5) { break } } # `break` stops the pipeline and
# breaks out of the dummy loop
} while ($false)
An equivalent example with a Boolean variable that terminates the infinite producer:
& { while (-not $done) { Get-Date; Start-Sleep 1 } } |
% { $done = $false; $i = 0 } { $_; if (++$i -eq 5) { $done = $true } }
Note how even though $done is only initialized in the 2nd pipeline segment - namely in the ForEach-Object (%) cmdlet's (implicit) -Begin block - that initialization still happens before the 1st pipeline segment - the infinite producer - starts executing.Thanks again, #PetSerAl.
Not sure why you'd want to use a pipeline over a loop in this scenario, but it is possible by using a bit of C#; e.g.
$Source = #"
using System.Collections.Generic;
public static class Counter
{
public static bool Running = false;
public static IEnumerable<long> Run()
{
Running = true;
while(Running)
{
for (long l = 0; l <= long.MaxValue; l++)
{
yield return l;
if (!Running) {
break;
}
}
}
}
}
"#
Add-Type -TypeDefinition $Source -Language CSharp
[Counter]::Run() | %{
start-sleep -seconds 1
$_
} | %{
"Hello $_"
if ($_ -eq 12) {
[Counter]::Running = $false;
}
}
NB: Because the numbers are generated in parallel with the pipeline execution it's possible that the generator may create a backlog of numbers before it's stopped. In my testing that didn't happen; but I believe that scenario is possible.
You'll also notice that I've stuck a for loop inside the while loop; that's to ensure that the values produced are valid; i.e. so I don't overrun the max value for the data type.
Update
Per #PetSerAl's comment above, here's an adapted version in pure PowerShell:
$run=$true; &{for($i=0;$run;$i++){$i}} | %{ #infinite loop outputting to pipeline demo
"hello $_";
if($_ -eq 10){"stop";$run=$false <# stopping condition demo #>}
}

How does Select-Object stop the pipeline in PowerShell v3?

In PowerShell v2, the following line:
1..3| foreach { Write-Host "Value : $_"; $_ }| select -First 1
Would display:
Value : 1
1
Value : 2
Value : 3
Since all elements were pushed down the pipeline. However, in v3 the above line displays only:
Value : 1
1
The pipeline is stopped before 2 and 3 are sent to Foreach-Object (Note: the -Wait switch for Select-Object allows all elements to reach the foreach block).
How does Select-Object stop the pipeline, and can I now stop the pipeline from a foreach or from my own function?
Edit: I know I can wrap a pipeline in a do...while loop and continue out of the pipeline. I have also found that in v3 I can do something like this (it doesn't work in v2):
function Start-Enumerate ($array) {
do{ $array } while($false)
}
Start-Enumerate (1..3)| foreach {if($_ -ge 2){break};$_}; 'V2 Will Not Get Here'
But Select-Object doesn't require either of these techniques so I was hoping that there was a way to stop the pipeline from a single point in the pipeline.
Check this post on how you can cancel a pipeline:
http://powershell.com/cs/blogs/tobias/archive/2010/01/01/cancelling-a-pipeline.aspx
In PowerShell 3.0 it's an engine improvement. From the CTP1 samples folder ('\Engines Demos\Misc\ConnectBugFixes.ps1'):
# Connect Bug 332685
# Select-Object optimization
# Submitted by Shay Levi
# Connect Suggestion 286219
# PSV2: Lazy pipeline - ability for cmdlets to say "NO MORE"
# Submitted by Karl Prosser
# Stop the pipeline once the objects have been selected
# Useful for commands that return a lot of objects, like dealing with the event log
# In PS 2.0, this took a long time even though we only wanted the first 10 events
Start-Process powershell.exe -Args '-Version 2 -NoExit -Command Get-WinEvent | Select-Object -First 10'
# In PS 3.0, the pipeline stops after retrieving the first 10 objects
Get-WinEvent | Select-Object -First 10
After trying several methods, including throwing StopUpstreamCommandsException, ActionPreferenceStopException, and PipelineClosedException, calling $PSCmdlet.ThrowTerminatingError and $ExecutionContext.Host.Runspace.GetCurrentlyRunningPipeline().stopper.set_IsStopping($true) I finally found that just utilizing select-object was the only thing that didn't abort the whole script (versus just the pipeline). [Note that some of the items mentioned above require access to private members, which I accessed via reflection.]
# This looks like it should put a zero in the pipeline but on PS 3.0 it doesn't
function stop-pipeline {
$sp = {select-object -f 1}.GetSteppablePipeline($MyInvocation.CommandOrigin)
$sp.Begin($true)
$x = $sp.Process(0) # this call doesn't return
$sp.End()
}
New method follows based on comment from OP. Unfortunately this method is a lot more complicated and uses private members. Also I don't know how robust this - I just got the OP's example to work and stopped there. So FWIW:
# wh is alias for write-host
# sel is alias for select-object
# The following two use reflection to access private members:
# invoke-method invokes private methods
# select-properties is similar to select-object, but it gets private properties
# Get the system.management.automation assembly
$smaa=[appdomain]::currentdomain.getassemblies()|
? location -like "*system.management.automation*"
# Get the StopUpstreamCommandsException class
$upcet=$smaa.gettypes()| ? name -like "*upstream*"
filter x {
[CmdletBinding()]
param(
[parameter(ValueFromPipeline=$true)]
[object] $inputObject
)
process {
if ($inputObject -ge 5) {
# Create a StopUpstreamCommandsException
$upce = [activator]::CreateInstance($upcet,#($pscmdlet))
$PipelineProcessor=$pscmdlet.CommandRuntime|select-properties PipelineProcessor
$commands = $PipelineProcessor|select-properties commands
$commandProcessor= $commands[0]
$null = $upce.RequestingCommandProcessor|select-properties *
$upce.RequestingCommandProcessor.commandinfo =
$commandProcessor|select-properties commandinfo
$upce.RequestingCommandProcessor.Commandruntime =
$commandProcessor|select-properties commandruntime
$null = $PipelineProcessor|
invoke-method recordfailure #($upce, $commandProcessor.command)
1..($commands.count-1) | % {
$commands[$_] | invoke-method DoComplete
}
wh throwing
throw $upce
}
wh "< $inputObject >"
$inputObject
} # end process
end {
wh in x end
}
} # end filter x
filter y {
[CmdletBinding()]
param(
[parameter(ValueFromPipeline=$true)]
[object] $inputObject
)
process {
$inputObject
}
end {
wh in y end
}
}
1..5| x | y | measure -Sum
PowerShell code to retrieve PipelineProcessor value through reflection:
$t_cmdRun = $pscmdlet.CommandRuntime.gettype()
# Get pipelineprocessor value ($pipor)
$bindFlags = [Reflection.BindingFlags]"NonPublic,Instance"
$piporProp = $t_cmdRun.getproperty("PipelineProcessor", $bindFlags )
$pipor=$piporProp.GetValue($PSCmdlet.CommandRuntime,$null)
Powershell code to invoke method through reflection:
$proc = (gps)[12] # semi-random process
$methinfo = $proc.gettype().getmethod("GetComIUnknown", $bindFlags)
# Return ComIUnknown as an IntPtr
$comIUnknown = $methinfo.Invoke($proc, #($true))
I know that throwing a PipelineStoppedException stops the pipeline. The following example will simulate what you see with Select -first 1 in v3.0, in v2.0:
filter Select-Improved($first) {
begin{
$count = 0
}
process{
$_
$count++
if($count -ge $first){throw (new-object System.Management.Automation.PipelineStoppedException)}
}
}
trap{continue}
1..3| foreach { Write-Host "Value : $_"; $_ }| Select-Improved -first 1
write-host "after"