Handling pipeline and parameter input in a Powershell function - powershell

I'm confused about something I saw in the book Learn PowerShell in a Month of lunches. In chapter 21 when the author discusses functions that accept input via parameter binding or the pipeline he gives two patterns.
The first as follows
function someworkerfunction {
# do some work
}
function Get-SomeWork {
param ([string[]]$computername)
BEGIN {
$usedParameter = $False
if($PSBoundParameters.ContainsKey('computername')) {
$usedParameter = $True
}
}
PROCESS {
if($usedParameter) {
foreach($computer in $computername) {
someworkerfunction -computername $comptuer
}
} else {
someworkerfunction -comptuername $_
}
}
END {}
}
The second like this
function someworkerfunction {
# do stuff
}
function Get-Work {
[CmdletBinding()]
param(
[Parameter(Mandatory=$True,
ValueFromPipelineByPropertyName=$True)]
[Alias('host')]
[string[]]$computername
)
BEGIN {}
PROCESS {
foreach($computer in $computername) {
someworkerfunction -comptuername $computer
}
}
END {}
}
I know the second sample is a standard Powershell 2.0 Advanced function. My question is with Powershell 2.0 support for the cmdletbinding directive would you ever want to use the first pattern. Is that just a legacy from Powershell 1.0? Basically is there ever a time when using Powershell 2.0 that I would want to mess around with the first pattern, when the second pattern is so much cleaner.
Any insight would be appreciated.
Thank you.

If you want to process pipeline input in your function but don't want to add all the parameter attributes or want backwards compatibility go with the cmdletbindingless way.
If you want to use the additional features of PowerShell script cmdlets like the parameter attributes, parameter sets etc... then go with the second one.

If anyone wishes for a very, very simple explanation of how to read from piped input see
How do you write a powershell function that reads from piped input?
Had this ^ existed when I had this question, I would have saved a lot of time because this thread is quite complicated and doesn't actually explain how to handle pipelined input into a function.

No, the first example is not just legacy. In order to create a PowerShell function that uses an array parameter and takes pipeline input you have to do some work.
I will even go as far as to say that the second example does not work. At least I could not get it to work.
Take this example...
function PipelineMadness()
{
[cmdletbinding()]
param (
[Parameter(Mandatory = $true, ValueFromPipeline=$true)]
[int[]] $InputArray
)
Write-Host ('$InputArray.Count {0}' -f $InputArray.Count)
Write-Host $InputArray
Write-Host ('$input.Count {0}' -f $input.Count)
Write-Host $input
if($input) { Write-Host "input is true" }
else { Write-Host "input is false" }
}
results ...
PS C:\Windows\system32> 1..5 | PipelineMadness
$InputArray.Count 1
5
$input.Count 5
1 2 3 4 5
input is true
PS C:\Windows\system32> PipelineMadness (1..5)
$InputArray.Count 5
1 2 3 4 5
$input.Count 1
input is false
Notice that when the pipeline is used the $InputArray variable is a single value of 5...
Now with BEGIN and PROCESS blocks
function PipelineMadnessProcess()
{
[cmdletbinding()]
param (
[Parameter(Mandatory = $true, ValueFromPipeline=$true)]
[int[]] $InputArray
)
BEGIN
{
Write-Host 'BEGIN'
Write-Host ('$InputArray.Count {0}' -f $InputArray.Count)
Write-Host $InputArray
Write-Host ('$input.Count {0}' -f $input.Count)
Write-Host $input
if($input) { Write-Host "input is true" }
else { Write-Host "input is false" }
}
PROCESS
{
Write-Host 'PROCESS'
Write-Host ('$InputArray.Count {0}' -f $InputArray.Count)
Write-Host $InputArray
Write-Host ('$input.Count {0}' -f $input.Count)
Write-Host $input
if($input) { Write-Host "input is true" }
else { Write-Host "input is false" }
}
}
Now this is where it gets weird
PS C:\Windows\system32> 1..5 | PipelineMadnessProcess
BEGIN
$InputArray.Count 0
$input.Count 0
input is false
PROCESS
$InputArray.Count 1
1
$input.Count 1
1
input is true
PROCESS
$InputArray.Count 1
2
$input.Count 1
2
input is true
...
PROCESS
$InputArray.Count 1
5
$input.Count 1
5
input is true
The BEGIN block does not have any data in there at all. And the process block works well however if you had a foreach like the example it would actually work but it would be running the foreach with 1 entry X times. Or if you passed in the array it would run the foreach once with the full set.
So I guess technically the example would work but it may not work the way you expect it to.
Also note that even though the BEGIN block had no data the function passed syntax validation.

To answer your question, I would say that the first pattern is just a legacy from PowerShell 1.0, you also can use $input in classical functions without Process script block. As far as you are just writting code for PowerShell 2.0 you can forget it.
Regarding pipeline functions, in powerShell V1.0 they can be handled with filters.
you just have to know that it have been done like that when you take samples from the Net or when you have to debug old Powerhell code.
Personally I still use old functions and filters inside my modules I reserve cmdletbinding for export functions or profile functions.
Powershell is a bit like lego blocks, you can do many things in many different ways.

The first form is expecting one or more computer names as string arguments either from an argumentlist or from the pipeline.
The second form is expecting either an array of string arguments from an argument list, or input objects from the pipeline that have the computer names as a property.

Related

What is the best way to structure an advanced multi-threaded Powershell function to work both on the pipeline as well as non-pipeline calls?

I have a function that flattens directories in parallel for multiple folders. It works great when I call it in a non-pipeline fashion:
$Files = Get-Content $FileList
Merge-FlattenDirectory -InputPath $Files
But now I want to update my function to work both on the pipeline as well as when called off the pipeline. Someone on discord recommended the best way to do this is to defer all processing to the end block, and use the begin and process blocks to add pipeline input to a list. Basically this:
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$List = [System.Collections.Generic.List[PSObject]]#()
}
process {
if(($InputPath.GetType().BaseType.Name) -eq "Array"){
Write-Host "Array detected"
$List = $InputPath
} else {
$List.Add($InputPath)
}
}
end {
$List | ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
However, this is still not working on the pipeline for me. When I do this:
$Files | Merge-FlattenDirectory
It actually passes individual arrays of length 1 to the function. So testing for ($InputPath.GetType().BaseType.Name) -eq "Array" isn't really the way forward, as only the first pipeline value gets used.
My million dollar question is the following:
What is the most robust way in the process block to differentiate between pipeline input and non-pipeline input? The function should add all pipeline input to a generic list, and in the case of non-pipeline input, should skip this step and process the collection as-is moving directly to the end block.
The only thing I could think of is the following:
if((($InputPath.GetType().BaseType.Name) -eq "Array") -and ($InputPath.Length -gt 1)){
$List = $InputPath
} else {
$List.Add($InputPath)
}
But this just doesn't feel right. Any help would be extremely appreciated.
You might just do
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$List = [System.Collections.Generic.List[String]]::new()
}
process {
$InputPath.ForEach{ $List.Add($_) }
}
end {
$List |ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
Which will process the input values either from the pipeline or the input parameter.
But that doesn't comply with the Strongly Encouraged Development Guidelines to Support Well Defined Pipeline Input (SC02) especially for Implement for the Middle of a Pipeline
This means if you correctly want to implement the PowerShell Pipeline, you should directly (parallel) process your items in the Process block and immediately output any results from there:
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$SharedPool = New-ThreadPool -Limit 16
}
process {
$InputPath |ForEach-Object -Parallel -threadPool $Using:SharedPool {
# Process your current item ($_) here ...
}
}
}
In general, script authors are advised to use idiomatic PowerShell which often comes down to lesser object manipulations and usually results in a correct PowerShell pipeline implementation with less memory usage.
Please let me know if you intent to collect (and e.g. order) the output based on this suggestion.
Caveat
The full invocation of the ForEach-Object -Parallel cmdlet itself is somewhat inefficient as you open and close a new pipeline each iteration. To resolve this, my whole general statement about idiomatic PowerShell falls a bit apart, but should be resolvable by using a steppable pipeline.
To implement this, you might use the ForEach-Object cmdlet as a template:
[System.Management.Automation.ProxyCommand]::Create((Get-Command ForEach-Object))
And set the ThrottleLimit of the ThreadPool in the Begin Block
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
[string[]]
$InputPath
)
begin {
$PSBoundParameters += #{
ThrottleLimit = 4
Parallel = {
Write-Host (Get-Date).ToString('HH:mm:ss.s') 'Started' $_
Start-Sleep -Seconds 3
Write-Host (Get-Date).ToString('HH:mm:ss.s') 'finished' $_
}
}
$wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('ForEach-Object', [System.Management.Automation.CommandTypes]::Cmdlet)
$scriptCmd = {& $wrappedCmd #PSBoundParameters }
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
}
process {
$InputPath.ForEach{ $steppablePipeline.Process($_) }
}
end {
$steppablePipeline.End()
}
}
1..5 |Merge-FlattenDirectory
17:57:40.40 Started 3
17:57:40.40 Started 2
17:57:40.40 Started 1
17:57:40.40 Started 4
17:57:43.43 finished 3
17:57:43.43 finished 1
17:57:43.43 finished 4
17:57:43.43 finished 2
17:57:43.43 Started 5
17:57:46.46 finished 5
Here's how I would write it with comments where I have changed it.
function Merge-FlattenDirectory {
[CmdletBinding()]
param (
[Parameter(Mandatory,Position = 0,ValueFromPipeline)]
$InputPath # <this may be a string, a path object, a file object,
# or an array
)
begin {
$List = #() # Use an array for less than 100K objects.
}
process {
#even if InputPath is a string for each will iterate once and set $p
#if it is an array of strings add each. If it is one or more objects,
#try to find the right property for the path.
foreach ($p in $inputPath) {
if ($p -is [String]) {$list += $p }
elseif ($p.Path) {$list += $p.Path}
elseif ($p.FullName) {$list += $p.FullName}
elseif ($p.PSPath) {$list += $p.PSPath}
else {Write-warning "$P makes no sense"}
}
}
end {
$List | ForEach-Object -Parallel {
# Code here...
} -ThrottleLimit 16
}
}
#iRon That "write for the middle of the pipeline" in the docs does not mean write everything in the process block .
function one { #(1,2,3,4,5) }
function two {
param ([parameter(ValueFromPipeline=$true)] $p )
begin {Write-host "Two begins" ; $a = #() }
process {Write-host "Two received $P" ; $a += $p }
end {Write-host "Two ending" ; $a; Write-host "Two ended"}
}
function three {
param ([parameter(ValueFromPipeline=$true)] $p )
begin {Write-host "three Starts" }
process {Write-host "Three received $P" }
end {Write-host "Three ended" }
}
one | two | three
One is treated as an end block.
One, two and three all run their begins (one's is empty).
One's output goes to the process block in two, which just collects the data. Two's end block starts after one's end-block ends, and sends output
At this point three's process block gets input. After two's end block ends, three's endblock runs.
Two is "in the middle" it has a process block to deal with multiple piped items (if it were all one 'end' block it would only process the last one).

Redirect/Capture Write-Host output even with -NoNewLine

The function Select-WriteHost from an answer to another Stackoverflow question (see code below) will redirect/capture Write-Host output:
Example:
PS> $test = 'a','b','c' |%{ Write-Host $_ } | Select-WriteHost
a
b
c
PS> $test
a
b
c
However, if I add -NoNewLine to Write-Host, Select-WriteHost will ignore it:
PS> $test = 'a','b','c' |%{ Write-Host -NoNewLine $_ } | Select-WriteHost
abc
PS> $test
a
b
c
Can anyone figure out how to modify Select-WriteHost (code below) to also support -NoNewLine?
function Select-WriteHost
{
[CmdletBinding(DefaultParameterSetName = 'FromPipeline')]
param(
[Parameter(ValueFromPipeline = $true, ParameterSetName = 'FromPipeline')]
[object] $InputObject,
[Parameter(Mandatory = $true, ParameterSetName = 'FromScriptblock', Position = 0)]
[ScriptBlock] $ScriptBlock,
[switch] $Quiet
)
begin
{
function Cleanup
{
# Clear out our proxy version of write-host
remove-item function:\write-host -ea 0
}
function ReplaceWriteHost([switch] $Quiet, [string] $Scope)
{
# Create a proxy for write-host
$metaData = New-Object System.Management.Automation.CommandMetaData (Get-Command 'Microsoft.PowerShell.Utility\Write-Host')
$proxy = [System.Management.Automation.ProxyCommand]::create($metaData)
# Change its behavior
$content = if($quiet)
{
# In quiet mode, whack the entire function body,
# simply pass input directly to the pipeline
$proxy -replace '(?s)\bbegin\b.+', '$Object'
}
else
{
# In noisy mode, pass input to the pipeline, but allow
# real Write-Host to process as well
$proxy -replace '(\$steppablePipeline\.Process)', '$Object; $1'
}
# Load our version into the specified scope
Invoke-Expression "function ${scope}:Write-Host { $content }"
}
Cleanup
# If we are running at the end of a pipeline, we need
# to immediately inject our version into global
# scope, so that everybody else in the pipeline
# uses it. This works great, but it is dangerous
# if we don't clean up properly.
if($pscmdlet.ParameterSetName -eq 'FromPipeline')
{
ReplaceWriteHost -Quiet:$quiet -Scope 'global'
}
}
process
{
# If a scriptblock was passed to us, then we can declare
# our version as local scope and let the runtime take
# it out of scope for us. It is much safer, but it
# won't work in the pipeline scenario.
#
# The scriptblock will inherit our version automatically
# as it's in a child scope.
if($pscmdlet.ParameterSetName -eq 'FromScriptBlock')
{
. ReplaceWriteHost -Quiet:$quiet -Scope 'local'
& $scriptblock
}
else
{
# In a pipeline scenario, just pass input along
$InputObject
}
}
end
{
Cleanup
}
}
PS: I tried inserting -NoNewLine to the line below (just to see how it would react) however, its producing the exception, "Missing function body in function declaration"
Invoke-Expression "function ${scope}:Write-Host { $content }"
to:
Invoke-Expression "function ${scope}:Write-Host -NoNewLine { $content }"
(Just to recap) Write-Host is meant for host, i.e. display / console output only, and originally couldn't be captured (in-session) at all. In PowerShell 5, the ability to capture Write-Host output was introduced via the information stream, whose number is 6, enabling techniques such as redirection 6>&1 in order to merge Write-Host output into the success (output) stream (whose number is 1), where it can be captured as usual.
However, due to your desire to use the -NoNewLine switch across several calls, 6>&1 by itself is not enough, because the concept of not emitting a newline only applies to display output, not to distinct objects in the pipeline.
E.g., in the following call -NoNewLine is effectively ignored, because there are multiple Write-Host calls producing multiple output objects (strings) that are captured separately:
'a','b','c' | % { Write-Host $_ -NoNewline } 6>&1
Your Select-WriteHost function - necessary in PowerShell 4 and below only - would have the same problem if you adapted it to support the -NoNewLine switch.
An aside re 6>&1: The strings that Write-Host invariably outputs are wrapped in [System.Management.Automation.InformationRecord] instances, due to being re-routed via the information stream. In display output you will not notice the difference, but to get the actual string you need to access the .MessageData.Message property or simply call .ToString().
There is no general solution I am aware of, but situationally the following may work:
If you know that the code of interest uses only Write-Host -NoNewLine calls:
Simply join the resulting strings after the fact without a separator to emulate -NoNewLine behavior:
# -> 'abc'
# Note: Whether or not you use -NoNewLine here makes no difference.
-join ('a','b','c' | % { Write-Host -NoNewLine $_ })
If you know that all instances of Write-Host -NoNewLine calls apply only to their respective pipeline input, you can write a simplified proxy function that collects all input up front and performs separator-less concatenation of the stringified objects:
# -> 'abc'
$test = & {
# Simplified proxy function
function Write-Host {
param([switch] $NoNewLine)
if ($MyInvocation.ExpectingInput) { $allInput = $Input }
else { $allInput = $args }
if ($NoNewLine) { -join $allInput.ForEach({ "$_" }) }
else { $allInput.ForEach({ "$_" }) }
}
# Important: pipe all input directly.
'a','b','c' | Write-Host -NoNewLine
}

PowerShell switch parameter doesn't work as expected

I have the following PowerShell function to help me do benchmarks. The idea is that you provide the command and a number and the function will run the code that number of times. After that the function will report the testing result, such as Min, Max and Average time taken.
function Measure-MyCommand()
{
[CmdletBinding()]
Param (
[Parameter(Mandatory = $True)] [scriptblock] $ScriptBlock,
[Parameter()] [int] $Count = 1,
[Parameter()] [switch] $ShowOutput
)
$time_elapsed = #();
while($Count -ge 1) {
$timer = New-Object 'Diagnostics.Stopwatch';
$timer.Start();
$temp = & $ScriptBlock;
if($ShowOutput) {
Write-Output $temp;
}
$timer.Stop();
$time_elapsed += $timer.Elapsed;
$Count--;
}
$stats = $time_elapsed | Measure-Object -Average -Minimum -Maximum -Property Ticks;
Write-Host "Min: $((New-Object 'System.TimeSpan' $stats.Minimum).TotalMilliseconds) ms";
Write-Host "Max: $((New-Object 'System.TimeSpan' $stats.Maximum).TotalMilliseconds) ms";
Write-Host "Avg: $((New-Object 'System.TimeSpan' $stats.Average).TotalMilliseconds) ms";
}
The problem is with the switch parameter $ShowOutput. As I understand, when you provide a switch parameter, its value is true. Otherwise it's false. However it doesn't seem to work. See my testing.
PS C:\> Measure-MyCommand -ScriptBlock {Write-Host "f"} -Count 3 -ShowOutput
f
f
f
Min: 0.4935 ms
Max: 0.8392 ms
Avg: 0.6115 ms
PS C:\> Measure-MyCommand -ScriptBlock {Write-Host "f"} -Count 3
f
f
f
Min: 0.4955 ms
Max: 0.8296 ms
Avg: 0.6251 ms
PS C:\>
Can anyone help to explain it please?
This is because Write-Host doesn't return the object and is not a part of pipeline stream, instead it sends text directly to console. You can't simply hide Write-Host output by writing it to variable, because Write-Host writes to console stream which isn't exposed in PowerShell before version 5. What you need is to replace Write-Host with Write-Output cmdlet call. Then you get what you expect. This solution is valid for PowerShell 2.0+.
Update:
Starting with PowerShell 5, Write-Host writes to its dedicated stream and you can handle and redirect this stream somewhere else. See this response for more details: https://stackoverflow.com/a/60353648/3997611.
The problem is with the specific script block you're passing, not with your [switch] parameter declaration (which works fine):
{Write-Host "f"}
By using Write-Host, you're bypassing PowerShell's success output stream (which the assignment to variable $temp collects) and printing directly to the host[1] (the console), so your output always prints.
To print to the success output stream, use Write-Output, or better yet, use PowerShell's implicit output feature:
# "f", due not being capture or redirected, is implicitly sent
# to the success output stream.
Measure-MyCommand -ScriptBlock { "f" } -Count 3 -ShowOutput
[1] In PowerShell 5.0 and higher, Write-Host now writes to a new stream, the information stream (number 6), which by default prints to the host. See about_Redirection.
Therefore, a 6> redirection now does allow you to capture Write-Host output; e.g.: $temp = Write-Host hi 6>&1. Note that the type of the objects captured this way is System.Management.Automation.InformationRecord.

Write Powershell function, that can be called with pipeline input AND "normal" input"

I've a simple function:
function Write-Log {
[CmdletBinding()]
param (
# Lines to log
[Parameter(Mandatory , ValueFromPipeline )]
[AllowEmptyString()]
$messages
)
process {
Write-Host $_
}
}
Based on ValueFromPipeline in can use the function with pipeline input, e.g.
"a","b", "c" | Write-Log
a
b
c
That's ok. But if I want to use my function in that way:
Write-Log -message "a", "b", "c"
the automatic $_ variable is empty, and nothing is printed.
I also found these two stackoverflow links:
Handling pipeline and parameter input in a Powershell function
How do I write a PowerShell script that accepts pipeline input?
Both of them suggest the following pattern:
function Write-Log {
[CmdletBinding()]
param (
# Lines to log
[Parameter(Mandatory , ValueFromPipeline )]
[AllowEmptyString()]
$messages
)
process {
foreach ($message in $messages){
Write-Host $message
}
}
}
Above pattern works for my use case. But from my point of view is feels weird to call foreach in the `processĀ“ block, since (as far as I've understood) the process block is called for every pipeline item. As there a better cleaner way to write functions supporting both use cases?
Thx.
That's the way you have to do it if you want to pass an array to a parameter like
Write-Log -Messages a,b,c
Otherwise you can only do
Write-Log -Messages a
You can still pipe an array in without the foreach:
Echo a b c | Write-Log
I appreciate cmdlets that can do both, like get-process.

Is it possible to customize the verbose message prefix in Powershell?

The current verbose message prefix is simply VERBOSE:
I would like to modify it to VERBOSE[N]:, where N is the current thread Id.
Is it possible?
This behavior (or rather the format string) is hard-coded into the default PowerShell host and there are no hooks to override it. You'd have to implement your own host or modify the calling code to use a proper logging framework, neither of which are particularly simple.
If you at least control the outermost invocation, you have the option to redirect the verbose stream output, and we can use this in combination with a cmdlet to "sort of" customize things:
function Verbosi-Tee {
[CmdletBinding()]
Param (
[Parameter(ValueFromPipeline = $true)]
$o
)
Process {
if ($o -is [System.Management.Automation.VerboseRecord]) {
Write-Verbose "[$([System.Threading.Thread]::CurrentThread.ManagedThreadId)] $($o.Message)"
} else {
$o
}
}
}
Sample use:
$VerbosePreference = "continue"
$x = (&{
Write-Verbose "This is verbose."
Write-Output "This is just regular output."
} >4&1 | Verbosi-Tee) # redirect, then pipe
"We captured the output in `$x: $x"
Output (on my system):
VERBOSE: [16] This is verbose.
We captured the output in $x: This is just regular output.
The name of the cmdlet is a lie because this doesn't in fact implement a full tee, but a good pun is its own reward.