Powershell: idiomatic way of mutating items passing through a stream? - powershell

When processing an input stream it's sometimes necessary to effect changes on the objects passing through the stream. It's also useful to allow those objects to pass out the other end so that they may be piped into other processes. Is there a more idiomatic/concise way than this to mutate and pass along?
$input | % {
if ($_.Office.length -ne 11) { $_.Errors += "Bad office" }
$_ #needed to allow additional piping
}
Some other operator? Something like... %!
$input | %! {
if ($_.Office.length -ne 11) { $_.Errors += "Bad office" }
} #whatever goes in must come out

I don't see anything wrong with the first example there, but you could write your own function and then use that:
function Do-Modification {
[CmdletBinding()]
param(
[Parameter(
ValueFromPipeline=$true
)]
$myObject
)
Process {
if ($myObject.Office.length -ne 11) {
$myObject.Errors += "Bad Office"
}
$myObject
}
}
$inputs | Do-Modification
# or to use it in a larger ForEach-Object (even though it's redundant here)
$inputs | ForEach-Object {
$_ | Do-Modification
}
Edit to add:
This is a contrived example; it's probably not worth it to reinvent the wheel (writing your own pipeline function) except for code re-use (where it is very much worth it in my opinion). It's up to you to decide where one approach or the other makes sense.

Related

Filter processes by module's FileName

If I need to filter processes by a module's FileName, the following code does the job:
Get-Process | where { $_.Modules.FileName -eq "xxx\yyy.dll”) }
But if I need to filter modules by FileName starting with a string, the following code doesn't seem to work:
Get-Process | where { $_.Modules.FileName.StartsWith("xxx\yyy.dll”)) }
As result, I see all the processes in the output. I'm very confused why filtering doesn't seem to work in case of StartsWith
The member modules might be a collection. Thus it needs to be iterated too. Like so,
(get-process) | % {
if($_.modules -ne $null) { # No modules, no action
$_.modules | ? { $_.filename.tolower().startswith("c:\program") }
}
}
As for the question, there are actually two iterations. Let's use explicit variables instead of pipelining and printing the acutal module files. Passing multiple $_s around is not easy to read syntax anyway. Like so,
foreach ($p in get-process) {
if ($p.modules -ne $null){
write-host $p.id $p.ProcessName
foreach($m in $p.modules){
if ($m.filename.tolower().startswith("c:\program") ) {
write-host `t $m.moduleName $m.FileName # ` markdown bug
}
}
write-host
}
}

Writing $null to Powershell Output Stream

There are powershell cmdlets in our project for finding data in a database. If no data is found, the cmdlets write out a $null to the output stream as follows:
Write-Output $null
Or, more accurately since the cmdlets are implemented in C#:
WriteOutput(null)
I have found that this causes some behavior that is very counter to the conventions employed elsewhere, including in the built-in cmdlets.
Are there any guidelines/rules, especially from Microsoft, that talk about this? I need help better explaining why this is a bad idea, or to be convinced that writing $null to the output stream is an okay practice. Here is some detail about the resulting behaviors that I see:
If the results are piped into another cmdlet, that cmdlet executes despite no results being found and the pipeline variable ($_) is $null. This means that I have to add checks for $null.
Find-DbRecord -Id 3 | For-Each { if ($_ -ne $null) { <do something with $_> }}
Similarly, If I want to get the array of records found, ensuring that it is an array, I might do the following:
$recsFound = #(Find-DbRecord -Category XYZ)
foreach ($record in $recsFound)
{
$record.Name = "Something New"
$record.Update()
}
The convention I have seen, this should work without issue. If no records are found, the foreach loop wouldn't execute. Since the Find cmdlet is writing null to the output, the $recsFound variable is set to an array with one item that is $null. Now I would need to check each item in the array for $null which clutters my code.
$null is not void. If you don't want null values in your pipeline, either don't write null values to the pipeline in the first place, or remove them from the pipeline with a filter like this:
... | Where-Object { $_ -ne $null } | ...
Depending on what you want to allow through the filter you could simplify it to this:
... | Where-Object { $_ } | ...
or (using the ? alias for Where-Object) to this:
... | ? { $_ } | ...
which would remove all values that PowerShell interprets as $false ($null, 0, empty string, empty array, etc.).

PowerShell how to use $MyInvocation for recursion

I am trying to implement some recursive functions with PowerShell. Here is the basic function:
function MyRecursiveFunction {
param(
[parameter(Mandatory=$true,ValueFromPipeline=$true)]
$input
)
if ($input -is [System.Array] -And $input.Length -eq 1) {
$input = $input[0]
}
if ($input -is [System.Array]) {
ForEach ($i in $input) {
$i | ##### HOW DO I USE $MyInvocation HERE TO CALL MyRecursiveFunction??? #####
}
return
}
# Do something with the single object...
}
I have looked at Invoke-Expression and Invoke-Item but have not been able to get the syntax right. For instance I tried
$i | Invoke-Expression $MyInvocation.MyCommand.Name
I'm guessing there is an easy way to do this if you know the right syntax :-)
A bit of an old question, but since there's no satisfactory answer and I've had the same question, here's my experience.
Calling the function by name will break if the name is changed, or if the function is out of scope. It's also not very nice to manage as changing the function name requires editing all the recursive calls, and it'll likely break in modules imported using the prefix option.
# A simple recursive countdown function
function countdown {
param([int]$count)
$count
if ($count -gt 0) { countdown ($count - 1) }
}
# And a way to break it
$foo = ${function:countdown}
function countdown { 'failed' }
& $foo 5
$MyInvocation.InvocationName is slightly nicer to work with, but will still break in the above example (although for different reasons).
The best way would seem to be calling the scriptblock of the function, $MyInvocation.MyCommand.ScriptBlock. That way it'll still work regardless of the function name/scope.
function countdown {
param([int]$count)
$count
if ($count -gt 0) { & $MyInvocation.MyCommand.ScriptBlock ($count - 1) }
}
Just Call the function:
$i | MyRecursiveFunction
To call it without knowing the name of the function you should be able to call it with $myInvocation.InvocationName:
Invoke-Expression "$i | $($myInvocation.InvocationName)"

Exiting the function while in loops like Foreach-Object in PowerShell

I have a function like this in Powershell:
function F()
{
$something | Foreach-Object {
if ($_ -eq "foo"){
# Exit from F here
}
}
# do other stuff
}
if I use Exit in the if statement, it exits powershell, I don't want this behavior. If I use return in the if statement, foreach keeps executing and the rest of the function is also executed. I came up with this:
function F()
{
$failed = $false
$something | Foreach-Object {
if ($_ -eq "foo"){
$failed = $true
break
}
}
if ($failed){
return
}
# do other stuff
}
I basically introduced a sentinel variable holding if I broke out of the loop or not. Is there a cleaner solution?
Any help?
function F()
{
Trap { Return }
$something |
Foreach-Object {
if ($_ -eq "foo"){ Throw }
else {$_}
}
}
$something = "a","b","c","foo","d","e"
F
'Do other stuff'
a
b
c
Do other stuff
I'm not entirely sure of your specific requirements, but I think you can simplify this by looking at it a different way. It looks like you just want to know if any $something == "foo" in which case this would make things a lot easier:
if($something ? {$_ -eq 'foo')) { return }
? is an alias for Where-Object. The downside to this is that it will iterate over every item in the array even after finding a match, so...
If you're indeed searching a string array, things can get even simpler:
if($something -Contains 'foo') { return }
If the array is more costly to iterate over, you might consider implementing an equivalent of the LINQ "Any" extension method in Powershell which would allow you to do:
if($something | Test-Any {$_ -eq 'foo'}) { return }
As an aside, while exceptions in the CLR aren't that costly, using them to direct procedural flow is an anti-pattern as it can lead to code that's hard to follow, or, put formally, it violates the principal of least surprise.

Is it possible to terminate or stop a PowerShell pipeline from within a filter

I have written a simple PowerShell filter that pushes the current object down the pipeline if its date is between the specified begin and end date. The objects coming down the pipeline are always in ascending date order so as soon as the date exceeds the specified end date I know my work is done and I would like to let tell the pipeline that the upstream commands can abandon their work so that the pipeline can finish its work. I am reading some very large log files and I will frequently want to examine just a portion of the log. I am pretty sure this is not possible but I wanted to ask to be sure.
It is possible to break a pipeline with anything that would otherwise break an outside loop or halt script execution altogether (like throwing an exception). The solution then is to wrap the pipeline in a loop that you can break if you need to stop the pipeline. For example, the below code will return the first item from the pipeline and then break the pipeline by breaking the outside do-while loop:
do {
Get-ChildItem|% { $_;break }
} while ($false)
This functionality can be wrapped into a function like this, where the last line accomplishes the same thing as above:
function Breakable-Pipeline([ScriptBlock]$ScriptBlock) {
do {
. $ScriptBlock
} while ($false)
}
Breakable-Pipeline { Get-ChildItem|% { $_;break } }
It is not possible to stop an upstream command from a downstream command.. it will continue to filter out objects that do not match your criteria, but the first command will process everything it was set to process.
The workaround will be to do more filtering on the upstream cmdlet or function/filter. Working with log files makes it a bit more comoplicated, but perhaps using Select-String and a regular expression to filter out the undesired dates might work for you.
Unless you know how many lines you want to take and from where, the whole file will be read to check for the pattern.
You can throw an exception when ending the pipeline.
gc demo.txt -ReadCount 1 | %{$num=0}{$num++; if($num -eq 5){throw "terminated pipeline!"}else{write-host $_}}
or
Look at this post about how to terminate a pipeline: https://web.archive.org/web/20160829015320/http://powershell.com/cs/blogs/tobias/archive/2010/01/01/cancelling-a-pipeline.aspx
Not sure about your exact needs, but it may be worth your time to look at Log Parser to see if you can't use a query to filter the data before it even hits the pipe.
If you're willing to use non-public members here is a way to stop the pipeline. It mimics what select-object does. invoke-method (alias im) is a function to invoke non-public methods. select-property (alias selp) is a function to select (similar to select-object) non-public properties - however it automatically acts like -ExpandProperty if only one matching property is found. (I wrote select-property and invoke-method at work, so can't share the source code of those).
# Get the system.management.automation assembly
$script:smaa=[appdomain]::currentdomain.getassemblies()|
? location -like "*system.management.automation*"
# Get the StopUpstreamCommandsException class
$script:upcet=$smaa.gettypes()| ? name -like "*StopUpstreamCommandsException *"
function stop-pipeline {
# Create a StopUpstreamCommandsException
$upce = [activator]::CreateInstance($upcet,#($pscmdlet))
$PipelineProcessor=$pscmdlet.CommandRuntime|select-property PipelineProcessor
$commands = $PipelineProcessor|select-property commands
$commandProcessor= $commands[0]
$ci = $commandProcessor|select-property commandinfo
$upce.RequestingCommandProcessor | im set_commandinfo #($ci)
$cr = $commandProcessor|select-property commandruntime
$upce.RequestingCommandProcessor| im set_commandruntime #($cr)
$null = $PipelineProcessor|
invoke-method recordfailure #($upce, $commandProcessor.command)
if ($commands.count -gt 1) {
$doCompletes = #()
1..($commands.count-1) | % {
write-debug "Stop-pipeline: added DoComplete for $($commands[$_])"
$doCompletes += $commands[$_] | invoke-method DoComplete -returnClosure
}
foreach ($DoComplete in $doCompletes) {
$null = & $DoComplete
}
}
throw $upce
}
EDIT: per mklement0's comment:
Here is a link to the Nivot ink blog on a script on the "poke" module which similarly gives access to non-public members.
As far as additional comments, I don't have meaningful ones at this point. This code just mimics what a decompilation of select-object reveals. The original MS comments (if any) are of course not in the decompilation. Frankly I don't know the purpose of the various types the function uses. Getting that level of understanding would likely require a considerable amount of effort.
My suggestion: get Oisin's poke module. Tweak the code to use that module. And then try it out. If you like the way it works, then use it and don't worry how it works (that's what I did).
Note: I haven't studied "poke" in any depth, but my guess is that it doesn't have anything like -returnClosure. However adding that should be easy as this:
if (-not $returnClosure) {
$methodInfo.Invoke($arguments)
} else {
{$methodInfo.Invoke($arguments)}.GetNewClosure()
}
Here's an - imperfect - implementation of a Stop-Pipeline cmdlet (requires PS v3+), gratefully adapted from this answer:
#requires -version 3
Filter Stop-Pipeline {
$sp = { Select-Object -First 1 }.GetSteppablePipeline($MyInvocation.CommandOrigin)
$sp.Begin($true)
$sp.Process(0)
}
# Example
1..5 | % { if ($_ -gt 2) { Stop-Pipeline }; $_ } # -> 1, 2
Caveat: I don't fully understand how it works, though fundamentally it takes advantage of Select -First's ability to stop the pipeline prematurely (PS v3+). However, in this case there is one crucial difference to how Select -First terminates the pipeline: downstream cmdlets (commands later in the pipeline) do not get a chance to run their end blocks.
Therefore, aggregating cmdlets (those that must receive all input before producing output, such as Sort-Object, Group-Object, and Measure-Object) will not produce output if placed later in the same pipeline; e.g.:
# !! NO output, because Sort-Object never finishes.
1..5 | % { if ($_ -gt 2) { Stop-Pipeline }; $_ } | Sort-Object
Background info that may lead to a better solution:
Thanks to PetSerAl, my answer here shows how to produce the same exception that Select-Object -First uses internally to stop upstream cmdlets.
However, there the exception is thrown from inside the cmdlet that is itself connected to the pipeline to stop, which is not the case here:
Stop-Pipeline, as used in the examples above, is not connected to the pipeline that should be stopped (only the enclosing ForEach-Object (%) block is), so the question is: How can the exception be thrown in the context of the target pipeline?
Try these filters, they'll force the pipeline to stop after the first object -or the first n elements- and store it -them- in a variable; you need to pass the name of the variable, if you don't the object(s) are pushed out but cannot be assigned to a variable.
filter FirstObject ([string]$vName = '') {
if ($vName) {sv $vName $_ -s 1} else {$_}
break
}
filter FirstElements ([int]$max = 2, [string]$vName = '') {
if ($max -le 0) {break} else {$_arr += ,$_}
if (!--$max) {
if ($vName) {sv $vName $_arr -s 1} else {$_arr}
break
}
}
# can't assign to a variable directly
$myLog = get-eventLog security | ... | firstObject
# pass the the varName
get-eventLog security | ... | firstObject myLog
$myLog
# can't assign to a variable directly
$myLogs = get-eventLog security | ... | firstElements 3
# pass the number of elements and the varName
get-eventLog security | ... | firstElements 3 myLogs
$myLogs
####################################
get-eventLog security | % {
if ($_.timegenerated -lt (date 11.09.08) -and`
$_.timegenerated -gt (date 11.01.08)) {$log1 = $_; break}
}
#
$log1
Another option would be to use the -file parameter on a switch statement. Using -file will read the file one line at a time, and you can use break to exit immediately without reading the rest of the file.
switch -file $someFile {
# Parse current line for later matches.
{ $script:line = [DateTime]$_ } { }
# If less than min date, keep looking.
{ $line -lt $minDate } { Write-Host "skipping: $line"; continue }
# If greater than max date, stop checking.
{ $line -gt $maxDate } { Write-Host "stopping: $line"; break }
# Otherwise, date is between min and max.
default { Write-Host "match: $line" }
}