Do chained powershell replace commands execute one after the other? - powershell

This would likely be a non-issue with expert regex comprehension. And only matters because I am running multiple chained replace commands that affect some of the same text in a text file. I also imagine partitioning the txt files based on how delimiter words --that are requiring multiple replaces-- are used, before replace, would help. With that said basic structural knowledge of powershell is useful and I have not found many great resources (open to suggestions!).
The question: Do chained powershell replace commands execute one after the other?
-replace "hello:","hello " `
-replace "hello ","hello:"
} | out-file ...
Would this silly example above yield hello:'s where there were initially hello:'s?
From working through some projects I gather that the above works most of the time. Yet there always seem to be some edge cases. Is this another aspect of the script or is the order that chained commands (decent number of them) execute in never variable?

What you have there are operators, not commands.
I say that not to be pedantic, but because "command" has a specific meaning in PowerShell (it is a general name encompassing functions, cmdlets, aliases, applications, filters, configurations (this is a DSC construct), workflows, and scripts), and because the way they can be used together is different.
Most operators are reserved words that begin with - (but other things count as operators, like casting), and you can indeed use them chained together. They also execute in order.
I need to clarify; they don't necessarily execute in the order given when you mix operators. Multiple of the same operator will because they all have the same precedence, but you should check about_Operator_Precedence to see the order that will be used when you combine them.
Note that some operators can "short-circuit" (which may sound like a malfunction, but it isn't), that is the result of certain boolean operators will not evaluate later operations if the boolean result can not change.
For example:
$true -or $false
In this example, the $false part of the expression will never actually be evaluated. This is important if the next part of the expression is complex or even invalid. Consider these:
$true -or $(throw)
$false -or $(throw)
The first will return $true because (presumably) nothing in the coming expression could make it $false.
The second line must evaluate the second expression, and in doing so it throws an exception, halting the program.
So, aside from that aside, yes, you can continue to chain your operators. You also don't need a line continuation character (backtick `) at the end of the line if the operator itself is at the end. More useful with boolean operators:
$a -and
$b -or
$c -xor
$false
A little awkward with something like replace:
'apple' -replace
'p',
'z'
Regarding this:
And only matters because I am running multiple chained replace
commands that affect some of the same text in a text file.
These operators aren't touching anything in a file, they are working with data in memory, as literals or variables in your script (what you do with it then, like writing to a file is your business).
Further, even then it doesn't change any values already in variables, it returns new ones, which you may assign to a variable or use in any other way.
$var = 'apple'
$var -replace 'p','Z'
$var
The value of the replacement will be returned, but nothing was done with it so it went out to the console. Then you can see that $var was not modified at all, as opposed to:
$var = 'apple'
$var = $var -replace 'p','Z'
$var
Where the value of $var was overwritten.
If there are edge cases, it's likely to be a misunderstanding of something in the sequence of events (an incorrect regular expression, not assigning or using a value, incorrect logic, etc.), as the order of operations will be consistent. If you have any such edge cases, please post them!

Related

How to use line breaks in PowerShell Method-Chaining

I am trying to use a Retry Service that is written in the fluent-api pattern.
The methods return the service and allow a chaining of methods.
However even though i am using --> ` <-- i see a lot of errors as shown below.
Is there any workaround or other possibility to not write everything into one line?
(I already checked the methods names and return types)
(Entry point of RetryService)
Unfortunately about_Methods doesn't seem to have a clarification on method chaining and its parsing rules. If you want to chain multiple methods on new lines, the dot . has to be at the end of each statement then a newline is allowed. The backticks are not needed.
In example:
[powershell]::Create().
AddScript({ "hello $args" }).
AddArgument('world').
Invoke()
Another way to chain method calls is to use the ForEach-Object command (alias %). This relies on the parameter set with the -MemberName parameter (implicitly by passing a string as the 1st argument).
PowerShell 7+ even lets you write the pipe symbol | on a new line:
[powershell]::Create()
|% AddScript { "hello $args" }
|% AddArgument 'world'
|% Invoke
If there are multiple method arguments, you have to separate them by , (parentheses are unnecessary).
For PS 5 and below you have to use a slightly different syntax because the pipe symbol has to be on the same line as the previous command:
[powershell]::Create() |
% AddScript { "hello $args" } |
% AddArgument 'world' |
% Invoke
Is this a better way than using member access operator .? I don't think so, it's just a different way. IMO it does look more consistent compared to regular PowerShell commands. Performance might be even worse than . but for high level code it propably doesn't matter (I haven't measured).

How to pipe results into output array

After playing around with some powershell script for a while i was wondering if there is a version of this without using c#. It feels like i am missing some information on how to pipe things properly.
$packages = Get-ChildItem "C:\Users\A\Downloads" -Filter "*.nupkg" |
%{ $_.Name }
# Select-String -Pattern "(?<packageId>[^\d]+)\.(?<version>[\w\d\.-]+)(?=.nupkg)" |
# %{ #($_.Matches[0].Groups["packageId"].Value, $_.Matches[0].Groups["version"].Value) }
foreach ($package in $packages){
$match = [System.Text.RegularExpressions.Regex]::Match($package, "(?<packageId>[^\d]+)\.(?<version>[\w\d\.-]+)(?=.nupkg)")
Write-Host "$($match.Groups["packageId"].Value) - $($match.Groups["version"].Value)"
}
Originally i tried to do this with powershell only and thought that with #(1,2,3) you could create an array.
I ended up bypassing the issue by doing the regex with c# instead of powershell, which works, but i am curious how this would have been done with powershell only.
While there are 4 packages, doing just the powershell version produced 8 lines. So accessing my data like $packages[0][0] to get a package id never worked because the 8 lines were strings while i expected 4 arrays to be returned
Terminology note re without using c#: You mean without direct use of .NET APIs. By contrast, C# is just another .NET-based language that can make use of such APIs, just like PowerShell itself.
Note:
The next section answers the following question: How can I avoid direct calls to .NET APIs for my regex-matching code in favor of using PowerShell-native commands (operators, automatic variables)?
See the bottom section for the Select-String solution that was your true objective; the tl;dr is:
# Note the `, `, which ensures that the array is output *as a single object*
%{ , #($_.Matches[0].Groups["packageId"].Value, $_.Matches[0].Groups["version"].Value) }
The PowerShell-native (near-)equivalent of your code is (note tha the assumption is that $package contains the content of the input file):
# Caveat: -match is case-INSENSITIVE; use -cmatch for case-sensitive matching.
if ($package -match '(?<packageId>[^\d]+)\.(?<version>[\w\d\.-]+)(?=.nupkg)') {
"$($Matches['packageId']) - $($Matches['Version'])"
}
-match, the regular-expression matching operator, is the equivalent of [System.Text.RegularExpressions.Regex]::Match() (which you can shorten to [regex]::Match()) in that it only looks for (at most) one match.
Caveat re case-sensitivity: -match (and its rarely used alias -imatch) is case-insensitive by default, as all PowerShell operators are; for case-sensitive matching, use the c-prefixed variant, -cmatch.
By contrast, .NET APIs are case-sensitive by default; you'd have to pass the [System.Text.RegularExpressions.RegexOptions]::IgnoreCase flag to [regex]::Match() for case-insensitive matching (you may use 'IgnoreCase', which PowerShell auto-converts for you).
As of PowerShell 7.2.x, there is no operator that is the equivalent of the related return-ALL-matches .NET API, [regex]::Matches(). See GitHub issue #7867 for a green-lit but yet-to-be-implemented proposal to introduce one, named -matchall.
However, instead of directly returning an object describing what was (or wasn't) matched, -match returns a Boolean, i.e. $true or $false, to indicate whether matching succeeded.
Only if -match returns $true does information about a match become available, namely via the automatic $Matches variable, which is a hashtable reflecting the matching parts of the input string: entry 0 is always the full match, with optional additional entries reflecting what any capture groups ((...)) captured, either by index, if they're anonymous (starting with 1) or, as in your case, for named capture groups ((?<name>...)) by name.
Syntax note: Given that PowerShell allows use of dot notation (property-access syntax) even with hashtables, the above command could have used $Matches.packageId instead of $Matches['packageId'], for instance, which also works with the numeric (index-based) entries, e.g., $Matches.0 instead of $Matches[0]
Caveat: If an array (enumerable) is used as the LHS operand, -match' behavior changes:
$Matches is not populated.
filtering is performed; that is, instead of returning a Boolean indicating whether matching succeeded, the subarray of matching input strings is returned.
Note that the $Matches hashtable only provides the matched strings, not also metadata such as index and length, as found in [regex]::Match()'s return object, which is of type [System.Text.RegularExpressions.Match].
Select-String solution:
$packages |
Select-String '(?<packageId>[^\d]+)\.(?<version>[\w\d\.-]+)(?=.nupkg)' |
ForEach-Object {
"$($_.Matches[0].Groups['packageId'].Value) - $($_.Matches[0].Groups['version'].Value)"
}
Select-String outputs Microsoft.PowerShell.Commands.MatchInfo instances, whose .Matches collection contains one or more [System.Text.RegularExpressions.Match] instances, i.e. instances of the same type as returned by [regex]::Match()
Unless -AllMatches is also passed, .Matches only ever has one entry, hence the use of [0] to target that entry above.
As you can see, working with Select-Object's output objects requires you to ultimately work with the same .NET type as when you call [regex]::Match() directly.
However, no method calls are required, and discovering the properties of the output objects is made easy in PowerShell via the Get-Member cmdlet.
If you want to capture the matches in a jagged array:
$capturedStrings = #(
$packages |
Select-String '(?<packageId>[^\d]+)\.(?<version>[\w\d\.-]+)(?=.nupkg)' |
ForEach-Object {
# Output an array of all capture-group matches,
# *as a single object* (note the `, `)
, $_.Matches[0].Groups.Where({ $_.Name -ne '0' }).Value
}
)
This returns an array of arrays, each element of which is the array of capture-group matches for a given package, so that $capturedStrings[0][0] returns the packageId value for the first package, for instance.
Note:
$_.Matches[0].Groups.Where({ $_.Name -ne '0' }).Value programmatically enumerates all capture-group matches and returns an their .Value property values as an array, using member-access enumeration; note how name '0' must be excluded, as it represents the whole match.
With the capture groups in your specific regex, the above is equivalent to the following, as shown in a commented-out line in your question:
#($_.Matches[0].Groups['packageId'].Value, $_.Matches[0].Groups['version'].Value)
, ..., the unary form of the array-construction operator, is used as a shortcut for outputting the array (symbolized by ... here) as a whole, as a single object. By default, enumeration would occur and the elements would be emitted one by one. , ... is in effect a shortcut to the conceptually clearer Write-Output -NoEnumerate ... - see this answer for an explanation of the technique.
Additionally, #(...), the array subexpression operator is needed in order to ensure that a jagged array (nested array) is returned even in the event that only one array is returned across all $packages.

Editing Powershell Object

I'm using powershell to run a command like so:
$getlist=rclone sha1sum remote:"\My Pictures\2009\03" --dry-run
Write-Output $getlist
that outputs a object with the results. Problem being I only want the first column of those results. I've tried things like custom-format --Depth 1 and the other *-format commands but they don't work on this object??
that outputs a object with the results
While that is technically true, it is more specifically an [object[]]-typed array of lines ([string] instances) that assigning the stream of output lines - produced by the external rclone program - to a PowerShell variable implicitly created. (Arrays created by PowerShell are [object[]]-typed, even if all the elements are of the same type, such as [string] in this case).
PowerShell fundamentally only "speaks text" when communicating with external programs.
Therefore, to extract substrings from these lines you must perform text parsing, as implied by AdminOfThings' comment on the question.
A simplified approach is to use the unary form of the -split operator:
# Simulate lines input whose first whitespace-separated token is to
# be extracted.
$getlist = 'foo bar baz', 'more stuff here'
$getlist.ForEach({ (-split $_)[0] })
The above yields:
foo
more
zett42's helpful answer shows a simpler alternative that relies on the -replace operator's (among others) ability to operate directly on each element of an array-valued LHS.
However, the -split approach is useful if you want to extract multiple column values.
If you don't need / want to capture all of the external program's (rclone's) output in memory first, you can use streaming processing in the pipeline, via the ForEach-Object cmdlet:
'foo bar baz', 'more stuff here' | ForEach-Object { (-split $_)[0] }
Note: While slightly slower than collecting all lines in memory up front, the advantage of a pipeline-based approach is reduced memory load: only the extracted substrings are kept in memory (if assigned to a variable).
You can use a regular expression to remove the undesired parts of the output:
$getlist = $getlist -replace '\s.*'
When a PowerShell operator such as -replace is applied to a collection, it will be applied to each element individually, creating a new array that stores the results (see Substitution in a collection).
The regular expression removes everything from the first whitespace up to the end of the string.
RegEx breakdown:
\s - a single whitespace character like space and tab
.* - any character, zero or more times

Powershell splatting: pass ErrorAction = Ignore in hash table

Here's a script to list directories / files passed on the command line -- recursively or not:
param( [switch] $r )
#gci_args = #{
Recurse = $r
ErrorAction = Ignore
}
$args | gci #gci_args
Now this does not work because Ignore is interpreted as a literal. What's the canonical way to pass an ErrorAction?
I see that both "Ignore" and (in PS7) { Ignore } work as values, but neither seems to make a difference for my use case (bad file name created under Linux, which stops PS5 regardless of the ErrorAction, but does not bother PS7 at all). So I'm not even sure the parameter has any effect.
because Ignore is interpreted as a literal
No, Ignore is interpreted as a command to execute, because it is parsed in argument mode (command invocation, like a shell) rather than in expression mode (like a traditional programming language) - see this answer for more information.
While using a [System.Management.Automation.ActionPreference] enumeration value explicitly, as in filimonic's helpful answer, is definitely an option, you can take advantage of the fact that PowerShell automatically converts back and forth between enum values and their symbolic string representations.
Therefore, you can use string 'Ignore' as a more convenient alternative to [System.Management.Automation.ActionPreference]::Ignore:[1]
$gci_args = #{
# ...
ErrorAction = 'Ignore'
}
Note that it is the quoting ('...') that signals to PowerShell that expression-mode parsing should be used, i.e. that the token is a string literal rather than a command.
Also note that -ErrorAction only operates on non-terminating errors (which are the typical kind, however) - see this answer for more information.
As for discovery of the permissible -ErrorAction values:
The conceptual about_CommonParameters help topic covers all common parameters, of which -ErrorAction is one.
Many common parameters have corresponding preference variables (which accept the same values), covered in about_Preference_Variables, which allow you to preset common parameters.
Interactively, you can use tab-completion to see the permissible values (as unquoted symbolic names, which you simply need to wrap in quotes); e.g.:
# Pressing the Tab key repeatedly where indicated
# cycles through the acceptable arguments.
Get-ChildItem -ErrorAction <tab>
[1] Note that using a string does not mean giving up type safety, if the context unambiguously calls for a specific enum type, such as in this case. Validation only happens at runtime either way, given that PowerShell is an interpreted language.
However, it is possible for a PowerShell-aware editor - such as Visual Studio Code with the PowerShell extension - to flag incorrect values at design time. As of version 2020.6.0, however, that does not yet appear to be the case. Fortunately, however, tab-completion and IntelliSense work as expected, so the problem may not arise.
That said, as zett42 points out, in the context of defining a hashtable entry for latter splatting the expected type is not (yet) known, so explicit use of [System.Management.Automation.ActionPreference] does have advantages: (a) IntelliSense in the editor can guide you, and (b) - assuming that Set-StrictMode -Version 2 or higher is in effect - an invalid value will we reported earlier at runtime, namely at the point of assignment, which makes troubleshooting easier. As of PowerShell 7.1, a caveat regarding Set-StrictMode -Version 2 or higher is that you will not be able to use the intrinsic (PowerShell-supplied) .Count property on objects that don't have it type-natively, due to the bug described in GitHub issue #2798.
I think the best way is to use native type.
$ErrorActionPreference.GetType().FullName # System.Management.Automation.ActionPreference
So, use
$gci_args = #{
Recurse = $r
ErrorAction = [System.Management.Automation.ActionPreference]::Ignore
}

Determine if parameter is regular expression?

I have created a Powershell routine for setting mp3 tags on songs, where I'd like some of my parameters to act as either a regular expession or a "simple" string. To be specific, if the parameter can be said to work as a regular expression, the function should try to use this for retrieving its value; if it can't, it should simply use that value.
I've just browsed Parameter sets, and don't think this would suit me since I want to be flexible with the parameter handling; i.e. I'd like several parameters to act this way independently. But maybe I'm wrong in this? Anyway, help would be appreciated.
You don't really need the try/catch if you use:
IF ($string -as [regex])
If the cast is successful it will return the regex, if not it will return $null, so used as a boolean test in the IF, it will be $true if it is a valid regex, and $false if it is not.
That being said, the I'd agree with Joey that you should settle on a single match type (either wildcard or regex) and stick with that. There's too much potential for unintended consequences in trying to determine if a regex metacharacter was intended to be match literally or not.
You can try converting the string to a regular expression and look for failures. If there is an exception, just use it as string:
$isParamRegex = $(try { $null = [regex]$Param; $true } catch { $false })
As for the parameter type, just make it a string and document it appropriately.
However, I'd say you might want to go a different route there:
Either make the argument always a regex, to avoid surprises with metacharacters.
Or make it a pattern for -like instead of -match which is a bit more predictable for users (imho).
In both cases provide a LiteralParam argument, akin to LiteralPath to just pass things as plain strings which are handled as such.