in a text something like this, I need to be able to read the project code which is unique per text file.
devices :
meta : #{Projectcode=rvmf99999}
public_keys : #{Key=ssh-
select-string -pattern "rvmf" picks up the whole line, I just need rvmf and the digits after that.
# Sample input
$txt = #'
devices :
meta : #{Projectcode=rvmf99999}
public_keys : #{Key=ssh-
'#
$txt | Select-String 'rvmf\d+' | foreach { $_.Matches[0].Value } # -> 'rvmf99999'
Regex 'rvmf\d+' captures substring 'rvmf' followed by 1 or more (+) digits (\d).
The object output by Select-String has a .Matches property whose first entry's .Value property contains what the regex captured.
Specifically, the output objects are of type Microsoft.PowerShell.Commands.MatchInfo, which contains the input line (property .Line) as well as metadata about the source of the line and details about the regex-matching operation in the .Matches property.
Specifically, the .Matches property contains a collection of match-information objects; unless -AllMatches was passed to Select-Object, there will only be one element, however.
Each element of the .Matches collection is a System.Text.RegularExpressions.Match instance, whose .Value property contains what the regex captured as a whole.
Note: There is an upcoming feature - green-lighted, but not yet implemented as of PowerShell Core 7.0.0-preview.5 - that will greatly simplify the command:
# NOT YET IMPLEMENTED as of PowerShell Core 7.0.0-preview.5
$txt | Select-String 'rvmf\d+' -OnlyMatching # -> 'rvmf99999'
-OnlyMatching will only output the part of the line that was matched.
Related
I am using this to find if file name contains exactly 7 digits
if ($file.Name -match '\D(\d{7})(?:\D|$)') {
$result = $matches[1]
}
The problem is when there is a file name that contains 2 groups of 7 digits
for an example:
patch-8.6.22 (1329214-1396826-Increase timeout.zip
In this case the result will be the first one (1329214).
For most cases there is only one number so the regex is working but I must to recognize if there is more than 1 group and integrated into the if ()
The -match operator only ever looks for one match.
To get multiple ones, you must currently use the underlying .NET APIs directly, specifically [regex]::Matches():
Note: There's a green-lighted proposal to implement a -matchall operator, but as of PowerShell 7.3.0 it hasn't been implemented yet - see GitHub issue #7867.
# Sample input.
$file = [pscustomobject] #{ Name = 'patch-8.6.22 (1329214-1396826-Increase timeout.zip' }
# Note:
# * If *nothing* matches, $result will contain $null
# * If *one* substring matches, return will be a single string.
# * If *two or more* substrings match, return will be an *array* of strings.
$result = ([regex]::Matches($file.Name, '(?<=\D)\d{7}(?=\D|$)')).Value
.Value uses member-access enumeration to extract matching substrings, if any, from the elements of the collection returned by [regex]::Matches().
I've tweaked the regex to use lookaround assertions ((?<=/...) and (?=...)) so that only the substrings of interest are captured.
See this regex101.com page for an explanation of the regex and the ability to experiment with it.
I'm looking to pad IP addresses with 0's
example
1.2.3.4 -> 001.002.003.004
50.51.52.53 -> 050.051.052.053
Tried this:
[string]$paddedIP = $IPvariable
[string]$paddedIP.PadLeft(3, '0')
Also tried split as well, but I'm new to powershell...
You can use a combination of .Split() and -join.
('1.2.3.4'.Split('.') |
ForEach-Object {$_.PadLeft(3,'0')}) -join '.'
With this approach, you are working with strings the entire time. Split('.') creates an array element at every . character. .PadLeft(3,'0') ensures 3 characters with leading zeroes if necessary. -join '.' combines the array into a single string with each element separated by a ..
You can take a similar approach with the format operator -f.
"{0:d3}.{1:d3}.{2:d3}.{3:d3}" -f ('1.2.3.4'.Split('.') |
Foreach-Object { [int]$_ } )
The :dN format string enables N (number of digits) padding with leading zeroes.
This approach creates a string array like in the first solution. Then each element is pipelined and converted to an [int]. Lastly, the formatting is applied to each element.
To complement AdminOfThings' helpful answer with a more concise alternative using the -replace operator with a script block ({ ... }), which requires PowerShell Core (v6.1+):
PSCore> '1.2.3.50' -replace '\d+', { '{0:D3}' -f [int] $_.Value }
001.002.003.050
The script block is called for every match of regex \d+ (one or more digits), and $_ inside the script block refers to a System.Text.RegularExpressions.Match instance that represents the match at hand; its .Value property contains the matched text (string).
I have some Powershell that works with mail from Outlook folders. There is a footer on most emails starting with text "------". I want to dump all text after this string.
I have added an expression to Select-Object as follows:
$cleanser = {($_.Body).Substring(0, ($_.Body).IndexOf("------"))}
$someObj | Select-Object -Property #{ Name = 'Body'; Expression = $cleanser}
This works when the IndexOf() returns a match... but when there is no match my Select-Object outputs null.
How can I update my expression to return the original string when IndexOf returns null?
PetSerAl, as countless times before, has provided the crucial pointer in a comment on the question:
Use PowerShell's -replace operator, which implements regex-based string replacement that returns the input string as-is if the regex doesn't match:
# The script block to use in a calculated property with Select-Object later.
$cleanser = { $_.Body -replace '(?s)------.*' }
If you want to ensure that ------ only matches at the start of a line, use (?sm)^------.*; if you also want to remove the preceding newline, use (?s)\r?\n------.*
(?s) is an inline regex option that makes . match newlines too, so that .* effectively matches all remaining text, across lines.
By not specifying a replacement operand, '' (the empty string) is implied, which effectively removes the matching part from the input string (technically, a copy of the original string with the matching part removed is returned).
If regex '(?s)------.*' does not match, $_.Body is returned as-is (technically, it is the input string itself that is returned, not a copy).
The net effect is that anything starting with ------ is removed, if present.
I agree with #mklement0 and #PetSerAl Regular Expressions give the best answer. Yay! Regular Expressions to the rescue!
Edit:
Fixing my original post.
Going with #Adam's ideas of using a script block in the expression, you simply need to add more logic to the script block to check the index first before using it:
$cleanser = {
$index = ($_.Body).IndexOf("------");
if($index -eq -1){
$index = $_.Body.Length
};
($_.Body).Substring(0, $index)
}
$someObj | Select-Object -Property #{ Name = 'Body'; Expression = $cleanser}
Struggling to extract value within square brackets from below strings using PowerShell
in relation to any Facility C Loan [?10%?] per cent. per annum;
"Facility A Commitments" means the aggregate of the Facility A Commitments, being [????????10 million?????] at the date of this Agreement.
Output required:
10%
10 million
With a single, multiline string in memory (PSv4+):
$str = #'
in relation to any Facility C Loan [?10%?] per cent. per annum;
"Facility A Commitments" means the aggregate of the Facility A Commitments, being [????????10 million?????] at the date of this Agreement.
'#
[regex]::matches($str,'\[\?+([^?]+)\?+\]').ForEach({ $_.Groups[1].Value })
Using the pipeline with Get-Content and Select-String for line-by-line processing (PSv3+):
$lines = #'
in relation to any Facility C Loan [?10%?] per cent. per annum;
"Facility A Commitments" means the aggregate of the Facility A Commitments, being [????????10 million?????] at the date of this Agreement.
'# -split '\r?\n'
# Substitute your `Get-Content someFile.txt` call for $lines
$lines |
Select-String '\[\?+([^?]+)\?+\]' |
ForEach-Object { $_.Matches.Groups[1].Value }
Explanation of regex \[\?+([^?]+)\?+\]:
\[ matches a literal [
\?+ matches one or more (+) literal ?
([^?]+) is a capture group ((...)) that matches one or more (+) characters from the set of characters ([...]) that are not (^) part of the set, i.e., any character that is not the ? character - this is the value of interest to extract.
\?+ matches one or more literal ?
\] matches a literal ]
[regex]::Matches() and the .Matches property on the objects that Select-String emits is a collection of [System.Text.RegularExpressions.Match] objects, whose .Groups property contains both the full match (index 0) and what each capture group captured (1 containing the 1st capture group's value, ...).
This is your regex for both cases:
(?<=\[\?+)[^\?]*(?=\?+\])
You can play with it at https://regex101.com
But this does not support non-fixed width look behinds (the first plus). It should work in .NET/PowerShell though.
This will be good for you:
https://www.regular-expressions.info/lookaround.html
For the first one run:
$message -match '\[\?(\d*%)\?\]'
echo $Matches[1]
For the second one:
\[\?*(\d* million)\?*\]
echo $Matches[1]
In each iteration you can simple as if $message -match '...' returns $True, than check the values inside $Matches variable (this is a system variable to hold the result of the regex.
I have a long string which contains letters, numbers, and other symbols.
I need to filter everything that matches the form number.number.number. For example 1.0.90 should pass the filter (it's a version number).
Afterwards, I need to convert the number after the last period (in the above example - 90) to a number which I can manipulate.
I didn't find any good explanation out there.
Use a regular expression to match the version number and capture the revision number for extraction (via the automatic variable $matches):
... | Where-Object {
$_ -match '\d+\.\d+\.(\d+)'
} | ForEach-Object {
$revision = [int]$matches[1]
}