PowerShell filter specific form from string - powershell

I have a long string which contains letters, numbers, and other symbols.
I need to filter everything that matches the form number.number.number. For example 1.0.90 should pass the filter (it's a version number).
Afterwards, I need to convert the number after the last period (in the above example - 90) to a number which I can manipulate.
I didn't find any good explanation out there.

Use a regular expression to match the version number and capture the revision number for extraction (via the automatic variable $matches):
... | Where-Object {
$_ -match '\d+\.\d+\.(\d+)'
} | ForEach-Object {
$revision = [int]$matches[1]
}

Related

Powershell - Need to recognize if there is more than one result (regex)

I am using this to find if file name contains exactly 7 digits
if ($file.Name -match '\D(\d{7})(?:\D|$)') {
$result = $matches[1]
}
The problem is when there is a file name that contains 2 groups of 7 digits
for an example:
patch-8.6.22 (1329214-1396826-Increase timeout.zip
In this case the result will be the first one (1329214).
For most cases there is only one number so the regex is working but I must to recognize if there is more than 1 group and integrated into the if ()
The -match operator only ever looks for one match.
To get multiple ones, you must currently use the underlying .NET APIs directly, specifically [regex]::Matches():
Note: There's a green-lighted proposal to implement a -matchall operator, but as of PowerShell 7.3.0 it hasn't been implemented yet - see GitHub issue #7867.
# Sample input.
$file = [pscustomobject] #{ Name = 'patch-8.6.22 (1329214-1396826-Increase timeout.zip' }
# Note:
# * If *nothing* matches, $result will contain $null
# * If *one* substring matches, return will be a single string.
# * If *two or more* substrings match, return will be an *array* of strings.
$result = ([regex]::Matches($file.Name, '(?<=\D)\d{7}(?=\D|$)')).Value
.Value uses member-access enumeration to extract matching substrings, if any, from the elements of the collection returned by [regex]::Matches().
I've tweaked the regex to use lookaround assertions ((?<=/...) and (?=...)) so that only the substrings of interest are captured.
See this regex101.com page for an explanation of the regex and the ability to experiment with it.

Powershell number format

I am creating a script converting a csv file in an another format.
To do so, i need my numbers to have a fixed format to respect column size : 00000000000000000,00 (20 characters, 2 digits after comma)
I have tried to format the number with -f and the method $value.toString("#################.##") without success
Here is an example Input :
4000000
45817,43
400000
570425,02
15864155,69
1068635,69
128586256,9
8901900,04
29393,88
126858346,88
1190011,46
2358411,95
139594,82
13929,74
11516,85
55742,78
96722,57
21408,86
717,01
54930,49
391,13
2118,64
Any hints are welcome :)
Thank you !
tl;dr:
Use 0 instead of # in the format string:
PS> $value = 128586256.9; $value.ToString('00000000000000000000.00')
00000000000128586256.90
Note:
Alternatively, you could construct the format string as an expression:
$value.ToString('0' * 20 + '.00')
The resulting string reflects the current culture with respect to the decimal mark; e.g., with fr-FR (French) in effect, , rather than . would be used; you can pass a specific [cultureinfo] object as the second argument to control what culture is used for formatting; see the docs.
As in your question, I'm assuming that $value already contains a number, which implies that you've already converted the CSV column values - which are invariably strings - to numbers.
To convert a string culture-sensitively to a number, use [double]::Parse('1,2'), for instance (this method too has an overload that allows specifying what culture to use).
Caveat: By contrast, a PowerShell cast (e.g. [double] '1.2') is by design always culture-invariant and only recognizes . as the decimal mark, irrespective of the culture currently in effect.
zerocukor287 has provided the crucial pointer:
To unconditionally represent a digit in a formatted string and default to 0 in the absence of an available digit, use 0, the zero placeholder in a .NET custom numeric format string
By contrast, #, the digit placeholder, represents only digits actually present in the input number.
To illustrate the difference:
PS> (9.1).ToString('.##')
9.1 # only 1 decimal place available, nothing is output for the missing 2nd
PS> (9.1).ToString('.00')
9.10 # only 1 decimal place available, 0 is output for the missing 2nd
Since your input uses commas as decimal point, you can split on the comma and format the whole number and the decimal part separately.
Something like this:
$csv = #'
Item;Price
Item1;4000000
Item2;45817,43
Item3;400000
Item4;570425,02
Item5;15864155,69
Item6;1068635,69
Item7;128586256,9
Item8;8901900,04
Item9;29393,88
Item10;126858346,88
Item11;1190011,46
Item12;2358411,95
Item13;139594,82
Item14;13929,74
Item15;11516,85
Item16;55742,78
Item17;96722,57
Item18;21408,86
Item19;717,01
Item20;54930,49
Item21;391,13
Item22;2118,64
'# | ConvertFrom-Csv -Delimiter ';'
foreach ($item in $csv) {
$num,$dec = $item.Price -split ','
$item.Price = '{0:D20},{1:D2}' -f [int64]$num, [int]$dec
}
# show on screen
$csv
# output to (new) csv file
$csv | Export-Csv -Path 'D:\Test\formatted.csv' -Delimiter ';'
Output in screen:
Item Price
---- -----
Item1 00000000000004000000,00
Item2 00000000000000045817,43
Item3 00000000000000400000,00
Item4 00000000000000570425,02
Item5 00000000000015864155,69
Item6 00000000000001068635,69
Item7 00000000000128586256,09
Item8 00000000000008901900,04
Item9 00000000000000029393,88
Item10 00000000000126858346,88
Item11 00000000000001190011,46
Item12 00000000000002358411,95
Item13 00000000000000139594,82
Item14 00000000000000013929,74
Item15 00000000000000011516,85
Item16 00000000000000055742,78
Item17 00000000000000096722,57
Item18 00000000000000021408,86
Item19 00000000000000000717,01
Item20 00000000000000054930,49
Item21 00000000000000000391,13
Item22 00000000000000002118,64
I do things like this all the time, usually for generating computernames. That custom numeric format string reference will come in handy. If you want a literal period, you have to backslash it.
1..5 | % tostring 00000000000000000000.00
00000000000000000001.00
00000000000000000002.00
00000000000000000003.00
00000000000000000004.00
00000000000000000005.00
Adding commas to long numbers:
psdrive c | % free | % tostring '0,0' # or '#,#'
18,272,501,760
"Per mille" character ‰ :
.00354 | % tostring '#0.##‰'
3.54‰

Pad IP addresses with leading 0's - powershell

I'm looking to pad IP addresses with 0's
example
1.2.3.4 -> 001.002.003.004
50.51.52.53 -> 050.051.052.053
Tried this:
[string]$paddedIP = $IPvariable
[string]$paddedIP.PadLeft(3, '0')
Also tried split as well, but I'm new to powershell...
You can use a combination of .Split() and -join.
('1.2.3.4'.Split('.') |
ForEach-Object {$_.PadLeft(3,'0')}) -join '.'
With this approach, you are working with strings the entire time. Split('.') creates an array element at every . character. .PadLeft(3,'0') ensures 3 characters with leading zeroes if necessary. -join '.' combines the array into a single string with each element separated by a ..
You can take a similar approach with the format operator -f.
"{0:d3}.{1:d3}.{2:d3}.{3:d3}" -f ('1.2.3.4'.Split('.') |
Foreach-Object { [int]$_ } )
The :dN format string enables N (number of digits) padding with leading zeroes.
This approach creates a string array like in the first solution. Then each element is pipelined and converted to an [int]. Lastly, the formatting is applied to each element.
To complement AdminOfThings' helpful answer with a more concise alternative using the -replace operator with a script block ({ ... }), which requires PowerShell Core (v6.1+):
PSCore> '1.2.3.50' -replace '\d+', { '{0:D3}' -f [int] $_.Value }
001.002.003.050
The script block is called for every match of regex \d+ (one or more digits), and $_ inside the script block refers to a System.Text.RegularExpressions.Match instance that represents the match at hand; its .Value property contains the matched text (string).

Returning the whole string when no match in a Powershell Substring(0, IndexOf)

I have some Powershell that works with mail from Outlook folders. There is a footer on most emails starting with text "------". I want to dump all text after this string.
I have added an expression to Select-Object as follows:
$cleanser = {($_.Body).Substring(0, ($_.Body).IndexOf("------"))}
$someObj | Select-Object -Property #{ Name = 'Body'; Expression = $cleanser}
This works when the IndexOf() returns a match... but when there is no match my Select-Object outputs null.
How can I update my expression to return the original string when IndexOf returns null?
PetSerAl, as countless times before, has provided the crucial pointer in a comment on the question:
Use PowerShell's -replace operator, which implements regex-based string replacement that returns the input string as-is if the regex doesn't match:
# The script block to use in a calculated property with Select-Object later.
$cleanser = { $_.Body -replace '(?s)------.*' }
If you want to ensure that ------ only matches at the start of a line, use (?sm)^------.*; if you also want to remove the preceding newline, use (?s)\r?\n------.*
(?s) is an inline regex option that makes . match newlines too, so that .* effectively matches all remaining text, across lines.
By not specifying a replacement operand, '' (the empty string) is implied, which effectively removes the matching part from the input string (technically, a copy of the original string with the matching part removed is returned).
If regex '(?s)------.*' does not match, $_.Body is returned as-is (technically, it is the input string itself that is returned, not a copy).
The net effect is that anything starting with ------ is removed, if present.
I agree with #mklement0 and #PetSerAl Regular Expressions give the best answer. Yay! Regular Expressions to the rescue!
Edit:
Fixing my original post.
Going with #Adam's ideas of using a script block in the expression, you simply need to add more logic to the script block to check the index first before using it:
$cleanser = {
$index = ($_.Body).IndexOf("------");
if($index -eq -1){
$index = $_.Body.Length
};
($_.Body).Substring(0, $index)
}
$someObj | Select-Object -Property #{ Name = 'Body'; Expression = $cleanser}

powershell extracting data from strings or other suggestions

I have a script I am writing that essentially reads data from an excel document that is generated from another tool. It lists file ages in the format listed below. My issue is I would like to process each cell value and change the cell color based on that value. So anything older than 1 year gets changed to RED, 90+ days gets yellow\orange.
So after a bit of research, I elected to use an if statement to determine when it is greater than 0 years which seems to work fine, however when I reach the days portion I'm not sure how to extract JUST the digits portion to the left of d in each cell when you get to the y if its there just stop OR possibly just read the left digits only if the $_ contains d then I could further process if that value is -gt 90? I am unsure of how to extract variable length strings only if they are digits left of a character. I considered using a combination of the below method of finding a character and returning up to y or something else.
Find character position and update file name
Possible Age Formats:
13y170d
3y249d
8h7m
1y109d
1y109d
1y109d
5d22h
3y281d
3y184d
11y263d
7m25s
1h14m
[regex]$years = "\d{1,3}[0-9]y"
[regex]$days_90 = "\d{0,3}[0-9]d"
conditionally formatting/coloring row based on age (years)
if ( $( A$_ -match "$years") -eq $True ) {
$($test_home).$("Last Accessed") | ForEach-Object { $( $($_.Contains("y") -eq $True ) { New-ConditionalText -Text Red } }
conditionally formatting/coloring row based on age (90+ days)
if ( $( A$_ -match "$days_90") -eq $True ) { New-ConditionalText -Text Yellow }
What you are after is a positive lookahead and lookbehind. Effectivly it gets the text between two characters or sets. Really handy if you have a consistently formatted set of data to work with.
[regex]$days_90 = '(?<=y).*?(?=d)'
. Matches any characters without line breaks.
* Matches 0 or more of the preceding token.
? Makes the regex lazy and try to match as few as possible.