Leftpad doesn't work when using with regex in Powershell - powershell

Here are two code blocks which show the strange behavior that Leftpad does.
$array = "z", "fg"
$array -replace "(\w{1,2})", '$1'.PadLeft(2, "0")
# output: z, fg
$array = "z", "fg"
$array -replace "(\w{1,2})", '$1'.PadLeft(3, "0")
# output: 0z, 0fg
How comes the pad length not fixed?
EDIT: #LotPings
I ask this another question is because of your way of doing it when applied to rename-item statement will not affect files with brackets in its names.
$file_names = ls "G:\Donwloads\Srt Downloads\15" -Filter *.txt
# the files are: "Data Types _ C# _ Tutorial 5.txt", "If Statements (con't) _ C# _ Tutorial 15.txt"
$file_names |
ForEach{
if ($_.basename -match '^([^_]+)_[^\d]+(\d{1,2})$')
{
$file_names |
Rename-Item -NewName `
{ ($_.BaseName -replace $matches[0], ("{0:D2}. {1}" -f [int]$matches[2],$matches[1])) + $_.Extension }
}
}
# output: 05. Data Types.txt
# If Statements (con't) _ C# _ Tutorial 15.txt
As for the .PadLeft, I thought that the regex replacement group is of string type, it should work with .PadLeft but it doesn't.

Ansgars comment to your last question should have shown that your assumption on the order of actions was wrong.
And Lee_Dailey proved you wrong another time.
My answer to your previous question presented an alternative which also works here:
("Data Types _ C# _ Tutorial 5", "If Statements (con't) _ C# _ Tutorial 15") |
ForEach{
if ($_ -match '^([^_]+)_[^\d]+(\d{1,2})$'){
"{0:D2}. {1}" -f [int]$matches[2],$matches[1]
}
}
Sample output:
05. Data Types
15. If Statements (con't)
The last edit to your question is in fact a new question...
Rename-Item accepts piped input, so no ForEach-Object neccessary when also using
Where-Object with the -match operator to replace the if.
the $Matches collection is supplied the same way.
I really don't know why you insist on using the -replace operator when building the NewName from scratch with the -format operator.
$file_names = Get-ChildItem "G:\Donwloads\Srt Downloads\15" -Filter *.txt
$file_names | Where-Object BaseName -match '^([^_]+)_[^\d]+(\d{1,2})$' |
Rename-Item -NewName {"{0:D2}. {1}{2}" -f [int]$matches[2],$matches[1].Trim(),$_.Extension} -WhatIf

Several days after asking this question, I happen to figure out the problem.
The $number capture group references in -replace syntax are merely literal strings!
Powershell never treats them as anything special, but the Regex engine does. Look at the example below:
$array = "z", "fg"
$array -replace "(\w{1,2})", '$1'.Length
#output: 2
# 2
Looks strange? How comes the $1 capture group has both Lengths of 2 with "z" and "fg"? The answer is that the length being calculated is the string $1 rather than "z","fg"!
Let's look at another example, this time lets replace a letter within the capture group, see what happens:
$array -replace "(\w{1,2})", '$1'.Replace("z", "6")
#output: z
# fg
The output shows the .replace didn't apply to the capture group 1.
$array -replace "(\w{1,2})", '$1'.Replace("1", "6")
#output: $6
# $6
See? The string being replaced is $1 itself.
Now the cause of the .padleft problem should be understood. PS pad the literal string $1 and show the result with the content of the group.
When I pad it with .Padleft(2, "0"), nothing happens because the "$1" itself has the length of 2.
$array -replace "(\w{1,2})", '$1'.PadLeft(2, "0")
# output: z
# fg
If instead, I pad it with .Padleft(3, "0"), this time the pad method does take effect, it applies the extra "0" to $1 but shows the result with the "0" preceding the content of $1.
$array -replace "(\w{1,2})", '$1'.PadLeft(3, "0")
#output: 0z
# 0fg

Related

Split string in Powershell

I have sth written in a ".ini" file that i want to read from PS. The file gives the value "notepad.exe" and i want to give the value "notepad" into a variable. So i do the following:
$CLREXE = Get-Content -Path "T:\keeran\Test-kill\test.ini" | Select-String -Pattern 'CLREXE'
#split the value from "CLREXE ="
$CLREXE = $CLREXE -split "="
#everything fine untill here
$CLREXE = $CLREXE[1]
#i am trying to omit ".exe" here. But it doesn't work
$d = $CLREXE -split "." | Select-String -NotMatch 'exe'
How can i do this ?
#Mathias R. Jessen is already answered your question.
But instead of splitting on filename you could use the GetFileNameWithoutExtension method from .NET Path class.
$CLREXE = "notepad.exe"
$fileNameWithoutExtension = [System.IO.Path]::GetFileNameWithoutExtension($CLREXE)
Write-Host $fileNameWithoutExtension # this will print just 'notepad'
-split is a regex operator, and . is a special metacharacter in regex - so you need to escape it:
$CLREXE -split '\.'
A better way would be to use the -replace operator to remove the last . and everything after it:
$CLREXE -replace '\.[^\.]+$'
The regex pattern matches one literal dot (\.), then 1 or more non-dots ([^\.]+) followed by the end of the string $.
If you're not comfortable with regular expressions, you can also use .NET's native string methods for this:
$CLREXE.Remove($CLREXE.LastIndexOf('.'))
Here, we use String.LastIndexOf to locate the index (the position in the string) of the last occurrence of ., then removing anything from there on out

Need to batch add characters to filenames using Powershell

I have a series of files all named something like:
PRT14_WD_14220000_1.jpg
I need to add two zeroes after the last underscore and before the number so it looks like PRT14_WD_14220000_001.jpg
I've tried"
(dir) | rename-Item -new { $_.name -replace '*_*_*_','*_*_*_00' }
Appreciate any help.
The closest thing to what you attempted would be this. In regex, the wildcard is .*. And the parentheses do grouping to refer to later with the dollar sign numbers.
dir *.jpg | rename-Item -new { $_.name -replace '(.*)_(.*)_(.*)_','$1_$2_$3_00' } -whatif
What if: Performing the operation "Rename File" on target "Item: C:\users\admin\foo\PRT14_WD_14220000_1.jpg Destination: C:\users\admin\foo\PRT14_WD_14220000_001.jpg".
Ok, here's my take when you want the number with max two zeroes padding. $num has to be an integer for the .tostring() method I want.
dir *.jpg | rename-item -newname { $a,$b,$c,$num = $_.basename -split '_'
$num = [int]$num
$a + '_' + $b + "_" + $c + '_' + $num.tostring('000') + '.jpg'
} -whatif
the following presumes your last part of the .BaseName will always need two zeros added to it. what it does ...
fakes getting the fileinfo object that you get from Get-Item/Get-ChildItem
replace that with the appropriate cmdlet. [grin]
splits the .BaseName into parts using the _ as the split target
adds two zeros to the final part from the above split
merges the parts into a $NewBaseName
gets the .FullName and replaces the original BaseName with the $newBaseName
displays that new file name
you will still need to do your rename, but that is pretty direct. [grin]
here's the code ...
# fake getting a file info object
# in real life, use Get-Item or Get-ChildItem
$FileInfo = [System.IO.FileInfo]'PRT14_WD_14220000_1.jpg'
$BNParts = $FileInfo.BaseName.Split('_')
$BNParts[-1] = '00{0}' -f $BNParts[-1]
$NewBasename = $BNParts -join '_'
$NewFileName = $FileInfo.FullName.Replace($FileInfo.BaseName, $NewBaseName)
$NewFileName
output = D:\Data\Scripts\PRT14_WD_14220000_001.jpg
The -replace operator operates on regexes (regular expressions), not wildcard expressons such as * (by itself), which is what you're trying to use.
A conceptually more direct approach is to focus the replacement on the end of the string:
Get-ChildItem | # `dir` is a built-in alias for Get-ChildItem`
Rename-Item -NewName { $_.Name -replace '(?<=_)[^_]+(?=\.)', '00$&' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
(?<=_)[^_]+(?=\.) matches a nonempty run (+) of non-_ chars. ([^_]) preceded by _ ((?<=_) and followed by a literal . ((?=\.)), excluding both the preceding _ and the following . from what is captured by the match ((?<=...) and (?=...) are non-capturing look-around assertions).
In short: This matches and captures the characters after the last _ and before the start of the filename extension.
00$& replaces what was matched with 00, followed by what the match captured ($&).
In a follow-up comment you mention wanting to not just blindly insert 00, but to 0-left-pad the number after the last _ to 3 digits, whatever the number may be.
In PowerShell [Core] 6.1+, this can be achieved as follows:
Get-ChildItem |
Rename-Item -NewName {
$_.Name -replace '(?<=_)[^_]+(?=\.)', { $_.Value.PadLeft(3, '0') }
} -WhatIf
The script block ({ ... }) as the replacement operand receives each match as a Match instance stored in automatic variable $_, whose .Value property contains the captured text.
Calling .PadLeft(3, '0') on that captured text 0-left-pads it to 3 digits and outputs the result, which replaces the regex match at hand.
A quick demonstration:
PS> 'PRT14_WD_14220000_1.jpg' -replace '(?<=_)[^_]+(?=\.)', { $_.Value.PadLeft(3, '0') }
PRT14_WD_14220000_001.jpg # Note how '_1.jpg' was replaced with '_001.jpg'
In earlier PowerShell versions, you must make direct use of the .NET [regex] type's .Replace() method, which fundamentally works the same:
Get-ChildItem |
Rename-Item -NewName {
[regex]::Replace($_.Name, '(?<=_)[^_]+(?=\.)', { param($m) $m.Value.PadLeft(3, '0') })
} -WhatIf

Splitting in Powershell

I want to be able to split some text out of a txtfile:
For example:
Brackets#Release 1.11.6#Path-to-Brackets
Atom#v1.4#Path-to-Atom
I just want to have the "Release 1.11.6" part. I am doing a where-object starts with Brackets but I don't know the full syntax. Here is my code:
"Get-Content -Path thisfile.txt | Where-Object{$_ < IM STUCK HERE > !
You could do this:
((Get-Content thisfile.txt | Where-Object { $_ -match '^Brackets' }) -Split '#')[1]
This uses the -match operator to filter out any lines that don't start with Brackets (the ^ special regex character indicates that what follows must be at the beginning of the line). Then it uses the -Split operator to split those lines on # and then it uses the array index [1] to get the second element of the split (arrays start at 0).
Note that this will throw an error if the split on # doesn't return at least two elements and it assumes that the text you want is always the second of those elements.
$bracketsRelease = Get-Content -path thisfile.txt | foreach-object {
if ( $_ -match 'Brackets#(Release [^#]+)#' )
{
$Matches[1]
}
}
or
(select-string -Path file.txt -Pattern 'Brackets#(Release [^#]+)#').Matches[0].Groups[1].value

Performing A String Operation in a -replace Expression

I'm trying to make using of String.Substring() to replace every string with its substring from a certain position. I'm having a hard time figuring out the right syntax for this.
$dirs = Get-ChildItem -Recurse $path | Format-Table -AutoSize -HideTableHeaders -Property #{n='Mode';e={$_.Mode};width=50}, #{n='LastWriteTime';e={$_.LastWriteTime};width=50}, #{n='Length';e={$_.Length};width=50}, #{n='Name';e={$_.FullName -replace "(.:.*)", "*($(str($($_.FullName)).Substring(4)))*"}} | Out-String -Width 40960
I'm referring to the following expression
e={$_.FullName -replace "(.:.*)", "*($(str($($_.FullName)).Substring(4)))*"}}
The substring from the 4th character isn't replacing the Full Name of the path.
The paths in question are longer than 4 characters.
The output is just empty for the Full Name when I run the script.
Can someone please help me out with the syntax
EDIT
The unaltered list of strings (as Get-ChildItem recurses) would be
D:\this\is\where\it\starts
D:\this\is\where\it\starts\dir1\file1
D:\this\is\where\it\starts\dir1\file2
D:\this\is\where\it\starts\dir1\file3
D:\this\is\where\it\starts\dir1\dir2\file1
The $_.FullName will therefore take on the value of each of the strings listed above.
Given an input like D:\this\is or D:\this\is\where, then I'm computing the length of this input (including the delimiter \) and then replacing $_.FullName with a substring beginning from the nth position where n is the length of the input.
If input is D:\this\is, then length is 10.
Expected output is
\where\it\starts
\where\it\starts\dir1\file1
\where\it\starts\dir1\file2
\where\it\starts\dir1\file3
\it\starts\dir1\dir2\file1
If you want to remove a particular prefix from a string you can do so like this:
$prefix = 'D:\this\is'
...
$_.FullName -replace ('^' + [regex]::Escape($prefix))
To remove a prefix of a given length you can do something like this:
$len = 4
...
$_.FullName -replace "^.{$len}"
When having trouble, simplify:
This function will do what you are apparently trying to accomplish:
Function Remove-Parent {
param(
[string]$Path,
[string]$Parent)
$len = $Parent.length
$Path.SubString($Len)
}
The following is not the way you likely would use it but does demonstrate that the function returns the expected results:
#'
D:\this\is\where\it\starts
D:\this\is\where\it\starts\dir1\file1
D:\this\is\where\it\starts\dir1\file2
D:\this\is\where\it\starts\dir1\file3
D:\this\is\where\it\starts\dir1\dir2\file1
'# -split "`n" | ForEach-Object { Remove-Parent $_ 'D:\This\Is' }
# Outputs
\where\it\starts
\where\it\starts\dir1\file1
\where\it\starts\dir1\file2
\where\it\starts\dir1\file3
\where\it\starts\dir1\dir2\file1
Just call the function with the current path ($_.fullname) and the "prefix" you are expecting to remove.
The function above is doing this strictly on 'length' but you could easily adapt it to match the actual string with either a string replace or a regex replace.
Function Remove-Parent {
param(
[string]$Path,
[string]$Parent
)
$remove = [regex]::Escape($Parent)
$Path -replace "^$remove"
}
The output was the same as above.

powershell multiple block expressions

I am replacing multiple strings in a file. The following works, but is it the best way to do it? I'm not sure if doing multiple block expressions is a good way.
(Get-Content $tmpFile1) |
ForEach-Object {$_ -replace 'replaceMe1.*', 'replacedString1'} |
% {$_ -replace 'replaceMe2.*', 'replacedString2'} |
% {$_ -replace 'replaceMe3.*', 'replacedString3'} |
Out-File $tmpFile2
You don't really need to foreach through each replace operations. Those operators can be chained in a single command:
#(Get-Content $tmpFile1) -replace 'replaceMe1.*', 'replacedString1' -replace 'replaceMe2.*', 'replacedString2' -replace 'replaceMe3.*', 'replacedString3' |
Out-File $tmpFile2
I'm going to assume that your patterns and replacements don't really just have a digit on the end that is different, so you might want to execute different code based on which regex actually matched.
If so you can consider using a single regular expression but using a function instead of a replacement string. The only catch is you have to use the regex Replace method instead of the operator.
PS C:\temp> set-content -value #"
replaceMe1 something
replaceMe2 something else
replaceMe3 and another
"# -path t.txt
PS C:\temp> Get-Content t.txt |
ForEach-Object { ([regex]'replaceMe([1-3])(.*)').Replace($_,
{ Param($m)
$head = switch($m.Groups[1]) { 1 {"First"}; 2 {"Second"}; 3 {"Third"} }
$tail = $m.Groups[2]
"Head: $head, Tail: $tail"
})}
Head: First, Tail: something
Head: Second, Tail: something else
Head: Third, Tail: and another
This may be overly complex for what you need today, but it is worth remembering you have the option to use a function.
The -replace operator uses regular expressions, so you can merge your three script blocks into one like this:
Get-Content $tmpFile1 `
| ForEach-Object { $_ -replace 'replaceMe([1-3]).*', 'replacedString$1' } `
| Out-File $tmpFile2
That will search for the literal text 'replaceMe' followed by a '1', '2', or '3' and replace it with 'replacedString' followed by whichever digit was found (the '$1').
Also, note that -replace works like -match, not -like; that is, it works with regular expressions, not wildcards. When you use 'replaceMe1.*' it doesn't mean "the text 'replaceMe1.' followed by zero or more characters" but rather "the text 'replaceMe1' followed by zero or more occurrences ('*') of any character ('.')". The following demonstrates text that will be replaced even though it wouldn't match with wildcards:
PS> 'replaceMe1_some_extra_text_with_no_period' -replace 'replaceMe1.*', 'replacedString1'
replacedString1
The wildcard pattern 'replaceMe1.*' would be written in regular expressions as 'replaceMe1\..*', which you'll see produces the expected result (no replacement performed):
PS> 'replaceMe1_some_extra_text_with_no_period' -replace 'replaceMe1\..*', 'replacedString1'
replaceMe1_some_extra_text_with_no_period
You can read more about regular expressions in the .NET Framework here.