find and replace special characters powershell - powershell

I have a method that has an if statement that catches if it finds a special character. What I want to do now if find the position of the special characters and replace it with _A
Some Examples
test# becomes test_A
I#hope#someone#knows#the#answer# becomes I_Ahope_Asomeone_Aknows_Athe_Aanswer_A
or if it has more than one special character
You?didnt#understand{my?Question# becomes You_Adidnt_Aunderstand_Amy_AQuestion_A
Would I have to loop through the whole string and when I reach that character change it to _A or is there a quicker way of doing this?

# is just a character like any other, you can use the -replace operator:
PS C:\>'I#hope#someone#knows#the#answer#' -replace '#','_A'
I_Ahope_Asomeone_Aknows_Athe_Aanswer_A
Regex is magic, you can define all the special cases you like (braces will have to be escaped):
PS C:\>'You?didnt#understand{my?Question#' -replace '[#?\{]','_A'
You_Adidnt_Aunderstand_Amy_AQuestion_A
So your function could look something like this:
function Replace-SpecialChars {
param($InputString)
$SpecialChars = '[#?\{\[\(\)\]\}]'
$Replacement = '_A'
$InputString -replace $SpecialChars,$Replacement
}
Replace-SpecialChars -InputString 'You?didnt#write{a]very[good?Question#'
If you are unsure of which characters to escape, have the regex class do it for you!
function Replace-SpecialChars {
param(
[string]$InputString,
[string]$Replacement = "_A",
[string]$SpecialChars = "#?()[]{}"
)
$rePattern = ($SpecialChars.ToCharArray() |ForEach-Object { [regex]::Escape($_) }) -join "|"
$InputString -replace $rePattern,$Replacement
}
Alternatively, you can use the .NET string method Replace():
'You?didnt#understand{my?Question#'.Replace('#','_A').Replace('?','_A').Replace('{','_A')
But I feel the regex way is more concise

Related

Switch statement with RegEx - having trouble getting a switch statement not to match if string has a dot

Lets say I have switch statement like so:
$NewName = "test.psd"
switch -RegEX ($NewName) {
"^\..*" { #if string starts with "." it means only change extension
'Entry Starts with a "."'
}
".*\..*" { # "." is in the middle , change both basename and extension
'Entry does not start with a "."'
}
"[^.]" { # if no "." at all, it means only change base name
'No "." present'
}
}
The first and second condidtions work as expected, but the last one always triggers. It will trigger against:
$NewName = "test.psd"
$NewName = ".psd"
$NewName = "test"
Doesnt regex "[^.]" mean if there is a dot, dont match. Essentially only trigger in the absence of a dot.
My expected outcome is for the last statement to only trigger if there is not dot present.
Any help on this, would be wellcome.
That would only work if "." were the only character. All the other characters would match it. You would have to repeat that pattern for every character on the line. See also Regex - Does not contain certain Characters
'a.' -match '^[^.]+$'
False
'ab' -match '^[^.]+$'
True
The dot is a special character in regular expressions, and needs to be escaped when you want to use a literal dot. Try "[^\.]" for the regular expression in the third case.

Powershell - Remove text and capitalise some letters

Been scratching my head on this one...
I'd like to remove .com and capitalize S and T from: "sometext.com"
So output would be Some Text
Thank you in advance
For most of this you can use the replace() member of the String object.
The syntax is:
$string = $string.replace('what you want replaced', 'what you will replace it with')
Replace can be used to erase things by using blank quotes '' for the second argument. That's how you can get rid of .com
$string = $string.replace('.com','')
It can also be used to insert things. You can insert a space between some and text like this:
$string = $string.replace('et', 'e t')
Note that using replace does NOT change the original variable. The command below will print "that" to your screen, but the value of $string will still be "this"
$string = 'this'
$string.replace('this', 'that')
You have to set the variable to the new value with =
$string = "this"
$string = $string.replace("this", "that")
This command will change the value of $string to that.
The tricky part here comes in changing the first t to capital T without changing the last t. With strings, replace() replaces every instance of the text.
$string = "text"
$string = $string.replace('t', 'T')
This will set $string to TexT. To get around this, you can use Regex. Regex is a complex topic. Here just know that Regex objects look like strings, but their replace method works a little differently. You can add a number as a third argument to specify how many items to replace
$string = "aaaaaa"
[Regex]$reggie = 'a'
$string = $reggie.replace($string,'a',3)
This code sets $string to AAAaaa.
So here's the final code to change sometext.com to Some Text.
$string = 'sometext.com'
#Use replace() to remove text.
$string = $string.Replace('.com','')
#Use replace() to change text
$string = $string.Replace('s','S')
#Use replace() to insert text.
$string = $string.Replace('et', 'e t')
#Use a Regex object to replace the first instance of a string.
[regex]$pattern = 't'
$string = $pattern.Replace($string, 'T', 1)
What you're trying to achieve isn't well-defined, but here's a concise PowerShell Core solution:
PsCore> 'sometext.com' -replace '\.com$' -replace '^s|t(?!$)', { $_.Value.ToUpper() }
SomeText
-replace '\.com$' removes a literal trailing .com from your input string.
-replace '^s|t(?!$), { ... } matches an s char. at the start (^), and a t that is not (!) at the end ($); (?!...) is a so-called negative look-ahead assertion that looks ahead in the input string without including what it finds in the overall match.
Script block { $_.Value.ToUpper() } is called for each match, and converts the match to uppercase.
-replace (a.k.a -ireplace) is case-INsensitive by default; use -creplace for case-SENSITIVE replacements.
For more information about PowerShell's -replace operator see this answer.
Passing a script block ({ ... }) to dynamically determine the replacement string isn't supported in Windows PowerShell, so a Windows PowerShell solution requires direct use of the .NET [regex] class:
WinPs> [regex]::Replace('sometext.com' -replace '\.com$', '^s|t(?!$)', { param($m) $m.Value.ToUpper() })
SomeText

Validate against an array

I'm trying to setup a validation check against an array. I have the following
$ValidDomain = "*.com","*.co.uk"
$ForwardDomain = Read-Host "What domain do you want to forward to? e.g. contoso.com"
#while (!($ForwardDomain -contains $ValidDomain)) {
while (!($ValidDomain.Contains($ForwardDomain))) {
Write-Warning "$ForwardDomain isn't a valid domain name format. Please try again."
$ForwardDomain = Read-Host "What domain do you want to forward to? e.g. contoso.com"
}
The commented while line is just an alternative way I've been testing this.
If I enter, when prompted by Read-Host, "fjdkjfl.com" this displays the warning message rather than saying it's valid and keeps looping.
I have tried using -match instead of -contains but get the message:
parsing "*com *co.uk" - Quantifier {x,y} following nothing.
Have I got this completely wrong?
Contains() and -contains don't work the way you expect. Use a regular expression instead:
$ValidDomain = '\.(com|co\.uk)$'
$ForwardDomain = Read-Host ...
while ($ForwardDomain -notmatch $ValidDomain) {
...
}
You can construct $ValidDomain from a list of strings like this:
$domains = 'com', 'co.uk'
$ValidDomains = '\.({0})$' -f (($domains | ForEach-Object {[regex]::Escape($_)}) -join '|')
Regular expression breakdown:
.: The dot is a special character that matches any character except newlines. To match a literal dot you need the escape sequence \..
(...): Parentheses define a (capturing) group or subexpression.
|: The pipe defines an alternation (basically an "OR"). Alternations are typically put in grouping constructs to distinguish the alternation from the rest of the expression.
$: The dollar sign is a special character that matches the end of a string.
The {0} in the format string for the -f operator is not part of the regular expression, but a placeholder that defines where (and optionally how) the second argument of the operator is inserted into the format string.

Remove first and last three character of a word with powershell

I have a list of users in a text file who's names are in the following format: xn-tsai-01.
How do I script to remove the xn- KEEP THIS -01 so the output is like: tsai
I know how to do this in bash but not too familiar with powershell.
Thanks in advance!
Why not use Substring method. If you will always trim the first three characters, you can do the following assuming the variable is a string type.
$string = xn-tsai-01
$string.Substring(3)
Here is a quick way to do it using regex:
'xn-tsai-01' -replace '.*?-(.*)-.*','$1'
Example with a list:
(Get-Content list.txt) -Replace '.*?-(.*)-.*','$1'
You can use the .NET string method IndexOf("-") to find the first, and LastIndexOf("-") to find the last occurrence of "-" within the string.
Use these indexes with Substring() to remove the unnecessary parts:
function Clean-Username {
param($Name)
$FirstDash = $Name.IndexOf("-") + 1
$LastDash = $Name.LastIndexOf("-")
return $Name.Substring( $f, $l - $f )
}
PS C:\> Clean-UserName -Name "xn-tsai-01"
tsai
Boe's example is probably going to be the most efficient.
Another way is to use the split() method if they're in a uniform format.
Get-Content .\list.txt | % { ($_.Split('-'))[1] }
% is an alias for ForEach

powershell: remove text from end of string

--deleted earlier text - I asked the wrong question!
ahem....
What I have is $var = "\\unknowntext1\alwaysSame\unknowntext2"
I need to keep only "\\unknowntext1"
Try regular expressions:
$foo = 'something_of_unknown' -replace 'something.*','something'
Or if you know only partially the 'something', then e.g.
'something_of_unknown' -replace '(some[^_]*).*','$1'
'some_of_unknown' -replace '(some[^_]*).*','$1'
'somewhatever_of_unknown' -replace '(some[^_]*).*','$1'
The $1 is reference to group in parenthesis (the (some[^_]*) part).
Edit (after changed question):
If you use regex, then special characters need to be escaped:
"\\unknowntext1\alwaysSame\unknowntext2" -replace '\\\\unknowntext1.*', '\\unknowntext1'
or (another regex magic) use lookbehind like this:
"\\unknowntext1\alwaysSame\unknowntext2" -replace '(?<=\\\\unknowntext1).*', ''
(which is: take anything (.*), but there must be \\unknowntext1 before it ('(?<=\\\\unknowntext1)) and replace it with empty string.
Edit (last)
If you know that there is something known in the middle (the alwaysSame), this might help:
"\\unknowntext1\alwaysSame\unknowntext2" -replace '(.*?)\\alwaysSame.*', '$1'
function Remove-TextAfter {
param (
[Parameter(Mandatory=$true)]
$string,
[Parameter(Mandatory=$true)]
$value,
[Switch]$Insensitive
)
$comparison = [System.StringComparison]"Ordinal"
if($Insensitive) {
$comparison = [System.StringComparison]"OrdinalIgnoreCase"
}
$position = $string.IndexOf($value, $comparison)
if($position -ge 0) {
$string.Substring(0, $position + $value.Length)
}
}
Remove-TextAfter "something_of_unknown" "SoMeThInG" -Insensitive
What I have is $var = "\unknowntext1\alwaysSame\unknowntext2"
I need to keep only "\unknowntext1"
Not sure this requires a regular expression. Assuming alwaysSame is literally always the same, as the discussion around stej's answer suggests, it seems by far the most straightforward way to accomplish this would be:
$var.substring(0, $var.indexOf("\alwaysSame"));