I am trying to remove parts of a name "- xx_xx" from the end of multiple files. I'm using this and it works well.
dir | Rename-Item -NewName { $_.Name -replace " - xx_xx","" }
However, there are other parts like:
" - yy_yy"
" - zz_zz"
What can I do to remove all of these at once instead of running it again and again changing the part of the name I want removed?
Easiest way
You can keep on stringing -replace statements until the cows come home, if you need to.
$myLongFileName = "Something xx_xx yy_yy zz_zz" -replace "xx_xx","" -replace "yy_yy"
More Terse Syntax
If every file has these, you can also make an array of pieces you want to replace, like this, just separating them with commas.
$stuffWeDontWantInOurFile =#("xx_xx", "yy_yy", "zz_zz")
$myLongFileName -replace $stuffWeDontWantInOurFile, ""
Yet another way
If your file elements are separated by spaces or dashes or something predictable, you can split the file name on that.
$myLongFileName = "Something xx_xx yy_yy zz_zz"
PS> $myLongFileName.Split()
Something
xx_xx
yy_yy
zz_zz
PS> $myLongFileName.Split()[0] #select just the first piece
Something
For spaces, you use the Spit() method with no overload inside of it.
If it were dashes or another character, you'd provide it like so Split("-"). Between these techniques, you should be able to do what you want to do.
If as you say, the pattern - xx_xx is always at the end of the file name, I'd suggest using something like this:
Get-ChildItem -Path '<TheFolderWhereTheFilesAre>' -File |
Rename-Item -NewName {
'{0}{1}' -f ($_.BaseName -replace '\s*-\s*.._..$'), $_.Extension
} -WhatIf
Remove the -WhatIf switch if you are satisfied with the results shown in the console
Result:
D:\Test\blah - xx_yy.txt --> D:\Test\blah.txt
D:\Test\somefile - zy_xa.txt --> D:\Test\somefile.txt
Regex details:
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
- Match the character “-” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
. Match any single character that is not a line break character
. Match any single character that is not a line break character
_ Match the character “_” literally
. Match any single character that is not a line break character
. Match any single character that is not a line break character
$ Assert position at the end of the string (or before the line break at the end of the string, if any)
Related
I have text that prints out like this:
mdbAppText_Arr: [0]: The cover is open. {goes to next line here}
Please close the cover. and [1] Backprinter cover open
46
I tried getting rid of the newline after open., and it's still there. Any idea of a better way or fix for what I'm doing? I need to get rid of the newline because it's going to a csv file, and messing up formatting (going to newline there).
This is my code:
$mdbAppText_Arr = $mdbAppText.Split("|")
$mdbAppText_Arr[0].replace("`r",";").replace("`n",";").replace("`t",";").replace("&",";")
#replace newline/carriage return/tab with semicolon
if($alarmIdDef -eq "12-7")
{
Write-Host "mdbAppText_Arr: [0]: $($mdbAppText_Arr[0]) and [1] $($mdbAppText_Arr[1]) "
[byte] $mdbAppText_Arr[0][31]
}
I've been looking at:
replace
replace - this one has a link reference to lookup in the asci table, but it's unclear to me what column the byte equivalent is in the table/link.
I'm using PowerShell 5.1.
-replace is a regex operator, so you need to supply a valid regular expression pattern as the right-hand side operand.
You can replace most newline sequences with a pattern describing a substring consisting of:
an optional carriage return (\r? in regex), followed by
a (non-optional) newline character (\n in regex):
$mdbAppText_Arr = $mdbAppText_Arr -replace '\r?\n'
I have a HTML file with a load of links in it.
They are in the format
http:/oldsite/showFile.asp?doc=1234&lib=lib1
I'd like to replace them with
http://newsite/?lib=lib1&doc=1234
(1234 and lib1 are variable)
Any idea on how to do that?
Thanks
P
I don't think your examples are correct.
http:/oldsite/showFile.asp?doc=1234&lib=lib1 should be
http:/oldsite/showFile.asp?doc=1234&lib=lib1
and
http://newsite/?lib=lib1&doc=1234 should be http://newsite?lib=lib1&doc=1234
To do the replacement on these, you can do
'http:/oldsite/showFile.asp?doc=1234&lib=lib1' -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1'
which returns http://newsite?lib=lib1&doc=1234
To replace these in a file you can use:
(Get-Content -Path 'X:\TheHtmlFile.html' -Raw) -replace 'http:/oldsite/showFile\.asp\?(doc=\d+)&(lib=\w+)', 'http://newsite?$2&$1' |
Set-Content -Path 'X:\TheNewHtmlFile.html'
Regex details:
http:/oldsite/showFile Match the characters “http:/oldsite/showFile” literally
\. Match the character “.” literally
asp Match the characters “asp” literally
\? Match the character “?” literally
( Match the regular expression below and capture its match into backreference number 1
doc= Match the characters “doc=” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
& Match the character “&” literally
( Match the regular expression below and capture its match into backreference number 2
lib= Match the characters “lib=” literally
\w Match a single character that is a “word character” (letters, digits, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
Read in the file, loop through each line and replace the old value with the new value, send the output to the a new file:
gc file.html | % { $_.Replace('oldsite...','newsite...') } | out-file new-file.html
Need to replace strings after pattern matching. Using powershell v4.
Log line is -
"08:02:37.961" level="DEBUG" "Outbound message: [32056][Sent: HTTP]" threadId="40744"
Need to remove level and threadId completely. Expected line is -
"08:02:37.961" "Outbound message: [32056][Sent: HTTP]"
Have already tried following but did not work -
$line.Replace('level="\w+"','')
AND
$line.Replace('threadId="\d+"','')
Help needed with correct replace command. Thanks.
Try this regex:
$line = "08:02:37.961" level="DEBUG" "Outbound message: [32056][Sent: HTTP]" threadId="40744"
$line -replace '(\s*(level|threadId)="[^"]+")'
Result:
"08:02:37.961" "Outbound message: [32056][Sent: HTTP]"
Regex details:
( # Match the regular expression below and capture its match into backreference number 1
\s # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( # Match the regular expression below and capture its match into backreference number 2
# Match either the regular expression below (attempting the next alternative only if this one fails)
level # Match the characters “level” literally
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
threadId # Match the characters “threadId” literally
)
=" # Match the characters “="” literally
[^"] # Match any character that is NOT a “"”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
" # Match the character “"” literally
)
.replace() doesn't use regex. https://learn.microsoft.com/en-us/dotnet/api/system.string.replace?view=netframework-4.8 -replace does.
I have a list of email, for example
johnsmith at gmail dot com
username at gmail.com
random atsign outlook dot com
The username and the provider is always separated by a custom word between spaces.
The problem here is that the domain can have a custom separator like this (dot, or any text) OR just a dot, like gmail.com
If it would have only spaces, I would simply read the lines and split them at the spaces, then write the first, #, the third, . and then the fifth items from the list.
However, the possible john at gmail.com format is problematic for me. How could I handle this format along with the simple name at gmail dot com formats in one script?
For the examples you give, a bit of regex will do it:
$emails = #"
johnsmith at gmail dot com
username at gmail.com
random atsign outlook dot com
"# -split '\r?\n'
$emails | ForEach-Object {
# replace all repeating whitespace characters by a single space
# and split 3 parts
$pieces = $_ -replace '\s+', ' ' -split ' ', 3
# output the username, followed by the '#' sign, followed by the domain
'{0}#{1}' -f $pieces[0], ($pieces[2] -replace ' [^\.]+ ', '.')
}
Output:
johnsmith#gmail.com
username#gmail.com
random#outlook.com
Regex details for the domain part:
\ Match the character “ ” literally
[^\.] Match any character that is NOT a “A . character”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\ Match the character “ ” literally
A PowerShell v6.1+ solution, which uses the ability of the -replace operator to accept a script block ({ ... }) to process each match.
For a solution that also works in Windows PowerShell, see Theo's helpful answer.
# Simulate an array of input lines.
$emails = #'
johnsmith at gmail dot com
username at gmail.com
random atsign outlook dot com
'# -split '\r?\n'
# Synthesize a valid email address from each line.
# (If the lines came from file, say, 'emails.txt', replace `$emails`
# with `(Get-Content emails.txt)`)
$emails -replace '^([^ ]+) \w+ ([^ ]+|[^ ]+ [^ ]+ [^ ]+)$',
{ '{0}#{1}' -f $_.Groups[1].Value, ($_.Groups[2].Value -replace ' [^ ]+ ', '.') }
Note:
I've assumed that the tokens in your input line are separated by exactly one space char.; to support multiple spaces as well, replace in the regex with \s+.
[^ ]+ is a nonempty (+) run of non-space ([^ ]) characters; loosely speaking, a word.
The regex matches each line in full, capturing the parts of interest via capture groups ((...))
The script block ({ ... }) receives the match at hand in automatic variable $_, as a Match instance, from which the capture groups can be extracted via .Groups[<n>].Value), starting with index 1.
The above yields:
johnsmith#gmail.com
username#gmail.com
random#outlook.com
I'm using the .Replace() function to replace line feeds in the file I'm working on with a carriage return and a line feed but I would also like to match any number of spaces preceding the line feed. Can this be done in the same operation using a regular expression?
I've tried various combinations of "\s +*" but none have worked, except with a fixed number of manually typed spaces.
This version works for the one space case:
.Replace(" `n","`r`n")
For example, a file like this:
...end of line one\n
...end of line two \n
would look like:
...end of line one\r\n
...end of line two\r\n
The .Replace() method of the .NET [string] type performs literal string replacements.
By contrast, PowerShell's -replace operator is based on regexes (regular expressions), so it allows you to match a variable number of spaces (including none) with *:
"...end of line two `n" -replace ' *\n', "`r`n"