Replace text in files within a folder PowerShell - powershell

I have a folder that contains files like 'goodthing 2007adsdfff.pdf', 'betterthing 2007adfdsw.pdf', and 'bestthing_2007fdsfad.pdf', I want to be able to rename each, eliminating all text including 2007 OR _2007 to the end of the string keeping .pdf and getting this result: 'goodthing.pdf' 'betterthing.pdf' 'bestthing.pdf' I've tried this with the "_2007", but haven't figured out a conditional to also handle the "2007". Any advice on how to accomplish this is greatly appreciated.
Get-ChildItem 'C:Temp\' -Name -Filter *.pdf | foreach { $_.Split("_2017")[0].substring(0)}

Try the following:
Get-ChildItem 'C:\Temp' -Name -Filter *.pdf |
Rename-Item -NewName { $_.Name -replace '[_ ][^.]+' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
The above uses Rename-Item with a delay-bind script block and the -replace operator as follows:
Regex [_ ][^.]+ matches everything from the first space or _ char. (character set [ _]) through to the following literal . char. ([^.]+ matches one or more chars. other than (^) than .) - that is, everything from the first / _ through to the filename extension (excluding the .).
Note: To guard against file names such as _2017.pdf matching (which would result in just .pdf as the new name), use the following regex instead: '(?<=.)[_ ][^.]+'
By not providing a replacement operand to -replace, what is matched is replace with the empty string and therefore effectively removed.
The net effect is that input files named
'goodthing 2007adsdfff.pdf', 'betterthing 2007adfdsw.pdf', 'bestthing_2007fdsfad.pdf'
are renamed to
'goodthing.pdf', 'betterthing.pdf', 'bestthing.pdf'

Without knowing the names of all the potential files, I can offer this solution that is 100%:
PS> $flist = ("goodthing 2007adsdfff.pdf","betterthing 2007adfdsw.pdf","bestthing_2007fdsfad.pdf")
PS> foreach ($f in $flist) {$nicename = ($f -replace "([\w\s]+)2007.*(\.\w+)", '$1$2') -replace "[\s_].","." ;$nicename}
goodthing.pdf
betterthing.pdf
bestthing.pdf
Two challenges:
the underscore is actually part of the \w character class. So the alternative to the above is to complicate the regex or try to assume that there will always be only one '_' before the 2007. Both seemed risky to me.
if there are spaces in filenames, there is no telling if you might encounter more than one. This solution removes only the one right before 2007.
The magic:
The -replace operator enables you to quickly capture text in () and re-use it in variables like $1$2. If you have more complex captures, you just have to figure out the order they are assigned.
Hope this helps.

Related

Remove "." in file name while retaining file extension

I am trying to use PS to rename a bunch files within a big share and one of the requirements is to remove a dot from the file name. I have tested a few things with my rather basic skills and of course the most basic of scripts zap the file extension.
I finally came up with something like this:
gci *.xlsx | rename-item -newname {$_.Name.replace(".","") + $_.extension }
But that adds the extension to the end of the filename (while keeping the file extension intact)
I thought I could zap the last four symbols using something like this:
gci *.xlsx | rename-item -newname { $_.basename.substring(0,$_.basename.length-4) + $_.extension }
Overall this seems like an overly complicated operation which could also mess up files without dots (unless I specify xlsx as only 4 symbols to be removed)
Would anyone be able to point me in the right direction to an easier solution? ;-)
You were on the right track with your second attempt: using the .BaseName and .Extension properties of the [System.IO.FileInfo] instances[1] output by Get-ChildItem allows you to modify the base name (the file name without its extension) separately, and then re-append the extension to form the full file name:
Get-ChildItem *.xlsx |
Rename-Item -NewName { ($_.BaseName -replace '\.') + $_.Extension } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
The above uses the regex-based -replace operator to remove all . instances from the base name; because . is a regex metacharacter (representing any single character), it must be escaped as \. in order to be used literally.
In this simple case, you could have used the [string] type's .Replace() method as well ($_.BaseName.Replace('.', '')), but -replace offers more features and has fewer surprises - see this answer for more information.
Case in point: Say you wanted to remove only the first . from the base name; -replace allows you to do that as follows (but you couldn't do it with .Replace()):
'foo.bar.baz' -replace '\.(.*)$', '$1' # -> 'foobar.baz'
[1] .BaseName isn't a native property of this type; instead, it is a property that PowerShell decorates instances of the type with, using its ETS (Extended Type System).

Appending string to the end of all file names in PowerShell

I have files look like
data.svg
map.svg
aplicationp.svg
...
*.svg
I am trying to add -b string to the end of all files names bu using power shell rename command like
D:\icons> Dir | Rename-Item -NewName {$_.name -replace ".","-b."}
to get these
data-b.svg
map-b.svg
application-b.svg
but this is not changing anything. How can I achieve this?
Powershell's -replace operator is based on regular expressions. And since . is a wildcard in regex, what should be happening is that each character in the file name is being replaced with the resulting string. So test.txt would become -b.-b.-b.-b.-b.-b.-b in your example.
You likely want to use the Replace method of the .NET String type like this instead.
dir | Rename-Item -NewName { $_.Name.Replace('.','-b.') }
If you want to keep using -replace, you need to escape the . in your expression like this.
dir | Rename-Item -NewName { $_.Name -replace '\.','-b.' }
Both of these have a couple edge case problems that you may want to avoid. The first is narrowing the scope of your dir (which is just an alias for Get-ChildItem) to avoid including files or directories you don't actually want to rename. The second is that a simple replace in the file name doesn't account for file names that contain multiple dots. So you may want to ultimately do something like this if you only care about SVG files that may have multiple dots.
Get-ChildItem *.svg -File | Rename-Item -NewName { "$($_.BaseName)-b$($_.Extension)" }
The replace operator uses regex. Therefore your . needs to be escaped, otherwise it just stands for any character. I would generally make sure to be as specific as possible when writing regexes. The following is a possible solution
Get-ChildItem *.svg | Rename-Item -NewName { $_.name -Replace '\.svg$','-c.svg' }
The $ anchors the expression to the end of the string which makes sure it only changes the extension and not any other text inside the file names.

How to bulk rename files in folder such that all characters BEFORE and including "_" is removed

I've been searching online for some help on this but can't seem to find the right answer.
Everything I've come across so far helps with renaming files in batch, but only such that the files are renamed by trimming all characters AFTER a special character (in my case it's "_"). I would actually like to know how to rename all files in a folder such that I trim all characters BEFORE (and including) the underscore.
Example: I have "AB CD_2019481-1" and want the name to be "2019481-1"
I would be open to using Powershell or CMD!
Thanks in advance for any help.
If you know that there is one and only one underscore in all of the file names, you can do a -split on the underscore character, then take the right side of the split.
$Filename = 'AB CD_2019481-1'
$NewFilename = ($Filename -split '_')[1]
The -split '_' splits the string into an array based on the delimiter, underscore. Then the [1] retrieves the 2nd element from the left, which should be the right-hand side of the filename.
Try this out. With the -whatif, it's harmless. It should do what you ask. If your filename has more than one underscore, it may not do what you want. You can pipe get-item or get-childitem to it.
get-item 'AB CD_2019481-1' |
rename-item -newname { $_ -replace '.*_' } -whatif

Removing Parts of a File Name based on a Delimiter

I have various file names that are categorized in two different ways. They either just have a code like: "866655" or contain a suffix and prefix "eu_866655_001". My hope is to write to a text file the names of files in the same format. I cannot figure out a successful method for removing the suffix and prefix.
Currently this what I have in my loop in Powershell:
$docs = Get-ChildItem -Path $source | Where-Object {$_.Name -match '.doc*'}
if ($docs.basename -contians 'eu_*')
{
Write-Output ([io.fileinfo]"$doc").basename.split("_")
}
I'm hoping to turn "eu_866655_001" into "866655" by using "_" as the delimiter.
I'm aware that the answer is staring me down but I still can't seem to figure it out.
You could do something like the following. Feel free to tweak the -Filter on the Get-ChildItem command.
$source = 'c:\path\*'
$docs = Get-ChildItem -Path $source -File -Filter "*_*_*" -Include '*.doc','*.docx'
$docs | Rename-Item -NewName { "{0}{1}" -f $_.Basename.Split('_')[1],$_.Extension }
The important things to remember is that in order to use the -Include switch, you need an * at the end of the -Path value.
Explanation:
-Filter allows us to filter on names that contain two underscores separating three substrings.
-Include allows us to only list files ending in extensions .docx and .doc.
Rename-Item -NewName supports delayed script binding. This allows us use a scriptblock to perform any necessary operations for each piped object (each file).
Since the target files will always have two underscores, the .Split('_') method will result in an three index array delimited by the _. You have specified that you always want the second delimited substring and that is represented by index 1 ([1]).
The format operator (-f) puts the substring and extension together, completing the file name.

Rename files with Powershell if file has certain structure

I am trying to rename files in multiple folder with same name structure. I got the following files:
(1).txt
(2).txt
(3).txt
I want to add the following text in front of it: "Subject is missing"
I only want to rename these files all other should remain the same
Tip of the hat to LotPings for suggesting the use of a look-ahead assertion in the regex.
Get-ChildItem -File | Rename-Item -NewName {
$_.Name -replace '^(?=\(\d+\)\.)', 'Subject is missing '
} -WhatIf
-WhatIf previews the renaming operation; remove it to perform actual renaming.
Get-ChildItem -File enumerates files only, but without a name filter - while you could try to apply a wildcard-based filter up front - e.g., -Filter '([0-9]).*' - you couldn't ensure that multi-digit names (e.g., (13).txt) are properly matched.
You can, however, pre-filter the results, with -Filter '(*).*'
The Rename-Item call uses a delay-bind script block to derive the new name.
It takes advantage of the fact that (a) -rename returns the input string unmodified if the regex doesn't match, (b) Rename-Item does nothing if the new filename is the same as the old.
In the regex passed to -replace, the positive look-ahead assertion (?=...) (which is matched at the start of the input string (^)) looks for a match for subexpression \(\d+\)\. without considering what it matches a part of what should be replaced. In effect, only the start position (^) of an input string is matched and "replaced".
Subexpression \(\d+\)\. matches a literal ( (escaped as \(), followed by 1 or more (+) digits (\d), followed by a literal ) and a literal . (\.), which marks the start of the filename extension. (Replace .\ with $, the end-of-input assertion if you want to match filenames that have no extension).
Therefore, replacement operand 'Subject is missing ' is effectively prepended to the input string so that, e.g., (1).txt returns Subject is missing (1).txt.