Powershell - Add extension to files containing dots - powershell

I have a script that addes the extension .xml to files without an extension.
Get-ChildItem "C:\TEST" -file "*." | rename-item -NewName {"{0}.xml" -f $_.fullname}
This works perfectly for a file such as:
BC Logistics SPA con Socio Duo - 2022-03-31 - FT 123456VESE
However, it does not work for a file such as:
A.B.C. Mini MAGAZZINI - 2022-02-25 - FT MM9 000000123
This is because of the dots. The directory also contains files that already have .xml as extension, as well as .pdf-files.
How can I add the extenion .xml to files without an extension but with dots in them?
Excluding .pdf and .xml files is not an option as the directory also contains other files that are deleted in the process.

The challenge is that nowadays filename extensions are a convention, and that most APIs, including .NET's, consider anything after the last ., if any, to be the extension, even if it isn't meant to be one. E.g., in A.B.C. Mini, . Mini (sic) is considered the extension, whereas -Filter *. in effect only matches names that contain no . at all.
If you're willing to assume that any file that doesn't end in . followed by 3 characters (e.g., .pdf) is an extension-less file, you can use the following:
# Note: Works with 3-character extensions only, doesn't limit what those chars.
# may be.
Get-ChildItem -File C:\TEST |
Where-Object Extension -NotLike '.???' |
Rename-Item -NewName { '{0}.xml' -f $_.FullName } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
If you need to match what characters are considered part of an extension more specifically and want to consider a range of character counts after the . - say 1 through 4 - so that, say, .1 and .html would be considered extensions too:
# Works with 1-4 character extensions that are letters, digits, or "_"
Get-ChildItem -File C:\TEST |
Where-Object Extension -NotMatch '^\.\w{1,4}$' |
Rename-Item -NewName { '{0}.xml' -f $_.FullName } -WhatIf
Note the use of a regex with the -notmatch operator.
Regex ^\.\w{1,4}$ in essence means: match any extension that has between 1 and 4 word characters (\w), where a word characters is defined as either a letter, a digit, or an underscore (_).
See this regex101.com page for a detailed explanation and the ability to experiment.

Related

PowerShell to remove all letters and dashes leaving UPC

I have about 1,800 .pdfs including a UPC with dashes and text that need to be removed to manage. I found a code to remove extra spaces and underscores.
How do I remove all text leaving just the UPCs?
01182232110_V1R1_CartonOL_KP_DNV15.pdf
... to ...
01182232110.pdf
# Targets .pdf files in the current dir.
# Add a -LiteralPath / -Path argument to target a different dir.
# Add -Recurse to target .pdf files in the target dir's entire *subtree*.
Get-ChildItem -Filter *.pdf |
Rename-Item -NewName { $_.Name -replace '_.+(?=\.)' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
-replace '_.+(?=\.)' in essence removes all characters starting with the first _ from the input file's base name (preserving the extension).

How to recursively append to file name in powershell?

I have multiple .txt files in folders/their sub-folders.
I want to append _old to their file names.
I tried:
Get-ChildItem -Recurse | Rename-Item -NewName {$_.name -replace '.txt','_old.txt' }
This results in:
Some files get updated correctly
Some files get updated incorrectly - they get _old twice - example: .._old_old.txt
There are few errors: Rename-Item : Source and destination path must be different.
To prevent already renamed files from accidentally reentering the file enumeration and therefore getting renamed multiple times, enclose your Get-ChildItem call in (), the grouping operator, which ensures that all output is collected first[1], before sending the results through the pipeline:
(Get-ChildItem -Recurse) |
Rename-Item -NewName { $_.name -replace '\.txt$', '_old.txt' }
Note that I've used \.txt$ as the regex[2], so as to ensure that only a literal . (\.) followed by string txt at the end ($) of the file name is matched, so as to prevent false positives (e.g., a file named Atxt.csv or even a directory named AtxtB would accidentally match your original regex).
Note: The need to collect all Get-ChildItem output first arises from how the PowerShell pipeline fundamentally works: objects are (by default) sent to the pipeline one by one, and processed by a receiving command as they're being received. This means that, without (...) around Get-ChildItem, Rename-Item starts renaming files before Get-ChildItem has finished enumerating files, which causes problems. See this answer for more information about how the PowerShell pipeline works.
Tip of the hat to Matthew for suggesting inclusion of this information.
However, I suggest optimizing your command as follows:
(Get-ChildItem -Recurse -File -Filter *.txt) |
Rename-Item -NewName { $_.BaseName + '_old' + $_.Extension }
-File limits the the output to files (doesn't also return directories).
-Filter is the fastest way to limit results to a given wildcard pattern.
$_.BaseName + '_old' + $_.Extension uses simple string concatenation via the sub-components of a file name.
An alternative is to stick with -replace:
$_.Name -replace '\.[^.]+$', '_old$&'
Note that if you wanted to run this repeatedly and needed to exclude files renamed in a previous run, add -Exclude *_old.txt to the Get-ChildItem call.
[1] Due to a change in how Get-ChildItem is implemented in PowerShell [Core] 6+ (it now internally sorts the results, which invariably requires collecting them all first), the (...) enclosure is no longer strictly necessary, but this could be considered an implementation detail, so for conceptual clarity it's better to continue to use (...).
[2] PowerShell's -replace operator operates on regexes (regular expressions); it doesn't perform literal substring searches the way that the [string] type's .Replace() method does.
The below command will return ALL files from the current folder and sub-folders within the current directory the command is executed from.
Get-ChildItem -Recurse
Because of this you are also re-turning all the files you have already updated to have the _old suffix.
What you need to do is use the -Include -Exclude paramters of the Get-Childitem Cmdlet in order to ignore files that already have the _old suffix, and meet your include criteria, for example.
Get-ChildItem -Recure -Include "*.txt" -Exclude "*_old"
Then pipe the results into your re-name item command
Get-ChildItem cmdlet explanation can be found here.
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-childitem?view=powershell-7

Removing multiple consecutive periods from file names

I am working on cleaning up a file share for a SharePoint migration, and I am writing a script to either remove or replace unwanted characters from file names. I am struggling to remove multiple consecutive periods (file..example.txt as an example of what I am dealing with).
I was able to use the simple replace script below to deal with all of the other objectionable characters, but the script fails when attempting to replace double period errors.
dir -recurse | rename-item -NewName {$_.name -replace ".." , ""}
I expect that a file with a name like file..example.txt to become fileexample.txt, however nothing changes.
As Matt mentioned in the comments, -replace uses regex. In regex, the . character is a wildcard representing any single character. To actually select a dot, you must use \..
The regex for selecting anything with two or more dots is \.\.+ (RegExr)
Therefore, your command should be:
dir -Recurse | Rename-Item -NewName {$_.name -replace "\.\.+" , ""}
However, dir is an alias for Get-ChildItem. It's a good practice when writing scripts to avoid aliases whenever possible as it can create a situation where your script does not work in certain environments. With that in mind, your command should be:
Get-ChildItem -Recurse | Rename-Item -NewName {$_.name -replace "\.\.+" , ""}
You can use .replace() instead, and not have to worry about the regex. Note that Rename-Item is using a delay bind script block https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_parameters?view=powershell-5.1
Get-Childitem -Recurse -Filter *..* |
Rename-Item -NewName { $_.Name.Replace('..','.') } -WhatIf

Batch copy and rename files with PowerShell

I'm trying to use PowerShell to batch copy and rename files.
The original files are named AAA001A.jpg, AAB002A.jpg, AAB003A.jpg, etc.
I'd like to copy the files with new names, by stripping the first four characters from the filenames, and the character before the period, so that the copied files are named 01.jpg, 02.jpg, 03.jpg, etc.
I have experience with Bash scripts on Linux, but I'm stumped on how to do this with PowerShell. After a couple of hours of trial-and-error, this is as close as I've gotten:
Get-ChildItem AAB002A.jpg | foreach { copy-item $_ "$_.name.replace ("AAB","")" }
(it doesn't work)
Note:
* While perhaps slightly more complex than abelenky's answer, it (a) is more robust in that it ensures that only *.jpg files that fit the desired pattern are processed, (b) shows some advanced regex techniques, (c) provides background information and explains the problem with the OP's approach.
* This answer uses PSv3+ syntax.
Get-ChildItem *.jpg |
Where-Object Name -match '^.{4}(.+).\.(.+)$' |
Copy-Item -Destination { $Matches.1 + '.' + $Matches.2 } -WhatIf
To keep the command short, the destination directory is not explicitly controlled, so the copies will be placed in the current dir. To ensure placement in the same dir. as the input files, use
Join-Path $_.PSParentPath ($Matches.1 + '.' + $Matches.2) inside { ... }.
-WhatIf previews what files would be copied to; remove it to perform actual copying.
Get-ChildItem *.jpg outputs all *.jpg files - whether or not they fit the pattern of files to be renamed.
Where-Object Name -match '^.{4}(.*).\.(.+)$' then narrows the matches down to those that fit the pattern, using a regex (regular expression):
^...$ anchors the regular expression to ensure that it matches the whole input (^ matches the start of the input, and $ its end).
.{4} matches the first 4 characters (.), whatever they may be.
(.+) matches any nonempty sequence of characters and, due to being enclosed in (...), captures that sequence in a capture group, which is reflected in the automatic $Matches variable, accessible as $Matches.1 (due to being the first capture group).
. matches the character just before the filename extension.
\. matches a literal ., due to being escaped with \ - i.e., the start of the extension.
(.+) is the 2nd capture group that captures the filename extension (without the preceding . literal), accessible as $Matches.2.
Copy-Item -Destination { $Matches.1 + '.' + $Matches.2 } then renames each input file based on the capture-group values extracted from the input filenames.
Generally, directly piping to a cmdlet, if feasible, is always preferable to piping to the Foreach-Object cmdlet (whose built-in alias is foreach), for performance reasons.
In the Copy-Item command above, the target path is specified via a script-block argument, which is evaluated for each input path with $_ bound to the input file at hand.
Note: The above assumes that the copies should be placed in the current directory, because the script block outputs a mere filename, not a path.
To control the target path explicitly, use Join-Path inside the -Destination script block.
For instance, to ensure that the copies are always placed in the same folder as the input files - irrespective of what the current dir. is - use:
Join-Path $_.PSParentPath ($Matches.1 + '.' + $Matches.2)
As for what you've tried:
Inside "..." (double-quoted strings), you must use $(...), the subexpression operator, in order to embed expressions that should be replaced with their value.
Irrespective of that, .replace ("AAB", "") (a) breaks syntactically due to the space char. before ( (did you confuse the [string] type's .Replace() method with PowerShell's -replace operator?), (b) hard-codes the prefix to remove, (c) is limited to 3 characters, and (d) doesn't remove the character before the period.
The destination-location caveat applies as well: If your expression worked, it would only evaluate to a filename, which would place the resulting file in the current directory rather than the same directory as the input file (though that wouldn't be a problem, if you ran the command from the current dir. or if that is your actual intent).
In Powershell:
(without nasty regexs. We hates the regexs! We does!)
Get-ChildItem *.jpg | Copy-Item -Destination {($_.BaseName.Substring(4) -replace ".$")+$_.Extension} -WhatIf
Details on the expression:
$_.BaseName.Substring(4) :: Chop the first 4 letters of the filename.
-replace ".$" :: Chop the last letter.
+$_.Extension :: Append the Extension
Not Powershell, but Batch File:
(since someone wants to be ultra-pedantic about comments)
#echo off
setlocal enabledelayedexpansion
for %%a in (*.jpg) do (
::Save the Extension
set EXT=%%~xa
::Get the Source Filename (no extension)
set SRC_FILE=%%~na
::Chop the first 4 chars
set DST_FILE=!SRC_FILE:~4!
::Chop the last 1 char.
set DST_FILE=!DST_FILE:~,-1!
:: Copy the file
copy !SRC_FILE!!EXT! !DST_FILE!!EXT! )
try this:
Get-ChildItem "C:\temp\Test" -file -filter "*.jpg" | where BaseName -match '.{4,}' |
%{ Copy-Item $_.FullName (Join-Path $_.Directory ("{0}{1}" -f $_.BaseName.Substring(4, $_.BaseName.Length - 5), $_.Extension)) }

powershell- recursive call to rename file names in bulk

I am trying renaming the files recursively.
My sample file name is:
2011.02.21 Work Plan - Greg Graham_v1.0__977a6c84-340a-442f-997e-aea94308b382.pdf
I want to delete the string __977a6c84-340a-442f-997e-aea94308b382 which starts with two underscore + 36 characters of identifier.
So result filename will be :
2011.02.21 Work Plan - Greg Graham_v1.0.pdf
All the files are in the mentioned folders or subfolders.
I am using following PowerShell :
Get-ChildItem -Path E:\Recover\test -Recurse | Rename-Item -NewName{$_.name -replace{$_.name.SubString({$_name.IndexOf("__")},38)},""}
When I was using -WhatIf then it shows all the files. But if I use without -WhatIf . It doesn't delete anything.
With-WhatIf it shows both target and destination filenames same.
Appreciate your help.
I think you'd be better off with a regex match. Something like:
GCI $path -recurse | Where{$_.BaseName -match "(.+?)__.{36}$"} | ForEach{Rename-Item -Path $_.FullName -NewName "$($Matches[1])$($_.extension)"}
That will capture the beginning of the file's name (assuming the file name without extension ends in two underscores followed by 36 characters), and then rename the file based on that capture, and the file's original extension.