PowerShell to remove all letters and dashes leaving UPC - powershell

I have about 1,800 .pdfs including a UPC with dashes and text that need to be removed to manage. I found a code to remove extra spaces and underscores.
How do I remove all text leaving just the UPCs?
01182232110_V1R1_CartonOL_KP_DNV15.pdf
... to ...
01182232110.pdf

# Targets .pdf files in the current dir.
# Add a -LiteralPath / -Path argument to target a different dir.
# Add -Recurse to target .pdf files in the target dir's entire *subtree*.
Get-ChildItem -Filter *.pdf |
Rename-Item -NewName { $_.Name -replace '_.+(?=\.)' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
-replace '_.+(?=\.)' in essence removes all characters starting with the first _ from the input file's base name (preserving the extension).

Related

Powershell - Add extension to files containing dots

I have a script that addes the extension .xml to files without an extension.
Get-ChildItem "C:\TEST" -file "*." | rename-item -NewName {"{0}.xml" -f $_.fullname}
This works perfectly for a file such as:
BC Logistics SPA con Socio Duo - 2022-03-31 - FT 123456VESE
However, it does not work for a file such as:
A.B.C. Mini MAGAZZINI - 2022-02-25 - FT MM9 000000123
This is because of the dots. The directory also contains files that already have .xml as extension, as well as .pdf-files.
How can I add the extenion .xml to files without an extension but with dots in them?
Excluding .pdf and .xml files is not an option as the directory also contains other files that are deleted in the process.
The challenge is that nowadays filename extensions are a convention, and that most APIs, including .NET's, consider anything after the last ., if any, to be the extension, even if it isn't meant to be one. E.g., in A.B.C. Mini, . Mini (sic) is considered the extension, whereas -Filter *. in effect only matches names that contain no . at all.
If you're willing to assume that any file that doesn't end in . followed by 3 characters (e.g., .pdf) is an extension-less file, you can use the following:
# Note: Works with 3-character extensions only, doesn't limit what those chars.
# may be.
Get-ChildItem -File C:\TEST |
Where-Object Extension -NotLike '.???' |
Rename-Item -NewName { '{0}.xml' -f $_.FullName } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
If you need to match what characters are considered part of an extension more specifically and want to consider a range of character counts after the . - say 1 through 4 - so that, say, .1 and .html would be considered extensions too:
# Works with 1-4 character extensions that are letters, digits, or "_"
Get-ChildItem -File C:\TEST |
Where-Object Extension -NotMatch '^\.\w{1,4}$' |
Rename-Item -NewName { '{0}.xml' -f $_.FullName } -WhatIf
Note the use of a regex with the -notmatch operator.
Regex ^\.\w{1,4}$ in essence means: match any extension that has between 1 and 4 word characters (\w), where a word characters is defined as either a letter, a digit, or an underscore (_).
See this regex101.com page for a detailed explanation and the ability to experiment.

Powershell script to read and split filename from a folder

I'm new to powershell. I want to read names of each file in a folder for eg. 900_CA_2022.pdf, remove the _ from the filename and create a new text file which has name 900CA2022900_CA_2022.txt
Basically, I want to remove the _ from the extension-less file name, append the latter as-is, and use a new extension, .txt
Easiest way is use the same name of source file + .txt for create the text files without doing much damage to the source filenames.
E.g.
900_CA_2022.pdf -----> 900_CA_2022.pdf.txt
My alternative solution
Initial files
Script Code
$files = Get-ChildItem "./*.pdf" -Recurse -Force
$files | ForEach-Object{
New-Item -Path "$($_ | Split-Path)/$($_ | Split-Path -Leaf).txt" -Force
}
Result files
Update:
The stated requirements are unusual, but the next section provides a solution to address them.
The fact that you later accepted Joma's answer indicates that simply appending .txt to each input file name is what you actually needed; this is most easily accomplished as follows:
Get-ChildItem -Filter *.pdf | New-Item -Path { $_.FullName + '.txt' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Important: All solutions below create the new files in the current directory. If needed, construct the target file path with an explicit directory path, using Join-Path, e.g.:Join-Path C:\target (($_.BaseName -replace '_') + $_.BaseName + '.txt')
To create new, empty files whose names should be derived from the input files, use New-Item:
Get-ChildItem -Filter *.pdf |
New-Item -Path { ($_.BaseName -replace '_') + $_.BaseName + '.txt' } -WhatIf
Note: If the target file exists, an error occurs. If you add -Force, the existing file is truncated instead - use with caution.
$_.BaseName is the input file's name without the extension.
-replace '_' removes all _ chars. from it.
To create new files whose names should be derived from the input files and fill them, use ForEach-Object:
Get-ChildItem -Filter *.pdf |
ForEach-Object {
# Construct the new file path.
$newFilePath = ($_.BaseName -replace '_') + $_.BaseName + '.txt'
# Create and fill the new file.
# `>` acts like Out-File. To control the encoding, use
# something like `| Out-File -Encoding utf8 $newFilePath` instead.
"content for $newFilePath" > $newFilePath
}
Note that > / Out-File and Set-Content (for string data) all quietly replace the contents of an existing target file.

Find and Rename large quantities of files

I am using PowerShell to find, move, and rename a large amount of audit files. These files are in a shared folder with hundreds of gigabytes of extra junk. Manually clicking and dragging would take hours or even days as they are in many nested folders.
All files are currently named the same (audit.log, or audit1.log if there is a second log in the same folder). I need to find those files, copy them to a central location and rename them so they don't overwrite one another (not necessarily in that order).
I am not a programmer by any standard. This is what I have tried so far based on this website:
cd "H:\Flights\SCP\Log Analysis\1st Quarter"
Get-ChildItem -Filter "audit*.log" -Recurse `
| Rename-Item -NewName {$_.Name -replace 'audit', "$_.Fullname"} -WhatIf `
| Move-Item -Destination "H:\Flights\SCP\Log Analysis\Audit logs" -WhatIf
I use -WhatIf to make sure I do not make a mistake since I cannot overwrite the files. My original line of thought was to simply replace the word audit with the file path, but any reasonable method to rename the files in a way which will not overwrite will be helpful.
Theo and Mathias R. Jessen have provided all the crucial pointers in comments:
Rename-Item only accepts a mere name as a -NewName argument.
Move-Item can perform both moving and renaming in a single operation.
Delay-bind script blocks ({ ... }) can be passed to both Rename-Item's -NewName and Move-Item's -Destination parameters, which enable deriving the target name / path dynamically, for each input object ($_)
To put it all together:
Get-ChildItem -Filter audit*.log -Recurse |
Move-Item -Destination {
"H:\Flights\SCP\Log Analysis\Audit logs\$($_.FullName -replace '[:\\/]', '_')"
} -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Note:
The target directory of the move operation must already exist (-Force does not create it for you, it would only allow you to replace an existing file).
$_.FullName -replace '[:\\/]', '_' transforms the full path of the original file into something that can be used as a file name, by replacing :, \ (and /) characters with _.
The caveat is that with long paths you may run into the 256-characters-per-name limit
An alternative is to use an abstract, unique identifier of fixed length, which you can generate with the New-Guid cmdlet, as Mathias suggests.

How to recursively append to file name in powershell?

I have multiple .txt files in folders/their sub-folders.
I want to append _old to their file names.
I tried:
Get-ChildItem -Recurse | Rename-Item -NewName {$_.name -replace '.txt','_old.txt' }
This results in:
Some files get updated correctly
Some files get updated incorrectly - they get _old twice - example: .._old_old.txt
There are few errors: Rename-Item : Source and destination path must be different.
To prevent already renamed files from accidentally reentering the file enumeration and therefore getting renamed multiple times, enclose your Get-ChildItem call in (), the grouping operator, which ensures that all output is collected first[1], before sending the results through the pipeline:
(Get-ChildItem -Recurse) |
Rename-Item -NewName { $_.name -replace '\.txt$', '_old.txt' }
Note that I've used \.txt$ as the regex[2], so as to ensure that only a literal . (\.) followed by string txt at the end ($) of the file name is matched, so as to prevent false positives (e.g., a file named Atxt.csv or even a directory named AtxtB would accidentally match your original regex).
Note: The need to collect all Get-ChildItem output first arises from how the PowerShell pipeline fundamentally works: objects are (by default) sent to the pipeline one by one, and processed by a receiving command as they're being received. This means that, without (...) around Get-ChildItem, Rename-Item starts renaming files before Get-ChildItem has finished enumerating files, which causes problems. See this answer for more information about how the PowerShell pipeline works.
Tip of the hat to Matthew for suggesting inclusion of this information.
However, I suggest optimizing your command as follows:
(Get-ChildItem -Recurse -File -Filter *.txt) |
Rename-Item -NewName { $_.BaseName + '_old' + $_.Extension }
-File limits the the output to files (doesn't also return directories).
-Filter is the fastest way to limit results to a given wildcard pattern.
$_.BaseName + '_old' + $_.Extension uses simple string concatenation via the sub-components of a file name.
An alternative is to stick with -replace:
$_.Name -replace '\.[^.]+$', '_old$&'
Note that if you wanted to run this repeatedly and needed to exclude files renamed in a previous run, add -Exclude *_old.txt to the Get-ChildItem call.
[1] Due to a change in how Get-ChildItem is implemented in PowerShell [Core] 6+ (it now internally sorts the results, which invariably requires collecting them all first), the (...) enclosure is no longer strictly necessary, but this could be considered an implementation detail, so for conceptual clarity it's better to continue to use (...).
[2] PowerShell's -replace operator operates on regexes (regular expressions); it doesn't perform literal substring searches the way that the [string] type's .Replace() method does.
The below command will return ALL files from the current folder and sub-folders within the current directory the command is executed from.
Get-ChildItem -Recurse
Because of this you are also re-turning all the files you have already updated to have the _old suffix.
What you need to do is use the -Include -Exclude paramters of the Get-Childitem Cmdlet in order to ignore files that already have the _old suffix, and meet your include criteria, for example.
Get-ChildItem -Recure -Include "*.txt" -Exclude "*_old"
Then pipe the results into your re-name item command
Get-ChildItem cmdlet explanation can be found here.
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-childitem?view=powershell-7

Powershell rename file to directory with truncating name after a certain string

I am trying to rename files in directories to the directory name but I need to truncate the name after a certain string in the directory name.
Example: Dir name "filename123_Release v.1"
I want the file in the directory to be called "filename123" (Leave the extension)
This is what I have to so far. I need to truncate the names after "Release". I don't want "Release" or anything after it in the file name.
Get-ChildItem D:\rename -Include *.mp4,*.mkv -Recurse |Rename-Item -NewName { $_.Directory.Name+$_.Extension}
I got it working:
Get-ChildItem D:\rename -Include .mp4,.mkv -Recurse |Rename-Item -NewName { $.Directory.Name.Substring(0,$.Directory.Name.Indexof("Release"))+$_.Extension}
The following should work.
Essentially it splits the directory name into an array with [0] being the part before the word "Release". It's worth noting that based on your example you will be left with an underscore at the end of the file name which may not be the desired result.
Get-ChildItem D:\Rename -Include *.mp4,*.mkv -Recurse | Rename-Item -NewName { ((($_.Directory.Name) -split 'Release')[0])+$_.Extension }