Using Rename-Item to make numeric identifiers & names consistent - powershell

I have several thousand file names delimited as such:
Last, First-000000-Title-MonYYYY.pdf
Probem 1: Some files conform to the 6-digit convention while others need leading zeroes for consistency.
Problem 2: Some names are entered with dashes (which are, problematically, delimiters) which need to be joined as such:
Last-Last, First > LastLast, First
I'm able to perform a simple Rename-Item function for each file but have not been able to create a broader Get-ChildItem function taking into account the several iterations of file names to generate a consistent output.
Apologies for the entry-level question but I cannot seem to coherently draw together the required functions.

Based on your explanations:
Set-Location -Path "C:\path" # replace this with actual path to where the files are
$cFiles = Get-ChildItem -Filter "*.pdf" # Getting all PDFs in the folder
foreach ($oFile in $cFiles) {
$sName = $oFile.Name
# This regex captures 1-5 digits number between two dashes.
$sPattern = '(?:(?<=-))(\d{1,5})(?:(?=-))'
if ($sName -match $sPattern) {
# Extracting number.
[UInt32]$iNumber = $sName -replace (".*" + $sPattern + ".*"), '$1'
# Padding number with zeros.
$sNumber = "{0:D6}" -f $iNumber
# Updating filename string.
$sName = $sName -replace $sPattern, $sNumber
} else {
# This regex captures 6 digits number between two dashes.
$sPattern = '.*-(\d{6})-.*'
# Extracting number.
$sNumber = $sName -replace $sPattern, '$1'
}
# Splitting filename string on 6 digits number.
$cParts = $sName -split $sNumber
# Removing dashes from first/last names and re-assembling filename string.
$sName = ($cParts[0] -replace '-') + '-' + $sNumber + $cParts[1]
Rename-Item -Path $oFile.Name -NewName $sName
}
Tested on the following sample:
Last, First-000000-Title-Jan1900.pdf
One-Two, Three-123-Title-Feb2000.pdf
Four, Five-Six-456-Title-Mar2010.pdf
Seven-Eight, Nine-Ten-7890-Title-Sep2012.pdf
May not work if there are more complicated cases.

Related

Rename files in a specific way. Target nth string between symbols

Apologies in advance for a bit vague question (no coding progress).
I have files (they can be .csv but dont have .csv, but that I can add via script easy). The files' name is something like this:
TRD_123456789_ABC123456789_YYMMDD_HHMMSS_12345678_12345_blabla_blabla_blabla_blabla
Now I would need a script that renames the file in a way that it keeps original name except:
It would cut off the ending (blabla_blabla_blabla_blabla) part.
Changes the 12345 before blabla to random 5 characters (can be numbers too)
Change timestamp of HHMMSS to current Hours, minutes, seconds.
In regards to point 3. I think that I can insert arbitary powershell script to any string in " " queotes. So when renaming the files, I was thinking I could just add
Rename-Item -NewName {... + $(get-date -f hhmmss) + ...}
However, I am lost how to write renaming script that renames parts between 4th & 5th _ symbol. And removes string part after 7th _ symbol.
Can somebody help me with the script or help me how to in powershell script target string between Nth Symbols?
Kind Regards,
Kamil.
Split the string on _:
$string = 'TRD_123456789_ABC123456789_YYMMDD_HHMMSS_12345678_12345_blabla_blabla_blabla_blabla'
$parts = $string -split '_'
Then discard all but the first 6 substrings (eg. drop the 12345 part and anything thereafter):
$parts = $parts[0..5]
Now add your random 5-digit number:
$parts = #($parts; '{0:D5}' -f $(Get-Random -Maximum 100000))
Update the string at index 4 (the HHMMSS string):
$parts[4] = Get-Date -Format 'HHmmss'
And finally join all the substrings together with _ again:
$newString = $parts -join '_'
Putting it all together, you could write a nice little helper function:
function Get-NewName {
param(
[string]$Name
)
# split and discard
$parts = $Name -split '_' |Select -First 6
# add random number
$parts = #($parts; '{0:D5}' -f $(Get-Random -Maximum 100000))
# update timestamp
$parts[4] = Get-Date -Format 'HHmmss'
# return new string
return $parts -join '_'
}
And then do:
Get-ChildItem -File -Filter TRD_* |Rename-Item -NewName { Get-NewName $_.Name }

Need to batch add characters to filenames using Powershell

I have a series of files all named something like:
PRT14_WD_14220000_1.jpg
I need to add two zeroes after the last underscore and before the number so it looks like PRT14_WD_14220000_001.jpg
I've tried"
(dir) | rename-Item -new { $_.name -replace '*_*_*_','*_*_*_00' }
Appreciate any help.
The closest thing to what you attempted would be this. In regex, the wildcard is .*. And the parentheses do grouping to refer to later with the dollar sign numbers.
dir *.jpg | rename-Item -new { $_.name -replace '(.*)_(.*)_(.*)_','$1_$2_$3_00' } -whatif
What if: Performing the operation "Rename File" on target "Item: C:\users\admin\foo\PRT14_WD_14220000_1.jpg Destination: C:\users\admin\foo\PRT14_WD_14220000_001.jpg".
Ok, here's my take when you want the number with max two zeroes padding. $num has to be an integer for the .tostring() method I want.
dir *.jpg | rename-item -newname { $a,$b,$c,$num = $_.basename -split '_'
$num = [int]$num
$a + '_' + $b + "_" + $c + '_' + $num.tostring('000') + '.jpg'
} -whatif
the following presumes your last part of the .BaseName will always need two zeros added to it. what it does ...
fakes getting the fileinfo object that you get from Get-Item/Get-ChildItem
replace that with the appropriate cmdlet. [grin]
splits the .BaseName into parts using the _ as the split target
adds two zeros to the final part from the above split
merges the parts into a $NewBaseName
gets the .FullName and replaces the original BaseName with the $newBaseName
displays that new file name
you will still need to do your rename, but that is pretty direct. [grin]
here's the code ...
# fake getting a file info object
# in real life, use Get-Item or Get-ChildItem
$FileInfo = [System.IO.FileInfo]'PRT14_WD_14220000_1.jpg'
$BNParts = $FileInfo.BaseName.Split('_')
$BNParts[-1] = '00{0}' -f $BNParts[-1]
$NewBasename = $BNParts -join '_'
$NewFileName = $FileInfo.FullName.Replace($FileInfo.BaseName, $NewBaseName)
$NewFileName
output = D:\Data\Scripts\PRT14_WD_14220000_001.jpg
The -replace operator operates on regexes (regular expressions), not wildcard expressons such as * (by itself), which is what you're trying to use.
A conceptually more direct approach is to focus the replacement on the end of the string:
Get-ChildItem | # `dir` is a built-in alias for Get-ChildItem`
Rename-Item -NewName { $_.Name -replace '(?<=_)[^_]+(?=\.)', '00$&' } -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
(?<=_)[^_]+(?=\.) matches a nonempty run (+) of non-_ chars. ([^_]) preceded by _ ((?<=_) and followed by a literal . ((?=\.)), excluding both the preceding _ and the following . from what is captured by the match ((?<=...) and (?=...) are non-capturing look-around assertions).
In short: This matches and captures the characters after the last _ and before the start of the filename extension.
00$& replaces what was matched with 00, followed by what the match captured ($&).
In a follow-up comment you mention wanting to not just blindly insert 00, but to 0-left-pad the number after the last _ to 3 digits, whatever the number may be.
In PowerShell [Core] 6.1+, this can be achieved as follows:
Get-ChildItem |
Rename-Item -NewName {
$_.Name -replace '(?<=_)[^_]+(?=\.)', { $_.Value.PadLeft(3, '0') }
} -WhatIf
The script block ({ ... }) as the replacement operand receives each match as a Match instance stored in automatic variable $_, whose .Value property contains the captured text.
Calling .PadLeft(3, '0') on that captured text 0-left-pads it to 3 digits and outputs the result, which replaces the regex match at hand.
A quick demonstration:
PS> 'PRT14_WD_14220000_1.jpg' -replace '(?<=_)[^_]+(?=\.)', { $_.Value.PadLeft(3, '0') }
PRT14_WD_14220000_001.jpg # Note how '_1.jpg' was replaced with '_001.jpg'
In earlier PowerShell versions, you must make direct use of the .NET [regex] type's .Replace() method, which fundamentally works the same:
Get-ChildItem |
Rename-Item -NewName {
[regex]::Replace($_.Name, '(?<=_)[^_]+(?=\.)', { param($m) $m.Value.PadLeft(3, '0') })
} -WhatIf

Rename file - delete all characters AFTER 2nd underscore

I need to replace the time\date stamp that's included in the filename after 2nd underscore (needs to be in the same format yyyyMMddHHmmss)
example file: 123456_123456_20190716163001.xml
sometimes the file in question gets created with an additional character which invalidates the file, in this case I need to replace this with the current timestamp.
example: 123456_123456_current Timestamp here.xml
The file should never exceed 32 characters(including extension)
I found a script but it deletes everything after the 1st underscore not the 2nd and I'm struggling to find a way to replace the text with the current timestamp.
Get-ChildItem c:\test -Filter 123456_123456*.xml | Foreach-Object -Process {
$NewName = [Regex]::Match($_.Name,"^[^_]*").Value + '.xml' $_ | Rename-Item -NewName $NewName
}
timestamp after 2nd underscore to be updated to the current timestamp if original file exceeds 32 characters
123456_123456_current Timestamp here.xml
this takes advantage of the way a [fileinfo] object is structured. the .BaseName is easy to get to & use .Split() on. then one can use -join to put it back into one basename & finally add the extension onto the basename.
# fake reading in a file info object
# in real life, use Get-ChildItem or Get-Item
$FileObject = [System.IO.FileInfo]'123456_123456_current Timestamp here.xml'
$NewName = -join (($FileObject.BaseName.Split('_')[0,1] -join '_'), $FileObject.Extension)
$NewName
output = 123456_123456.xml
Sticking with the regex theme, you can do the following:
$CurrentTime = Get-Date -Format 'yyyyMMddHHmmss'
$RegexReplace = "(.*?_.*?_).*(\..*)"
Get-ChildItem c:\test -Filter 123456_123456*.xml |
Rename-Item -NewName {$_.Name -replace $RegexReplace,"`${1}$CurrentTime`${2}"}
If duplicate file names are a concern, you can build in an increment to $CurrentTime.
$CurrentTime = Get-Date -Format 'yyyyMMddHHmmss'
$RegexReplace = "(.*?_.*?_).*(\..*)"
Get-ChildItem c:\test -Filter 123456_123456*.xml |
Rename-Item -NewName {
$NewName = $_.Name -replace $RegexReplace,"`${1}$CurrentTime`${2}"
if (test-path $NewName) {
$CurrentTime = [double]$CurrentTime + 1
$NewName = $_.Name -replace $RegexReplace,"`${1}$CurrentTime`${2}"
}
$NewName
}
Explanation:
$RegexReplace contains the regex expression that will need to be matched for the ideal rename operation to happen. The regex mechanisms are explained below:
.*?_.*?_: Matches a minimal number of characters (lazy matching) followed by an underscore and then another minimal number of characters followed by an underscore.
.*: Greedily matches any characters
\.: Literally matches the dot character (.).
(): The parentheses here represent capture groups with the first set being 1 and the second set being 2. These are later referenced as ${1} and ${2} in the -replace operation.
Since Rename-Item -NewName supports delayed script binding, we can just pipe Get-ChildItem output directly to it. The current pipeline object is $_.
The -replace operation uses the variable $CurrentTime, which must be expanded in order for a successful outcome. For that reason, we use double quotes around the replacement. Since we do not want capture groups ${1} and ${2} expanded, we backtick escape them.

Find and Replace character only in certain column positions in each line

I'm trying to write a script to find all the periods in the first 11 characters or last 147 characters of each line (lines are fixed width of 193, so I'm attempting to ignore characters 12 through 45).
First I want a script that will just find all the periods from the first or last part of each line, but then if I find them I would like to replace all periods with 0's, but ignore periods on the 12th through 45th line and leaving those in place. It would scan all the *.dat files in the directory and create period free copies in a subfolder. So far I have:
$data = get-content "*.dat"
foreach($line in $data)
{
$line.substring(0,12)
$line.substring(46,147)
}
Then I run this with > Output.txt then do a select-string Output.txt -pattern ".". As you can see I'm a long ways from my goal as presently my program is mashing all the files together, and I haven't figured out how to do any replacement yet.
Get-Item *.dat |
ForEach-Object {
$file = $_
$_ |
Get-Content |
ForEach-Object {
$beginning = $_.Substring(0,12) -replace '\.','0'
$middle = $_.Substring(12,44)
$end = $_.Substring(45,147) -replace '\.','0'
'{0}{1}{2}' -f $beginning,$middle,$end
} |
Set-Content -Path (Join-Path $OutputDir $file.Name)
}
You can use the powershell -replace operator to replace the "." with "0". Then use substring as you do to build up the three portions of the string you're interested in to get the updated string. This will output an updated line for each line of your input.
$data = get-content "*.dat"
foreach($line in $data)
{
($line.SubString(0,12) -replace "\.","0") + $line.SubString(13,34) + ($line.substring(46,147) -replace "\.","0")
}
Note that the -replace operator performs a regular expression match and the "." is a special regular expression character so you need to escape it with a "\".

Reversing a file name based on delimiter then truncating part

I need to rename many hundreds of files to follow a new naming convention, but I'm having awful trouble. This really needs to be scripted in powershell or VBS so we can automate the task in a regular basis.
Original File Name
Monday,England.txt
New File Name
EnglanMo
Convention Rules:
The file name is reversed around the delimiter (,) to England,Monday and then truncated to 6/2 char
Englan,Mo
The Delimiter is then removed
englanmo.txt
Say we had Wednesday,Spain.txt spain being 5 char, this is not subject to any reduction
SpainWe.txt
All the txt files can be accessed in one directory, or from a CSV, whatever is easiest.
Without having the exact details of your file paths, where it'll run, etc. you'll have to adapt this to point at the appropriate path(s).
$s= "Monday,England.txt";
#$s = "Wednesday,Spain.txt";
$nameparts = $s.split(".")[0].split(",");
if ($nameparts[1].length -gt 6) {
$newname = $nameparts[1].substring(0,6);
} else {
$newname = $nameparts[1];
}
if ($nameparts[0].length -gt 2) {
$newname += $nameparts[0].substring(0,2);
} else {
$newname += $nameparts[0];
}
$newname = $newname.toLower() + "."+ $s.split(".")[1];
$newname;
get-item $s |rename-item -NewName $newname;
I'm certain this isn't the most efficient/elegant way to do this, but it works with both of your test cases.
Use Get-ChildItem to grab the files, then on files that match your criteria, use regular expressions to capture the first two characters of the day of the week and the first six characters of the location, then use those captures to create a new filename. This is my best guess. Use -WhatIf on the Move-Item cmdlet until you get the regular expression and the destination path correct.
Get-ChildItem C:\Path\To\Files *.txt |
Where-Object { $_.BaseName -matches '^([^,]{2})[^,]*,(.{1,6})' } |
Move-Item -WhatIf -Destination {
$newFileName = '{0}{1}.txt' -f $matches[1],$matches[2]
Join-Path C:\Path\To\Files $newFileName.ToLower()
}
I think you should be able to achieve this by splitting the string into arrays in powershell and then recording the array to get your reverse.
For example:
$fileNameExtension = "Monday,England.txt";
$fileName = $fileNameExtension.split("."); // gets you an array [Monday,England][txt]
$fileparts = $fileName.split(","); // gets you an array [Monday][England]
//Create the new substring parts, notice you can now pull items from the array in any order you need,
//You will need to check the length before using substringing
$part1 = $fileparts[1].substring(0,5);
$part2 = $fileparts[0].substring(0,2);
//Now construct the new file name by rebuilding the string
$newfileName = $part1 + $part2 + “.” + $fileName[1];