How to rename files with sequential even and odd numbers in PowerShell? - powershell

I would like to know how I can rename the files from a specific folder with a sequence of only even and odds numbers in PowerShell. E.g. Folder1: pag_001.jpg, pag_003.jpg, pag_005.jpg.... pag_201.jpg , Folder2: pag_002.jpg, pag_004.jpg, pag_006.jpg.... pag_200.jpg. It is because I have a document that was scanned first the odds pages and secondly the even pages, therefore their file names are in a consecutive sequence from 1 to 201. Then I separated one half of the files which are the odds pages in a new place: Folder1, and the second half,the even pages in the Folder2. That is why I would like change the names first and the join again together with their new names.
I have tried this based in a similar post:
At the moment I could generate even number sequences like that:
ForEach ($number in 1..100 ) { $number * 2}
and odd numbers like that:
ForEach ($number in 0..100 ) { $number *2+1}
and wanted apply the sequences generated before to rename my files like that:
cd C:\test\Folder1
$i = $number * 2
Get-ChildItem *.jpg | %{Rename-Item $_ -NewName ('pag_{0:D3}.jpg' -f $i++)}
but it doesn't works! Any suggestions are welcome
Regards,

Your $i++ adds 1 each time, this is why it also add even numbers,
You can create array of Odd Numbers then use the $i++ to step one item in the array, like this:
$path = "C:\test\Folder1"
$oddNumbersArray = 0..100 | % {$_ *2 +1}
$i = 0
Get-ChildItem $path -Filter *.jpg | % {Rename-Item $_ -NewName ("pag_$($oddNumbersArray[$i]).jpg") ;$i++}
For Even Numbers change the $oddNumbersArray line to {$_ *2}

Bunch of ways to do this. For mine we add each index as a member so that it is more easily accessible in the rename item script block.
$index = 0
Get-ChildItem $path -Filter "*.jpg" | ForEach-Object{
$index = $index +2
$_ | Add-Member -Name "Index" -MemberType NoteProperty -Value $index -PassThru
} | Rename-Item -NewName {'pag_{0:D3}.jpg' -f $_.Index} -WhatIf
Using Add-Member in a ForEach-Object we update the value of index and then add it as a property of the same name. Then in your rename-item scriptblock we can call that property. Remove the -WhatIf after you verified the new names are what you wanted. Switch $index between 0 and -1 for even and odd respectively.
Another method using a global index variable and mitigating the pipeline by using calculated properties to create the pipeline variables that Rename-Item uses.
$path = "C:\Temp\csv"
$global:index = 0 # Use -1 for odd
Get-ChildItem $path -Filter "*.csv" |
Select-Object #{Name="Path";Expression={$_.FullName}},
#{Name="NewName";Expression={$global:index = $global:index + 2; 'pag_{0:D3}.jpg' -f $global:index}} |
Rename-Item -WhatIf

Related

Bulk renaming files with different extensions in order using powershell

is there a way to bulk rename items such that a folder with the items arranged in order would have their name changed into numbers with zero padding regardless of extension?
for example, a folder with files named:
file1.jpg
file2.jpg
file3.jpg
file4.png
file5.png
file6.png
file7.png
file8.jpg
file9.jpg
file10.mp4
would end up like this:
01.jpg
02.jpg
03.jpg
04.png
05.png
06.png
07.png
08.jpg
09.jpg
10.mp4
i had a script i found somewhere that can rename files in alphabetical order. however, it seems to only accepts conventionally bulk renamed files (done by selecting all the files, and renaming them such that they read "file (1).jpg" etc), which messes up the ordering when dealing with differing file extensions. it also doesn't seem to rename files with variations in their file names. here is what the code looked like:
Get-ChildItem -Path C:\Directory -Filter file* | % {
$matched = $_.BaseName -match "\((?<number>\d+)\)"
if (-not $matched) {break;}
[int]$number = $Matches["number"]
Rename-Item -Path $_.FullName -NewName "$($number.ToString("000"))$($_.Extension)"
}
If your intent is to rename the files based on the ending digits of their BaseName you can use Get-ChildItem in combination with Where-Object for filtering them and then pipe this result to Rename-Item using a delay-bind script block.
Needles to say, this code does not handle file collision. If there is more than one file with the same ending digits and the same extension this will error out.
Get-ChildItem -Filter file* | Where-Object { $_.BaseName -match '\d+$' } |
Rename-Item -NewName {
$basename = '{0:00}' -f [int][regex]::Match($_.BaseName, '\d+$').Value
$basename + $_.Extension
}
To test the code you can use the following:
#'
file1.jpg
file2.jpg
file3.jpg
file4.png
file5.png
file6.png
file7.png
file8.jpg
file9.jpg
file10.mp4
'# -split '\r?\n' -as [System.IO.FileInfo[]] | ForEach-Object {
$basename = '{0:00}' -f [int][regex]::Match($_.BaseName, '\d+$').Value
$basename + $_.Extension
}
You could just use the number of files found in the folder to create the appropriate 'numbering' format for renaming them.
$files = (Get-ChildItem -Path 'D:\Test' -File) | Sort-Object Name
# depending on the number of files, create a formating template
# to get the number of leading zeros correct.
# example: 645 files would create this format: '{0:000}{1}'
$format = '{0:' + '0' * ($files.Count).ToString().Length + '}{1}'
# a counter for the index number
$index = 1
# now loop over the files and rename them
foreach ($file in $files) {
$file | Rename-Item -NewName ($format -f $index++, $file.Extension) -WhatIf
}
The -WhatIf switch is a safety measure. With this, no file gets actually renamed, you will only see in the console what WOULD happen. Once you are content with that, remove the -WhatIf switch from the code and run again to rename all your files in the folder

Renaming multiple files with correct numerical order in Powershell

I have .mp3 files in my folder with these names: 2.mp3,4.mp3,6.mp3,8.mp3,10.mp3,12.mp3,...102.mp3,104.mp3...
and would like to rename them all and get following result: 0001.mp3,0002.mp3,0003.mp3,0004.mp3...
the following code changes the names of all files, but the problem is that it does not change/sort them in the exact original order as they are originally displayed in my folder.
$i = 1
Get-ChildItem *.mp3 | %{Rename-Item $_ -NewName ('{0:D4}.mp3' -f $i++)}
Something like this written below I want to achieve
2.mp3 -> 0001.mp3
4.mp3 -> 0002.mp3
6.mp3 -> 0003.mp3
8.mp3 -> 0004.mp3
10.mp3 -> 0005.mp3
12.mp3 -> 0006.mp3
...
I have currently come to this half-solution that does not work and I do not know how to proceed/solve this problem.
$i = 1
Get-ChildItem *.mp3 | %{Rename-Item $_ -NewName {'{0:D4}.mp3' -f [int]($_.BaseName -replace '\D') $i++ }}
Use Sort-Object to sort them according to the numerical value in the name before you start renaming:
$i = 1
Get-ChildItem *.mp3 |Sort-Object {[int]($_.BaseName -replace '\D')} | Rename-Item ...
Although you may run into unexpected errors because $i++ does not have the effect you intend - the ++ operation creates a new local copy of $i without ever incrementing $i in the parent scope.
We can solve that by explicitly passing a reference to $i:
$i = 1
Get-ChildItem *.mp3 |Sort-Object {[int]($_.BaseName -replace '\D')} | Rename-Item -NewName {'{0:D4}.mp3' -f ([ref]$i).Value++}

Powerhsell/Unix Bash - Find missing numbers for a set of files

I'm currently trying to detect some missing files for batches.
This batch generates files with following convention:
File_1_01.txt (first hour, first iteration)
File_1_02.txt
I want to create a script to find out which iteration is missing.
I found this piece of code in powershell
function missingNumbers {
Get-ChildItem -path ./* -include *.txt
| Where{$_.Name -match '\d+'}
| ForEach{$Matches[0]}
| sort {[int]$_} |% {$i = 1}
{while ($i -lt $_){$i;$i++};$i++}}
However when i use this script against
File_1_01.txt
File_1_02.txt
File_1_04.txt
It doesn't return entry 03
Investigations
Remove hourly occurrence File_<1>_helps and script behaves as expected.
If i concatenate hour and occurrence, script will display all numbers before 101.
I'm open to having sth in Unix as well.
I had another approach in mind, removing the common text between all files but have no idea how to do it.
Instead of matching the 1 every time, you can match the 2 digits \d\d at the end. Putting the pipe symbol on the next line like that will only work in powershell 7.
Get-ChildItem -path ./* -include *.txt |
Where Name -match \d\d |
ForEach{ $Matches[0] } |
sort {[int]$_} |
% { $i = 1 } { while ($i -lt $_){ $i;$i++ }; $i++ }
3
You could put all the numbers together like this, but then they'd have to all be right after each other. Adjust the match pattern or get-childitem for your needs.
Get-ChildItem -path ./* -include *.txt |
Where Name -match '(\d).(\d\d)' |
ForEach{ $Matches[1] + $Matches[2] }
101
102
104

How to use Powershell to list duplicate files in a folder structure that exist in one of the folders

I have a source tree, say c:\s, with many sub-folders. One of the sub-folders is called "c:\s\Includes" which can contain one or more .cs files recursively.
I want to make sure that none of the .cs files in the c:\s\Includes... path exist in any other folder under c:\s, recursively.
I wrote the following PowerShell script which works, but I'm not sure if there's an easier way to do it. I've had less than 24 hours experience with PowerShell so I have a feeling there's a better way.
I can assume at least PowerShell 3 being used.
I will accept any answer that improves my script, but I'll wait a few days before accepting the answer. When I say "improve", I mean it makes it shorter, more elegant or with better performance.
Any help from anyone would be greatly appreciated.
The current code:
$excludeFolder = "Includes"
$h = #{}
foreach ($i in ls $pwd.path *.cs -r -file | ? DirectoryName -notlike ("*\" + $excludeFolder + "\*")) { $h[$i.Name]=$i.DirectoryName }
ls ($pwd.path + "\" + $excludeFolder) *.cs -r -file | ? { $h.Contains($_.Name) } | Select #{Name="Duplicate";Expression={$h[$_.Name] + " has file with same name as " + $_.Fullname}}
1
I stared at this for a while, determined to write it without studying the existing answers, but I'd already glanced at the first sentence of Matt's answer mentioning Group-Object. After some different approaches, I get basically the same answer, except his is long-form and robust with regex character escaping and setup variables, mine is terse because you asked for shorter answers and because that's more fun.
$inc = '^c:\\s\\includes'
$cs = (gci -R 'c:\s' -File -I *.cs) | group name
$nopes = $cs |?{($_.Group.FullName -notmatch $inc)-and($_.Group.FullName -match $inc)}
$nopes | % {$_.Name; $_.Group.FullName}
Example output:
someFile.cs
c:\s\includes\wherever\someFile.cs
c:\s\lib\factories\alt\someFile.cs
c:\s\contrib\users\aa\testing\someFile.cs
The concept is:
Get all the .cs files in the whole source tree
Split them into groups of {filename: {files which share this filename}}
For each group, keep only those where the set of files contains any file with a path that matches the include folder and contains any file with a path that does not match the includes folder. This step covers
duplicates (if a file only exists once it cannot pass both tests)
duplicates across the {includes/not-includes} divide, instead of being duplicated within one branch
handles triplicates, n-tuplicates, as well.
Edit: I added the ^ to $inc to say it has to match at the start of the string, so the regex engine can fail faster for paths that don't match. Maybe this counts as premature optimization.
2
After that pretty dense attempt, the shape of a cleaner answer is much much easier:
Get all the files, split them into include, not-include arrays.
Nested for-loop testing every file against every other file.
Longer, but enormously quicker to write (it runs slower, though) and I imagine easier to read for someone who doesn't know what it does.
$sourceTree = 'c:\\s'
$allFiles = Get-ChildItem $sourceTree -Include '*.cs' -File -Recurse
$includeFiles = $allFiles | where FullName -imatch "$($sourceTree)\\includes"
$otherFiles = $allFiles | where FullName -inotmatch "$($sourceTree)\\includes"
foreach ($incFile in $includeFiles) {
foreach ($oFile in $otherFiles) {
if ($incFile.Name -ieq $oFile.Name) {
write "$($incFile.Name) clash"
write "* $($incFile.FullName)"
write "* $($oFile.FullName)"
write "`n"
}
}
}
3
Because code-golf is fun. If the hashtables are faster, what about this even less tested one-liner...
$h=#{};gci c:\s -R -file -Filt *.cs|%{$h[$_.Name]+=#($_.FullName)};$h.Values|?{$_.Count-gt1-and$_-like'c:\s\includes*'}
Edit: explanation of this version: It's doing much the same solution approach as version 1, but the grouping operation happens explicitly in the hashtable. The shape of the hashtable becomes:
$h = {
'fileA.cs': #('c:\cs\wherever\fileA.cs', 'c:\cs\includes\fileA.cs'),
'file2.cs': #('c:\cs\somewhere\file2.cs'),
'file3.cs': #('c:\cs\includes\file3.cs', 'c:\cs\x\file3.cs', 'c:\cs\z\file3.cs')
}
It hits the disk once for all the .cs files, iterates the whole list to build the hashtable. I don't think it can do less work than this for that bit.
It uses +=, so it can add files to the existing array for that filename, otherwise it would overwrite each of the hashtable lists and they would be one item long for only the most recently seen file.
It uses #() - because when it hits a filename for the first time, $h[$_.Name] won't return anything, and the script needs put an array into the hashtable at first, not a string. If it was +=$_.FullName then the first file would go into the hashtable as a string and the += next time would do string concatenation and that's no use to me. This forces the first file in the hashtable to start an array by forcing every file to be a one item array. The least-code way to get this result is with +=#(..) but that churn of creating throwaway arrays for every single file is needless work. Maybe changing it to longer code which does less array creation would help?
Changing the section
%{$h[$_.Name]+=#($_.FullName)}
to something like
%{if (!$h.ContainsKey($_.Name)){$h[$_.Name]=#()};$h[$_.Name]+=$_.FullName}
(I'm guessing, I don't have much intuition for what's most likely to be slow PowerShell code, and haven't tested).
After that, using h.Values isn't going over every file for a second time, it's going over every array in the hashtable - one per unique filename. That's got to happen to check the array size and prune the not-duplicates, but the -and operation short circuits - when the Count -gt 1 fails, the so the bit on the right checking the path name doesn't run.
If the array has two or more files in it, the -and $_ -like ... executes and pattern matches to see if at least one of the duplicates is in the includes path. (Bug: if all the duplicates are in c:\cs\includes and none anywhere else, it will still show them).
--
4
This is edited version 3 with the hashtable initialization tweak, and now it keeps track of seen files in $s, and then only considers those it's seen more than once.
$h=#{};$s=#{};gci 'c:\s' -R -file -Filt *.cs|%{if($h.ContainsKey($_.Name)){$s[$_.Name]=1}else{$h[$_.Name]=#()}$h[$_.Name]+=$_.FullName};$s.Keys|%{if ($h[$_]-like 'c:\s\includes*'){$h[$_]}}
Assuming it works, that's what it does, anyway.
--
Edit branch of topic; I keep thinking there ought to be a way to do this with the things in the System.Data namespace. Anyone know if you can connect System.Data.DataTable().ReadXML() to gci | ConvertTo-Xml without reams of boilerplate?
I'd do more or less the same, except I'd build the hashtable from the contents of the includes folder and then run over everything else to check for duplicates:
$root = 'C:\s'
$includes = "$root\includes"
$includeList = #{}
Get-ChildItem -Path $includes -Filter '*.cs' -Recurse -File |
% { $includeList[$_.Name] = $_.DirectoryName }
Get-ChildItem -Path $root -Filter '*.cs' -Recurse -File |
? { $_.FullName -notlike "$includes\*" -and $includeList.Contains($_.Name) } |
% { "Duplicate of '{0}': {1}" -f $includeList[$_.Name], $_.FullName }
I'm not as impressed with this as I would like but I thought that Group-Object might have a place in this question so I present the following:
$base = 'C:\s'
$unique = "$base\includes"
$extension = "*.cs"
Get-ChildItem -Path $base -Filter $extension -Recurse |
Group-Object $_.Name |
Where-Object{($_.Count -gt 1) -and (($_.Group).FullName -match [regex]::Escape($unique))} |
ForEach-Object {
$filename = $_.Name
($_.Group).FullName -notmatch [regex]::Escape($unique) | ForEach-Object{
"'{0}' has file with same name as '{1}'" -f (Split-Path $_),$filename
}
}
Collect all the files with the extension filter $extension. Group the files based on their names. Then of those groups find every group where there are more than one of that particular file and one of the group members is at least in the directory $unique. Take those groups and print out all the files that are not from the unique directory.
From Comment
For what its worth this is what I used for testing to create a bunch of files. (I know the folder 9 is empty)
$base = "E:\Temp\dev\cs"
Remove-Item "$base\*" -Recurse -Force
0..9 | %{[void](New-Item -ItemType directory "$base\$_")}
1..1000 | %{
$number = Get-Random -Minimum 1 -Maximum 100
$folder = Get-Random -Minimum 0 -Maximum 9
[void](New-Item -Path $base\$folder -ItemType File -Name "$number.txt" -Force)
}
After looking at all the others, I thought I would try a different approach.
$includes = "C:\s\includes"
$root = "C:\s"
# First script
Measure-Command {
[string[]]$filter = ls $includes -Filter *.cs -Recurse | % name
ls $root -include $filter -Recurse -Filter *.cs |
Where-object{$_.FullName -notlike "$includes*"}
}
# Second Script
Measure-Command {
$filter2 = ls $includes -Filter *.cs -Recurse
ls $root -Recurse -Filter *.cs |
Where-object{$filter2.name -eq $_.name -and $_.FullName -notlike "$includes*"}
}
In my first script, I get all the include files into a string array. Then i use that string array as a include param on the get-childitem. In the end, I filter out the include folder from the results.
In my second script, I enumerate everything and then filter after the pipe.
Remove the measure-command to see the results. I was using that to check the speed. With my dataset, the first one was 40% faster.
$FilesToFind = Get-ChildItem -Recurse 'c:\s\includes' -File -Include *.cs | Select Name
Get-ChildItem -Recurse C:\S -File -Include *.cs | ? { $_.Name -in $FilesToFind -and $_.Directory -notmatch '^c:\s\includes' } | Select Name, Directory
Create a list of file names to look for.
Find all files that are in the list but not part of the directory the list was generated from
Print their name and directory

Remove Top Line of Text File with PowerShell

I am trying to just remove the first line of about 5000 text files before importing them.
I am still very new to PowerShell so not sure what to search for or how to approach this. My current concept using pseudo-code:
set-content file (get-content unless line contains amount)
However, I can't seem to figure out how to do something like contains.
While I really admire the answer from #hoge both for a very concise technique and a wrapper function to generalize it and I encourage upvotes for it, I am compelled to comment on the other two answers that use temp files (it gnaws at me like fingernails on a chalkboard!).
Assuming the file is not huge, you can force the pipeline to operate in discrete sections--thereby obviating the need for a temp file--with judicious use of parentheses:
(Get-Content $file | Select-Object -Skip 1) | Set-Content $file
... or in short form:
(gc $file | select -Skip 1) | sc $file
It is not the most efficient in the world, but this should work:
get-content $file |
select -Skip 1 |
set-content "$file-temp"
move "$file-temp" $file -Force
Using variable notation, you can do it without a temporary file:
${C:\file.txt} = ${C:\file.txt} | select -skip 1
function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {
if ( -not (Test-Path $path -PathType Leaf) ) {
throw "invalid filename"
}
ls $path |
% { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }
}
I just had to do the same task, and gc | select ... | sc took over 4 GB of RAM on my machine while reading a 1.6 GB file. It didn't finish for at least 20 minutes after reading the whole file in (as reported by Read Bytes in Process Explorer), at which point I had to kill it.
My solution was to use a more .NET approach: StreamReader + StreamWriter.
See this answer for a great answer discussing the perf: In Powershell, what's the most efficient way to split a large text file by record type?
Below is my solution. Yes, it uses a temporary file, but in my case, it didn't matter (it was a freaking huge SQL table creation and insert statements file):
PS> (measure-command{
$i = 0
$ins = New-Object System.IO.StreamReader "in/file/pa.th"
$outs = New-Object System.IO.StreamWriter "out/file/pa.th"
while( !$ins.EndOfStream ) {
$line = $ins.ReadLine();
if( $i -ne 0 ) {
$outs.WriteLine($line);
}
$i = $i+1;
}
$outs.Close();
$ins.Close();
}).TotalSeconds
It returned:
188.1224443
Inspired by AASoft's answer, I went out to improve it a bit more:
Avoid the loop variable $i and the comparison with 0 in every loop
Wrap the execution into a try..finally block to always close the files in use
Make the solution work for an arbitrary number of lines to remove from the beginning of the file
Use a variable $p to reference the current directory
These changes lead to the following code:
$p = (Get-Location).Path
(Measure-Command {
# Number of lines to skip
$skip = 1
$ins = New-Object System.IO.StreamReader ($p + "\test.log")
$outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")
try {
# Skip the first N lines, but allow for fewer than N, as well
for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {
$ins.ReadLine()
}
while( !$ins.EndOfStream ) {
$outs.WriteLine( $ins.ReadLine() )
}
}
finally {
$outs.Close()
$ins.Close()
}
}).TotalSeconds
The first change brought the processing time for my 60 MB file down from 5.3s to 4s. The rest of the changes is more cosmetic.
$x = get-content $file
$x[1..$x.count] | set-content $file
Just that much. Long boring explanation follows. Get-content returns an array. We can "index into" array variables, as demonstrated in this and other Scripting Guys posts.
For example, if we define an array variable like this,
$array = #("first item","second item","third item")
so $array returns
first item
second item
third item
then we can "index into" that array to retrieve only its 1st element
$array[0]
or only its 2nd
$array[1]
or a range of index values from the 2nd through the last.
$array[1..$array.count]
I just learned from a website:
Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }
Or you can use the aliases to make it short, like:
gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }
Another approach to remove the first line from file, using multiple assignment technique. Refer Link
$firstLine, $restOfDocument = Get-Content -Path $filename
$modifiedContent = $restOfDocument
$modifiedContent | Out-String | Set-Content $filename
skip` didn't work, so my workaround is
$LinesCount = $(get-content $file).Count
get-content $file |
select -Last $($LinesCount-1) |
set-content "$file-temp"
move "$file-temp" $file -Force
Following on from Michael Soren's answer.
If you want to edit all .txt files in the current directory and remove the first line from each.
Get-ChildItem (Get-Location).Path -Filter *.txt |
Foreach-Object {
(Get-Content $_.FullName | Select-Object -Skip 1) | Set-Content $_.FullName
}
For smaller files you could use this:
& C:\windows\system32\more +1 oldfile.csv > newfile.csv | out-null
... but it's not very effective at processing my example file of 16MB. It doesn't seem to terminate and release the lock on newfile.csv.