Create directories and subdirectories in Powershell to move files - powershell

I am trying to create directories and subdirectories based on the names of existing files. After that I want to move those files into the according directories. I have already come pretty far, also with the help of
here and here, but I am failing at some point.
Existing Test Files Actually about 5000 files
Folder structure This is how it should look like afterwards
MM0245AK625_G03_701.txt
MM\MM0245\625\G03\MM0245AK625_G03_701.txt
MM0245AK830_G04_701.txt
MM\MM0245\830\G04\MM0245AK830_G04_701.txt
VY0245AK_G03.txt
VY\VY0245\VY0245AK_G03.txt
VY0245AK_G03_701.txt
VY\VY0245\G03\VY0245AK_G03_701.txt
VY0245AK625_G03.txt
VY\VY0245\625\VY0245AK625_G03.txt
VY0245AK625_G03_701.txt
VY\VY0245\625\G03\VY0245AK625_G03_701.txt
VY0345AK625_G03_701.txt
VY\VY0345\625\G03\VY0345AK625_G03_701.txt
Code for creating those files is at the end of this post.
As you can see, the files do match some kind of pattern, but not consistently. I use multiple copies of my code with different 'parameters' to sort each type of filepattern, but there gotta be a more streamline way.
Existing code
$dataPath = "$PSScriptRoot\Test"
#$newDataPath = "$PSScriptRoot\"
Get-ChildItem $dataPath -Filter *.txt | % {
$g1 = $_.BaseName.Substring(0, 2)
$g2 = $_.BaseName.Substring(0, 6)
$g3 = $_.BaseName.Substring(8, 3)
$g4 = $_.BaseName.Substring(12, 3)
$path = "$DataPath\$g1\$g2\$g3\$g4"
if (-not (Test-Path $path)) {
New-Item -ItemType Directory -Path $path
}
Move-Item -Path $_.FullName -Destination $path
}
This code also creates directories in the 3rd $g3 layer for files in "the shorter format", e.g. XX0000AK_G00.txt. This file should however not be moved further than layer $g2. Of course the code above is not capable of doing this, so I tried it with regex below.
This is an alternative idea (not worked out furhter than creating directories), but I failed to continue after
Select-Object -Unique. I am failing to use $Matches[1] in New-Item, because I can only Select-Object -unique the variable $_, not $Matches[1] or even the subdirectory "$($Matches[1])$($Matches[2])". The following code is my attempt.
cd $PSScriptRoot\Test
# Create Folder Layer 1
Get-ChildItem |
% {
$_.BaseName -match "^(\w{2})(\d{4})AN(\d{3})?_(G\d{2})(_\d{3})?$" | Out-Null
$Matches[1]
"$($Matches[1])$($Matches[2])"
} |
Select-Object -Unique |
% {
New-Item -ItemType directory $_
} | Out-Null
I am fairly new to powershell, please don't be too harsh :) I also don't have a programming background, so please excuse the use of incorrect wording.
new-item $dataPath\MM0245AK830_G04_701.txt -ItemType File
new-item $dataPath\VY0245AK_G03.txt -ItemType File
new-item $dataPath\VY0245AK_G03_701.txt -ItemType File
new-item $dataPath\VY0245AK625_G03.txt -ItemType File
new-item $dataPath\VY0245AK625_G03_701.txt -ItemType File
new-item $dataPath\VY0345AK625_G03_701.txt -ItemType File

i am truly bad at complex regex patterns [blush], so this is done with simple string ops, mostly.
what the code does ...
fakes reading in some files
when you have tested this and it works as needed on all your test files, replace the entire #region/#endregion block with a Get-ChildItem call.
iterates thru the collection
splits the BaseName on ak & saves it for later use
checks for a the two short file layouts
checks for 1 _ versus 2
builds the $Dir string for each of those 2 filename layouts
builds the long file name $Dir
uses the previous $Dir stuff to build the $FullDest for each file
shows the various stages for each file
that last section would be replaced with your mkdir & Move-Item commands.
the code ...
#region >>> fake reading in files
# when ready to use the real things, use $Get-ChildItem
$InStuff = #'
MM0245AK625_G03_701.txt
MM0245AK830_G04_701.txt
VY0245AK_G03.txt
VY0245AK_G03_701.txt
VY0245AK625_G03.txt
VY0245AK625_G03_701.txt
VY0345AK625_G03_701.txt
'# -split [System.Environment]::NewLine |
ForEach-Object {
[System.IO.FileInfo]$_
}
#endregion >>> fake reading in files
foreach ($IS_Item in $InStuff)
{
$BNSplit_1 = $IS_Item.BaseName -split 'ak'
if ($BNSplit_1[-1].StartsWith('_'))
{
if (($BNSplit_1[-1] -replace '[^_]').Length -eq 1)
{
$Dir = '{0}\{1}' -f $IS_Item.BaseName.Substring(0, 2),
$IS_Item.BaseName.Substring(0, 6)
}
else
{
$Dir = '{0}\{1}\{2}' -f $IS_Item.BaseName.Substring(0, 2),
$IS_Item.BaseName.Substring(0, 6),
$IS_Item.BaseName.Split('_')[1]
}
}
else
{
$Dir = '{0}\{1}\{2}\{3}' -f $IS_Item.BaseName.Substring(0, 2),
$IS_Item.BaseName.Substring(0, 6),
$BNSplit_1[-1].Split('_')[0],
$BNSplit_1[-1].Split('_')[1]
}
$FullDest = Join-Path -Path $Dir -ChildPath $IS_Item
#region >>> show what was done with each file
# replace this block with your MkDir & Move-Item commands
$IS_Item.Name
$Dir
$FullDest
'depth = {0}' -f ($FullDest.Split('\').Count - 1)
'=' * 20
#endregion >>> show what was done with each file
}
the output ...
MM0245AK625_G03_701.txt
MM\MM0245\625\G03
MM\MM0245\625\G03\MM0245AK625_G03_701.txt
depth = 4
====================
MM0245AK830_G04_701.txt
MM\MM0245\830\G04
MM\MM0245\830\G04\MM0245AK830_G04_701.txt
depth = 4
====================
VY0245AK_G03.txt
VY\VY0245
VY\VY0245\VY0245AK_G03.txt
depth = 2
====================
VY0245AK_G03_701.txt
VY\VY0245\G03
VY\VY0245\G03\VY0245AK_G03_701.txt
depth = 3
====================
VY0245AK625_G03.txt
VY\VY0245\625\G03
VY\VY0245\625\G03\VY0245AK625_G03.txt
depth = 4
====================
VY0245AK625_G03_701.txt
VY\VY0245\625\G03
VY\VY0245\625\G03\VY0245AK625_G03_701.txt
depth = 4
====================
VY0345AK625_G03_701.txt
VY\VY0345\625\G03
VY\VY0345\625\G03\VY0345AK625_G03_701.txt
depth = 4
====================

I would first split each file BaseName on the underscore. Then use a regex to split the first part into several array elements, combine that with a possible second part of the split in order to create the destination folder path for the files.
$DataPath = "$PSScriptRoot\Test"
$files = Get-ChildItem -Path $DataPath -Filter '*_*.txt' -File
foreach ($file in $files) {
$parts = $file.BaseName -split '_'
# regex your way to split the first part into path elements (remove empty items)
$folders = [regex]::Match($parts[0], '(?i)^(.{2})(\d{4})[A-Z]{2}(\d{3})?').Groups[1..3].Value | Where-Object { $_ -match '\S'}
# the second part is a merge with the first part
$folders[1] = $folders[0] + $folders[1]
# if there was a third part after the split on the underscore, add $part[1] (i.e. 'Gxx') to the folders array
if ($parts.Count -gt 2) { $folders += $parts[1] }
# join the array elements with a backslash (i.e. [System.IO.Path]::DirectorySeparatorChar)
# and join all tat to the $DataPath to create the full destination for the file
$target = Join-Path -Path $DataPath -ChildPath ($folders -join '\')
# create the folder if that does not yet exist
$null = New-Item -Path $target -ItemType Directory -Force
# move the file to that (new) directory
$file | Move-Item -Destination $target -WhatIf
}
The -WhatIf switch makes the code not move anything to the new destination, it will only display where the file would go to. Once you are happy with that information, remove -WhatIf and run the code again
After moving your filestructure will look like this:
D:\TEST
+---MM
| \---MM0245
| +---625
| | \---G03
| | MM0245AK625_G03_701.txt
| |
| \---830
| \---G04
| MM0245AK830_G04_701.txt
|
\---VY
+---VY0245
| | VY0245AK_G03.txt
| |
| +---625
| | | VY0245AK625_G03.txt
| | |
| | \---G03
| | VY0245AK625_G03_701.txt
| |
| \---G03
| VY0245AK_G03_701.txt
|
\---VY0345
\---625
\---G03
VY0345AK625_G03_701.txt

Related

Bulk renaming files with different extensions in order using powershell

is there a way to bulk rename items such that a folder with the items arranged in order would have their name changed into numbers with zero padding regardless of extension?
for example, a folder with files named:
file1.jpg
file2.jpg
file3.jpg
file4.png
file5.png
file6.png
file7.png
file8.jpg
file9.jpg
file10.mp4
would end up like this:
01.jpg
02.jpg
03.jpg
04.png
05.png
06.png
07.png
08.jpg
09.jpg
10.mp4
i had a script i found somewhere that can rename files in alphabetical order. however, it seems to only accepts conventionally bulk renamed files (done by selecting all the files, and renaming them such that they read "file (1).jpg" etc), which messes up the ordering when dealing with differing file extensions. it also doesn't seem to rename files with variations in their file names. here is what the code looked like:
Get-ChildItem -Path C:\Directory -Filter file* | % {
$matched = $_.BaseName -match "\((?<number>\d+)\)"
if (-not $matched) {break;}
[int]$number = $Matches["number"]
Rename-Item -Path $_.FullName -NewName "$($number.ToString("000"))$($_.Extension)"
}
If your intent is to rename the files based on the ending digits of their BaseName you can use Get-ChildItem in combination with Where-Object for filtering them and then pipe this result to Rename-Item using a delay-bind script block.
Needles to say, this code does not handle file collision. If there is more than one file with the same ending digits and the same extension this will error out.
Get-ChildItem -Filter file* | Where-Object { $_.BaseName -match '\d+$' } |
Rename-Item -NewName {
$basename = '{0:00}' -f [int][regex]::Match($_.BaseName, '\d+$').Value
$basename + $_.Extension
}
To test the code you can use the following:
#'
file1.jpg
file2.jpg
file3.jpg
file4.png
file5.png
file6.png
file7.png
file8.jpg
file9.jpg
file10.mp4
'# -split '\r?\n' -as [System.IO.FileInfo[]] | ForEach-Object {
$basename = '{0:00}' -f [int][regex]::Match($_.BaseName, '\d+$').Value
$basename + $_.Extension
}
You could just use the number of files found in the folder to create the appropriate 'numbering' format for renaming them.
$files = (Get-ChildItem -Path 'D:\Test' -File) | Sort-Object Name
# depending on the number of files, create a formating template
# to get the number of leading zeros correct.
# example: 645 files would create this format: '{0:000}{1}'
$format = '{0:' + '0' * ($files.Count).ToString().Length + '}{1}'
# a counter for the index number
$index = 1
# now loop over the files and rename them
foreach ($file in $files) {
$file | Rename-Item -NewName ($format -f $index++, $file.Extension) -WhatIf
}
The -WhatIf switch is a safety measure. With this, no file gets actually renamed, you will only see in the console what WOULD happen. Once you are content with that, remove the -WhatIf switch from the code and run again to rename all your files in the folder

Powershell Scan for missing files in multiple folders

I'm checking for missing XYZ map tiles using Powershell, but coming unstuck in the nested loops. Essentially the map tiles exist from a "base" folder, within this base folder are multiple directories. Within each directory are the map tiles.
e.g.
C:\My Map\17\ # this is the Base folder, zoom level 17
C:\My Map\17\1234\ # this is a folder containing map tiles
C:\My Map\17\1234\30200.png # this is a map tile
C:\My Map\17\1234\30201.png # this is a map tile
C:\My Map\17\1234\30203.png # this is a map tile, but we're missing 30202.png (have tiles either side)
C:\My Map\17\1234\30204.png # this is a map tile
C:\My Map\17\1235\ # this is another folder containing map tiles [...]
So my idea is for each folder, scan for gaps where we have tiles each side and try to download them.
This is what I have so far:
$BasePath = "C:\_test\17\"
$ColumnDirectories = Get-ChildItem $BasePath -Directory
$ColumnDirectories | ForEach-Object {
$ColumnDirectory = $ColumnDirectories.FullName
$MapTiles = Get-ChildItem -Path $ColumnDirectory -Filter *.png -file
$MapTiles | ForEach-Object {
#Write-Host $MapTiles.FullName
$TileName = $MapTiles.Name -replace '.png',''
$TileNamePlus1 = [int]$TileName + 1
$TileNamePlus2 = [int]$TileName + 2
Write-Host $TileName
}
}
But I'm getting Cannot convert the "System.Object[]" value of type "System.Object[]" to type "System.Int32".
Eventually I want to go Test-Path on each of $TileName, TileNamePlus1, $TileNamePlus2, and where the middle one doesn't exist to download it again.
e.g.
C:\My Map\17\1234\30201.png -- Exists
C:\My Map\17\1234\30202.png -- Not exists, download from https://somemapsrv.com/17/1234/30202.png
C:\My Map\17\1234\30203.png -- Exists
Any help appreciated! I'm fairly new to Powershell.
The whole problem here is an understanding of how ForEach-Object loops work. Within the loop the automatic variable $_ represents the current iteration of the loop. So as suggested by the comments by dugas and Santiago Squarzon you need to change this line:
$TileName = $MapTiles.Name -replace '.png',''
to this:
$TileName = $_.Name -replace '\.png',''
Or more simply this (the BaseName property is the file name without the extension):
$TileName = $_.BaseName
Since all your png files have basenames as integer numbers, you could do something like this:
$BasePath = 'C:\_test\17'
$missing = Get-ChildItem -Path $BasePath -Directory | ForEach-Object {
$ColumnDirectory = $_.FullName
# get an array of the files in the folder, take the BaseName only
$MapTiles = (Get-ChildItem -Path $ColumnDirectory -Filter '*.png' -File).BaseName
# create an array of integer numbers taken from the files BaseName
$sequence = $MapTiles | ForEach-Object { [int]$_ } | Sort-Object
$sequence[0]..$sequence[-1] | Where-Object { $MapTiles -notcontains $_ } | ForEach-Object {
Join-Path -Path $ColumnDirectory -ChildPath ('{0}.png' -f $_)
}
}
# missing in this example has only one file, but could also be an array of missing sequential numbered files
$missing # --> C:\_test\17\1234\30202.png
If your file names have leading zero's, this won't work..

How to add MD5 hash toa PowerShell file dump

Looking for a line to add that pulls the file information as below but includes an MD5 hash
It can be from certutil, but there is not a means to download that module so looking for a means that uses PowerShell without an additional update of PowerShell.
We are looking to compare two disks for missing files even when the file might be located in an alternate location.
cls
$filPath="G:/"
Set-Location -path $filPath
Get-ChildItem -Path $filPath -recurse |`
foreach-object{
$Item=$_
$Path =$_.FullName
$ParentS=($_.FullName).split("/")
$Parent=$ParentS[#($ParentS.Length-2)]
$Folder=$_.PSIsContainer
#$Age=$_.CreationTime
#$Age=$_.ModifiedDate
$Modified=$_.LastWriteTime
$Type=$_.Extension
$Path | Select-Object `
#{n="Name";e={$Item}},`
#{n="LastModified";e={$Modified}},`
#{n="Extension";e={$Type}},`
#{n="FolderName";e={if($Parent){$Parent}else{$Parent}}},`
#{n="filePath";e={$Path}}`
} | Export-csv Q:/lpdi/fileDump.csv -NoTypeInformation
Possible answer here: (Thanks Guenther)
#{name="Hash";expression={(Get-FileHash -Algorithm MD5 -Path $Path).hash}}
In this script it meets the filehash condition along with the name of the file which allows a way to find the file on the folder and know it matches another one in another location based on the hash.
I'm not sure what happens on the file hash itself. If it includes the name of the file, the hash will be different. If it is only the file itself and the path doesn't matter, it should meet the requirement. I'm not sure how to include it in the code above however
Your code could be simplified so you don't need all those 'in-between' variables.
Also, the path separator character in Windows is a backslash (\), not a forward slash (/) which makes this part of your code $ParentS=($_.FullName).split("/") not doing what you expect from it.
Try
$SourcePath = 'G:\'
Get-ChildItem -Path $SourcePath -File -Recurse | ForEach-Object {
# remove the next line if you do not want console output
Write-Host "Processing file '$($_.FullName)'.."
$md5 = ($_ | Get-FileHash -Algorithm MD5).Hash
$_ | Select-Object #{Name = 'Name'; Expression = { $_.Name }},
#{Name = 'LastModified'; Expression = { $_.LastWriteTime }},
#{Name = 'Extension'; Expression = { $_.Extension }},
#{Name = 'FolderName'; Expression = { $_.Directory.Name }},
#{Name = 'FilePath'; Expression = { $_.FullName }},
#{Name = 'FileHash'; Expression = { $md5 }}
} | Export-Csv -Path 'Q:/lpdi/fileDump.csv' -NoTypeInformation
Because getting hash values is a time consuming process I've added a Write-Host line, so you know the script did not 'hang'..
Edit: Okay so, here is my workaround as promised.
Before we start, requirements are:
Have python 3.8 or above installed and registered in windows PATH
edit the ps1 file variables accordingly
edit the python file variables accordingly
bypass powershell script execution policies
There are 4 files in the working directory (different from your target directory):
addMD5.ps1 (static)
addMD5.py (static)
fileDump-original.csv (auto-generated)
fileDump-modified.csv (auto-generated)
Here are the contents of those 4 files:
addMD5.ps1
$targetDir="C:\Users\USERname4\Desktop\myGdrive"
$workingDir="C:\Users\USERname4\Desktop\myWorkingDir"
$pythonName="addMD5.py"
$exportName = "fileDump-original.csv"
Set-Location -path $workingDir
if (Test-Path $exportName)
{
Remove-Item $exportName
}
Get-ChildItem -Path $targetDir -recurse |`
foreach-object{
$Item=$_
$Path =$_.FullName
$ParentS=($_.FullName).split("/")
$Parent=$ParentS[#($ParentS.Length-2)]
$Folder=$_.PSIsContainer
#$Age=$_.CreationTime
#$Age=$_.ModifiedDate
$Modified=$_.LastWriteTime
$Type=$_.Extension
$Path | Select-Object `
#{n="Name";e={$Item}},`
#{n="LastModified";e={$Modified}},`
#{n="Extension";e={$Type}},`
#{n="FolderName";e={if($Parent){$Parent}else{$Parent}}},`
#{n="filePath";e={$Path}}`
} | Export-csv $exportName -NoTypeInformation
python $pythonName
addMD5.py
import os, hashlib
def file_len(fname):
with open(fname) as fp:
for i, line in enumerate(fp):
pass
return i + 1
def read_nth(fname,intNth):
with open(fname) as fp:
for i, line in enumerate(fp):
if i == (intNth-1):
return line
def getMd5(fname):
file_hash = hashlib.md5()
with open(fname, "rb") as f:
chunk = f.read(8192)
while chunk:
file_hash.update(chunk)
chunk = f.read(8192)
return file_hash.hexdigest()
file1name = "fileDump-original.csv"
file2name = "fileDump-modified.csv"
try:
os.remove(file2name)
except:
pass
file2 = open(file2name , "w")
for linenum in range(file_len(file1name)):
if (linenum+1) == 1:
file2.write(read_nth(file1name,linenum+1).strip()+',"md5"\n')
else:
innerfilename = read_nth(file1name,linenum+1).split(",")[4].strip()[1:-1]
file2.write(read_nth(file1name,linenum+1).strip()+',"'+getMd5(innerfilename)+'"\n')
file2.close()
fileDump-original.csv
"Name","LastModified","Extension","FolderName","filePath"
"test1.txt","20-Jun-21 12:50:44 PM",".txt","C:\Users\USERname4\Desktop\myGdrive\test1.txt","C:\Users\USERname4\Desktop\myGdrive\test1.txt"
"test2.txt","20-Jun-21 12:50:37 PM",".txt","C:\Users\USERname4\Desktop\myGdrive\test2.txt","C:\Users\USERname4\Desktop\myGdrive\test2.txt"
fileDump-modified.csv
"Name","LastModified","Extension","FolderName","filePath","md5"
"test1.txt","20-Jun-21 12:50:44 PM",".txt","C:\Users\USERname4\Desktop\myGdrive\test1.txt","C:\Users\USERname4\Desktop\myGdrive\test1.txt","d659c1bc0a3010b0bdd45d9a8fee3196"
"test2.txt","20-Jun-21 12:50:37 PM",".txt","C:\Users\USERname4\Desktop\myGdrive\test2.txt","C:\Users\USERname4\Desktop\myGdrive\test2.txt","d55749658669d28f8549d94cd01b72ba"

Append file name using CSV

I'm trying to rename files that match values in column one of a csv adding the value in column 3 to the beginning of the file name leaving the rest of the file name intact. Here is what I have so far. I cant seem to figure out the Rename-Item.
# Common Paths
$PathRoot = "C:\Temp\somefiles" #Where the files are to rename
# Get csv file
$ClientAccounts = Import-CSV -path "\\server\some\path\to\csv\file.csv"
# Get each file and rename it
ForEach($row in $ClientAccounts)
{
$CurrentClientTaxId = $row[-1].TaxId
$CurrentClientName = $row[-1].ClientName
#loop through files
$FileExists = Test-Path -Path "$PathTotal\*$CurrentClientLB_Number*" #See if there is a file.
If ($FileExists -eq $true) #The file does exist.
{
#ReName File
Rename-Item -Path $PathRoot -NewName {$CurrentClientName + " " + $_.name}
}
}
Lets suppose your CSV file looks similar to this:
"LB_Number","TaxId","ClientName"
"987654","12345","Microsoft"
"321456","91234","Apple"
"741852","81234","HP"
Column 1 has the portion of the existing file name to match
Column 3 has the client name you want to prepend to the file name
Then your function could be something like this:
# Common Paths
$PathRoot = "C:\Temp\somefiles" # Where the files are to rename
# Get csv file
$ClientAccounts = Import-CSV -path "\\server\some\path\to\csv\file.csv"
# Loop through all clients in the CSV
foreach($client in $ClientAccounts) {
$CurrentClientLB_Number = $client.LB_Number
$CurrentClientTaxId = $client.TaxId # unused...??
$CurrentClientName = $client.ClientName
# get the file(s) using wildcards (there can be more than one)
# and rename them
Get-ChildItem -Path "$PathRoot\*$CurrentClientLB_Number*" -File | ForEach-Object {
$_ | Rename-Item -NewName ($CurrentClientName + " " + $_.Name)
}
# Curly braces work also, although this is not very common practice:
# Get-ChildItem -Path "$PathRoot\*$CurrentClientLB_Number*" -File |
# Rename-Item -NewName { ($CurrentClientName + " " + $_.Name) }
}
I use the -File parameter with Get-ChildItem so the function will only return files; not directories. If you are using PowerShell version 2.0, you need to replace that with | Where-Object { !$_.PSIsContainer }.

How do I consolidate copy and rename commands into one using Powershell?

Currently I am using PS to copy files from a network location based on a CSV file, then I am renaming them using a variation of the same data. This requires that I run two separate commands.
How do I consolidate these commands into one?
Copy:
import-csv C:\TEST\test.csv | foreach {copy-item -path $_.npath -destination 'C:\TEST\'}
Paste:
import-csv C:\TEST\test.csv | foreach {rename-item -path $_.lpath -newname $_.newalias}
Notice that the -path trigger in each case refers to a separate variable header, npath vs. lpath which correspond to the network file location, and then a local file location which have been manually entered.
On the same note, how could I concatenate this variable to constant data. If I have a variable fn which represents the file name and another path, could I theoretically do:
foreach {rename-item -path 'C:\TEST\' + $_.fn
Or:
foreach {rename-item -path $_.path + $_.fn
Just append the two commands
import-csv C:\TEST\test.csv | foreach {copy-item -path $_.npath -destination 'C:\TEST\';rename-item -path $_.lpath -newname $_.newalias }
for your second question there are lots of ways to append string
C:(...)WindowsPowerShell>$data = "bob"
C:(...)WindowsPowerShell>echo "this is a $data"
C:(...)WindowsPowerShell>$concat = "hi" + " george"
C:(...)WindowsPowerShell>$concat
hi george
C:(...)WindowsPowerShell>[string]::Format("{0} {1}","string 1","string 2")
string 1 string 2