Copy subset of files keeping folder structure using Powershell - powershell

Looking for some Powershell help with a copying challenge.
I need to copy all MS Office files from a fairly large NAS (over 4 million of them and a little over 5tb) to another drive, retaining the existing folder structure where a file is copied.
I have a text file of all the common Office file types (about 40 of them) - extns.txt
At this stage, being a good StackExchanger, I'd post the script I've got so far, but I've spent best part of a day on this and, not only is what I've got embarrassingly awful, I suspect that even the basic algorithm is wrong.
I started to gci the entire tree on the old NAS, once for each file type
Then I thought it would be better to traverse once and compare every file to the list of valid types.
Then I got into a complete mess about rebuilding the folder structure. I started by splitting on '\' and iterating through the path then wasted an hour of searching because I thought I remembered reading about a simple way to duplicate a path if it doesn't exist.
Another alternative is that I dump out a 4 million line text file of all the files (with full path) I want to copy (this is easy as I imported the entire structure into SQL Server to analyse what was there) and use that as a list of sources
I'm not expecting a 'please write the codez for me' answer but some pointers/thoughts on the best way to approach this would be appreciated.

I'm not sure if this is the best approach, but the below script is a passable solution to the least.
$sourceRootPath = "D:\Source"
$DestFolderPath = "E:\Dest"
$extensions = Get-Content "D:\extns.txt"
# Prefix "*." to items in $extensions if it doesn't already have it
$extensions = $extensions -replace "^\*.|^","*."
$copiedItemsList = New-Object System.Collections.ArrayList
foreach ( $ext in $extensions ) {
$copiedItems = Copy-Item -Path $sourceRootPath -Filter $ext -Destination $DestFolderPath -Container -Recurse -PassThru
$copiedItems | % { $copiedItemsList.Add($_) | Out-Null }
}
$copiedItemsList = $copiedItemsList | select -Unique
# Remove Empty 'Deletable' folders that get created while maintaining the folder structure with Copy-Item cmdlet's Container switch
While ( $DeletableFolders = $copiedItemsList | ? { ((Test-Path $_) -and $_.PSIsContainer -eq $true -and ((gci $_ | select -first 1).Count -eq 0)) } ) {
$DeletableFolders | Remove-Item -Confirm:$false
}
The Copy-Item's -Container switch is going to preserve the folder structure for us. However, we may encounter empty folders with this approach.
So, I'm using an arraylist named $copiedItemsList to add the copied objects into, which I will later use to determine empty 'Deletable' folders which are then removed at the end of the script.
Hope this helps!

Related

How can I merge this code so all stages use $RootPath? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 days ago.
Improve this question
I'm not a beginner to scripting with PowerShell and have discovered just how powerful and amazing it truly is. I get confused with some things though so I am here seeking help with a few things with the script that I'm in the process of creating. My script manages a few things. It:
• prompt user to select a directory
• recursively moves files that are many levels deep into the parent folder
• deletes all empty folders after the move
•renames the parent folders by removing periods and other "illegal" characters because the program that uses these files will crash if there are any characters besides numbers or letters.
• renames the files to the parent's name.
• exits when finished
The files don't have a file format extension, they're approx 32 characters long and are alphanumeric.
Unfortunately, the script cannot make it past the first step (moving the files) if it is placed in a directory outside of the one that contains the folders and files. If I place it in the root of the directory containing said files and folders, it works flawlessly. If it is ran in another directory containing other files, it will work with the files and folders there after finishing the 1st step which is set using $RootPath, the rest of the script is not using $RootPath and I need to figure out how to edit the code I have to utilize it.
However no matter what I do, I fail. I know I can just run it from the directory containing the files that need to be moved but I intend to release this on a forum that I frequent and want to make sure it is safe for those that use it. ie: I don't want their system getting messed up through carelessness or ignorance.
Full Disclosure: I'm not good at writing code on my own, I find chunks of code offered in forums, test and if it accomplishes what I need, I tweak it to work with my intended use. Most of this code I found here.
How can I get the last ⅔ of my script to use my $RootPath instead of the script's residing directory? I have tried a few things but end up breaking it's functionally and the thing is, in my mind, I see why it's not working but reading the code is where I have a Patrick-Star-drooling moment. This is when I get overwhelmed and take a break or focus on something else that I do understand. I know I need to make the rest of my code utilize $RootPath that gets set when selecting a directory but I can't figure out how to get it to use it.
Additionally, I would like the final step to append "_1" to the file name when there is a naming conflict. I can't seem to figure out how to get this step to carry over from the first step.
Here is a pastebin link of my script. It is a bit long, I have also pasted the code in case that is preferred.
# You need this to use System.Windows.MessageBox
Add-Type -AssemblyName 'PresentationFramework'
Add-Type -AssemblyName System.Windows.Forms
$continue = 'No'
$caption = "'Bulk File Renamer Script' by MyLegzRwheelz."
$message = "Have you read the ENTIRE disclaimer (from the very top, I know, it is a lot) in the console window along the instructions provided and do you agree that you are responsible for your own negligence and anything that can go wrong IF YOU DO NOT FOLLOW MY INSTRUCTIONS PRECISELY? If so, then click 'Yes' to proceed, 'No' to exit."
$continue = [System.Windows.MessageBox]::Show($message, $caption, 'YesNo');
if ($continue -eq 'Yes') {
$characters = "?!'._" # These are the characters the script finds and removes
$regex = "[$([regex]::Escape($characters))]"
$filesandfolders = Get-ChildItem -recurse | Where-Object {$_.name -match $regex}
$browser = New-Object System.Windows.Forms.FolderBrowserDialog
$null = $browser.ShowDialog()
$RootPath = $browser.SelectedPath
# Get list of parent folders in root path
$ParentFolders = Get-ChildItem -Path $RootPath | Where-Object {$_.PSIsContainer}
# For each parent folder get all files recursively and move to parent, append number to file to avoid collisions
ForEach ($Parent in $ParentFolders) {
Get-ChildItem -Path $Parent.FullName -Recurse | Where-Object {!$_.PSIsContainer -and ($_.DirectoryName -ne $Parent.FullName)} | ForEach-Object {
$FileInc = 1
Do {
If ($FileInc -eq 1) {$MovePath = Join-Path -Path $Parent.FullName -ChildPath $_.Name}
Else {$MovePath = Join-Path -Path $Parent.FullName -ChildPath "$($_.BaseName)($FileInc)$($_.Extension)"}
$FileInc++
}
While (Test-Path -Path $MovePath -PathType Leaf)
Move-Item -Path $_.FullName -Destination $MovePath
}
}
$filesandfolders | Where-Object {$_.PsIscontainer} | foreach {
$New=$_.name -Replace $regex
Rename-Item -path $_.Fullname -newname $New -passthru
}
# For this to work, we need to temporarily append a file extension to the file name
Get-ChildItem -File -Recurse | where-object {!($_.Extension)} | Rename-Item -New {$_.basename+'.ext'}
# Removes alphanumeric subdirectories after moving renamed game into the parent folder
Get-ChildItem -Recurse -Directory | ? { -Not ($_.EnumerateFiles('*',1) | Select-Object -First 1) } | Remove-Item -Recurse
# Recursively searches for the files we renamed to .ext and renames it to the parent folder's name
# ie: "B2080E9FFF47FE2DA382BD55EDFCA2152078AEBD58.ext" becomes "0 day Attack on Earth" and will be
# found in the directory of the same name.
ls -Recurse -Filter *.ext | %{
$name = ([IO.DirectoryInfo](Split-Path $_.FullName -Parent)).Name
Rename-Item -Path $_.FullName -NewName "$($name)"
}
} else {
{Exit}
}
I have tried using $ParentFolders in varying places so that it uses $RootPath as the working directory. I have also tried copy/pasting the "file inc" part in the final step but it is not working. To test this out, create folder, make this your root folder. Within that folder, create additional folders with multiple subfolders and a file with no extension, just create .txt and remove the extension then run the script from the newly created root folder.
Do not run this in a directory with files you care about. This why I am trying to get the rest of the code to use only the directory set at launch. To test it to see if it is working regardless of the scripts location, place the script in another folder then run it. When the explorer dialog pops up (after clicking yes), select this directory. If you place it in the root directory then run it, it works as it should but not in any other director, which is the desired result, to run and work to completion, regardless of the location of the script.
Here is code to add _1 to filename
$filename = "abcdefg.csv"
$lastIndex = $filename.LastIndexOf('.')
$extension = $filename.Substring($lastIndex)
$filename = $filename.Substring(0,$lastIndex)
Write-Host "filename = " $filename ",extension = " $extension
$filename = $filename + "_1" + $extension
$filename

Sorting jpg/pdf files into folders that already exist by name but only using the the first 6 digits

I am having issues trying to sort a large amount of files into folders.Screen shot of files that i need sorted,new files are added daily
My main issue is coming from the naming format of the files relative to the folders. Is there a way to move them by the first 6 digits into the corresponding folders that include those digits and if the folder doesn't exist have one created? I couldn't get name-split to work since the beginning of the file name isn't broken up by a break. Does anybody have any code that could do this for me? I'm still learning powershell, not great at writing from scratch yet :)
Use the String.Substring() or String.Remove() to extract the first 6 digits:
$sourceItemFolder = 'C:\unsorted'
$targetRootFolder = 'C:\folder\with\directories'
Get-ChildItem $sourceItemFolder -File |ForEach-Object {
if($_.Name.Length -ge 6){
# Extract prefix from file name
$prefix = $_.Name.Remove(6)
# Use prefix to find appropriate folder, pick the first match
$targetFolder = Get-ChildItem $targetRootFolder -Filter "${prefix}*" -Directory |Select -First 1
if(-not $targetFolder){
# No matching folder found, create one
$targetFolder = New-Item -Path $targetRootFolder -Name $prefix -Type Directory
}
# Move the file
$_ |Move-Item -Destination $targetFolder.FullName
}
}

How to create a powershell script to move specific files to a different location?

So I have been tasked to write a script that will move files from one folder to another folder, which is easy enough. The problem I am having is the files are for accounts so there will be a file called DEA05292020.pdf and another file called TENSJ05292020 and each file needs to go to a specific folder (EX. the DEA05292020.pdf file needs to be moved to a folder called DEA and the TENSJ05292020 will move to the TENSJ folder. There are over a hundred different accounts that have their own specific folder. The files all start off in our Recon folder and need to be moved at the end of each month to their respective accounts folder. So my question is how could I go about creating a powershell script to make that happen. I am very new to powershell and have been studying the "Learn Powershell in a Month of Lunches" and have a basic grasp of it. So what I have so far is very simple where I can copy the file over to the new folder:
copy-item -path "\Sageshare\share\Reconciliation\PDF Recon Center\DEA RECON 05292020" -destination "Sageshare\share\Account Rec. Sheets\Seperate Accounts\DEA"
This works but I need a lot more automation in regards to seperating all the different account names in the PDF Recon Center folder. How do I make a script that can filter the account name (IE: DEA) and also the month and year from the name of the file (IE: 052020 pulled out of the 05292020 part of the filename)?
Thanks!
If #Lee_Dailey wants to write the code and post it here, I'll delete my answer. He solved the problem, I just code monkeyed it.
Please don't test on everything at once, run it in batches so you can monitor its behavior and not mess up your environment. It moves files in ways you may not want, i.e. if there is a folder named a it'll move everything that matches that folder into it. If you want to prevent this you can write the prescanning if there is a folder more "closely matching" that name before it actually creates the folder itself. Pretty sure it does everything you want however in the simplest way to understand. :)
$names = $(gci -af).name |
ForEach-Object {
if (-not ($_.Contains(".git"))){
$_
}
}
if ( $null -eq $names ) {
Write-Host "No files to move!"
Start-Sleep 5
Exit
}
$removedNames = $names |
ForEach-Object {
$_ = $_.substring(0, $_.IndexOf('.')) # Remove extension
$_ -replace '[^a-zA-Z-]','' # Regex removes numbers
}
$removedNames = $removedNames |
Get-Unique # Get unique folder names
$names |
ForEach-Object {
$name = $_
$removedNames |
ForEach-Object {
if ($name.Contains($_)) # If it matches a name
{
if (-not (Test-Path ".\$_")) { # If it doesn't see the folder
New-Item -Path ".\" `
-Name "$_" `
-ItemType "directory"
}
Move-Item -Path ".\$name" `
-Destination ".\$_" # Move file to folder
}
}
}

How do I move a selection of files into several different directories in one loop?

First time toying seriously with Powershell. I'm running into the problem that my little loop doesn't do what I want it to; it creates a list of series names from the files found in a directory and creates the directories needed to hold the files.
[TAG]Series first.txt
[TAG]Series something else.txt
File.jpg
etc.
This should be sorted into
Series first [Sometag]\[TAG]Series first.txt
Series something else [Sometag]\[TAG]Series something else.txt
File.jpg
etc.
But I can't get Move-Item to actually move the files into the new directories. It leads to extensionless files or errors stating that the file (directory) already exists.
$details = ' [Sometag]'
$series = Get-ChildItem . -Name -File -Filter *.txt |
% {$_.Replace("[TAG]", "").Split("-")[0].Trim()} |
Get-Unique
$series | ForEach-Object {
New-Item -ErrorAction Ignore -Name $_$details -ItemType Directory
}
This is the command that should move the files that it finds using the $serie collection into the directories previously created.
foreach ($serie in $series) {
Get-ChildItem -File -Filter *$serie*.txt |
Move-Item -Destination '.\$serie$details'
}
Which results in it complaining that the file already exists. What would be the best way to deal with this, and can I optimize the first two lines?

Powershell getfiles.count() to exclude thumbs.db

We have a script running daily that removes old files and directories from an area that people use to transfer data around. Everything works great except for one little section. I want to delete a folder if it's older than 7 days and it's empty. The script always shows 1 file in the folder because of the thumbs.db file. I guess I could check to see if the one file is thumb.db and if so just delete the folder but I'm sure there is a better way.
$location = Get-ChildItem \\dropzone -exclude thumbs.db
foreach ($item in $location) {
other stuff here going deeper into the tree...
if(($item.GetFiles().Count -eq 0) -and ($item.GetDirectories().Count -eq 0)) {
This is where I delete the folder but because the folder always has
the Thumbs.db system file we never get here
}
}
$NumberOfFiles = (gci -Force $dir | ?{$_ -notmatch "thumbs.db"}).count
You can try the get-childitem -exclude option where all files/items in your directory will be
counted except those that end in db:
$location = get-childitem -exclude *.db
It also works out if you specify the file to exclude, in this case thumbs.db
$location = get-childitem -exclude thumb.db
Let me know if this works out.
Ah, I also just noticed something,
$location = get-childitem -exclude *.db
Will only handle .db items in the location directory, if you're going deeper into the tree (say from your GetFiles() and GetDirectories() methods) then you may still find a thumb.db. Hence you'll have to add the exclude option in these methods to ignore thumbs.db.
So, for example in your $item.getFiles() method, if you use get-childitem you will have to specify the -exclude option as well.
Sorry, I should have read your question more closely.
Use this method to provide a exclusion list in the form of a simple text file to exclude specific files or extensions from your count:
$dir = 'C:\YourDirectory'
#Type one filename.ext or *.ext per line in this txt file
$exclude = Get-Content "C:\Somefolder\exclude.txt"
$count = (dir $dir -Exclude $exclude).count
$count