Flatten directory structure - powershell

The below function flattens the directory structure and copies files based on the last write date chosen.
function mega-copy($srcdir,$destdir,$startdate,$enddate)
{
$files = Get-ChildItem $SrcDir -recurse | Where-Object { $_.LastWriteTime -ge "$startdate" -and $_.LastWriteTime -le "$enddate" -and $_.PSIsContainer -eq $false };
$files|foreach($_)
{
cp $_.Fullname ($destdir+$_.name) -Verbose
}
}
This has been very successful on smaller directories, but when attempting to use it for directories with multiple sub-directories and file counts ranging from the hundreds of thousands to the tens of millions, it simply stalls. I ran this and allowed it to sit for 24 hours, and not a single file was copied, nor did anything show up in the PowerShell console window. In this particular instance, there were roughly 27 million files.
However a simplistic batch file did the job with no issue whatsoever, though it was very slow.

Simple answer is this: using the intermediate variable caused a huge delay in the initiation of the file move. Couple that with using
-and $_.PSIsContainer -eq $false
as opposed to simply using the -file switch, and the answer was a few simple modifications to my script resulting in this:
function mega-copy($srcdir,$destdir,$startdate,$enddate)
{
Get-ChildItem $SrcDir -recurse -File | Where-Object { $_.LastWriteTime -ge "$startdate" -and $_.LastWriteTime -le "$enddate" } | foreach($_) {
cp $_.Fullname ($destdir+$_.name) -Verbose
}
}

Related

Delete all files and folder working very slow

I need to delete all the archived files and folder older than 15 days.
I have implemented the solution using PowerShell script but it taking more than a day to delete all files. Total size of the folder is less than 100 GB.
$StartFolder = "\\Guru\Archive\"
$deletefilesolderthan = "15"
#Get Foldernames for ForEach Loop
$SubFolders = Get-ChildItem -Path $StartFolder |
Where-Object {$_.PSIsContainer -eq "True"} |
Select-Object Name
#Loop through folders
foreach ($Subfolder in $SubFolders) {
Write-Host "Processing Folder:" $Subfolder
#For each folder recurse and delete files olders than specified number of days while the folder structure is left intact.
Get-ChildItem -Path $StartFolder$($Subfolder.name) -Include *.* -File -Recurse |
Where LastWriteTime -lt (Get-Date).AddDays(-$deletefilesolderthan) |
foreach {$_.Delete()}
#$dirs will be an array of empty directories returned after filtering and loop until until $dirs is empty while excluding "Inbound" and "Outbound" folders.
do {
$dirs = gci $StartFolder$($Subfolder.name) -Exclude Inbound,Outbound -Directory -Recurse |
Where {(gci $_.FullName).Count -eq 0} |
select -ExpandProperty FullName
$dirs | ForEach-Object {Remove-Item $_}
} while ($dirs.Count -gt 0)
}
Write-Host "Completed" -ForegroundColor Green
#Read-Host -Prompt "Press Enter to exit"
Please suggest some way to optimise the performance.
If you have many smaller files, the long delete time is not abnormal because it has to process each file descriptor. Some improvements can be made depending on your version; I'm going to assume you're on at least v4.
#requires -Version 4
param(
[string]
$start = '\\Guru\Archive',
[int]
$thresholdDays = 15
)
# getting the name wasn't useful. keep objects as objects
foreach ($folder in Get-ChildItem -Path $start -Directory) {
"Processing Folder: $folder"
# get all items once
$folders, $files = ($folder | Get-ChildItem -Recurse).
Where({ $_.PSIsContainer }, 'Split')
# process files
$files.Where{
$_.LastWriteTime -lt (Get-Date).AddDays(-$thresholdDays)
} | Remove-Item -Force
# process folders
$folders.Where{
$_.Name -notin 'Inbound', 'Outbound' -and
($_ | Get-ChildItem).Count -eq 0
} | Remove-Item -Force
}
"Complete!"
The reason why it takes so many time is that you are deleting files/folder over network which leads to need for additional network communication for every file and folder. You can easily check that fact using network analyzer. The best approach here is to use one of the method that allows to run code which executes file operations on remote machine, for example you can try to use:
WinRM
psexec (first copy code to remote machine and then execute it using psexec)
remote WMI (using CIM_Datafile)
or even adding needed task to the scheduler
I would prefer to use WinRM but psexec is also good decision (if you don't want to perform additional configuration of WinRM).

Prioritizing "$_.Name" over "$_.LastAccessTime"

I created a script that allows me to search for and ignore directories from a Remove-Item statement, and the script works, but not necessarily to the extent I need it to.
Get-ChildItem -Path $Path |
Where-Object {
($_.LastAccessTime -lt $Limit) -and
-not ($_.PSIsContainer -eq $True -and $_.Name -contains ("2013","2014","2015"))
} | Remove-Item -Force -Recurse -WhatIf
This script is currently finding and deleting all objects that
Have not been accessed in the given time period
But what I need this script to do is find and delete all objects that
Have not been accessed in the given time period AND
Exclude directories that contain the name of "2013", "2014", or "2015".
I'm not arguing that the script "isn't working properly", but the thesis of my question is this:
How do I program this script to look at the directory name first, and then the last access date? I don't know where and how to tell this script that the $_.Name needs to take precedence over the $_.LastAccessTime -lt $Limit.
Currently the logic of your condition is this:
Delete objects that were last accessed before $Limit and are not folders whose name contains the array ["2013","2014","2015"].
The second condition is never true, because a string can never contain an array of strings.
Also, the last modification time is stored in the LastWriteTime property.
What you actually want is something like this:
Where-Object {
$_.LastWriteTime -lt $Limit -and
-not ($_.PSIsContainer -and $_.Name -match '2013|2014|2015')
}
If the directory names consist only of the year and nothing else you could also use this:
Where-Object {
$_.LastWriteTime -lt $Limit -and
-not ($_.PSIsContainer -and '2013','2014','2015' -contains $_.Name)
}
Note the reversed order of the last clause (array -contains value).

Powershell 3.0: Returning a list of file names with last write times within the past 24 hours

I am writing a script that checks, recursively, all folders and file names in a directory, and then returns the names and last write times to a text file. I only want the names of files and folders that have been added within the past twenty four hours.
$date = Get-Date
Get-ChildItem 'R:\Destination\Path' -recurse |
Where-Object { $_.LastWriteTime -lt $date -gt $date.AddDays(-1) } |
Select LastWriteTime, Name > 'C:\Destination\Path\LastWriteTime.txt' |
Clear-Host
Invoke-Item 'C:\Destination\Path\LastWriteTime.txt'
The .txt file that is invoked is blank, which, based on the test conditions I have set up should not be the case. What am I doing wrong?
You are missing a logical and. Change:
Where-Object { $_.LastWriteTime -lt $date -gt $date.AddDays(-1) }
to
Where-Object { $_.LastWriteTime -lt $date -and $_.LastWriteTime -gt $date.AddDays(-1) }
Even better to use parenthesis, if you would have used them then the syntax would not have been parsed with the missing AND:
Where-Object { ($_.LastWriteTime -lt $date) -and ($_.LastWriteTime -gt $date.AddDays(-1)) }

Get parent directory and modification date from a file using Powershell

I'm looking into building a script that will get me both a date and parent directory of files created during a certain period.
So far this is what I've come up with:
get-childitem –recurse | where-object {($_.lastwritetime -gt “7/1/2013”) -and ($_.lastwritetime -le “7/22/2013”) }
I'm a bit clueless as to how to separate "Directory" and "LastWriteTime" (minus the time) into variables.
Would appreciate the help.
Thanks!
I wouldn't put them into separate variables. I'd just select the 2 properties:
$files = Get-ChildItem –Recurse | ? {
-not $_.PSIsContainer -and
$_.LastWriteTime -gt "7/1/2013" -and
$_.LastWriteTime -le "7/22/2013"
} | select Directory, #{n='LastWriteDate';e={Get-Date -uformat "%m\/%d\/%Y"}}
Then you can access those values like this:
$files[0].Directory.FullName
$files[0].LastWriteDate

PowerShell Get-ChildItem and Skip

I'm writing a PowerShell script that deletes all but the X most recent folders, excluding a folder named Data. My statement to gather the folders to delete looks like this:
$folders1 = Get-ChildItem $parentFolderName |
? { $_.PSIsContainer -and $_.Name -ne "Data" } |
sort CreationTime -desc |
select -Skip $numberOfFoldersToKeep
foreach ($objItem in $folders1) {
Write-Host $webServerLocation\$objItem
Remove-Item -Recurse -Force $parentFolderName\$objItem -WhatIf
}
This works great when I pass a $numberOfFoldersToKeep that is fewer than the number of folders in the starting directory $parentFolderName. For example, with 5 subdirectories in my target folder, this works as expected:
myScript.ps1 C:\StartingFolder 3
But if I were to pass a high number of folders to skip, my statement seems to return the value of $parentFolderName itself! So this won't work:
myScript.ps1 C:\StartingFolder 15
Because the skip variable exceeds the number of items in the Get-ChildItem collection, the script tries to delete C:\StartingFolder\ which was not what I expected at all.
What am I doing wrong?
try this:
$folders1 = Get-ChildItem $parentFolderName |
? { $_.PSIsContainer -and $_.Name -ne "Data" } |
sort CreationTime -desc |
select -Skip $numberOfFoldersToKeep
if ($folder1 -neq $null)
{
foreach ($objItem in $folders1) {
Write-Host $($objItem.fullname)
Remove-Item -Recurse -Force $objItem.fullname -WhatIf
}
}
I gave #C.B. credit for the answer, but there's another way to solve the problem, by forcing the output of Get-ChildItem to an array using the #( ... ) syntax.
$folders1 = #(Get-ChildItem $parentFolderName |
? { $_.PSIsContainer -and $_.Name -ne "Data" } |
sort CreationTime -desc |
select -Skip $numberOfFoldersToKeep)
foreach ($objItem in $folders1) {
Write-Host $webServerLocation\$objItem
Remove-Item -Recurse -Force $parentFolderName\$objItem -WhatIf
}
This returns an array of length zero, so the body of the foreach statement is not executed.
As C.B. noted in the comments above, the problem is that if you pass a null collection into a foreach statement in PowerShell, the body of the foreach statement is executed once.
This was completely unintuitive to me, coming from a .NET background. Apparently, it's unintuitive to lots of other folks as well, since there's bug reports filed for this behavior on MSDN: https://connect.microsoft.com/feedback/ViewFeedback.aspx?FeedbackID=281908&SiteID=99
Apparently, this bug has been fixed in PowerShell V3.