Delete massive amount of files without running out of memory - powershell

There is COTS app we have that creates reports and never deletes it. So we need to start cleaning it up. I started doing a foreach and would run out of memory on the server (36GB) when it got up to 50ish million files. After searching it seemed you could change it like so
Get-ChildItem -path $Path -recurse | foreach {
and it won't go through memory but process each item at a time. I can get to 140 million files before I run out of memory.
Clear-Host
#Set Age to look for
$TimeLimit = (Get-Date).AddMonths(-4)
$Path = "D:\CC\LocalStorage"
$TotalFileCount = 0
$TotalDeletedCount = 0
Get-ChildItem -Path $Path -Recurse | foreach {
if ($_.LastWriteTime -le $TimeLimit) {
$TotalDeletedCount += 1
$_.Delete()
}
$TotalFileCount += 1
$FileDiv = $TotalFileCount % 10000
if ($FileDiv -eq 0 -and $TotalFileCount -ne 0) {
$TF = [string]::Format('{0:N0}', $TotalFileCount)
$TD = [string]::Format('{0:N0}', $TotalDeletedCount)
Write-Host "Files Scanned : " -ForegroundColor Green -NoNewline
Write-Host "$TF" -ForegroundColor Yellow -NoNewline
Write-Host " Deleted: " -ForegroundColor Green -NoNewline
Write-Host "$TD" -ForegroundColor Yellow
}
Is there a better way to do this? My only next thought was not to use the -Recurse command but make my own function that calls itself for each directory.
EDIT:
I used the code provided in the first answer and it does not solve the issue. Memory is still growing.
$limit = (Get-Date).Date.AddMonths(-3)
$totalcount = 0
$deletecount = 0
$Path = "D:\CC\"
Get-ChildItem -Path $Path -Recurse -File | Where-Object { $_.LastWriteTime -lt $limit } | Remove-Item -Force

Using the ForEach-Object and the pipeline should actually prevent the code from running out of memory. If you're still getting OOM exceptions I suspect that you're doing something in your code that counters this effect, which you didn't tell us about.
With that said, you should be able to clean up your data directory with something like this:
$limit = (Get-Date).Date.AddMonths(-4)
Get-ChildItem -Path $Path -Recurse -File |
Where-Object { $_.LastWriteTime -lt $limit } |
Remove-Item -Force -WhatIf
Remove the -WhatIf switch after you verified that everything is working.
If you need the total file count and the number of deleted files, add counters like this:
$totalcount = 0
$deletecount = 0
Get-ChildItem -Path $Path -Recurse -File |
ForEach-Object { $totalcount++; $_ } |
Where-Object { $_.LastWriteTime -lt $limit } |
ForEach-Object { $deletecount++; $_ } |
Remove-Item -Force -WhatIf
I don't recommend printing status information to the console when you're bulk-processing large numbers of files. The output could significantly slow down the processing. If you must have that information, write it to a log file and tail that file separately.

Related

Filter and delete files and folders(and files inside of folders) older than x days in powershell

this is my first post on this forum. Im a beginner in coding and I need help with one of my very first self coded tools.
I made a small script, which deletes files based on if they are older than date x (lastwritetime). Now to my problem: I want the script also to check for files inside of folders inside of a directory and only delete a folder afterwards if it is truly empty. I cant figure out how to solve the recursion in this problem, seems like the script deletes just the entire folder in relation to the date x. Could anyone tell me please what I missed in this code and help me to create a own recursion to solve the problem or fix the code? Thanks to you all, guys! Here is my code:
I would be glad if someone knows how to make the code work by using a function
$path = Read-Host "please enter your path"
"
"
$timedel = Read-Host "Enter days in the past (e.g -12)"
$dateedit = (Get-Date).AddDays($timedel)
"
"
Get-ChildItem $path -File -Recurse | foreach{ if ($_.LastWriteTime -and !$_.LastAccessTimeUtc -le $dateedit) {
Write-Output "older as $timedel days: ($_)" } }
"
"
pause
Get-ChildItem -Path $path -Force -Recurse | Where-Object { $_.PsisContainer -and $_.LastWriteTime -le $dateedit } | Remove-Item -Force -Recurse
""
Write-Output "Files deleted"
param(
[IO.DirectoryInfo]$targetTolder = "d:\tmp",
[DateTime]$dateTimeX = "2020-11-15 00:00:00"
)
Get-ChildItem $targetTolder -Directory -Recurse | Sort-Object {$_.FullName} -Descending | ForEach-Object {
Get-ChildItem $_ -File | Where-Object {$_.LastWriteTime -lt $dateTimeX} | Remove-Item -Force
if ((Get-ChildItem $_).Count -eq 0){Remove-Item $_ -Force}
}
remove -WhatIf after test
To also remove folders that are older than the set days in the past if they are empty leaves you with the problem that as soon as a file is removed from such a folder, the LastWriteTime of the folder is set to that moment in time.
This means you should get a list of older folders first, before you start deleting older files and use that list afterwards to also remove these folders if they are empty.
Also, a minimal check on user input from Read-Host should be done. (i.e. the path must exist and the number of days must be convertable to an integer number. For the latter I chose to simply cast it to [int] because if that fails, the code would generate an execption anyway.
Try something like
$path = Read-Host "please enter your path"
# test the user input
if (-not (Test-Path -Path $path -PathType Container)) {
Write-Error "The path $path does not exist!"
}
else {
$timedel = Read-Host "Enter days in the past (e.g -12)"
# convert to int and make sure it is a negative value
$timedel = -[Math]::Abs([int]$timedel)
$dateedit = (Get-Date).AddDays($timedel).Date # .Date sets this date to midnight (00:00:00)
# get a list of all folders (FullNames only)that have a LastWriteTime older than the set date.
# we check this list later to see if any of the folders are empty and if so, delete them.
$folders = (Get-ChildItem -Path $path -Directory -Recurse | Where-Object { $_.LastWriteTime -le $dateedit }).FullName
# get a list of files to remove
Get-ChildItem -Path $path -File -Recurse | Where-Object { $_.LastWriteTime -le $dateedit} | ForEach-Object {
Write-Host "older as $timedel days: $($_.FullName)"
$_ | Remove-Item -Force -WhatIf # see below about the -WhatIf safety switch
}
# now that old files are gone, test the folder list we got earlier and remove any if empty
$folders | ForEach-Object {
if ((Get-ChildItem -Path $_ -Force).Count -eq 0) {
Write-Host "Deleting empty folder: $_"
$_ | Remove-Item -Force -WhatIf # see below about the -WhatIf safety switch
}
}
Write-Host "All Done!" -ForegroundColor Green
}
The -WhatIf switch used on Remove-Item is there for your own safety. With that, no file or folder is actually deleted, instead in the console it is written what would be deleted. If you are satisfied that this is all good, remove the -WhatIf and run the code again to really delete the files and folders
try something like this:
$timedel=-12
#remove old files
Get-ChildItem "C:\temp" -Recurse -File | Where LastWriteTime -lt (Get-Date).AddDays($timedel) | Remove-Item -Force
#remove directory without file
Get-ChildItem "C:\temp\" -Recurse -Directory | where {(Get-ChildItem $_.FullName -Recurse -File).count -eq 0} | Remove-Item -Force -recurse

Repeat foreach loop after all iteration completed in Powershell

Anyone here can help me how to repeat the code from the beginning after all iteration in foreach loop has been completed. The code below will get all the files having 'qwerty' pattern inside the file, feed the list in foreach loop and display the filename and last 10 lines on each file and terminate the code if there is no new/updated file in certain amount of time
$today=(Get-date).Date
$FILES=Get-ChildItem -Path C:\Test\ | `
Where-Object {$_.LastWriteTime -ge $today} | `
Select-String -pattern "qwerty" | `
Select-Object FileName -Unique
foreach ($i in $FILES) {
Write-host $i -foregroundcolor red
Get-content -Path \\XXXXXX\$i -tail 10
Start-Sleep 1
}
You can use this:
For ($r = 0; $r -eq NumberOfTimesYouWantToRepeat; $r++) {
$today=(Get-date).Date
$FILES=Get-ChildItem -Path C:\Test\ | `
Where-Object {$_.LastWriteTime -ge $today} | `
Select-String -pattern "qwerty" | `
Select-Object FileName -Unique
foreach ($i in $FILES) {
Write-host $i -foregroundcolor red
Get-content -Path \\XXXXXX\$i -tail 10
Start-Sleep 1
}
}
PS: Replace TheNumberOfTimesToRepeat with the number of time you want to repeat.
If I understand the question properly, you would like to test for files in a certain folder containing a certain string. For each of these files, the last 10 lines should be displayed.
The first difficulty comes from the fact that you want to do this inside a loop and test new or updated files.
That means you need to keep track of files you have already tested and only display new or updated files. The code below uses a Hashtable $alreadyChecked for that so we can test if a file is either new or updated.
If no new or updated files are found during a certain time, the code should end. To do that, I'm using two other variables: $endTime and $checkTime.
$checkTime gets updated on every iteration, making it the current time
$endTime only gets updated if files were found.
$today = (Get-Date).Date
$sourceFolder = 'D:\Test'
$alreadyChecked = #{} # a Hashtable to keep track of files already checked
$maxMinutes = 5 # the max time in minutes to perform the loop when no new files or updates are added
$endTime = (Get-Date).AddMinutes($maxMinutes)
do {
$checkTime = Get-Date
$files = Get-ChildItem -Path $sourceFolder -File |
# only files created today and that have not been checked already
Where-Object {$_.LastWriteTime -ge $today -and
(!$alreadyChecked.ContainsKey($_.FullName) -or
$alreadyChecked[$_.FullName].LastWriteTime -ne $_.LastWriteTime) } |
ForEach-Object {
$filetime = $_.LastWriteTime
$_ | Select-String -Pattern "qwerty" -SimpleMatch | # -SimpleMatch if you don't use a Regex match
Select-Object Path, FileName, #{Name = 'LastWriteTime'; Expression = { $filetime }}
}
if ($files) {
foreach ($item in $files) {
Write-Host $item.Filename -ForegroundColor Red
Write-Host (Get-content -Path $item.Path -Tail 10)
Write-Host
# update the Hashtable to keep track of files already done
$alreadyChecked[$item.Path] = $item | Select-Object FileName, LastWriteTime
Start-Sleep 1
}
# files were found, so update the time to check for no updates/new files
$endTime = (Get-Date).AddMinutes($maxMinutes)
}
# exit the loop if no new or updated files have been found during $maxMinutes time
} while ($checkTime -le $endTime)
For demo, I'm using 5 minutes to wait for the loop to expire if no new or updated files are found, but you can change that to suit your needs.

Delete all files and folder working very slow

I need to delete all the archived files and folder older than 15 days.
I have implemented the solution using PowerShell script but it taking more than a day to delete all files. Total size of the folder is less than 100 GB.
$StartFolder = "\\Guru\Archive\"
$deletefilesolderthan = "15"
#Get Foldernames for ForEach Loop
$SubFolders = Get-ChildItem -Path $StartFolder |
Where-Object {$_.PSIsContainer -eq "True"} |
Select-Object Name
#Loop through folders
foreach ($Subfolder in $SubFolders) {
Write-Host "Processing Folder:" $Subfolder
#For each folder recurse and delete files olders than specified number of days while the folder structure is left intact.
Get-ChildItem -Path $StartFolder$($Subfolder.name) -Include *.* -File -Recurse |
Where LastWriteTime -lt (Get-Date).AddDays(-$deletefilesolderthan) |
foreach {$_.Delete()}
#$dirs will be an array of empty directories returned after filtering and loop until until $dirs is empty while excluding "Inbound" and "Outbound" folders.
do {
$dirs = gci $StartFolder$($Subfolder.name) -Exclude Inbound,Outbound -Directory -Recurse |
Where {(gci $_.FullName).Count -eq 0} |
select -ExpandProperty FullName
$dirs | ForEach-Object {Remove-Item $_}
} while ($dirs.Count -gt 0)
}
Write-Host "Completed" -ForegroundColor Green
#Read-Host -Prompt "Press Enter to exit"
Please suggest some way to optimise the performance.
If you have many smaller files, the long delete time is not abnormal because it has to process each file descriptor. Some improvements can be made depending on your version; I'm going to assume you're on at least v4.
#requires -Version 4
param(
[string]
$start = '\\Guru\Archive',
[int]
$thresholdDays = 15
)
# getting the name wasn't useful. keep objects as objects
foreach ($folder in Get-ChildItem -Path $start -Directory) {
"Processing Folder: $folder"
# get all items once
$folders, $files = ($folder | Get-ChildItem -Recurse).
Where({ $_.PSIsContainer }, 'Split')
# process files
$files.Where{
$_.LastWriteTime -lt (Get-Date).AddDays(-$thresholdDays)
} | Remove-Item -Force
# process folders
$folders.Where{
$_.Name -notin 'Inbound', 'Outbound' -and
($_ | Get-ChildItem).Count -eq 0
} | Remove-Item -Force
}
"Complete!"
The reason why it takes so many time is that you are deleting files/folder over network which leads to need for additional network communication for every file and folder. You can easily check that fact using network analyzer. The best approach here is to use one of the method that allows to run code which executes file operations on remote machine, for example you can try to use:
WinRM
psexec (first copy code to remote machine and then execute it using psexec)
remote WMI (using CIM_Datafile)
or even adding needed task to the scheduler
I would prefer to use WinRM but psexec is also good decision (if you don't want to perform additional configuration of WinRM).

Powershell copy file after a date has passed with file structure

I am trying to copy a file off a server and onto another, I want to keep the structure of the file like so C:\folder\folder\file! If the folder is there copy the file into it, if it is not then create the folders and then copy into it!
I would like it also to filter out the files that are still needed so I want to keep files for 30 days and then move them!
Blockquote
`[int]$Count = 0
$filter = (Get-Date).AddDays(-15).ToString("MM/dd/yyyy")
Get-WMIObject Win32_LogicalDisk | ForEach-Object{
$SearchFolders = Get-Childitem ($_.DeviceID + "\crams") -recurse
$FileList = $SearchFolders |
Where-Object {$_.name -like "Stdout_*" -and $_.lastwritetime -le $filter}
[int]$Totalfiles = ($FileList | Measure-object).count
write-host "There are a total of $Totalfiles found."
echo $FileList
start-sleep 30
[int]
ForEach ($Item in $FileList)
{$Count++
$File = $Item
Write-Host "Now Moving $File"
$destination ="C:\StdLogFiles\"
$path = test-Path (get-childitem $destination -Exclude "Stdout_*")
if ($path -eq $true) {
write-Host "Directory Already exists"
copy-item $File -destination $destination
}
elseif ($path -eq $false) {
cd $destination
mkdir $File
copy-Item $File -destination $destination
}
}
}`
Is what I have so far it has changed a lot due to trying to get it to work but the search works and so does the date part I can not get it to keep the structure of the file!
Okay I took out the bottom part and put in
ForEach ($Item in Get-ChildItem $FileList)
also tried get-content but path is null
{$Count++
$destination = "C:\StdLogFiles"
$File = $Item
Write-Host "Now Moving $File to $destination"
Copy-Item -Path $file.fullname -Destination $destination -force}}
it is copying everything that is in c into that folder but not the files I do not understand what it is doing now! I had it copying the files even wen back to an older version and can't get it to work again! I am going to leave it before I break it more!
Any help or thoughts would be appreciated
I think RoboCopy is probably a simpler solution for you to be honest. But, if you insist on using PowerShell you are going to need to setup your destination better if you want to keep your file structure. You also want to leave your filter date as a [DateTime] object instead of converting it to a string since what you are comparing it to (lastwritetime) is a [DateTime] object. You'll need to do something like:
$filter = (Get-Date).AddDays(-15)
$FileList = Get-WMIObject Win32_LogicalDisk | ForEach-Object{
Get-Childitem ($_.DeviceID + "\crams") -recurse | Where-Object {$_.name -like "Stdout_*" -and $_.lastwritetime -le $filter}
}
$Totalfiles = $FileList.count
For($i = 1;$i -le $TotalFiles; $i++)
{
$File = $FileList[($i-1)]
Write-Progress -Activity "Backing up old files" -CurrentOperation ("Copying file: " + $file.Name) -Status "$i of $Totalfiles files" -PercentComplete ($i*100/$Totalfiles)
$Destination = (Split-Path $file.fullname) -replace "^.*?\\crams", "C:\StdLogFiles"
If(!(Test-Path $Destination)){
New-Item -Path $Destination -ItemType Directory | Out-Null
}
Copy-Item $File -Destination $Destination
}
Write-Progress -Completed
That gathers all the files you need to move from all disks. Takes a count of them, and then enters a loop that will cycle as many times as you have files. In the loop is assigns the current item to a variable, then updates a progress bar based on progress. It then parses the destination by replacing the beginning of the file's full path (minus file name) with your target destination of 'C:\StdLogFiles'. So D:\Crams\HolyPregnantNunsBatman\Stdout04122015.log becomes C:\StdLogFiles\HolyPregnantNunsBatman. Then it tests the path, and if it's not valid it creates it (piped to out-null to avoid spam). Then we copy the file to the destination and move on to the next item. After the files are done we close out the progress bar.

PowerShell Remove-Item silently doesn't remove a single file

I have mixed together a PowerShell script that browses folders recursively searching for too low resolution pictures.
I have googled for a solution for couple of hours and so far I haven't managed to get those images deleted. So what's the deal with this?
Turns out that square brackets in file names were causing problems. But now Remove-Item complains that the files are used by another process, which I assume is this script itself.
$kuveja = 0
$pikkusii = 0
[void][reflection.assembly]::loadwithpartialname("system.drawing")
function Get-Image{
process {
$file = $_
[Drawing.Image]::FromFile($_.FullName) |
ForEach-Object{
$_ | Add-Member -PassThru NoteProperty FullName ('{0}' -f $file.FullName)
}
}
}
$path = 'H:\Juttui\'
Get-ChildItem $path\* -Include *.jpg, *.jpeg* -Recurse | ? {
$kuva = $_ | Get-Image
if($kuva.Width -lt 960 -and $kuva.Height -lt 960){
Write-Host $_.FullName -fore red
Remove-Item $_.FullName -Force
$pikkusii++
}
elseif($kuva.Width -lt 540 -or $kuva.Height -lt 540){
Write-Host $_.FullName -fore yellow
Remove-Item $_.FullName -Force
$pikkusii++
}
$kuveja++
}
Write-Host "Pics browsed: $kuveja" -fore green
Write-Host "Small: $pikkusii" -fore green
Please understand that I am complete noob in PowerShell and I don't know a better place to ask this than here.
I think your $kuva reference is keeping the file open. Try performing the .Dispose() method before trying to remove the object. So something like this:
Get-ChildItem $path\* -Include *.jpg, *.jpeg* -Recurse | ? {
$kuva = $_ | Get-Image
if($kuva.Width -lt 960 -and $kuva.Height -lt 960){
Write-Host $_.FullName -fore red
$kuva.dispose()
Remove-Item $_.FullName -Force
$pikkusii++
}
elseif($kuva.Width -lt 540 -or $kuva.Height -lt 540){
Write-Host $_.FullName -fore yellow
$kuva.dispose()
Remove-Item $_.FullName -Force
$pikkusii++
}
$kuveja++
}
Get-ChildItem $path\* -Include *.jpg, *.jpeg* -Recurse | ? {
# ...
}
? means Where-Object, but I think what you really want is ForEach-Object. The alias for ForEach-Object is %, but you should not use aliases in scripts for this very reason (it's less clear).
Try this:
Get-ChildItem $path\* -Include *.jpg, *.jpeg* -Recurse | ForEach-Object {
# ...
}
If that doesn't work, you have Write-Host calls before the removes so are you seeing the red and yellow text that indicates the remove should be happening? You should try to run your script in ISE. Set a breakpoint right before the Remove-Item call, then step through the code.