Powershell: Get-ChildItem performance to deal with bulk files - powershell

The scenario is in a remote server, a folder is shared for ppl to access the log files.
The log files will be kept for around 30 days before they got aged, and each day, around 1000 log files will be generated.
For problem analysis, I need to copy log files to my own machine, according to the file timestamp.
My previous strategy is:
Use dir /OD command to get the list of the files, to my local PC, into a file
Open the file, find the timestamp, get the list of the files I need to copy
Use copy command to copy the actual log files
It works but needs some manual work, ie step2 I use notepad++ and regular expression to filter the timestamp
I tried to use powershell as:
Get-ChildItem -Path $remotedir | where-object {$_.lastwritetime -gt $starttime -and $_lastwritetime -lt $endtime } |foreach {copy-item $_.fullname -destination .\}
However using this approach it took hours and hours and no file has been copied, while compared with the dir solution it took around 7-8 minutes to generate the list of the file than copy itself took sometime but not hours
I guess most of the time spent on the filter file. I'm not quite sure why the get-childitem's performance is so poor.
Can you please advise if there's anything i can change?
Thanks

For directories with a lot of files, Get-ChildItem is too slow. It looks like most of the time is spent enumerating the directory, then filtering through 'where', then copying each file.
Use .net directly, particularly [io.directoryinfo] with the GetFileSystemInfos() method.
e.g.
$remotedir = [io.directoryinfo]'\\server\share'
$destination = '.\'
$filemask = '*.*'
$starttime = [datetime]'jan-21-2021 1:23pm'
$endtime = [datetime]'jan-21-2021 4:56pm'
$remotedir.GetFileSystemInfos($filemask, [System.IO.SearchOption]::TopDirectoryOnly) | % {
if ($_.lastwritetime -gt $starttime -and $_.lastwritetime -lt $endtime){
Copy-Item -Path $_.fullname -Destination $destination
}
}

Related

Copy files after time x based on modifaction Date

I need a script that only copy files after 5 minutes based on the modification date. Does anyone have a solution for this ?
I couldn't find any script online.
The answer from jdweng is a good solution to identify the files in scope.
You could make your script something like this to easily re-use it with other paths or file age.
# Customizable variables
$Source = 'C:\Temp\Input'
$Destination = 'C:\Temp\Output'
[int32]$FileAgeInMinutes = 5
# Script Execution
Get-ChildItem -Path $Source | Where-Object { $_.LastWriteTime -lt (Get-Date).AddMinutes(-$FileAgeInMinutes) } | Copy-Item -Destination $Destination
You could then run a scheduled task using this script and schedule it to run in periodically, depending on your need.

using 7zip to zip in powershell 5.0

I have customized one powershell code to zip files older than 7 days from a source folder to a subfolder and then delete the original files from source after zipping is complete. The code is working fine with inbuilt Compress-Archive and Remove-Item cmdlets with less volume of files, but takes more time and system memory for a large volume of files. So, I'm working on a solution using 7zip instead as it's faster.
Below script does zipping correctly but not following the condition of only files older than 7 days and deletes all the files from source folder. It should zip and delete only files older than 7 days.
I have tried all possible ways to troubleshoot but no luck. Can anybody suggest possible solution?
if (-not (test-path "$env:ProgramFiles\7-Zip\7z.exe")) {throw "$env:ProgramFiles\7-Zip\7z.exe needed"}
set-alias sz "$env:ProgramFiles\7-Zip\7z.exe"
$Date = Get-Date -format yyyy-MM-dd_HH-mm
$Source = "C:\Users\529817\New folder1\New folder_2\"
$Target = "C:\Users\529817\New folder1\New folder_2\ARCHIVE\"
Get-ChildItem -path $Source | sz a -mx=9 -sdel $Target\$Date.7z $Source
There are several problems here. The first is that 7-Zip doesn't accept a list of files as a pipe, furthermore even if it did your GCI is selecting every file and not selecting by date. The reason that it works at all is that you are passing the source folder as a parameter to 7-Zip.
7-Zip accepts the list of files to zip as a command line argument:
Usage: 7z <command> [<switches>...] <archive_name> [<file_names>...] [#listfile]
And you can select the files you want by filter the output from GCI by LastWriteTime.
Try changing your last line to this
sz a -mx=9 -sdel $Target\$Date.7z (gci -Path $Source |? LastWriteTime -lt (Get-Date).AddDays(-7) | select -expandproperty FullName)
If you have hundreds of files and long paths then you may run into problems with the length of the command line in which case you might do this instead:
gci -Path $Source |? LastWriteTime -lt (Get-Date).AddDays(-7) |% { sz a -mx=9 -sdel $Target\$Date.7z $_.FullName }
Consider a temporary file with a list of those files which need to be compressed:-
$tmp = "$($(New-Guid).guid).tmp"
set-content $tmp (gci -Path $Source |? LastWriteTime -lt (Get-Date).AddDays(-7)).FullName
sz a -mmt=8 out.7z #$tmp
Remove-Item $tmp
Also looking at the parameters to 7-Zip: -mx=9 will be slowest for potentially a small size gain. Perhaps leave that parameter out and take the default and consider adding -mmt=8 to use multiple threads.

Powershell Get-ChildItem returns nothing for folder

A friend of mine asked me to write a short script for him. The script should check a specific folder, find all files and subfolders older than X days and remove them. Simple so far, I wrote the script, successfully tested it on my own system and sent it to him. Here's the thing - it doesn't work on his system. To be more specific, the Get-ChildItem cmdlet does not return anything for the provided path, but it gets weirder even, more on that later. I'm using the following code to first find the files and folders (and log them before deleting them later on):
$Folder = "D:\Data\Drive_B$\General\ExchangeFolder"
$CurrentDate = Get-Date
$TimeSpan = "-1"
$DatetoDelete = $CurrentDate.AddDays($TimeSpan)
$FilesInFolder = (Get-ChildItem -Path $Folder -Recurse -ErrorAction SilentlyContinue | Where-Object {$_.LastWriteTime -lt $DatetoDelete})
All variables are filled and we both know that the folder is filled to the brim with files and subfolders older than one day, which was our timespan we chose for the test. Now, the interesting part is that not only does Get-ChildItem not return anything - going to the folder itself and typing in "dir" does not return anything either. Never seen behaviour like this. I've checked everything I could think of - is it DFS?, typos, folder permissions, share permissions, hidden files, ExecutionPolicy. Everything is as it should be to allow this script to work properly as it did on my own system when initially testing it. The script does not return any errors whatsoever.
So for some reason, the content of the folder cannot be found by powershell. Does anyone know of a reason why this could be happening? I'm at a loss here :-/
Thanks for your time & help,
Fred
.AddDays() takes an double I would use that.
Filter then action
This code will work for you.
$folder = Read-Host -Prompt 'File path'
$datetodel = (Get-Date).AddDays(-1)
$results = try{ gci -path $folder -Recurse | select FullName, LastWriteTime | ?{ $_.LastWriteTime -lt $datetodel}}catch{ $Error[-1] }
$info = "{0} files older than: {1} deleting ...." -f $results.count, $datetodel
if($results | ogv -PassThru){
[System.Windows.Forms.MessageBox]::Show($info)
#Put you code here for the removal of the files
# $results | % del FullName -force
}else{
[System.Windows.Forms.MessageBox]::Show("Exiting")
}

Increase speed of PowerShell Get-ChildItem large directory

I have a script that references a .csv document of filenames and then runs a Get-ChildItem over a large directory to find the file and pull the 'owner'. Finally the info outputs into another .csv document. We use this to find who created files. Additionally I have it create .txt files with filename and timestamp to see how fast the script is finding the data. The code is as follows:
Get-ChildItem -Path $serverPath -Filter $properFilename -Recurse -ErrorAction 'SilentlyContinue' |
Where-Object {$_.LastWriteTime -lt (get-date).AddDays(30) -and
$_.Extension -eq ".jpg"} |
Select-Object -Property #{
Name='Owner'
Expression={(Get-Acl -Path $_.FullName).Owner}
},'*' |
Export-Csv -Path "$desktopPath\Owner_Reports\Owners.csv" -NoTypeInformation -Append
$time = (get-date -f 'hhmm')
out-file "$desktopPath\Owner_Reports\${fileName}_$time.txt"
}
This script serves it's purpose but is extremely slow based on the large size of the parent directory. Currently it takes 12 minutes per filename. We query approx 150 files at a time and this long wait time is hindering production.
Does anyone have better logic that could increase the speed? I assume that each time the script runs Get-ChildItem it recreates the index of the parent directory, but I am not sure. Is there a way we can create the index one time instead of for each filename?
I am open to any and all suggestions! If more data is required (such as the variable naming etc) I will provide upon request.
Thanks!

Copy files in PowerShell too slow

Good day, all. New member here and relatively new to PowerShell so I'm having trouble figuring this one out. I have searched for 2 days now but haven't found anything that quite suits my needs.
I need to copy folders created on the current date to another location using mapped drives. These folders live under 5 other folders, based on language.
Folder1\Folder2\Folder3\Folder4\chs, enu, jpn, kor, tha
The folders to be copied all start with the same letters followed by numbers - abc123456789_111. With the following script, I don't need to worry about folder names because only the folder I need will have the current date.
The folders that the abc* folders live in have about 35k files and over 1500 folders each.
I have gotten all of this to work using Get-ChildItem but it is so slow that the developer could manually copy the files by the time the script completes. Here is my script:
GCI -Path $SrcPath -Recurse |
Where {$_.LastWriteTime -ge (Get-Date).Date} |
Copy -Destination {
if ($_.PSIsContainer) {
Join-Path $DestPath $_.Parent.FullName.Substring($SrcPath.length)
} else {
Join-Path $DestPath $_.FullName.Substring($SrcPath.length)
}
} -Force -Recurse
(This only copies to one destination folder at the moment.)
I have also been looking into using cmd /c dir and cmd /c forfiles but haven't been able to work it out. Dir will list the folders but not by date. Forfiles has turned out to be pretty slow, too.
I'm not a developer but I'm trying to learn as much as possible. Any help/suggestions are greatly appreciated.
#BaconBits is right, you have a recurse on your copy-item as well as your getchild-item. This will cause a lot of extra pointless copies which are just overwrites due to your force parameter. Change your script to do a foreach loop and drop the recurse parameter from copy-item
GCI -Path $SrcPath -Recurse |
Where {$_.LastWriteTime -ge (Get-Date).Date} | % {
Copy -Destination {
if ($_.PSIsContainer) {
Join-Path $DestPath $_.Parent.FullName.Substring($SrcPath.length)
} else {
Join-Path $DestPath $_.FullName.Substring($SrcPath.length)
}
} -Force
}