First run of script is slower - powershell

I am doing some performance profiling and noticed something very odd. I wanted to compare two ways of building a list of XML files. One uses Get-ChildItem twice, with the first one filtering for a single file and the second filtering based on file name, and the second approach uses Get-Item for the single file and Get-ChildItem again for the multiple files. I wrapped both in Measure-Command.
When I run it, the first run shows a much longer time for the very first Measure-Command, but it doesn't matter which approach is first. And it's only the first one over a fair amount of time. So, if we call the two approaches GIGC (Get-Item & Get-ChildItem) and GCGC (Get-ChildItem & Get-ChildItem), if I have the order GIGC then GCGC, and I run it I will see 2.5 seconds for GIGC and 1.5 seconds for GCGC. If I immediately rerun it they will both be around 1.5 seconds. It will stay around 1.5 seconds as I rerun over and over. let the console sit for a few minutes and GIGC will again be around 2.5 seconds. BUT, if I reverse it, GCGC first and GIGC second, the numbers remain the same. That first Measure-Command, this time of GCGC, will be 2.5 seconds, and all the rest will be 1.5. Let the console sit for long enough and the first one will go back up.
$firmAssets = '\\Mac\Support\Px Tools\Dev 4.0'
Measure-Command {
[Collections.ArrayList]$sourceDefinitions = #(Get-Item "$firmAssets\Definitions.xml") + #(Get-ChildItem $firmAssets -Filter:Definitions_*.xml -Recurse)
}
Measure-Command {
[Collections.ArrayList]$sourceDefinitions = #(Get-ChildItem $firmAssets -Filter:Definitions.xml) + #(Get-ChildItem $firmAssets -Filter:Definitions_*.xml -Recurse)
}
At first I thought the issue might be with Measure-Command, so I changed the code to use Get-Date as the timer. Same results.
$startTime = Get-Date
[Collections.ArrayList]$sourceDefinitions = #(Get-Item "$firmAssets\Definitions.xml") + #(Get-ChildItem $firmAssets -Filter:Definitions_*.xml -Recurse)
Write-Host "$(New-Timespan –Start:$startTime –End:(Get-Date))"
$startTime = Get-Date
[Collections.ArrayList]$sourceDefinitions = #(Get-ChildItem $firmAssets -Filter:Definitions.xml) + #(Get-ChildItem $firmAssets -Filter:Definitions_*.xml -Recurse)
Write-Host "$(New-Timespan –Start:$startTime –End:(Get-Date))"
So, last test, I thought it might be something related to the console, so I converted to a script. Weirdly, same result! First run is always substantially slower. My (uneducated) guess is that there is some memory initialization thing happening, that happens on the first run and then stays initialized over multiple uses of PowerShell, be it scripts or console. Basically some sort of PowerShell overhead, which feels like something we can't work around. But hopefully someone has a better answer than my "That's just how PowerShell works, get over it"

Related

PowerShell: Why does it take 3 seconds to access a recycled bin item?

EDIT: Because of helpful comments, I'm revising this question, combining the two sample scripts given before into one.
The following PowerShell code accesses the first ten items in the recycle bin and outputs their original filenames and paths in a CSV file. It works, but it reports an elapsed time of about 3 to 4 seconds each time the code asks for the object of an item in the bin. It also reports a similar amount of time to count all the items in the bin. So it seems as if the delay may be caused just by the code accessing the bin object, and once the object is open, it can do whatever it wants in negligible time. But I can't speed this up by piping because I need a separate object for each bin item as an input parameter for the GetDetailsOf() method. (Am I wrong about that? Is there a way to pipe this process?)
Other people have reported in the comments that they've run this code (earlier version in two segments before this edit) with access of each bin item taking negligible time. So the problem is not the code, but something in my computer when the code accesses a bin object.
$oShell = New-Object -com shell.application
$ssfBitBucket = 10
$oRecycleBin = $oShell.Namespace($ssfBitBucket)
$aasBin = New-Object System.Collections.ArrayList
$dTimeAllStart = $(get-date)
echo ""
$dTimeCountBinStart = $(get-date)
echo ("Count of recycle bin objects = " + $oRecycleBin.Items().count)
echo ("Elapsed time on count of bin: " + ($(get-date) - $dTimeCountBinStart) + ".")
for ($nCount = 0 ; $nCount -lt 10 ; $nCount++)
{$dTimeAccessOneItemStart = $(get-date)
$oRBItem = $oRecycleBin.Items().Item($nCount)
echo ("Elapsed time on item " + $nCount + ": " + ($(get-date) - $dTimeAccessOneItemStart) + ".")
$asBinItem = New-Object PsObject -property `
#{'Name' = $oRecycleBin.GetDetailsOf($oRBItem, 0)
'OriginalLocation' = $oRecycleBin.GetDetailsOf($oRBItem, 1)}
$aasBin.add($asBinItem)
echo $oRBItem.Name
}
$aasBin `
| select Name, OriginalLocation `
| Export-Csv -Path $sOutputFilespec -NoTypeInformation
out-file -InputObject ("") -FilePath $sOutputFilespec -append
$dTimeAllEnd = $(get-date)
$dTimeAllElapsed = $dTimeAllEnd - $dTimeAllStart
$sTimeAllElapsed = $dTimeAllElapsed.ToString("hh\:mm\:ss")
echo ("Elapsed time: " + $sTimeAllElapsed + ".")
This code is stripped-down from a script I want to run on the entire recycle bin. With 17,000 items, at 3 seconds per item, that's going to take 14 hours, which is way too long.
An additional wrinkle is that if I've been doing something else and come back to this, or if I restart the computer and run this with nothing else loaded, then the first access takes about 7 seconds. If the piece of code that counts the bin is there, then that takes 7 about seconds and everything else takes about 3 to 4. If the bin is not counted, then accessing the first item takes about 7 seconds and rest take about 3 to 4. But with or without counting the bin, if I run the code shortly after it's just run, then everything takes 3 to 4 seconds.
As reported in response to the comments:
Task Manager does not show any bump in memory usage when running the code.
From running my larger script, I know that the ten items range in size from 15 kb to 5.5 Mb. All are local on one SSD. All but one had their original location in one folder. The deletion dates range from April 2021 to Nov. 2022. In one particular run, the access times range from 3.0 to 3.8 seconds.
While NTFS can have performance problems with very large folders, those problems appear to occur with file counts over 100k, whereas my recycle bin has 17,189 items.
Edit 2: To check if my Internet security software, SentinelOne, is causing the delay, I went offline, disabled the SentinelOne agent, then ran the script again. The numbers seemed to change somewhat, but not enough to make a big difference.
How can I find out what my computer is doing that's making it take several seconds to access an object of the recycled bin?

How to find a file from files via PowerShell?

I had an excel script to search for files in a command.
I found this example on the forum, the statement says that to search for a file by name, you need to write down the name and send (*) but when requested, it does not find anything
Get-ChildItem -Path "C:\\Folder\\test\*"
What can I do to simplify the code and make it much faster. Wait 10 minutes to find a file out of 10000. this is very long
I have a folder with 10,000 files, and excel searches through VBA for a script in almost 2-3 seconds.
When to script in PowerShell via
$find = Get-ChildItem -Path "C:\\Folder"
for ($f=0; $f -lt $find.Count; $f++){
$path_name = $find\[$f\].Name
if($path_name-eq 'test'){
Write Host 'success'
}
}
ut it turns out sooooooo long, the script hangs for 10 minutes and does not respond, and maybe he will be lucky to answer.
How can I find a file by filter using
Get-ChildItem
To make your search faster you can use Get-ChildItem filter.
$fileName = "test.txt"
$filter = "*.txt"
$status = Get-ChildItem -Path "C:\PS\" -Recurse -Filter $filter | Where-Object {$_.Name -match $fileName}
if ($status) {
Write-Host "$($status.Name) is found"
} else {
Write-Host "No such file is available"
}
You could also compare the speed of searching by using Measure-Command
If the disk the data is on is slow then it'll be slow no matter what you do.
If the folder is full of files then it'll also be slow depending on the amount of RAM in the system.
Less files per folder equals more performance so try to split them up into several folders if possible.
Doing that may also mean you can run several Get-ChildItems at once (disk permitting) using PSJobs.
Using several loops to take take care of a related problem usually makes the whole thing run "number of loops" times as long. That's what Where-Object is for (in addition to the -Filter, -Include and -Exclude flags to Get-ChildItem`).
Console I/O takes A LOT of time. Do NOT output ANYTHING unless you have to, especially not inside loops (or cmdlets that act like loops).
For example, including basic statistics:
$startTime = Get-Date
$FileList = Get-ChildItem -Path "C:\Folder" -File -Filter 'test'
$EndTime = Get-Date
$FileList
$AfterOutputTime = Get-Date
'Seconds taken for listing:'
(EndTime - $startTime).TotalSeconds
'Seconds taken including output:'
($AfterOutputTime - $StartTime).TotalSeconds

Get-Random increased randomness HELP Powershell

I am in need of a little help as to increasing the randomness of get-random thus not repeating so often.
I have been using this with powershell:
$formats =
#("*.mpg")
$dir = Split-Path $MyInvocation.MyCommand.Path
gci "$dir\*" -include $formats | Get-Random -Count 1 |
Invoke-Item
My application is to have a in home virtual television channel that randomly selects 1 file from multiple folders at 6am and plays them all day until midnight, powers off and starts up, repeating the process daily. The problem I am finding is that whatever the get-random command uses had a tendency to choose the exact same file often. Thus after months of running this script, I am seeing the exact same movies chosen day after day and some that are never chosen. I'm guessing because the get-random is using the clock as it's factor for choosing a number?
Is there a way to increase the odds of getting a broader selection of files .mpg's in this instance and less of a repeat of the same .mpg's chosen?
My other option was to find a script that would keep track of the .mpg's chosen and "mark" them, or sort by date accessed and not play the same file twice, until all files of a particular folder have been played once; then "unmarking" all the files and starting over. To me that sounds like advanced scripting that I just don't have the knowledge to procure on my own.
Any insight into my dilemma would be vastly appreciated. Any questions about my specifics to help you ascertain a conclusion to this query will be forthcoming.
**maybe i want to know how to increase the random distribution? I'm looking for a way to have more variety in the files chosen day after day and less repeats.
that (access time) should be fairly easy to check
$item = gci "$dir\*" -include $formats | where {$_.lastaccesstime -le (Get-Date).AddDays(-30)} |
Get-Random -Count 1
$item.LastAccessTime = Get-Date
$item | Invoke-Item

Powershell script works on Powershell window line-by-line, but not in script

I have a script I'm using to loop through a bunch of domains and get dates from whois.exe. This works line-by-line, but when run as a script, it'll freeze. Here is where it gets stuck:
ForEach ($domain in $domains)
{
$domainname = $domain.Name
Write-Host "Processing $domainname..."
# WhoIsCL responds with different information depending on if it's a .org or something else.
if($domainname -like "*.org" -and $domainname)
{
$date = .\WhoIs.exe -v "$domainname" | Select-String -Pattern "Registry Expiry Date: " -AllMatches
Write-Host "Domain is a .org" -ForegroundColor "Yellow"
When I CTRL+C to cancel the command, I can verify that $domain is the correct variable. I can then write this:
if($domainname -like "*.org" -and $domainname)
{
"Test"
}
... and "Test" appears in the command line. I then run:
$date = .\WhoIs.exe -v "$domainname" | Select-String -Pattern "Registry Expiry Date: " -AllMatches
Upon checking the date, it comes out right and I get the appropriate date. Given it freezes right as it says "Processing $domainname..." and right before "Domain is a .org", I can only assume WhoIs.exe is freezing. So, why does this happen as the script is being run, but not directly from the Powershell window?
Lastly, I did a final test by simply copying and pasting the entire script into a Powershell window (which is just silly, but it appears to function) and get the same result. It freezes at whois.exe.
My best guess is that whois.exe needs to be run differently to be reliable in Powershell in my for-loop. However, I don't seem to have a way to test using it in a Start-Process and get string output.
Anyways, advise would be great. I've definitely hit a wall.
Thanks!
If your script is running through lots of domains, it could be that you're being throttled. Here is a quote from the Nominet AUP:
The maximum query rate is 5 queries per second with a maximum of 1,000
queries per rolling 24 hours. If you exceed the query limits a block
will be imposed. For further details regarding blocks please see the
detailed instructions for use page. These limits are not per IP
address, they are per user.
http://registrars.nominet.org.uk/registration-and-domain-management/acceptable-use-policy
Different registrars may behave differently, but I'd expect some sort of rate limit. This would explain why a script (with high volume) behaves differently to ad-hoc manual lookups.
Proposed solution from the comments below is to add Start-Sleep -Seconds 1 to the loop between each Whois lookup.

Check if data coming continuously

Every hour data comes into every folder of my dir tree. I need to check if it does come in every hour, or of there was any interruption. (For example, no data coming in for 2–3 hours.)
I am trying to write a PowerShell script that will check LastWriteTime for every folder, but it would solve the problem with gaps. If I would check the logs in the morning I would see that all is OK if some data come to folder one hour ago, but was not there a few hours earlier.
So IMHO LastWriteTime is not suitable for it.
Also there is a problem with subfolders. I need to check only the last folder in every dir tree. I do not know how to drop any previous folders like:
Z:\temp #need to drop
Z:\temp\foo #need to drop
Z:\temp\foo\1 #need to check
I had tried to write a script that checks the LastAccessTime, but it throws an error:
Expressions are only allowed as the first element of a pipeline.
The script is as follows:
$rootdir = "Z:\SatData\SatImages\nonprojected\"
$timedec1 = (Get-date).AddHours(-1)
$timedec2 = (Get-date).AddHours(-2)
$timedec3 = (Get-date).AddHours(-3)
$timedec4 = (Get-date).AddHours(-4)
$dir1 = get-childitem $rootdir –recurse | ?{ $_.PSIsContainer } | Select-Object FullName | where-object {$_.lastwritetime -lt $timedec1} | $_.LastWriteTime -gt $timedec4 -and $_.LastWriteTime -lt $timedec3
$dir1
But as I said before it does not solve the problem.
--
The main question exactly about checking of continuously data collections. I would make dir tree bu hands, but I need to way to check if data had come to folder every hour or there was any hours without data...
you can try to setup the powershell script to run in a Windows Scheduler (which will run every hour). This way, the script will only have to check if any data arrived within the past one hour.