Possible to slow down a PowerShell Recursive Search? - powershell

This is going to be a weird one for ya.
I need to rate limit or slow down a powershell one liner that I am using to search a local file system, as well as network shares in an enterprise environment. I would like to slow down the search to minimize the possibility of network impact.
The script searches files (mostly document or text files) for certain keywords and phrases. Getting this done quickly is not an issue as I am not on a time crunch, safety is key.
Here is the one liner:
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue | Select-String -Pattern "xxxx" | select filename, Linenumber, Line, Path | Format-Table

With your code added then, I think I was on the right track. With your original code (formatted a bit):
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue |
ForEach-Item -Process {
Start-Sleep -Seconds 1
$_
} |
Select-String -Pattern "xxxx" |
select filename, Linenumber, Line, Path |
Format-Table
Want to make it look more idiomatic? Write your own function that accepts pipeline input and delays execution. I might even use a filter which is a shorthand way of writing a pipeline-aware function:
filter Delay-Object ([int]$Milliseconds) {
Start-Sleep -Milliseconds $Milliseconds
$_
}
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue |
Delay-Object -Milliseconds 1000 |
Select-String -Pattern "xxxx" |
select filename, Linenumber, Line, Path |
Format-Table
Without your code this is pure speculation but let's say you're doing something like this:
Get-ChildItem \\my\share\*.* | ForEach-Object {
# do your search here
}
You can just introduce a delay right into your iteration:
Get-ChildItem \\my\share\*.* | ForEach-Object {
Start-Sleep -Seconds 1
# do your search here
}
If you're not using your own script block, let's say you're using Select-String:
Get-ChildItem \\my\share\*.* | Select-String findme
Then the solution is the same: insert a ForEach-Object!
Get-ChildItem \\my\share\*.* |
ForEach-Object {
Start-Sleep -Seconds 1
$_ # have to return the original object back to the pipeline
} |
Select-String findme
You might be thinking that Get-ChildItem is going to find all the files first and pass them all along and that even that will be too much stress on the network drives, but the pipeline doesn't work that way. *
Each item found is going to be passed to the next command in the pipeline one by one, so your delays will happen between each item. Therefore, you can basically insert a delay between any pipeline commands.
* some pipeline commands like Sort-Object need to collect all of the items and then pass them all out at once; from the POV of the next commands it still looks the same, but it will change how/where you need to put delays.

Invoke-command has a -throttle parameter which might help. It limits the number of threads so might help limit the throughput.

Related

Powershell navigate to specific directory

Total powershell noob and trying to go into a directory. I know it is the last directory (alphabetically) and I can get that with
Get-ChildItem -Directory | Select-Object -Last 1
I then want to Set-Location to that but I can't figure out how to call it with the output of the above.
I was trying something like:
Get-ChildItem -Directory | Select-Object -Last 1
Set-Variable -Name "Dir" -Value { Get-ChildItem -Directory | Select-Object -Last 1 }
Set-Location -Path $Dir
but that doesn't work at all. Any pointers?
There's good information in the comments, let me try to summarize and complement them:
{ ... } is a script block literal in PowerShell: a reusable piece of code to be invoked later with &, the call operator; that is, what's inside the curly braces is not executed right away.
Given that you want the execute and return the output from a command as the argument to another command, use (), the grouping operator[1].
Set-Variable -Name Dir -Value (Get-ChildItem -Directory | Select-Object -Last 1)
However, there is rarely a need to use the Set-Variable cmdlet, given that simple assignments - $Dir = ... work too, in which case you don't even need the parentheses:
$Dir = Get-ChildItem -Directory | Select-Object -Last 1
Of course, just like you can pass an (...) expression to Set-Variable, you can pass it to Set-Location directly, in which case you don't need an intermediate variable at all (parameter -Path is positionally implied):
Set-Location (Get-ChildItem -Directory | Select-Object -Last 1)
You can make this more concise by directly indexing the Get-ChildItem call's output[2]; [-1] refers to the last array element (output object):
Set-Location (Get-ChildItem -Directory)[-1]
The alternative is to use the pipeline (see about_Pipelines):
Get-ChildItem -Directory | Select-Object -Last 1 | Set-Location
Note that there are subtle differences between the last 3 commands in the event that no subdirectories exist in the current directory (meaning that Get-ChildItem -Directory produces no output):
Set-Location (Get-ChildItem -Directory | Select-Object -Last 1) will report an error, because you're then effectively passing $null as the -Path argument.
Set-Location (Get-ChildItem -Directory)[-1] will also report an error, because you cannot apply an index to a $null value.
Get-ChildItem -Directory | Select-Object -Last 1 | Set-Location, by contrast, will be a no-op, because Set-Location effectively won't be invoked at all, due to not receiving input via the pipeline.
If not having subdirectories is an unexpected condition, you can force the script to abort by adding -ErrorAction Stop to one of the first two commands - see about_CommonParameters.
[1] While $(...), the subexpression operator, works too, it's usually not necessary - see this answer.
[2] Note that doing so means that all output from the Get-ChildItem call is then collected in memory first, but that is unlikely to be a problem in this case. See this answer for more information.

Powershell Preferred method to remove the file older than x days

I see below two methods doing the same operation in Power Shell, which is preferred method to remove the file older than 1 day?
Option 1:
Get-ChildItem -Path c:\temp -File | ?{($_.LastWriteTime -lt (Get-Date).AddDays(-1)) -and ($_.Name -like "a*") -and ($_.Extension -eq ".csv")} | Select-Object –ExpandProperty FullName | %{Remove-Item $_ -Force -WhatIf}
Option 2
Get-ChildItem -Path c:\temp -Filter "a*.csv" -File | Where LastWriteTime -lt (Get-Date).AddDays(-1) | Remove-Item -Force -WhatIf
Thanks
SR
Option 2.
? and where are aliases for the where-object cmdlet. filtering aside, they both are checking if the LastWriteTime is older then 24hours in the same way.
As stated by HAL9256 if you need to filter, you should do that first.
If you need more fancy filtering (e.g. regex ), filtering with where-object is the next best thing.
Go with Option 2 because you always want to filter before continuing with the pipeline. Dealing with less data ahead of time in pipelines will always be better.
Option 1 will get every file object (e.g. 100's of files) to pass down the pipeline before it can begin to look at applying the where clause.
Option 2 will filter "a*.csv" files, which should always get you fewer files (e.g. 10's), before continuing down the pipeline to apply the where clause.
I would do it this way.
get-childitem | where { (get-date) - $_.lastwritetime -gt '1' } | remove-item -whatif

Powershell Find and Replace Loop, OutOfMemoryException

I have a working powershell script to find and and replace a few different strings with a new string in thousands of files, without changing the modified date on the files. In any given file there could be hundreds of instances of said strings to replace. The files themselves aren't very large and probably range from 1-50MB (a quick glance at the directory I am testing with shows the largest as ~33MB).
I'm running the script inside a Server 2012 R2 VM with 4 vCPUs and 4GB of RAM. I have set the MaxMemoryPerShellMB value for Powershell to 3GB. As mentioned previously, the script works, but after 2-4 hours powershell will start throwing OutOfMemoryExceptions and crash. The script is 'V2 friendly' and I haven't adopted it to V3+ but I doubt that matters too much.
My question is whether or not the script can be improved to prevent/eliminate the memory exceptions I am running into at the moment. I don't mind if it runs slower, as long as it can get the job done without having to check back every couple of hours and restart it.
$i=0
$all = Get-ChildItem -Recurse -Include *.txt
$scriptfiles = Select-String -Pattern string1,string2,string3 $all
$output = "C:\Temp\scriptoutput.txt"
foreach ($file in $scriptFiles)
{
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite" | out-file -FilePath $output -Append
(Get-Content $file.Path) | ForEach-Object {$_ -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring"
} | Set-Content $file.Path
(Get-ChildItem $file.Path).creationtime=$filecreate
(Get-ChildItem $file.Path).lastaccesstime=$fileaccess
(Get-ChildItem $file.Path).lastwritetime=$filewrite
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite" | out-file -FilePath $output -Append
$i++}
Any comments, criticisms, and suggestions welcomed.
Thanks
Biggest issue I can see is that you are repeatedly getting the file for every property you are querying. Replace that with one call per loop pass and save it to be used during the pass. Also Out-File is one of the slower methods of outputting data to file.
$output = "C:\Temp\scriptoutput.txt"
$scriptfiles = Get-ChildItem -Recurse -Include *.txt |
Select-String -Pattern string1,string2,string3 |
Select-Object -ExpandProperty Path
$scriptfiles | ForEach-Object{
$file = Get-Item $_
# Save currrent file times
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite"
# Update content.
(Get-Content $file) -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring" | Set-Content $file
# Write all the original times back.
$file.creationtime=$filecreate
$file.lastaccesstime=$fileaccess
$file.lastwritetime=$filewrite
# Verify the changes... Should not be required but it is what you were doing.
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite"
} | Set-Content $output
Not tested but should be fine.
Depending on what you replacements are actually like you could probably save some time there as well. Test first before running in production obviously.
I remove the counter you had since it appeared nowhere in the code.
Your logging could easily be csv based since you have all the object ready to go but I just want to be sure we are one the right track before we go to far.

Printing recursive file and folder count in powershell?

I am trying to compare two sets of folders to determine discrepancies in file and folder counts. I have found a command that will output the data I am looking for, but cannot find a way to print it to a file. Here is the command I am using currently:
dir -recurse | ?{ $_.PSIsContainer } | %{ Write-Host $_.FullName (dir $_.FullName | Measure-Object).Count }
This is getting me the desired data but I need to find a way to print this to a text file. Any help would be greatly appreciated.
The problem is the use of the Write-Host cmdlet, which bypasses almost all pipeline handling. In this case, it is also unnecessary, as any output that isn't used by a cmdlet is automatically passed into the pipeline (or to the console if there's nothing further).
Here is your code rewritten to output a string to the pipeline instead of using Write-Host. This uses PowerShell's string subexpression operator $(). At the console, it will look the same, but it can be piped to a file or other cmdlet.
gci -Recurse -Directory | %{ "$($_.FullName) $((gci $_.FullName).Count)" }
You may also find it useful to put the data into a PSCustomObject. Once you have the object, you can do further processing such as sorting or filtering based on the count.
$folders = gci -Recurse -Directory | %{ [PSCustomObject]#{Name=$_.FullName; Count=(dir $_.FullName).Count }}
$folders | sort Count
$folders | where Count -ne 0
Some notes on idioms: dir is an alias for Get-Childitem, as is gci. Using gci's -Directory parameter is the best way to list only directories, rather than the PSIsContainer check. Finally, Measure-Object is unnecessary. You can take the Count of the file listing directly.
See also Write-Host Considered Harmful from the inventor of PowerShell

Delete files containing string

How can I delete all files in a directory that contain a string using powershell?
I've tried something like
$list = get-childitem *.milk | select-string -pattern "fRating=2" | Format-Table Path
$list | foreach { rm $_.Path }
And that worked for some files but did not remove everything. I've tried other various things but nothing is working.
I can easily get the list of file names and can create an array with the path's only using
$lista = #(); foreach ($f in $list) { $lista += $f.Path; }
but can't seem to get any command (del, rm, or Remove-Item) to do anything. Just returns immediately without deleting the files or giving errors.
Thanks
First we can simplify your code as:
Get-ChildItem "*.milk" | Select-String -Pattern "fRating=2" | Select-Object -ExcludeProperty path | Remove-Item -Force -Confirm
The lack of action and errors might be addressable by one of two things. The Force parameter which:
Allows the cmdlet to remove items that cannot otherwise be changed,
such as hidden or read-only files or read-only aliases or variables.
I would aslo suggest that you run this script as administrator. Depending where these files are located you might not have permissions. If this is not the case or does not work please include the error you are getting.
Im going to guess the error is:
remove-item : Cannot remove item C:\temp\somefile.txt: The process cannot access the file 'C:\temp\somefile.txt'
because it is being used by another process.
Update
In testing, I was also getting a similar error. Upon research it looks like the Select-String cmd-let was holding onto the file preventing its deletion. Assumption based on i have never seen Get-ChildItem do this before. The solution in that case would be encase the first part of this in parentheses as a sub expression so it would process all the files before going through the pipe.
(Get-ChildItem | Select-String -Pattern "tes" | Select-Object -ExpandProperty path) | Remove-Item -Force -Confirm
Remove -Confirm if deemed required. It exists as a precaution so that you don't open up a new powershell in c:\windows\system32 and copy paste a remove-item cmdlet in there.
Another Update
[ and ] are wildcard searches in powershell in order to escape those in some cmdlets you use -Literalpath. Also Select-String can return multiple hits in files so we should use -Unique
(Get-ChildItem *.milk | Select-String -Pattern "fRating=2" | Select-Object -ExpandProperty path -Unique) | ForEach-Object{Remove-Item -Force -LiteralPath $_}
Why do you use select-string -pattern "fRating=2"? You would like to select all files with this name?
I think the Format-Table Path don't work. The command Get-ChildItem don't have a property called "Path".
Work this snipped for you?
$list = get-childitem *.milk | Where-Object -FilterScript {$_.Name -match "fRating=2"}
$list | foreach { rm $_.FullName }
The following code gets all files of type *.milk and puts them in $listA, then uses that list to get all the files that contain the string fRating=[01] and stores them in $listB. The files in $listB are deleted and then the number of files deleted versus the number of files that contained the match is displayed(they should be equal).
sv -name listA -value (Get-ChildItem *.milk); sv -name listB -value ($listA | Select-String -Pattern "fRating=[01]"); (($listB | Select-Object -ExpandProperty path) | ForEach-Object {Remove-Item -Force -LiteralPath $_}); (sv -name FCount -value ((Get-ChildItem *.milk).Count)); Write-Host -NoNewline Files Deleted ($listA.Count - $FCount)/($listB.Count)`n;
No need to complicate things:
1. $sourcePath = "\\path\to\the\file\"
2. Remove-Item "$sourcePath*whatever*"
I tried the answer, unfortunately, errors seems to always come up, however, I managed to create a solution to get this done:
Without using Get-ChilItem; You can use select-string directly to search for files matching a certain string, yes, this will return the filename:count:content ... etc, but, internally these have names that you can chose or omit, the one you need is the "filename" to do this pipe this into "select-object" choosing the "FileName" from the output.
So, to select all *.MSG files that has the pattern of "Subject: Webservices restarted", you can do the following:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename
Also, to remove these files on the fly, you could pip into a ForEach statement with the RM command as follows:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename | foreach { rm $_.FileName }
I tried this myself, works 100%.
I hope this helps