Powershell memory exhaustion using NTFSSecurity module on a deep folder traverse - powershell

I have been tasked with reporting all of the ACL's on each folder in our Shared drive structure. Added to that, I need to do a look up on the membership of each unique group that gets returned.
Im using the NTFSSecurity module in conjunction with the get-childitem2 cmdlet to get past the 260 character path length limit. The path(s) I am traversing are many hundreds of folders deep and long since pass the 260 character limit.
I have been banging on this for a couple of weeks. My first challenge was crafting my script to do my task all at once, but now im thinking thats my problem... The issue at hand is resources, specifically memory exhaustion. Once the script gets into one of the deep folders, it consumes all RAM and starts swapping to disk, and I eventually run out of disk space.
Here is the script:
$csvfile = 'C:\users\user1\Documents\acl cleanup\dept2_Dir_List.csv'
foreach ($record in Import-Csv $csvFile)
{
$Groups = get-childitem2 -directory -path $record.FullName -recurse | Get-ntfsaccess | where -property accounttype -eq -value group
$groups2 = $Groups | where -property account -notmatch -value '^builtin|^NT AUTHORITY\\|^Creator|^AD\\Domain'
$groups3 = $groups2 | select account -Unique
$GroupMembers = ForEach ($Group in $Groups3) {
(Get-ADGroup $Group.account.sid | get-adgroupmember | select Name, #{N="GroupName";e={$Group.Account}}
)}
$groups2 | select FullName,Account,AccessControlType,AccessRights,IsInherited | export-csv "C:\Users\user1\Documents\acl cleanup\Dept2\$($record.name).csv"
$GroupMembers | export-csv "C:\Users\user1\Documents\acl cleanup\Dept2\$($record.name)_GroupMembers.csv"
}
NOTE: The dir list it reads in is the top level folders created from a get-childitem2 -directory | export-csv filename.csv
During the run, it appears to not be flushing memory properly. This is just a guess from observation. At the end of each run through the code, the variables should be getting over-written, I thought, but memory doesn't go down, so it looked to me that since memory didn't go back down, that it wasn't properly releasing it? Like I said, a guess... I have been reading about runspaces but I am confused about how to implement that with this script. Is that the right direction for this?
Thanks in advance for any assistance...!

Funny you should post about this as I just finished a modified version of the script that I think works much better. A friend turned me on to 'Function Filters' that seem to work well here. Ill test it on the big directories tomorrow to see how much better the memory management is but so far it looks great.
#Define the function ‘filter’ here and call it ‘GetAcl’. Process is the keyword that tells the function to deal with each item in the pipeline one at a time
Function GetAcl {
PROCESS {
Get-NTFSAccess $_ | where -property accounttype -eq -value group | where -property account -notmatch -value '^builtin|^NT AUTHORITY\\|^Creator|^AD\\Domain'
}
}
#Import the directory top level paths
$Paths = import-csv 'C:\users\rknapp2\Documents\acl cleanup\dept2_Dir_List.csv'
#Process each line from the importcsv one at a time and run GetChilditem against it.
#Notice the second part – I ‘|’ pipe the results of the GetChildItem to the function that because of the type of function it is, handles each item one at a time
#When done, pass results to Exportcsv and send it to a file name based on the path name. This puts each dir into its own file.
ForEach ($Path in $paths) {
(Get-ChildItem2 -path $path.FullName -Recurse -directory) | getacl | export-csv "C:\Users\rknapp2\Documents\acl cleanup\TestFilter\$($path.name).csv" }

Related

Script triggers virus scanner. How can we slow it down?

This script that identifies duplicate files triggers a virus scanner. How can we slow it down?
Get-ChildItem -Recurse -File `
| Group-Object -Property Length `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group } `
| Get-FileHash `
| Group-Object -Property Hash `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group }
| %{ $_.path -replace "$([regex]::escape($(pwd)))",'' }
Is there a way to put like a 2 second pause in between files so it takes a long time to complete?
TIA
Edits for comments:
Don't want to say the antivirus software but it's very advanced.
I got the backticks from stack overflow, so garbage in garbage out :) [seriously thanks for the tip]
It works flawlessly on network shares with about 100 files on it.
The speed of your script isn't the problem with an A/V scanner. My guess is possibly the use of [regex]::replace(pattern, text) and Get-FileHash could be something your A/V flags on during heuristic analysis. Without knowing the A/V software, it's impossible to know if others have experienced and resolved the same problem you are having.
The real correct answer is to open a ticket with your A/V vendor on it flagging false positives. Signing your scripts is also known to help scripts with A/V some. Some A/Vs allow whitelisting by checksum, which you could use to approve your scripts if your vendor doesn't have any alternatives. Using the checksum of a signed script is even safer, as you can guarantee the code came from your organization at the time the checksum is calculated.
You can also configure A/V software to whitelist a directory, and you can effectively work around the issue by running scripts out of that directory while you sort the issue with your vendor. However, whitelisting by path should not be your permanent solution. Figure out why your scripts are getting flagged with the vendor, then follow their recommendations.
That said, to answer your original question "Is there a way to put like a 2 second pause in between files....?", yes. Start-Sleep will achieve what you want (but I have serious doubts it would affect your A/V results). The last block can be one line but is made multiline for readability (the semicolon ; is required if on one line):
Note: I've also replaced the backticks with better multi-line formatting. You can end a line with | operator and continue the code on the next line in a single expression. This also works for other operators as well.
This change has also fixed an issue in your original sample where you forgot the penultimate backtick. Backticks are easy to forget, and can be hard to look for. This is one reason why their use is not recommended for multi-line expressions.
Get-ChildItem -Recurse -File |
Group-Object -Property Length |
?{ $_.Count -gt 1 } |
%{ $_.Group } |
Get-FileHash |
Group-Object -Property Hash |
?{ $_.Count -gt 1 } |
%{ $_.Group } |
%{
$_.path -replace "$([regex]::escape($(pwd)))",'';
Start-Sleep 2
}

Get-Childitem - improve memory usage and performance

I would like to be able to also retrieve the file owner , LastAccessTime, LastWriteTime, CreationTime. Get-Childitem has known performance issues when scaled to large directory structures.
We had some performance issue while looking for files in a folder which have more than 100000 subfolders.
Here is my script:
$Dir = get-childitem "W:\DATA" -recurse -force
$Dir | Select-Object name,fullname, LastAccessTime, LastWriteTime, CreationTime, #{N='Owner';E={$_.GetAccessControl().Owner}} | Export-Csv -path C:\Scripts\xlsx.csv -NoTypeInformation
thanks in advance,
Memory
PowerShell objects (PSCustomObject) are optimized for streaming (One-at-a-time processing) and therefore quiet heavy.
Using parenthesis ((...)) or assigning you stream to a variable (like: $Dir =) will choke the pipeline and pile up all the objects into memory.
To reduce memory usage, immediately pass your objects through the pipeline by chaining the concerned cmdlets with a pipe character:
Get-childitem "W:\DATA" -recurse -force |
Select-Object astAccessTime, LastWriteTime, CreationTime |
Export-Csv -path C:\Scripts\xlsx.csv -NoTypeInformation
Performance
Starting with a quote from PowerShell scripting performance considerations:
PowerShell scripts that leverage .NET directly and avoid the pipeline tend to be faster than idiomatic PowerShell. Idiomatic PowerShell typically uses cmdlets and PowerShell functions heavily, often leveraging the pipeline, and dropping down into .NET only when necessary.
In your case, the performance bottleneck is likely not in PowerShell but due to the server and the network. Meaning leveraging from .NET directly would probably not have any effect on the performance.
In fact, using the PowerShell pipeline might be even faster in this case as you do not have to wait until the last file info item is loaded into memory where the native PowerShell pipeline immediately starts processing at the first item while the next items are (slowly) provided by the server.
If you change the last cmdlet (Export-Csv) to ConvertTo-Csv you will probably see the difference where a (correctly setup) pipeline almost starts on fly and other solutions take a while before outputting any data to the console.
The numbers tell the tale
(In Dutch: "meten is weten", which literally means: measuring is knowing)
If you aren't sure what technique would give you the best performance, I recommend you to simply test it (on a subset), like:
Measure-Command {
Get-childitem "W:\DATA" -recurse -force |
Select-Object astAccessTime, LastWriteTime, CreationTime |
Export-Csv -path C:\Scripts\xlsx.csv -NoTypeInformation
} | Select-Object TotalMilliseconds
and compare the results.
Give this a try, should be faster than Get-ChildItem. You could also use [SearchOption]::AllDirectories and no Collections.Queue but I'm not certain if that would consume less memory.
using namespace System.Collections
using namespace System.IO
class InfoProps {
[string] $Name
[string] $FullName
[datetime] $LastAccessTime
[datetime] $LastWriteTime
[datetime] $CreationTime
[string] $Owner
Infoprops([object]$FileInfo)
{
$this.Name = $FileInfo.Name
$this.FullName = $FileInfo.FullName
$this.LastAccessTime = $FileInfo.LastAccessTime
$this.LastWriteTime = $FileInfo.LastWriteTime
$this.CreationTime = $FileInfo.CreationTime
$this.Owner = $FileInfo.GetAccessControl().Owner
}
}
$initialDirectory = $pwd.Path
$queue = [Queue]::new()
$queue.Enqueue($initialDirectory)
& {
while ($queue.Count)
{
$target = $queue.Dequeue()
foreach ($childs in [Directory]::EnumerateDirectories($target)) {
$queue.Enqueue($childs)
}
[InfoProps] [DirectoryInfo] $target # => Remove this line if you want only files!
[InfoProps[]] [FileInfo[]] [Directory]::GetFiles($target)
}
} | Export-Csv test.csv -NoTypeInformation

Get ACL Folder & Subfolder + Users Using Powershell

Is it possible to get the permissions of a folder and its sub-folders then display the path, group, and users associated to that group? So, to look something like this. Or will it have to be one folder at a time.
-Folder1
-Line separator
-Group
-Line separator
-List of users
-Folder2
-Line separator
-Group
-Line separator
-List of users
The script I've come up with so far be warned I have very little experience with powershell. (Don't worry my boss knows.)
Param([string]$filePath)
$Version=$PSVersionTable.PSVersion
if ($Version.Major -lt 3) {Throw "Powershell version out of date. Please update powershell." }
Get-ChildItem $filePath -Recurse | Get-Acl | where { $_.Access | where { $_.IsInherited -eq $false } } | select -exp Access | select IdentityReference -Unique | Out-File .\Groups.txt
$Rspaces=(Get-Content .\Groups.txt) -replace 'JAC.*?\\|','' |
Where-Object {$_ -notmatch 'BUILTIN|NT AUTHORITY|CREATOR|-----|Identity'} | ForEach-Object {$_.TrimEnd()}
$Rspaces | Out-File .\Groups.txt
$ErrorActionPreference= 'SilentlyContinue'
$Groups=Get-Content .\Groups.txt
ForEach ($Group in $Groups)
{Write-Host"";$Group;Write-Host --------------
;Get-ADGroupMember -Identity $Group -Recursive | Get-ADUser -Property DisplayName | Select Name}
This only shows the groups and users, but not the path.
Ok, let's take it from the top! Excellent, you actually declare a parameter. What you might want to consider is setting a default value for the parameter. What I would do is use the current directory, which conveniently has an automatic variable $PWD (I believe that's short for PowerShell Working Directory).
Param([string]$filePath = $PWD)
Now if a path is provided it will use that, but if no path is provided it just uses the current folder as a default value.
Version check is fine. I'm pretty sure there's more elegant ways to do it, but I honestly don't have never done any version checking.
Now you are querying AD for each group and user that is found (after some filtering, granted). I would propose that we keep track of groups and members so that we only have to query AD once for each one. It may not save a lot of time, but it'll save some if any group is used more than once. So for that purpose we're going to make an empty hashtable to track groups and their members.
$ADGroups = #{}
Now starts a bad trend... writing to files and then reading those files back in. Outputting to a file is fine, or saving configurations, or something that you'll need again outside of the current PowerShell session, but writing to a file just to read it back into the current session is just a waste. Instead you should either save the results to a variable, or work with them directly. So, rather than getting the folder listing, piping it directly into Get-Acl, and losing the paths we're going to do a ForEach loop on the folders. Mind you, I added the -Directory switch so it will only look at folders and ignore files. This happens at the provider level, so you will get much faster results from Get-ChildItem this way.
ForEach($Folder in (Get-ChildItem $filePath -Recurse -Directory)){
Now, you wanted to output the path of the folder, and a line. That's easy enough now that we aren't ditching the folder object:
$Folder.FullName
'-'*$Folder.FullName.Length
Next we get the ACLs for the current folder:
$ACLs = Get-Acl -Path $Folder.FullName
And here's where things get complicated. I'm getting the group names from the ACLs, but I've combined a couple of your Where statements, and also added a check to see if it is an Allow rule (because including Deny rules in this would just be confusing). I've used ? which is an alias for Where, as well as % which is an alias for ForEach-Object. You can have a natural line brake after a pipe, so I've done that for ease of reading. I included comments on each line for what I'm doing, but if any of it is confusing just let me know what you need clarification on.
$Groups = $ACLs.Access | #Expand the Access property
?{ $_.IsInherited -eq $false -and $_.AccessControlType -eq 'Allow' -and $_.IdentityReference -notmatch 'BUILTIN|NT AUTHORITY|CREATOR|-----|Identity'} | #Only instances that allow access, are not inherited, and aren't a local group or special case
%{$_.IdentityReference -replace 'JAC.*?\\'} | #Expand the IdentityReference property, and replace anything that starts with JAC all the way to the first backslash (likely domain name trimming)
Select -Unique #Select only unique values
Now we'll loop through the groups, starting off by outputting the group name and a line.
ForEach ($Group in $Groups){
$Group
'-'*$Group.Length
For each group I'll see if we already know who's in it by checking the list of keys on the hashtable. If we don't find the group there we'll query AD and add the group as a key, and the members as the associated value.
If($ADGroups.Keys -notcontains $Group){
$Members = Get-ADGroupMember $Group -Recursive -ErrorAction Ignore | % Name
$ADGroups.Add($Group,$Members)
}
Now that we're sure that we have the group members we will display them.
$ADGroups[$Group]
We can close the ForEach loop pertaining to groups, and since this is the end of the loop for the current folder we'll add a blank line to the output, and close that loop as well
}
"`n"
}
So I wrote this up and then ran it against my C:\temp folder. It did tell me that I need to clean up that folder, but more importantly it showed me that most of the folders don't have any non-inherited permissions, so it would just give me the path with an underline, a blank line, and move to the next folder so I had a ton of things like:
C:\Temp\FolderA
---------------
C:\Temp\FolderB
---------------
C:\Temp\FolderC
---------------
That doesn't seem useful to me. If it is to you then use the lines above as I have them. Personally I chose to get the ACLs, check for groups, and then if there are no groups move to the next folder. The below is the product of that.
Param([string]$filePath = $PWD)
$Version=$PSVersionTable.PSVersion
if ($Version.Major -lt 3) {Throw "Powershell version out of date. Please update powershell." }
#Create an empty hashtable to track groups
$ADGroups = #{}
#Get a recursive list of folders and loop through them
ForEach($Folder in (Get-ChildItem $filePath -Recurse -Directory)){
# Get ACLs for the folder
$ACLs = Get-Acl -Path $Folder.FullName
#Do a bunch of filtering to just get AD groups
$Groups = $ACLs |
% Access | #Expand the Access property
where { $_.IsInherited -eq $false -and $_.AccessControlType -eq 'Allow' -and $_.IdentityReference -notmatch 'BUILTIN|NT AUTHORITY|CREATOR|-----|Identity'} | #Only instances that allow access, are not inherited, and aren't a local group or special case
%{$_.IdentityReference -replace 'JAC.*?\\'} | #Expand the IdentityReference property, and replace anything that starts with JAC all the way to the first backslash (likely domain name trimming)
Select -Unique #Select only unique values
#If there are no groups to display for this folder move to the next folder
If($Groups.Count -eq 0){Continue}
#Display Folder Path
$Folder.FullName
#Put a dashed line under the folder path (using the length of the folder path for the length of the line, just to look nice)
'-'*$Folder.FullName.Length
#Loop through each group and display its name and users
ForEach ($Group in $Groups){
#Display the group name
$Group
#Add a line under the group name
'-'*$Group.Length
#Check if we already have this group, and if not get the group from AD
If($ADGroups.Keys -notcontains $Group){
$Members = Get-ADGroupMember $Group -Recursive -ErrorAction Ignore | % Name
$ADGroups.Add($Group,$Members)
}
#Display the group members
$ADGroups[$Group]
}
#output a blank line, for some seperation between folders
"`n"
}
I have managed to get this working for me.
I edited the below section to show the Name and Username of the user.
$Members = Get-ADGroupMember $Group -Recursive -ErrorAction Ignore | % Name | Get-ADUser -Property DisplayName | Select-Object DisplayName,Name | Sort-Object DisplayName
This works really well, but would there be a way to get it to stop listing the same group access if it's repeated down the folder structure?
For example, "\example1\example2" was assigned a group called "group1" and we had the following folder structure:
\\example1\example2\folder1
\\example1\example2\folder2
\\example1\example2\folder1\randomfolder
\\example1\example2\folder2\anotherrandomfolder
All the folders are assigned the group "group1", and the current code will list each directory's group and users, even though it's the same. Would it be possible to get it to only list the group and users once if it's repeated down the directory structure?
The -notcontains doesn't seem to work for me
If that makes sense?

PowerShell script file modify time>10h and return a value if nothing is found

I am trying to compose a script/one liner, which will find files which have been modified over 10 hours ago in a specific folder and if there are no files I need it to print some value or string.
Get-ChildItem -Path C:\blaa\*.* | where {$_.Lastwritetime -lt (date).addhours(-10)}) | Format-table Name,LastWriteTime -HideTableHeaders"
With that one liner I am getting the wanted result when there are files with
modify time over 10 hours, but I also need it to print value/string if there are
no results, so that I can monitor it properly.
The reason for this is to utilize the script/one liner for monitoring purposes.
Those cmdlet Get-ChildItem and where clause you have a would return null if nothing was found. You would have to account for that separately. I would also caution the use of Format-Table for output unless you are just using it for screen reading. If you wanted a "one-liner" you would could this. All PowerShell code can be a one liner if you want it to be.
$results = Get-ChildItem -Path C:\blaa\*.* | where {$_.Lastwritetime -lt (date).addhours(-10)} | Select Name,LastWriteTime; if($results){$results}else{"No files found matching criteria"}
You have an added bracket in your code, that might be a copy artifact, I had to remove. Coded properly would look like this
$results = Get-ChildItem -Path "C:\blaa\*.*" |
Where-Object {$_.Lastwritetime -lt (date).addhours(-10)} |
Select Name,LastWriteTime
if($results){
$results
}else{
"No files found matching criteria"
}

powershell slow(?) - write names of subfolders to a text file

My Powershell script seems slow, when I run the below code in ISE, it keeps running, doesn't stop.
I am trying to write the list of subfolders in a folder(the folder path is in $scratchpart) to a text file. There are >30k subfolders
$limit = (Get-Date).AddDays(-15)
$path = "E:\Data\PathToScratch.txt"
$scratchpath = Get-Content $path -TotalCount 1
Get-ChildItem -Path $scratchpath -Recurse -Force | Where-Object { $_.PSIsContainer -and $_.CreationTime -lt $limit } | Add-Content C:\Data\eProposal\POC\ScratchContents.txt
Let me know if my approach is not optimal. Ultimately, I will read the text file, zip the subfolders for archival and delete them.
Thanks for your help in advance. I am new to PS, watched few videos on MVA
Add-Content, Set-Content, and even Out-File are notoriously slow in PowerShell. This is because each call opens the file, writes to it, and closes the handle. It never does anything more intelligently than that.
That doesn't sound bad until you consider how pipelines work with Get-ChildItem (and Where-Object and Select-Object). It doesn't wait until it's completed before it begins passing objects into the pipeline. It starts passes objects as soon as the provider returns them. For a large result set, this means that the objects are still feeding in the pipeline long after several have finished processing. Generally speaking, this is great! It means the system will function more efficiently, and it's why stuff like this:
$x = Get-ChildItem;
$x | ForEach-Object { [...] };
Is significantly slower than stuff like this:
Get-ChildItem | ForEach-Object { [...] };
And it's why stuff like this appears to stall:
Get-ChildItem | Sort-Object Name | ForEach-Object { [...] };
The Sort-Object cmdlet needs to waits until it's received all pipeline objects before it sorts. It kind of has to to be able to sort. The sort itself is nearly instantaneous; it's just the cmdlet waiting until it has the full results.
The issue with Add-Content is that, well, it experiences the pipeline not as, "Here's a giant string to write once," but instead as, "Here's a string to write. Here's a string to write. Here's a string to write. Here's a string to write." You'll be sending content to Add-Content here line by line. Each line will instantiate a new call to Add-Content, requiring the file to open, write, and close. You'll likely see better performance if you assign the result of Get-ChildItem [...] | Where-Object [...] to a variable, and then write the entire variable to the file at once:
$limit = (Get-Date).AddDays(-15);
$path = "E:\Data\PathToScratch.txt";
$scratchpath = Get-Content $path -TotalCount 1;
$Results = Get-ChildItem -Path $scratchpath -Recurse -Force -Directory | `
Where-Object{$_.CreationTime -lt $limit } | `
Select-Object -ExpandPropery FullName;
Add-Content C:\Data\eProposal\POC\ScratchContents.txt -Value $Results;
However, you might be concerned about memory usage if your results are actually going to be extremely large. You can actually use System.IO.StreamWriter for this purpose, too. My process improved in speed by nearly two orders of magnitude (from 12 hours to 20 minutes) by switching to StreamWriter and also only calling StreamWriter when I had about 250 lines to write (that seemed to be the break-even point for StreamWriter's overhead). But I was parsing all ACLs for user home and group shares for about 10,000 users and nearly 10 TB of data. Your task might not be as large.
Here's a good blog explaining the issue.
Do you have at least PowerShell 3.0? If you do you should be able to reduce the time by filtering out the files since you are returning those as well.
Get-ChildItem -Path $scratchpath -Recurse -Force -Directory | ...
Currently you are returning all files and folders then filtering out the files with $_.PSIsContainer which would be slower. So should end up with something like this
Get-ChildItem -Path $scratchpath -Recurse -Force -Directory |
Where-Object{$_.CreationTime -lt $limit } |
Select-Object -ExpandPropery FullName |
Add-Content C:\Data\eProposal\POC\ScratchContents.txt