So I have a script that gets the filename of songs contained in a CSV list and checks a directory to see if the file exists, then exports the missing information if there is any. The CSV file looks something like this:
Now, my script seems to work when I test on a smaller directory but when I run it against my actual directory contained on an external drive (about 10TB of files), I get a "system.outofmemoryexception" error before the script can complete.
$myPath = 'Z:\Music\media'
$myCSV = 'C:\Users\Me\Documents\Test.csv'
$CSVexport = 'C:\Users\Me\Documents\Results.csv'
$FileList = Get-ChildItem $myPath -Recurse *.wav | Select-Object -ExpandProperty Name -Unique
Import-CSV -Path $myCSV |
Where-Object {$FileList -notcontains $_.Filename} |
Select ID, AlbumTitle, TrackNo, Filename | Export-CSV $CSVexport -NoTypeInformation
$missing = Import-CSV $CSVexport | Select-Object -ExpandProperty Filename
If(!([string]::IsNullOrEmpty($missing))){
Write-Output "Missing files:`n" $missing}
Is there a way to make this script consume less memory or a more efficient way to do this against a large directory of files? I am new to Powershell scripting and am having trouble finding a way around this.
When #TheIncorrigible says iteratively he means something like this. Please note I am using different file paths since I don't have Z: drive. The best way would be to load up your csv items in a variable then iterate through that variable using a foreach loop, then for each one of those items testing to see if file exist, then if it does not add that item to a new variable. Once complete then export the new variable containing the missing items to csv.
$myPath = "C:\temp\"
$myCsv = "C:\temp\testcsv.csv"
$CSVexport = "C:\temp\results.csv"
$CsvItems = Import-Csv -Path $myCsv
$MissingItems
foreach($item in $CsvItems)
{
$DoesFileExist = Test-Path ($myPath + $item.Filename)
If($DoesFileExist -eq $false)
{
$MissingItems = $MissingItems + $item
}
}
$MissingItems | Export-Csv $CSVexport -NoTypeInformation
I have a workflow that I created that recursively scans hundreds of file shares for specific file names/extensions. Everything works great, but I'm wondering if there is a way to improve it's speed.
In the foreach -parallel, I'm getting child items and ACL recursively from a list of shares, 20 at a time. What I'm wondering is if there is a way to also make it process in parallel within the GCI.
In other words, is foreach -parallel nesting possible/doable, and if so can you offer suggestions such as syntax, the dangers of doing it, etc. I've found very little documentation on it and am hoping for some expert advice
If that doesn't make sense, here's a different way to explain it. I have 5 shares that I'm searching recursively, all with different sizes and a different number of files:
\share1\folder1 Size: 5GB
\share2\folder1 Size: 79GB
\share3\folder1 Size: 2GB
\share4\folder1 Size: 8GB
\share5\folder1 Size: 103GB
The GCI will process all 5 at the same time, however share2 and share5 are going to take a LOT more time than the others due to their size. Is there a way to to make a script process all 5 shares in parallel, and also process multiple files at once within the GCI? So it would start searching all 20 shares at once, and get information on say 5 files within the share at a time?
Here's a piece of what I'm doing, shortened down a bit for length reasons with variable names/input changed. I'm sure there are things that I can improve upon in here as well, but overall it works.
workflow Scan-Shares
{
[cmdletbinding()]
param(
[int]$ThrottleLimit = 20
)
$File1 = Import-csv C:\Drives.csv
$ExtList = #((Invoke-WebRequest -Uri "www.websitewithfilenames.com").content | convertfrom-json | % {$_.filters})
foreach -parallel -throttlelimit $throttlelimit ($Line in $File1)
{
inlinescript
{
$Line=$using:Line
$Extlist=$using:Extlist
$Path = $Line.Path + '\*'
$Directory = Get-ChildItem -Path $path -Recurse -Force -ErrorAction SilentlyContinue -WarningAction SilentlyContinue | select-object name, Extension, FullName, LastWriteTime, Directory -ExpandProperty Name
$singleRegex = ($extlist | %{ '^' + $_ + '$' }) -join '|'
[array]$Files = $Directory -match $singleRegex
if($files.count -gt 0 )
{
foreach($SingleFile in $Files)
{
$Owner = (Get-Acl $FullName -Erroraction SilentlyContinue -WarningAction SilentlyContinue).Owner
New-Object -TypeName PSCustomObject -Property #{
Path = $File.Fullname
Filename = $File.Name
LastWriteTime = $File.LastWriteTime
Owner = $Owner
Directory = $File.Directory } |
Export-Csv -Force C:\Share-Results.CSV -Append
}
}
}
}
}
I have a task to carry out 3 times a day on a WS2012R2 to get the Disk size, total number of files and folders including subdirectories from a folder on a remote server.
Currently I get this information by RDP'ing to the target, navigating to the folder and right clicking the folder to copy the info:
I have already tried the PowerShell script :
Get-ChildItem E:\Data -Recurse -File | Measure-Object | %{$_.Count}
and other PowerShell scripts.
Which produced countless errors pertaining to not having permissions for some sub directories or simply gave results I didn't want such.
I have tried VBscript but VBscript simply cannot get this information.
You can just access the count property:
$items = Get-ChildItem E:\Data -Recurse -File
($items | Where { -not $_.PSIsContainer}).Count #files
($items | Where $_.PSIsContainer).Count #folders
To summarise the comments so far.
You can get the size of the C drive with:
$Drive = "C"
Get-PSDrive -Name $Drive | Select-Object #{N="Used Space on $Drive Drive (GB)"; E={[Math]::Round($_.Used/1GB)}}
But you cannot use something like the code below to count the files & folders for the whole C drive, but it works for simple filestructures and simple drives.
I use the code to calculate my other drives and for my homepath, just change $Path to e.g "$Env:Homepath".
There's the path length problem which is a .NET thing and won't be
fixed until everyone (and PowerShell) is using .NET 4.6.2. Then
there's that you're acting as you counting it, not the operating
system counting.
[Int]$NumFolders = 0
[Int]$NumFiles = 0
$Path = "C:\"
$Objects = Get-ChildItem $Path -Recurse
$Size = Get-ChildItem $Path -Recurse | Measure-Object -property length -sum
$NumFiles = ($Objects | Where { -not $_.PSIsContainer}).Count
$NumFolders = ($Objects | Where {$_.PSIsContainer}).Count
$Summary = New-Object PSCustomObject -Property #{"Path" = $Path ; "Files" = $NumFiles ; "Folders" = $NumFolders ; "Size in MB" = ([Math]::Round($Size.sum /1MB))}
$Summary | Format-list
To run this on remote computers, I would recommend using New-PsSession to the computer/computers, then Invoke-Command to run the code using the new sessions.
I am fairly new to PowerShell, and wrote a script to analyze network shares, dump it to CSV and import it to SQL. Our NAS device has several hidden shares, so I specify the (NAS) server name, a string of share names to search, and the folder depth that I want to search. (like 3 or 4 levels for quick testing).
The script tries to convert the security permissions to show simple "List, Read or Modify" access to folders. Can this user/group "list" the files, view the files, or modify them? The user info is put into a comma-separated list for each access type.
I suspect that although the code is functional, it may not be very efficient and I wonder if there are some significant improvements that could be made?
To deal with long pathnames, I use the "File System Security PowerShell Module 3.2.3" which appends a "2" to several modules, like "Get-ChildItem2".
I used to just specify one share folder, and I'm also wondering if my For-Each-Object that processes multiple shares has introduced a bug in how the objects are handled. It seems to use a lot more memory and slows down, and doesn't seem to process the last share in the list properly.
Here is the code: (split into 3 pieces)
# This script reads through the specified shares on the server and creates a CSV file containing the folder information
# The data is written to a SQL server
$Server = '\\MyServer'
$Shares = 'data$,share$'.Split(',')
$Levels = 99 # specify 3 or 4 for faster testing with less info
$ScanDate = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
$CSVFile = 'C:\FolderInfo\' + $ScanDate.Replace(':','-') + '.csv'
Write-Debug "ScanDate will be set to: $ScanDate"
Write-Debug "Data will be written to: $CSVFile"
$Separator = ',' # Separate the AD groups
$ListRights = 'ListDirectory,GenericExecute'.Split(',')
$ReadRights = 'Read,ReadAndExecute,GenericRead'.Split(',')
$ModifyRights = 'CreateFiles,AppendData,Write,Modify,FullControl,GenericAll'.Split(',')
$ErrorPref = 'Continue'
$ErrorActionPreference = $ErrorPref
$DebugPreference = 'Continue'
$DataBase = 'Folders.dbo.FolderInfo'
Function Get-Subs {
Param([String]$Path,[Byte]$Depth)
$CurrentDepth = $Path.Length - $Path.Replace('\','').Length
new-object psobject -property #{Path=$Path} # Object 'Path' is for the pipe output
If ( $CurrentDepth -lt ($Depth + 1) ) {
Get-ChildItem2 -Path $Path -Directory | ForEach {
Get-Subs $PSItem.FullName $Depth }
}
}
The next line has a commented out line of code that I was using to test how it is processing multiple share names, and it works properly, but the remaining code below seems to mess up on the last sharename in the list.
$Shares | ForEach-Object {Get-Subs (Resolve-Path $Server\$_).ProviderPath $Levels} | Get-Item2 | #ForEach-Object { new-object psobject -property #{Path=$_.FullName} } | Select Path
And the remaining code: (I hope this breakage doesn't confuse everyone :)
ForEach-Object {
$ListUsers = #()
$ReadUsers = #()
$ModifyUsers = #()
$Folder = $PSItem.FullName
Write-Debug $Folder
$Inherited = $true
try {$Owner = (Get-NTFSOwner -Path $Folder).Owner.AccountName.Replace('MyDomain\','')
}
catch {Write-Debug "Access denied: $Folder"
$Owner = 'access denied'
$Inherited = $false
}
$Levels = $Folder.Length - $Folder.Replace('\','').Length - 3 # Assuming \\server\share as base = 0
Get-NTFSAccess $Folder | Where { $PSItem.Account -ne 'BUILTIN\Administrators' } | ForEach-Object {
$Account = $PSItem.Account.AccountName.Replace('MyDomain\','')
$Rights = $PSItem.AccessRights -split(',')
If ($PSItem.IsInherited -eq $false) {$Inherited = $false}
IF ($PSItem.InheritanceFlags -eq 'ContainerInherit') { # Folders only or 'ContainerInherit, ObjectInherit' = Folders and Files
If (#(Compare -ExcludeDifferent -IncludeEqual ($ListRights)($Rights)).Length -and $Account) {$ListUsers += $Account}
If (#(Compare -ExcludeDifferent -IncludeEqual ($ReadRights)($Rights)).Length -and $Account) {$ListUsers += $Account}
If (#(Compare -ExcludeDifferent -IncludeEqual ($ModifyRights)($Rights)).Length -and $Account) {$ListUsers += $Account
Write-Debug "Modify anomaly found on Container only: $Account with $Rights in $Folder"
}
}
Else {
If (#(Compare -ExcludeDifferent -IncludeEqual ($ListRights)($Rights)).Length -and $Account) {$ListUsers += $Account}
If (#(Compare -ExcludeDifferent -IncludeEqual ($ReadRights)($Rights)).Length -and $Account) {$ReadUsers += $Account}
If (#(Compare -ExcludeDifferent -IncludeEqual ($ModifyRights)($Rights)).Length -and $Account) {$ModifyUsers += $Account}
}
}
$FileCount = Get-ChildItem2 -Path $Folder -File -IncludeHidden -IncludeSystem | Measure-Object -property Length -Sum
If ($FileCount.Sum) {$Size = $FileCount.Sum} else {$Size = 0}
If ($FileCount.Count) {$NumFiles = $FileCount.Count} else {$NumFiles = 0}
$ErrorActionPreference = 'SilentlyContinue'
Remove-Variable FolderInfo
Remove-Variable Created, LastAccessed, LastModified
$ErrorActionPreference = $ErrorPref
$FolderInfo = #{} # create empty hashtable, new properties will be auto-created
$LastModified = Get-ChildItem2 -Path $Folder -File | Measure-Object -property LastWriteTime -Maximum
IF ($LastModified.Maximum) {$FolderInfo.LastModified = $LastModified.Maximum.ToString('yyyy-MM-dd hh:mm:ss tt')}
else {$FolderInfo.LastModified = $PSItem.LastWriteTime.ToString('yyyy-MM-dd hh:mm:ss tt')}
$LastAccessed = Get-ChildItem2 -Path $Folder -File | Measure-Object -property LastAccessTime -Maximum
IF ($LastAccessed.Maximum) {$FolderInfo.LastAccessed = $LastAccessed.Maximum.ToString('yyyy-MM-dd hh:mm:ss tt')}
else {$FolderInfo.LastAccessed = $PSItem.LastAccessTime.ToString('yyyy-MM-dd hh:mm:ss tt')}
$Created = Get-ChildItem2 -Path $Folder -File | Measure-Object -Property CreationTime -Maximum
IF ($Created.Maximum) {$FolderInfo.Created = $Created.Maximum.ToString('yyyy-MM-dd hh:mm:ss tt')}
else {$FolderInfo.Created = $PSItem.CreationTime.ToString('yyyy-MM-dd hh:mm:ss tt')}
$FolderInfo.FolderName = $Folder
$FolderInfo.Levels = $Levels
$FolderInfo.Owner = $Owner
$FolderInfo.ListUsers = $ListUsers -join $Separator
$FolderInfo.ReadUsers = $ReadUsers -join $Separator
$FolderInfo.ModifyUsers = $ModifyUsers -join $Separator
$FolderInfo.Inherited = $InheritedFrom
$FolderInfo.Size = $Size
$FolderInfo.NumFiles = $NumFiles
$FolderInfo.ScanDate = $ScanDate
Write-Debug $Folder
Write-Output (New-Object –Typename PSObject –Prop $FolderInfo)
} | Select FolderName, Levels, Owner, ListUsers, ReadUsers, ModifyUsers, Inherited, Size, NumFiles, Created, LastModified, LastAccessed, ScanDate |
ConvertTo-csv -NoTypeInformation -Delimiter '|' |
ForEach-Object {$PSItem.Replace('"','')} |
Out-File -FilePath $CSVFile -Force
Write-debug 'Starting import...'
$Query = #"
BULK INSERT $DataBase FROM '$CSVFile' WITH (DATAFILETYPE = 'widechar', FIRSTROW = 2, FIELDTERMINATOR = '|', ROWTERMINATOR = '\n')
"#
sqlcmd -S MyComputer\SQLExpress -E -Q $Query
Arrays can be defined by comma separated list. Each element belongs in quotes.
For $Shares = 'data$,share$'.Split(','), try, $Shares = 'data$','share$'
Similarly for most of your arrays where you used .Split(',')
The filename can be done in many, many ways. Here you use a custom format then change it immediately. Recommend you replace
$ScanDate = Get-Date -Format 'yyyy-MM-dd HH:mm:ss'
$CSVFile = 'C:\FolderInfo\' + $ScanDate.Replace(':','-') + '.csv'
With
$ScanDate = Get-Date -Format 'yyyy-MM-dd-HH-mm-ss'
$CSVFile = "C:\FolderInfo\$ScanDate.csv"
This uses custom date format to set to what you wanted without the extra operation AND leverages PS way of evaluating variables within strings if the string is in double quotes. YMMV, but I also prefer 'yyyyMMdd-HHmmss' for datestamps.
Why are you defining a variable only to use it to define a second variable?
$ErrorPref = 'Continue'
$ErrorActionPreference = $ErrorPref
Why not $ErrorActionPreference = 'Continue'?
I found it later. Could probably use an explanation of what you're doing and how to do it when you define your preference. e.g. # Set default errorAction to 'Continue' during development, to show errors for debugging. Change this to 'silentlycontinue' to ignore errors during run. This will really help when you come back to this script in 18 months and are like WTF does this do?
Also, research Advanced Functions and CmdletBinding() so that you can build your function like a commandlet, including inputting a -debug switch, so you can write with debugging in mind.
What does Function Get-Subs actually do? It looks like some kind of recursion to get the path using the custom commandlet get-childitem2. Do you need the full path? Get-ChildItem $path -Directory -Recurse | select fullname where path is your UNC path or local path or any other provider, really.
Get-NTFSOwner not sure where this comes from, perhaps your custom module. You can use Get-ACL in Powershell 3 (not sure about 2, I don't remember). $owner = (Get-ACL 'path\file.ext').Owner.Replace('mydomain\','')
No idea what you're doing with all the inheritance stuff. Just keep in mind that you can get paths from Get-ChildItem | select Fullname and owner from Get-ACL. These may allow you to skip the custom module.
For:
$FileCount = Get-ChildItem2 -Path $Folder -File -IncludeHidden -IncludeSystem | Measure-Object -property Length -Sum
If ($FileCount.Sum) {$Size = $FileCount.Sum} else {$Size = 0}
If ($FileCount.Count) {$NumFiles = $FileCount.Count} else {$NumFiles = 0}
Use:
$files = Get-ChildItem -Force -File -Path $folder
-Force shows all files. -File limits it to files only. Then the number of files is $files.count. Works with empty folders, folders with one hidden file, and files with hidden and normal files.
For $folderinfo consider using a custom object. If you create it within the loop it should destroy the previous one. Then you can assign values directly to the object instead of storing them in a variable then inserting the variable into the hash table.
Using Get-ChildItem native will help you maintain this script far more easily than your customized module.
For:
| Select FolderName, Levels, Owner, ListUsers, ReadUsers, ModifyUsers, Inherited, Size, NumFiles, Created, LastModified, LastAccessed, ScanDate |
ConvertTo-csv -NoTypeInformation -Delimiter '|' |
ForEach-Object {$PSItem.Replace('"','')} |
Out-File -FilePath $CSVFile -Force
Try:
| Export-CSV -NoTypeInformation $CSVFile # by default this will overwrite, but you're timestamping down to the second so I don't think this will be an issue.
Overall:
Get rid of the custom module, I think everything you're doing here can be done in native PowerShell.
Consider recursion
Use an advanced function, complete with documentation. Ref: https://technet.microsoft.com/en-us/magazine/hh360993.aspx
Definitely use a custom object to store data for each file/folder. Passing a collection of objects to Export-CSV produces excellent output.
pipe to Select -ExpandProperty when an object contains a hashtable or another object, e.g. Get-ACL 'file.txt' | select -ExpandProperty Access gives a list of the access rule objects in the ACL.
Get-Help and Get-Member are amongst the most powerful commands in PowerShell.
Select-Object and Where-Object are up there too.
My Copy-Item doesn't work when included in a foreach loop.
Much like Powershell: Copy-Item not working when in ForEach loop
Only my destination folder is not set the same as the originating folder which seemed to be the problem there.
This is the very basic function. My objective is to grab the latest log files from a directory containting log files for lots of stuff. I'm only interested in a few defined in $servers. A line in Servers.txt looks like this: \\clientname\d$\logdirectory\processlog\
When I Set-Location to a path in servers.txt and run Get-ChildItem it works as expected.
I also need to generate a new folder for each object in \Logs\ but one thing at a time.
$servers = #()
$servers = Get-Content c:\Test\Servers.txt
$destServer = #()
$destServer = ( "clientname")
$destinationFolder = "\\" + $destServer + "\d$\Logs\"
foreach ($serverpath in $servers) {
Write-Host " Copying from $serverpath "
Set-Location -literalpath $serverpath |
Get-ChildItem |
Sort-Object -Descending LastWriteTime |
Select -First 2 |
Copy-Item -Destination $destinationFolder -Recurse -Force
}