Speeding up Test-Path and GCI on large data set

Speeding up Test-Path and GCI on large data set - powershell

I have a large list of Customer IDs (40,000). Files for each Customer have been saved in many different locations. I first run a Test-Path to see if the directory exists, if it does then I run Get-ChildItem and filter down the results to find the file I need (Any file matching string 'Contract'). I am hoping to educate myself on how to speed this up, I have attempted to introduce another variable 'Trigger' to try and prevent some of the excess code from running if the matching file is found. It is taking a very long time to loop this through 40,000 times so if there is a better way any help greatly appreciated, many thanks.
Here is the code I'm currently using
ForEach ($Customer in $CustomerList)
{
$Trigger = 0
$Result1 = Test-Path "$Location1\$Customer"
$Result2 = Test-Path "$Location2\$Customer"
$Result3 = Test-Path "$Location3\$Customer"
IF ($Result1 -eq $True)
{
$1Files = GCI "$Location1\$Customer" -Recurse
ForEach ($File IN $1Files)
{
IF ($Trigger -eq 0)
{
$FileName = $File.Name
$FileLocation = $File.FullName
IF ($FileName -match 'Contract')
{
$Report += "$FileName $FileLocation"
$Trigger = 1
}
}
}
}
ELSEIF ($Result2 -eq True)
{
Same as result 1 codeblock but using $Location2
}
ELSEIF ($Result3 -eq True)
{
Same as result 1 codeblock but using $Location3
}
}

Related

Powershell: How to find size of several folders and all of their subfolders

I've been trying to write a script which will do the following:
a.) Show the path of several folders and all of their subfolders
b.) Show the number of files in all of these folders and subfolders
c.) Show the size of the contents of these folders and each of their subfolders
So far, a.) and b.) have been simple, with something like the following:
$folders = #('C:\Directory1','C:\Directory2','C:\Directory3')
$output = foreach ($folder in $folders) {
Get-ChildItem $folder -Recurse -Directory | ForEach-Object{
[pscustomobject]#{
Folder = $_.FullName
Count = #(Get-ChildItem -Path $_.Fullname -File).Count
}
} | Select-Object Folder,Count
}
$output | export-csv C:\Temp\Folderinfo.csv
This worked great, but I haven't been able to get Powershell to output the folder sizes alongside the paths and numbers of files. I tried to use the Get-DirectorySize function from this StackOverflow thread, and could only get it to output the size of the top-level folder, and never the subfolders. I have also tried passing Get-ChildItem to Measure-Object -Property Length -sum but ran into similar problems, where it would only show the size of the top-level folder.
Does anyone know the correct way to incorporate Measure-Object or Get-DirectorySize into this script, or one like it, so that it works with the needed recursion, and outputs the folder size of each path?
Thanks!

I will go ahead and assume you're looking to show the size of the folders recursively, similar to how explorer does it. In which case you would need to gather the sum of the files size in a folder (which you already have) but also add to that sum, the sum of each child folder in that parent folder. For this you can use an OrderedDictionary which will serve as an indexer, a structure that will hold information of each processed folder and allow for fast lookups of processed folders to add to the stored size.
Function Definition
function Get-DirectoryInfo {
[CmdletBinding()]
param(
[Parameter(Mandatory, ValueFromPipelineByPropertyName, ValueFromPipeline)]
[alias('FullName')]
[ValidateScript({
if(Test-Path $_ -PathType Container) {
return $true
}
throw 'Directories only!'
})]
[string[]] $LiteralPath,
[Parameter()]
[switch] $Force
)
begin {
class Tree {
[string] $FullName
[Int64] $Count
[Int64] $Size
[string] $FriendlySize
[void] AddSize([Int64] $Size) {
$this.Size += $Size
}
[void] SetFriendlySize() {
# Inspired from https://stackoverflow.com/a/40887001/15339544
$suffix = "B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB"
$index = 0
$length = $this.Size
while ($Length -ge 1kb) {
$Length /= 1kb
$index++
}
$this.FriendlySize = [string]::Format(
'{0} {1}',
[math]::Round($Length, 2), $suffix[$index]
)
}
}
$indexer = [ordered]#{}
$stack = [Collections.Generic.Stack[IO.DirectoryInfo]]::new()
$PSBoundParameters['Directory'] = $true
}
process {
foreach($folder in $LiteralPath | Get-Item) {
$stack.Push($folder)
}
while($stack.Count) {
$PSBoundParameters['LiteralPath'] = $stack.Pop().FullName
foreach($folder in Get-ChildItem #PSBoundParameters) {
$stack.Push($folder)
$count = 0
$size = 0
foreach($file in $folder.EnumerateFiles()) {
$count++
$size += $file.Length
}
$indexer[$folder.FullName] = [Tree]#{
FullName = $folder.FullName
Count = $count
Size = $size
}
$parent = $folder.Parent
while($parent -and $indexer.Contains($parent.FullName)) {
$indexer[$parent.FullName].AddSize($size)
$parent = $parent.Parent
}
}
}
$output = $indexer.PSBase.Values
if($output.Count) {
$output.SetFriendlySize()
$output
}
$indexer.Clear()
}
}
Usage
Get-ChildItem 'C:\Directory1', 'C:\Directory2', 'C:\Directory3' |
Get-DirectoryInfo
The logic used in this function is more or less similar to the one used in this module to display File / Folder hierarchy (similar to the tree command).

Identifying best photos from a photo collection

I borrowed code from this website and it looks helpful to me as a photographer. Since I do not know PowerShell I would like to ask if this code looks OK. I do not want to lose my photos by mistake.
$source_to_use_as_reference = "C:\photos\mytrip_to_hawai\Best\"
$destination_to_copy = "C:\photos\mytrip_to_hawai\Best\Best_CR2\"
$location_to_find_CR2_files = "C:\photos\mytrip_to_hawai\CR2\"
# these are the codes to find CR2 files that matches with JPG files and copy
# them to new destination
cls
$count_all = 0
$count_matches = 0
$a = Get-ChildItem -Path $source_to_use_as_reference -Recurse -File
foreach ($item in $a) {
$count_all += 1
if ($item.Name -match "JPG") {
$name_as_CR2 = $item.Name.Replace('JPG','CR2')
$location_and_cr2_name = $location_to_find_CR2_files + $name_as_CR2
if (Test-Path -Path $location_and_cr2_name ) {
$destination_and_CR2_name = $destination_to_copy + $name_as_CR2
if (Test-Path -Path $destination_and_CR2_name) {
Write-Output "already exists I skipped ... " $destination_and_CR2_name
} else {
$count_matches += 1
Write-Host "I found it " $destination_and_CR2_name
Copy-Item -Path $location_and_cr2_name -Destination $destination_to_copy
}
} else {
}
}
}
Write-Output "$count_matches matches found and files copied to destination $destination_to_copy"

There's nothing sinister in that script, it simply identifies, counts and copies JPGs and CR2 files to a 2nd location.

Why Isn't This Counting Correctly | PowerShell

Right now, I have a CSV file which contains 3,800+ records. This file contains a list of server names, followed by an abbreviation stating if the server is a Windows server, Linux server, etc. The file also contains comments or documentation, where each line starts with "#", stating it is a comment. What I have so far is as follows.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal = #($arraysplit[0], $arraysplit[1])
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
The goal of this script is to count the total number of Windows servers. My issue is that the first "foreach" block works fine, but the second one results in "$Windows" being 0. I'm honestly not sure why this isn't working. Two example lines of data are as follows:
example:LNX
example2:NT

if the goal is to count the windows servers, why do you need the array?
can't you just say something like
foreach ($thing in $file)
{
if ($thing -notmatch "^#" -and $thing -match "NT") { $windows++ }
}

$arrayfinal = #($arraysplit[0], $arraysplit[1])
This replaces the array for every run.
Changing it to += gave another issue. It simply appended each individual element. I used this post's info to fix it, sort of forcing a 2d array: How to create array of arrays in powershell?.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal += ,$arraysplit
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
1
I also changed the file around and added more instances of both NT and other random garbage. Seems it works fine.

I'd avoid making another ForEach loop for bumping count occurrences. Your $arrayfinal also rewrites everytime, so I used ArrayList.
$file = Get-Content "E:\Code\PS\myPS\2018\Jun\12\allSystems.csv"
$arrayFinal = New-Object System.Collections.ArrayList($null)
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing -split ":"
if($arraysplit[1] -match "NT" -or $arraysplit[1] -match "Windows")
{
$arrayfinal.Add($arraysplit[1]) | Out-Null
}
}
}
Write-Host "Entries with 'NT' or 'Windows' $($arrayFinal.Count)"
I'm not sure if you want to keep 'Example', 'example2'... so I have skipped adding them to arrayfinal, assuming the goal is to count "NT" or "Windows" occurrances

The goal of this script is to count the total number of Windows servers.
I'd suggest the easy way: using cmdlets built for this.
$csv = Get-Content -Path .\file.csv |
Where-Object { -not $_.StartsWith('#') } |
ConvertFrom-Csv
#($csv.servertype).Where({ $_.Equals('NT') }).Count
# Compatibility mode:
# ($csv.servertype | Where-Object { $_.Equals('NT') }).Count
Replace servertype and 'NT' with whatever that header/value is called.

Find first available file to output log to

I am working on a complete rewrite of my logging function that I use for a couple hundred scripts and I am trying to make it as robust as possible. I am trying to make it be able to create go through a very basic set of checks to find the first available log that it can write to.
I am trying to write it so it will attempt to write to each log file (in case the files have different permisisons than the directories)
Logic path
Go through each directory in the list
See if there are any logs I can append to
If there is append to them
If not, try to create a new log with # appended to it.
If cannot create a new log, move on to the next directory
This script isn't very difficult, I've written much more complex scripts, but for some reason my brain will not wrap its head around this and I keep coming up with non-robust very repetetive functions and I am trying to keep effiency and speed as the most important priority.
Function TestLogger {
$WriteTee = #{}
$WriteTee.LogName = 'WriteTee.log'
#$WriteTee.LogName = "$(((Split-Path -Leaf $script:MyInvocation.MyCommand.Definition)).BaseName).txt"
$WriteTee.LogPaths = "C:\Windows\",
'C:\Users\1151577373E\Documents\Powershell Scripts\AutoUpdater\',
"$Env:UserProfile"
#"$(Split-Path -Parent $script:MyInvocation.MyCommand.Definition)"
foreach ($Path in $WriteTee.LogPaths) {
$Path = [System.IO.DirectoryInfo]$Path
#Ensure the directory exists and if not, create it.
if (![System.IO.Directory]::Exists($Path)) {
try {
New-Item -Path $Path.Parent.FullName -Name $Path.BaseName -ItemType 'Directory' -ErrorAction Stop -Force | Out-Null
} catch {
continue
}
}
#Check to see if there are any existing logs
$WriteTee.ExistingLogs = Get-ChildItem -LiteralPath $Path -Filter "$(([System.IO.FileInfo]$WriteTee.LogName).BaseName)*$(([System.IO.FileInfo]$WriteTee.LogName).Extension)" |Sort-Object
if ($WriteTee.ExistingLogs.Count -eq 0) {
$WriteTee.LastLogName = $null
} else {
foreach ($ExistingLog in $WriteTee.ExistingLogs) {
try {
[IO.File]::OpenWrite($ExistingLog.FullName).close() | Out-Null
$WriteTee.LogFile = $ExistingLog.FullName
break
} catch {
$WriteTee.LastLogName = $ExistingLog
continue
}
}
}
#If no previous logs can be added to create a new one.
if (-not $WriteTee.ContainsKey('LogFile')) {
switch ($WriteTee.LastLogName.Name) {
{$_ -eq $Null} {
$WriteTee.ExistingLogs.count
Write-Host Create New File
}
{$_ -match '.*\[[0-9]+\]\.'} {
Write-Host AAAAAA
$WriteTee.NextLogName = $WriteTee.NextLogName.FullName.Split('[]')
$WriteTee.NextLogName = $WriteTee.NextLogName[0] + "[" + ([int]($WriteTee.NextLogName[1]) + 1) + "]" + $WriteTee.NextLogName[2]
}
default {}
}
}
#Determine if log file is available or not.
if ($WriteTee.ContainsKey('LogFile')) {
Write-Host "Function Success"
break
} else {
continue
}
}
return $WriteTee.LogFile
}
clear
TestLogger

I think I just burnt myself out yesturday, good night sleep got me going again. here is what I ended up with, really hope someone else finds some use out of it.
Function TestLogger {
$WriteTee = #{}
$WriteTee.LogName = 'WriteTee.log'
#$WriteTee.LogName = "$(((split-path -leaf $Script:MyInvocation.MyCommand.Definition)).BaseName).Log"
$WriteTee.LogPaths = 'C:\Windows\',
"C:\Users\1151577373E\Documents\Powershell Scripts\AutoUpdater\",
"$Env:UserProfile"
#"$(split-path -parent $Script:MyInvocation.MyCommand.Definition)"
Foreach ($Path in $WriteTee.LogPaths) {
If ($WriteTee.ContainsKey('LogFile')) { Break }
$Path = [System.IO.DirectoryInfo]$Path
#Ensure the directory exists and if not, create it.
If (![System.IO.Directory]::Exists($Path)) {
Try {
#Create the directory because .Net will error out if you try to create a file in a directory that doesn't exist yet.
New-Item -Path $Path.Parent.FullName -Name $Path.BaseName -ItemType 'Directory' -ErrorAction Stop -Force |Out-Null
} Catch {
Continue
}
}#End-If
#Check to see if there are any existing logs
$WriteTee.ExistingLogs = Get-ChildItem -LiteralPath $Path -Filter "$(([System.IO.FileInfo]$WriteTee.LogName).BaseName)*$(([System.IO.FileInfo]$WriteTee.LogName).Extension)" |Sort-Object
If ($WriteTee.ExistingLogs.Count -GT 0) {
ForEach ($ExistingLog in $WriteTee.ExistingLogs) {
Try {
[io.file]::OpenWrite($ExistingLog.FullName).close() |Out-Null
$WriteTee.LogFile = $ExistingLog.FullName
break
} Catch {
$WriteTee.LastLogName = $ExistingLog
Continue
}
}
}#End-If
#If no previous logs can be added to create a new one.
switch ($WriteTee.ExistingLogs.Count) {
{$PSItem -EQ 0} {
$WriteTee.TestLogFile = Join-Path -Path $Path -ChildPath $WriteTee.LogName
}
{$PSItem -EQ 1} {
$WriteTee.TestLogFile = Join-Path -Path $Path -ChildPath ($WriteTee.LastLogName.basename + '[0]' + $WriteTee.LastLogName.Extension)
}
{$PSItem -GE 2} {
$WriteTee.TestLogFile = $WriteTee.LastLogName.FullName.Split('[]')
$WriteTee.TestLogFile = ($WriteTee.TestLogFile[0] + '[' + (([int]$WriteTee.TestLogFile[1]) + 1) + ']' + $WriteTee.TestLogFile[2])
}
Default {
Write-Host "If you are looking for an explanation of how you got here, I can tell you I don't have one. But what I do have are a very particular lack of answers that I have aquired over a very long career that make these problems a nightmare for people like me."
Continue
}
}#End-Switch
#Last but not least, try to create the file and hope it is successful.
Try {
[IO.File]::Create($WriteTee.TestLogFile, 1, 'None').close() |Out-Null
$WriteTee.LogFile = $WriteTee.TestLogFile
Break
} Catch {
Continue
}
}#End-ForEach
Return $WriteTee.LogFile
}

What is the cleanest way to join in one array the result of two or more calls to Get-ChildItem?

I'm facing the problem of moving and copying some items on the file system with PowerShell.
I know by experiments the fact that, even with PowerShell v3, the cmdlet Copy-Item, Move-Item and Delete-Item cannot handle correctly reparse point like junction and symbolic link, and can lead to disasters if used with switch -Recurse.
I want to prevent this evenience. I have to handle two or more folder each run, so I was thinking to something like this.
$Strings = #{ ... }
$ori = Get-ChildItem $OriginPath -Recurse
$dri = Get-ChildItem $DestinationPath -Recurse
$items = ($ori + $dri) | where { $_.Attributes -match 'ReparsePoint' }
if ($items.Length -gt 0)
{
Write-Verbose ($Strings.LogExistingReparsePoint -f $items.Length)
$items | foreach { Write-Verbose " $($_.FullName)" }
throw ($Strings.ErrorExistingReparsePoint -f $items.Length)
}
This doen't work because $ori and $dri can be also single items and not arrays: the op-Addition will fail. Changing to
$items = #(#($ori) + #($dri)) | where { $_.Attributes -match 'ReparsePoint' }
poses another problem because $ori and $dri can also be $null and I can end with an array containing $null. When piping the join resutl to Where-Object, again, I can end with a $null, a single item, or an array.
The only apparently working solution is the more complex code following
$items = $()
if ($ori -ne $null) { $items += #($ori) }
if ($dri -ne $null) { $items += #($dri) }
$items = $items | where { $_.Attributes -match 'ReparsePoint' }
if ($items -ne $null)
{
Write-Verbose ($Strings.LogExistingReparsePoint -f #($items).Length)
$items | foreach { Write-Verbose " $($_.FullName)" }
throw ($Strings.ErrorExistingReparsePoint -f #($items).Length)
}
There is some better approch?
I'm interested for sure if there is a way to handle reparse point with PowerShell cmdlets in the correct way, but I'm much more interested to know how to join and filters two or more "PowerShell collections".
I conclude observing that, at present, this feature of PowerShell, the "polymorphic array", doen't appear such a benefit to me.
Thanks for reading.

Just add a filter to throw out nulls. You're on the right track.
$items = #(#($ori) + #($dri)) | ? { $_ -ne $null }

I've been on Powershell 3 for a while now but from what I can tell this should work in 2.0 as well:
$items = #($ori, $dri) | %{ $_ } | ? { $_.Attributes -match 'ReparsePoint' }
Basically %{ $_ } is a foreach loop that unrolls the inner arrays by iterating over them and passing each inner element ($_) down the pipeline. Nulls will automatically be excluded from the pipeline.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Speeding up Test-Path and GCI on large data set - powershell

Related

Powershell: How to find size of several folders and all of their subfolders

Identifying best photos from a photo collection

Why Isn't This Counting Correctly | PowerShell

Find first available file to output log to

What is the cleanest way to join in one array the result of two or more calls to Get-ChildItem?

Categories

Resources