I have a powershell script which uses Compare-Object to diff/compare a list of MD5 checksum's against each-other ... how can I speed this up? its been running for hours!
$diffmd5 =
(Compare-Object -ReferenceObject $localmd5 -DifferenceObject $remotefilehash |
Where-Object { ($_.SideIndicator -eq '=>') } |
Select-Object -ExpandProperty InputObject)
Compare-Object is convenient, but indeed slow; also avoiding the pipeline altogether is important for maximizing performance.
I suggest using a [System.Collections.Generic.HashSet[T] instance, which supports high-performance lookups in a set of unordered[1] values:[2]
# Two sample arrays
$localmd5 = 'Foo1', 'Bar1', 'Baz1'
$remotefilehash = 'foo1', 'bar1', 'bar2', 'baz1', 'more'
# Create a hash set from the local hashes.
# Make lookups case-*insensitive*.
# Note: Strongly typing the input array ([string[]]) is a must.
$localHashSet = [System.Collections.Generic.HashSet[string]]::new(
[string[]] $localmd5,
[System.StringComparer]::OrdinalIgnoreCase
)
# Loop over all remote hashes to find those not among the local hashes.
$remotefilehash.Where({ -not $localHashSet.Contains($_) })
The above yields collection 'bar2', 'more'.
Note that if case-sensitive lookups are sufficient, which is the default (for string elements), a simple cast is sufficient to construct the hash set:
$localHashSet = [System.Collections.Generic.HashSet[string]] $localmd5
Note: Your later feedback states that $remotefilehash is a hashtable(-like) collection of key-value pairs rather than a collection of mere file-hash strings, in which the keys store the hash strings. In that case:
To find just the differing hash strings (note the .Keys property access to get the array of key values):
$remotefilehash.Keys.Where({ -not $localHashSet.Contains($_) })
To find those key-value pairs whose keys are not in the hash set (note the .GetEnumerator() call to enumerate all entries (key-value pairs)):
$remotefilehash.GetEnumerator().Where({ -not $localHashSet.Contains($_.Key) })
Alternatively, if the input collections are (a) of the same size and (b) have corresponding elements (that is, element 1 from one collection should be compared to element 1 from the other, and so on), using Compare-Object with -SyncWindow 0, as shown in js2010's helpful answer, with subsequent .SideIndicator filtering may be an option; to speed up the operation, the -PassThru switch should be used, which forgoes wrapping the differing objects in [pscustomobject] instances (the .SideIndicator property is then added as a NoteProperty member directly to the differing objects).
[1] There is a related type for maintaining sorted values, System.Collections.Generic.SortedSet[T], but - as of .NET 6 - no built-in type for maintaining values in input order, though you can create your own type by deriving from [System.Collections.ObjectModel.KeyedCollection[TKey, TItem]]
[2] Note that a hash set - unlike a hash table - has no values associated with its entries. A hash set is "all keys", if you will - all it supports is testing for the presence of a key == value.
By default, compare-object compares every element in the first array with every element in the second array (up to about 2 billion positions), so the order doesn't matter, but large lists would be very slow. -syncwindow 0 would be much faster but would require matches to be in the same exact positions:
Compare-Object $localmd5 $remotefilehash -syncwindow 0
As a simple demo of -syncwindow:
compare-object 1,2,3 1,3,2 -SyncWindow 0 # not equal
InputObject SideIndicator
----------- -------------
3 =>
2 <=
2 =>
3 <=
compare-object 1,2,3 1,3,2 -SyncWindow 1 # equal
compare-object 1,2,3 1,2,3 -SyncWindow 0 # equal
I feel this should be faster than Compare-Object
$result = [system.collections.generic.list[string]]::new()
foreach($hash in $remotefilehash)
{
if(-not($localmd5.contains($hash)))
{
$result.Add($hash)
}
}
The problem here is that .contains method is case sensitive, I believe all MD5 hashes have uppercase letters but if this was not the case you would need to call the .toupper() or .tolower() methods to normalize the arrays.
Related
Let's say we have an array of objects $objects. Let's say these objects have a "Name" property.
This is what I want to do
$results = #()
$objects | %{ $results += $_.Name }
This works, but can it be done in a better way?
If I do something like:
$results = objects | select Name
$results is an array of objects having a Name property. I want $results to contain an array of Names.
Is there a better way?
I think you might be able to use the ExpandProperty parameter of Select-Object.
For example, to get the list of the current directory and just have the Name property displayed, one would do the following:
ls | select -Property Name
This is still returning DirectoryInfo or FileInfo objects. You can always inspect the type coming through the pipeline by piping to Get-Member (alias gm).
ls | select -Property Name | gm
So, to expand the object to be that of the type of property you're looking at, you can do the following:
ls | select -ExpandProperty Name
In your case, you can just do the following to have a variable be an array of strings, where the strings are the Name property:
$objects = ls | select -ExpandProperty Name
As an even easier solution, you could just use:
$results = $objects.Name
Which should fill $results with an array of all the 'Name' property values of the elements in $objects.
To complement the preexisting, helpful answers with guidance of when to use which approach and a performance comparison.
Outside of a pipeline[1], use (requires PSv3+):
$objects.Name # returns .Name property values from all objects in $objects
as demonstrated in rageandqq's answer, which is both syntactically simpler and much faster.
Accessing a property at the collection level to get its elements' values as an array (if there are 2 or more elements) is called member-access enumeration and is a PSv3+ feature.
Alternatively, in PSv2, use the foreach statement, whose output you can also assign directly to a variable: $results = foreach ($obj in $objects) { $obj.Name }
If collecting all output from a (pipeline) command in memory first is feasible, you can also combine pipelines with member-access enumeration; e.g.:
(Get-ChildItem -File | Where-Object Length -lt 1gb).Name
Tradeoffs:
Both the input collection and output array must fit into memory as a whole.
If the input collection is itself the result of a command (pipeline) (e.g., (Get-ChildItem).Name), that command must first run to completion before the resulting array's elements can be accessed.
In a pipeline, in case you must pass the results to another command, notably if the original input doesn't fit into memory as a whole, use: $objects | Select-Object -ExpandProperty Name
The need for -ExpandProperty is explained in Scott Saad's answer (you need it to get only the property value).
You get the usual pipeline benefits of the pipeline's streaming behavior, i.e. one-by-one object processing, which typically produces output right away and keeps memory use constant (unless you ultimately collect the results in memory anyway).
Tradeoff:
Use of the pipeline is comparatively slow.
For small input collections (arrays), you probably won't notice the difference, and, especially on the command line, sometimes being able to type the command easily is more important.
Here is an easy-to-type alternative, which, however is the slowest approach; it uses ForEach-Object via its built-in alias, %, with simplified syntax (again, PSv3+):
; e.g., the following PSv3+ solution is easy to append to an existing command:
$objects | % Name # short for: $objects | ForEach-Object -Process { $_.Name }
Note: Use of the pipeline is not the primary reason this approach is slow, it is the inefficient implementation of the ForEach-Object (and Where-Object) cmdlets, up to at least PowerShell 7.2. This excellent blog post explains the problem; it led to feature request GitHub issue #10982; the following workaround greatly speeds up the operation (only somewhat slower than a foreach statement, and still faster than .ForEach()):
# Speed-optimized version of the above.
# (Use `&` instead of `.` to run in a child scope)
$objects | . { process { $_.Name } }
The PSv4+ .ForEach() array method, more comprehensively discussed in this article, is yet another, well-performing alternative, but note that it requires collecting all input in memory first, just like member-access enumeration:
# By property name (string):
$objects.ForEach('Name')
# By script block (more flexibility; like ForEach-Object)
$objects.ForEach({ $_.Name })
This approach is similar to member-access enumeration, with the same tradeoffs, except that pipeline logic is not applied; it is marginally slower than member-access enumeration, though still noticeably faster than the pipeline.
For extracting a single property value by name (string argument), this solution is on par with member-access enumeration (though the latter is syntactically simpler).
The script-block variant ({ ... }) allows arbitrary transformations; it is a faster - all-in-memory-at-once - alternative to the pipeline-based ForEach-Object cmdlet (%).
Note: The .ForEach() array method, like its .Where() sibling (the in-memory equivalent of Where-Object), always returns a collection (an instance of [System.Collections.ObjectModel.Collection[psobject]]), even if only one output object is produced.
By contrast, member-access enumeration, Select-Object, ForEach-Object and Where-Object return a single output object as-is, without wrapping it in a collection (array).
Comparing the performance of the various approaches
Here are sample timings for the various approaches, based on an input collection of 10,000 objects, averaged across 10 runs; the absolute numbers aren't important and vary based on many factors, but it should give you a sense of relative performance (the timings come from a single-core Windows 10 VM:
Important
The relative performance varies based on whether the input objects are instances of regular .NET Types (e.g., as output by Get-ChildItem) or [pscustomobject] instances (e.g., as output by Convert-FromCsv).
The reason is that [pscustomobject] properties are dynamically managed by PowerShell, and it can access them more quickly than the regular properties of a (statically defined) regular .NET type. Both scenarios are covered below.
The tests use already-in-memory-in-full collections as input, so as to focus on the pure property extraction performance. With a streaming cmdlet / function call as the input, performance differences will generally be much less pronounced, as the time spent inside that call may account for the majority of the time spent.
For brevity, alias % is used for the ForEach-Object cmdlet.
General conclusions, applicable to both regular .NET type and [pscustomobject] input:
The member-enumeration ($collection.Name) and foreach ($obj in $collection) solutions are by far the fastest, by a factor of 10 or more faster than the fastest pipeline-based solution.
Surprisingly, % Name performs much worse than % { $_.Name } - see this GitHub issue.
PowerShell Core consistently outperforms Windows Powershell here.
Timings with regular .NET types:
PowerShell Core v7.0.0-preview.3
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.005
1.06 foreach($o in $objects) { $o.Name } 0.005
6.25 $objects.ForEach('Name') 0.028
10.22 $objects.ForEach({ $_.Name }) 0.046
17.52 $objects | % { $_.Name } 0.079
30.97 $objects | Select-Object -ExpandProperty Name 0.140
32.76 $objects | % Name 0.148
Windows PowerShell v5.1.18362.145
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.012
1.32 foreach($o in $objects) { $o.Name } 0.015
9.07 $objects.ForEach({ $_.Name }) 0.105
10.30 $objects.ForEach('Name') 0.119
12.70 $objects | % { $_.Name } 0.147
27.04 $objects | % Name 0.312
29.70 $objects | Select-Object -ExpandProperty Name 0.343
Conclusions:
In PowerShell Core, .ForEach('Name') clearly outperforms .ForEach({ $_.Name }). In Windows PowerShell, curiously, the latter is faster, albeit only marginally so.
Timings with [pscustomobject] instances:
PowerShell Core v7.0.0-preview.3
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.006
1.11 foreach($o in $objects) { $o.Name } 0.007
1.52 $objects.ForEach('Name') 0.009
6.11 $objects.ForEach({ $_.Name }) 0.038
9.47 $objects | Select-Object -ExpandProperty Name 0.058
10.29 $objects | % { $_.Name } 0.063
29.77 $objects | % Name 0.184
Windows PowerShell v5.1.18362.145
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.008
1.14 foreach($o in $objects) { $o.Name } 0.009
1.76 $objects.ForEach('Name') 0.015
10.36 $objects | Select-Object -ExpandProperty Name 0.085
11.18 $objects.ForEach({ $_.Name }) 0.092
16.79 $objects | % { $_.Name } 0.138
61.14 $objects | % Name 0.503
Conclusions:
Note how with [pscustomobject] input .ForEach('Name') by far outperforms the script-block based variant, .ForEach({ $_.Name }).
Similarly, [pscustomobject] input makes the pipeline-based Select-Object -ExpandProperty Name faster, in Windows PowerShell virtually on par with .ForEach({ $_.Name }), but in PowerShell Core still about 50% slower.
In short: With the odd exception of % Name, with [pscustomobject] the string-based methods of referencing the properties outperform the scriptblock-based ones.
Source code for the tests:
Note:
Download function Time-Command from this Gist to run these tests.
Assuming you have looked at the linked code to ensure that it is safe (which I can personally assure you of, but you should always check), you can install it directly as follows:
irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 | iex
Set $useCustomObjectInput to $true to measure with [pscustomobject] instances instead.
$count = 1e4 # max. input object count == 10,000
$runs = 10 # number of runs to average
# Note: Using [pscustomobject] instances rather than instances of
# regular .NET types changes the performance characteristics.
# Set this to $true to test with [pscustomobject] instances below.
$useCustomObjectInput = $false
# Create sample input objects.
if ($useCustomObjectInput) {
# Use [pscustomobject] instances.
$objects = 1..$count | % { [pscustomobject] #{ Name = "$foobar_$_"; Other1 = 1; Other2 = 2; Other3 = 3; Other4 = 4 } }
} else {
# Use instances of a regular .NET type.
# Note: The actual count of files and folders in your file-system
# may be less than $count
$objects = Get-ChildItem / -Recurse -ErrorAction Ignore | Select-Object -First $count
}
Write-Host "Comparing property-value extraction methods with $($objects.Count) input objects, averaged over $runs runs..."
# An array of script blocks with the various approaches.
$approaches = { $objects | Select-Object -ExpandProperty Name },
{ $objects | % Name },
{ $objects | % { $_.Name } },
{ $objects.ForEach('Name') },
{ $objects.ForEach({ $_.Name }) },
{ $objects.Name },
{ foreach($o in $objects) { $o.Name } }
# Time the approaches and sort them by execution time (fastest first):
Time-Command $approaches -Count $runs | Select Factor, Command, Secs*
[1] Technically, even a command without |, the pipeline operator, uses a pipeline behind the scenes, but for the purpose of this discussion using the pipeline refers only to commands that use |, the pipeline operator, and therefore by definition involve multiple commands.
Caution, member enumeration only works if the collection itself has no member of the same name. So if you had an array of FileInfo objects, you couldn't get an array of file lengths by using
$files.length # evaluates to array length
And before you say "well obviously", consider this. If you had an array of objects with a capacity property then
$objarr.capacity
would work fine UNLESS $objarr were actually not an [Array] but, for example, an [ArrayList]. So before using member enumeration you might have to look inside the black box containing your collection.
(Note to moderators: this should be a comment on rageandqq's answer but I don't yet have enough reputation.)
I learn something new every day! Thank you for this. I was trying to achieve the same. I was directly doing this:
$ListOfGGUIDs = $objects.{Object GUID}
Which basically made my variable an object again! I later realized I needed to define it first as an empty array,
$ListOfGGUIDs = #()
So I have the following code that ingests AD users on a domain controller. The following throws an error:
# User Props to select
$user_props = #(
'Name',
'DistinguishedName',
'SamAccountName',
'Enabled',
'SID'
)
# Get AD groups an AD user is a member of
$user_groups = #{ label = 'GroupMemberships'; expression = { (Get-ADPrincipalGroupMembership -Identity $_.DistinguishedName).Name } }
# Get AD Users
$users = Get-ADUser -Filter * -Property $user_props | Select-Object $user_props, $user_groups -ErrorAction Stop -ErrorVariable _error
However, if I were to change $users to the following:
$users = Get-ADUser -Filter * -Property $user_props | Select-Object Name, DistinguishedName, SamAccountName, Enabled, SID, $user_groups -ErrorAction Stop -ErrorVariable _error
I no longer get this error. Is there a way I can define $user_props such that I don't need to type out each property and still use my custom calculated property $user_groups?
I believe the issue has to do with mixing an array ($user_props) with a hashtable ($user_groups) but I'm unsure how to best write this. Thank you for the help!
The easiest way to "concatenate" two variables into one flat array is to use the #(...) array subexpression operator:
... |Select-Object #($user_props;$user_groups) ...
Since this issue keeps coming up, let me complement Mathias R. Jessen's effective solution with some background information:
Select-Object's (potentially positionally implied) -Property parameter requires a flat array of property names and/or calculated properties (for brevity, both referred to as just property names below).
Therefore, if you need to combine two variables containing arrays of property names, or combine literally enumerated property name(s) with such variables, you must explicitly construct a flat array.
Therefore, simply placing , between your array variables does not work, because it creates a jagged (nested) array:
# Two arrays with property names.
$user_props = #('propA', 'propB')
$user_groups = #('grpA', 'grpB')
# !! WRONG: This passes a *jagged* array, not a flat one.
# -> ERROR: "Cannot convert System.Object[] to one of the following types
# {System.String, System.Management.Automation.ScriptBlock}."
'foo' | Select-Object $user_props, $user_groups
# !! WRONG:
# The same applies if you tried to combine one or more *literal* property names
# with an array variable.
'foo' | Select-Object 'propA', $user_groups
'foo' | Select-Object $user_groups, 'propA', 'propB'
That is, $user_props, $user_groups effectively passes
#( #('propA', 'propB'), #('grpA', 'grpB') ), i.e. a jagged (nested) array,
whereas what you need to pass is
#('propA', 'propB', 'grpA', 'grpB'), i.e. a flat array.
Mathias' solution is convenient in that you needn't know or care whether $user_props and $user_groups contain arrays or just a single property name, due to how #(...), the array-subexpression operator works - the result will be a flat array:
# Note how the variable references are *separate statements*, whose
# output #(...) collects in an array:
'foo' | Select-Object #($user_props; $user_groups)
# Ditto, with a literal property name.
# Note the need to *quote* the literal name in this case.
'foo' | Select-Object #('propA'; $user_groups)
In practice it won't make a difference for this use case, so this is a convenient and pragmatic solution, but generally it's worth noting that #(...) enumerates array variables used as statements inside it, and then collects the results in a new array. That is, both $user_props and $user_groups are sent to the pipeline element by element, and the resulting, combined elements are collected in a new array.
A direct way to flatly concatenate arrays (or append a single element to an array) is to use the + operator with (at least) an array-valued LHS. This, of necessity, returns a new array that is copy of the LHS array with the element(s) of the RHS directly appended:
# Because $user_props is an *array*, "+" returns an array with the RHS
# element(s) appended to the LHS element.
'foo' | Select-Object ($user_props + $user_groups)
If you're not sure if $user_props is an array, you can simply cast to [array], which also works with a single, literal property name:
# The [array] cast ensures that $user_props is treated as an array, even if it isn't one.
# Note:
# 'foo' | Select-Object (#($user_props) + $user_groups)
# would work too, but would again needlessly enumerate the array first.
'foo' | Select-Object ([array] $user_props + $user_groups)
# Ditto, with a single, literal property name
'foo' | Select-Object ([array] 'propA' + $user_groups)
# With *multiple* literal property names (assuming they come first), the
# cast is optional:
'foo' | Select-Object ('propA', 'propB' + $user_groups)
Note:
The above uses (...), the grouping operator, in order to pass expressions as arguments to a command - while you frequently see $(...), the subexpression operator used instead, this is not necessary and can have unwanted side effects - see this answer.
#(...) isn't strictly needed to declare array literals, such as #('foo', 'bar') - 'foo', 'bar' is sufficient, though you may prefers enclosing in #(...) for visual clarity. In argument parsing mode, quoting is optional for simple strings, so that Write-Output foo, name is the simpler alternative to Write-Output #('foo', 'name')
,, the array constructor operator, has perhaps surprisingly high precedence, so that 1, 2 + 3 is parsed as (1, 2) + 3, resulting in 1, 2, 3
In PowerShell, if $dt is a datatable, I am used to using foreach() to do row-by-row operations. For example...
foreach ($tmpRow in $dt) {
Write-Host $tmpRow.blabla
}
I just want to get the first row (of n columns). I could introduce a counter $i and just break the foreach loop on the first iteration, but that seems clunky. Is there a more direct way of achieving this?
For a collection (array) that is already in memory, use indexing, namely [0]:
Note: Normally, $dt[0] should suffice, but in this case the index must be applied to the .Rows property, as Theo advises:
$dt.Rows[0].blabla
Given that PowerShell automatically enumerates a System.Data.DataTable by enumerating the System.Data.DataRow instances stored in its .Rows property - both in the pipeline and in a foreach loop, as in your code - the need to specify .Rows explicitly for indexing is surprising.
With $dt containing a System.Data.DataTable instance, $dt[0] is actually the same as just $dt itself, because PowerShell in this context considers $dt a single object, and generally supports indexing even into such single objects, in the interest of unified treatment of scalars and arrays - see this answer for background information.
For command output, use Select-Object -First 1. Using the example of Invoke-SqlCmd
(Invoke-SqlCommand ... | Select-Object -First 1).blabla
Note: Since Invoke-SqlCommand by default outputs individual System.Data.DataRow instances (one by one), you can directly access property .blabla on the result.
The advantage of using Select-Object -First 1 is that short-circuits the pipeline and returns once the first output object has been received, obviating the need to retrieve further objects.
PowerShell automatically enumerates all rows when you pipe a DataTable, so you could pipe it to Select-Object -First 1:
# set up sample table
$dt = [System.Data.DataTable]::new()
[void]$dt.Columns.Add('ID', [int])
[void]$dt.Columns.Add('Name', [string])
# initialize with 2 rows
[void]$dt.Rows.Add(1, "Clive")
[void]$dt.Rows.Add(2, "Mathias")
# enumerate only 1 row
foreach ($tmpRow in $dt |Select-Object -First 1) {
Write-Host "Row with ID '$($tmpRow.ID)' has name '$($tmpRow.Name)'"
}
Expected screen buffer output:
Row with ID '1' has name 'Clive'
What's the best way to rewrite the following powershell code that compares a list of two files, ensuring they have the same (or greater) file count, and that the second list contains every file in the first list:
$noNewFiles = $NewFiles.Count -ge $OldFiles.Count
foreach ($oldFile in $OldFiles){
if (!$NewFiles.Contains($oldFile)) {
return $false
}
}
PSv3+ syntax (? is a built-in alias for Where-Object cmdlet):
(Compare-Object $NewFiles $OldFiles | ? SideIndicator -eq '=>').Count -eq 0
More efficient PSv4+ alternative, using the Where() method (as suggested by Brian (the OP) himself):
(Compare-Object $NewFiles $OldFiles).Where-Object({ $_.SideIndicator -eq '=>' }).Count -eq 0
By default, Compare-Object only returns differences between two sets and outputs objects whose .SideIndicator property indicates the set that an element is unique to:
Since string value => indicates an element that is unique to the 2nd set (RHS), we can filter the differences down to elements unique to the 2nd set, so if their count is 0, the implication is that there are no elements unique to the 2nd set.
Side note:
How "sameness" (equality) is determined depends on the data type of the elements.
A pitfall is that instances of reference types are compared by their .ToString() values, which can result in disparate objects being considered equal. For instance, Compare-Object #{ one=1 } #{ two=2 } produces no output.
I have the following code that pulls in some server information from a text file and spits it into a hashtable.
Get-Content $serverfile | Foreach-Object {
if($_ -match '^([^\W+]+)\s+([^\.+]+)')
{
$sh[$matches[1]] = $matches[2]
}
}
$sh.GetEnumerator()| sort -Property Name
This produces the following:
Name Value
---- -----
Disk0 40
Disk1 40
Disk2 38
Disk3 43
Memory 4096
Name Value
Number_of_disks 1
Number_of_network_cards 2
Number_of_processors 1
ServerName WIN02
Depending on the server there may be one Disk0 or many more.
My challenge here is to pull each Disk* value from each of the varying number of Disk keys and return the values in a comma separated list, for example;
$disks = 40,40,38,43
I have tried varying approaches to this problem however none have met the criteria of being dynamic and including the ',' after each disk.
Any help would be appreciated.
I assume that when you say "Depending on the server there may be one Disk0 or many more", you mean "one Disk or many more", each with a different number? You can't have more than one Disk0, because key names can't be duplicated in a hash.
This will give you a list of all the hash values for keys starting with "Disk":
$sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}
If you actually want to get a comma-separated list (a single string value), you can use the -join operator:
$disks = ($sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}) -join ','
However, if the reason you want a comma-separated list is in order to get an array of the values, you don't really need the comma-separated list; just assign the results (which are already an array) to the variable:
$disks = $sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}
Note, BTW, that hashes are not ordered. There's no guarantee that the order of the keys listed will be the same as the order in which you added them or in ascending alphanumeric order. So, in the above example, your result could be 38,40,43,40. If order does matter (i.e. you're counting on the values in $disks to be in the order of their respective Disk numbers, you have two options.
Filter the listing of the keys through Sort-Object:
$sh.Keys | ?{$_ -match '^Disk'} | sort | %{$sh.$_}
(You can put the | sort between $sh.Keys and | ?{..., but it's more efficient this way...which makes little difference here but would matter with larger data sets.)
Use an ordered dictionary, which functions pretty much the same as a hash, but maintains the keys in the order added:
$sh = New-Object System.Collections.Specialized.OrderedDictionary