I have trouble of getting index of the current element for multiple elements that are exactly the same object:
$b = "A","D","B","D","C","E","D","F"
$b | ? { $_ -contains "D" }
Alternative version:
$b = "A","D","B","D","C","E","D","F"
[Array]::FindAll($b, [Predicate[String]]{ $args[0] -contains "D" })
This will return:
D
D
D
But this code:
$b | % { $b.IndexOf("D") }
Alternative version:
[Array]::FindAll($b, [Predicate[String]]{ $args[0] -contains "D" }) | % { $b.IndexOf($_) }
Returns:
1
1
1
so it's pointing at the index of the first element. How to get indexes of the other elements?
You can do this:
$b = "A","D","B","D","C","E","D","F"
(0..($b.Count-1)) | where {$b[$_] -eq 'D'}
1
3
6
mjolinor's answer is conceptually elegant, but slow with large arrays, presumably due to having to build a parallel array of indices first (which is also memory-inefficient).
It is conceptually similar to the following LINQ-based solution (PSv3+), which is more memory-efficient and about twice as fast, but still slow:
$arr = 'A','D','B','D','C','E','D','F'
[Linq.Enumerable]::Where(
[Linq.Enumerable]::Range(0, $arr.Length),
[Func[int, bool]] { param($i) $arr[$i] -eq 'D' }
)
While any PowerShell looping solution is ultimately slow compared to a compiled language, the following alternative, while more verbose, is still much faster with large arrays:
PS C:\> & { param($arr, $val)
$i = 0
foreach ($el in $arr) { if ($el -eq $val) { $i } ++$i }
} ('A','D','B','D','C','E','D','F') 'D'
1
3
6
Note:
Perhaps surprisingly, this solution is even faster than Matt's solution, which calls [array]::IndexOf() in a loop instead of enumerating all elements.
Use of a script block (invoked with call operator & and arguments), while not strictly necessary, is used to prevent polluting the enclosing scope with helper variable $i.
The foreach statement is faster than the Foreach-Object cmdlet (whose built-in aliases are % and, confusingly, also foreach).
Simply (implicitly) outputting $i for each match makes PowerShell collect multiple results in an array.
If only one index is found, you'll get a scalar [int] instance instead; wrap the whole command in #(...) to ensure that you always get an array.
While $i by itself outputs the value of $i, ++$i by design does NOT (though you could use (++$i) to achieve that, if needed).
Unlike Array.IndexOf(), PowerShell's -eq operator is case-insensitive by default; for case-sensitivity, use -ceq instead.
It's easy to turn the above into a (simple) function (note that the parameters are purposely untyped, for flexibility):
function get-IndicesOf($Array, $Value) {
$i = 0
foreach ($el in $Array) {
if ($el -eq $Value) { $i }
++$i
}
}
# Sample call
PS C:\> get-IndicesOf ('A','D','B','D','C','E','D','F') 'D'
1
3
6
You would still need to loop with the static methods from [array] but if you are still curious something like this would work.
$b = "A","D","B","D","C","E","D","F"
$results = #()
$singleIndex = -1
Do{
$singleIndex = [array]::IndexOf($b,"D",$singleIndex + 1)
If($singleIndex -ge 0){$results += $singleIndex}
}While($singleIndex -ge 0)
$results
1
3
6
Loop until a match is not found. Assume the match at first by assigning the $singleIndex to -1 ( Which is what a non match would return). When a match is found add the index to a results array.
Related
I would like to be able to add functionality to a PowerShell script to move one spot forwards or one spot backwards when in a foreach loop. While looping through a foreach statement, is it possible to move forwards and backwards or to identify what point in the array the current item lies?
Edit - the items in the array are files, not numbers
Yes - it's called a for loop!
For data structures that we can index into (arrays, lists, etc.), the foreach loop statement can easily be translated to a for loop statement, like so:
# This foreach loop statement
foreach($item in $array){
Do-Something $item
}
# ... is functionally identical to this for loop statement
for($i = 0; $i -gt $array.Length; $i++){
$item = $array[$i]
Do-Something $item
}
Since we have direct access to the current index (via $i) in the for loop, we can now use that as an offset to "look around" that position in the array:
$array = #(1,2,3,4)
for($i = 0; $i -lt $array.Length;$i++)
{
$item = $array[$i]
if($i -gt 0){
# If our index position $i is greater than zero,
# then there must be _at least_ 1 item "behind" us
$behind = $array[$i - 1]
}
if($i -lt ($array.Length - 1)){
# Not at end of array, we can also "look ahead"
$ahead = $array[$i + 1]
}
}
Be careful about checking that you don't use an $i value greater than $array.Length (which would result in $null) or less than 0 (in which case PowerShell will start reading the array backwards!)
The increase assignment operator (+=) is often used in [PowerShell] questions and answers at the StackOverflow site to construct a collection objects, e.g.:
$Collection = #()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]#{Index = $_; Name = "Name$_"}
}
Yet it appears an very inefficient operation.
Is it Ok to generally state that the increase assignment operator (+=) should be avoided for building an object collection in PowerShell?
Yes, the increase assignment operator (+=) should be avoided for building an object collection, see also: PowerShell scripting performance considerations.
Apart from the fact that using the += operator usually requires more statements (because of the array initialization = #()) and it encourages to store the whole collection in memory rather then push it intermediately into the pipeline, it is inefficient.
The reason it is inefficient is because every time you use the += operator, it will just do:
$Collection = $Collection + $NewObject
Because arrays are immutable in terms of element count, the whole collection will be recreated with every iteration.
The correct PowerShell syntax is:
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
}
Note: as with other cmdlets; if there is just one item (iteration), the output will be a scalar and not an array, to force it to an array, you might either us the [Array] type: [Array]$Collection = 1..$Size | ForEach-Object { ... } or use the Array subexpression operator #( ): $Collection = #(1..$Size | ForEach-Object { ... })
Where it is recommended to not even store the results in a variable ($a = ...) but immediately pass it into the pipeline to save memory, e.g.:
1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
} | ConvertTo-Csv .\Outfile.csv
Note: Using the System.Collections.ArrayList class could also be considered, this is generally almost as fast as the PowerShell pipeline but the disadvantage is that it consumes a lot more memory than (properly) using the PowerShell pipeline.
see also: Fastest Way to get a uniquely index item from the property of an array and Array causing 'system.outofmemoryexception'
Performance measurement
To show the relation with the collection size and the decrease of performance you might check the following test results:
1..20 | ForEach-Object {
$size = 1000 * $_
$Performance = #{Size = $Size}
$Performance.Pipeline = (Measure-Command {
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
}
}).Ticks
$Performance.Increase = (Measure-Command {
$Collection = #()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]#{Index = $_; Name = "Name$_"}
}
}).Ticks
[pscustomobject]$Performance
} | Format-Table *,#{n='Factor'; e={$_.Increase / $_.Pipeline}; f='0.00'} -AutoSize
Size Increase Pipeline Factor
---- -------- -------- ------
1000 1554066 780590 1.99
2000 4673757 1084784 4.31
3000 10419550 1381980 7.54
4000 14475594 1904888 7.60
5000 23334748 2752994 8.48
6000 39117141 4202091 9.31
7000 52893014 3683966 14.36
8000 64109493 6253385 10.25
9000 88694413 4604167 19.26
10000 104747469 5158362 20.31
11000 126997771 6232390 20.38
12000 148529243 6317454 23.51
13000 190501251 6929375 27.49
14000 209396947 9121921 22.96
15000 244751222 8598125 28.47
16000 286846454 8936873 32.10
17000 323833173 9278078 34.90
18000 376521440 12602889 29.88
19000 422228695 16610650 25.42
20000 475496288 11516165 41.29
Meaning that with a collection size of 20,000 objects using the += operator is about 40x slower than using the PowerShell pipeline for this.
Steps to correct a script
Apparently some people struggle with correcting a script that already uses the increase assignment operator (+=). Therefore, I have created a little instruction to do so:
Remove all the <variable> += assignments from the concerned iteration, just leave only the object item. By not assigning the object, the object will simply be put on the pipeline.
It doesn't matter if there are multiple increase assignments in the iteration or if there are embedded iterations or function, the end result will be the same.
Meaning, this:
ForEach ( ... ) {
$Array += $Object1
$Array += $Object2
ForEach ( ... ) {
$Array += $Object3
$Array += Get-Object
}
}
Is essentially the same as:
ForEach ( ... ) {
$Object1
$Object2
ForEach ( ... ) {
$Object3
Get-Object
}
}
Note: if there is no iteration, there is probably no reason to change your script as likely only concerns a few additions
Assign the output of the iteration (everything that is put on the pipeline) to the concerned a variable. This is usually at the same level as where the array was initialized ($Array = #()). e.g.:
$Array = ForEach ( ... ) { ...
Note 1: Again, if you want single object to act as an array, you probably want to use the Array subexpression operator #( ) but you might also consider to do this at the moment you use the array, like: #($Array).Count or ForEach ($Item in #($Array))
Note 2: Again, you're better off not assigning the output at all. Instead, pass the pipeline output directly to the next cmdlet to free up memory: ... | ForEach-Object {...} | Export-Csv .\File.csv.
Remove the array initialization <Variable> = #()
For a full example, see: Comparing Arrays within Powershell
Note that the same applies for using += to build strings (
see: Is there a string concatenation shortcut in PowerShell?) and also building HashTables like:
$HashTable += #{ $NewName = $Value }
$array = #('blue','red','purple','pink')
$array2 = #('brown','red','black','yellow')
$array | ForEach-Object {
if ($array2 -contains $_) {
Write-Host "`$array2 contains the `$array1 string [$_]"
}
}
how to get the index of the match string?
While PowerShell's -in / -contains operators allow you to test for containment of a given value in a collection (whether a given value is an element of the collection), there is no direct support for getting an element's index using only PowerShell's own features.
For .NET arrays (such as the ones created in your question[1]) you can use their .IndexOf() instance method, which uses case-SENSITIVE comparison based on the current culture; e.g.:
$array.IndexOf('red') # -> 1; case-SENSITIVE, current-culture comparison
Note that PowerShell itself is generally case-INSENSITIVE, and with -eq (and in other contexts) uses the invariant culture for comparison.
A case-INSENSITIVE solution based on the invariant culture, using the Array type's static [Array]::FindIndex() method:
$array = 'blue', 'ReD', 'yellow'
[Array]::FindIndex($array, [Predicate[string]] { 'red' -eq $args[0] }) # -> 1
Note that by delegating to a PowerShell script block ({ ... }) in which each element ($args[0]) is tested against the target value with -eq, you implicitly get PowerShell's case-insensitive, culture-invariant behavior.
Alternatively, you could use the -ceq operator for case-sensitive (but still culture-invariant) matching.
($args[0].Equals('red', 'CurrentCulture') would give you behavior equivalent to the .IndexOf() solution above).
Generally, this approach enables more sophisticated matching techniques, such as by using the regex-based -match operator, or the wildcard-based -like operator.
The above solutions find the index of the first matching element, if any.
To find the index of the last matching element, if any, use:
.LastIndexOf()
[Array]::FindLastIndex()
Note: While there is an [Array]::FindAll() method for returning all elements that meet a given predicate (criterion), there is no direct method for finding all indices.
[1] Note that you do not need #(), the array-subexpression operator to create an array from individually enumerated elements: enumerating them with ,, the array constructor operator alone is enough:
$array = 'blue','red','purple','pink'
Looks like a homework exercise to me. In any case, as mentioned, things are a lot easier if you format your code properly. It's also easier if you name your variables rather than relying on $_, because it changes as it goes through a nested loop.
There are also other ways to do this - do you want the index number or the contents? I assumed the latter
$array = #('blue','red','purple','pink')
$array2 = #('brown','red','black','yellow')
ForEach ($a in $array) {
if ($array2 -contains $a) {
Write-Host "`$array2 contains the `$array1 string $a"
}
}
$array2 contains the $array1 string red
You can try something with an index counter you can use. If $array2.ToLower() contains that element.ToLower(), then loop through that second array to find out where that element actually is.
Note that this is not going to work for large amount of arrays as the time it will take to go through would get larger and larger. But, for small samples like this one, it works fine.
$array = 'blue','Red','purple','pink', 'browN'
$array2 = 'brown','rEd','black','yellow'
$array | ForEach-Object {
if ($array2.ToLower() -contains $_.ToLower()) {
$index = 0
foreach($arrElement in $array2) {
#$index++ # based on index starting with 1
if ($arrElement -eq $_) {
Write-Host "`$array2 contains the `$array1 string [$_] at index: $index"
}
$index++ # based on index starting with 0
}
}
}
# produces output
$array2 contains the $array1 string [Red] at index: 1
$array2 contains the $array1 string [browN] at index: 0
If there are duplicates in the $array2, you'll get two separate lines that would show each index entry.
$array = 'blue','Red','purple','pink', 'browN'
$array2 = 'brown','rEd','black','yellow', 'red'
#Output would be with above code:
$array2 contains the $array1 string [Red] at index: 1
$array2 contains the $array1 string [Red] at index: 4
$array2 contains the $array1 string [browN] at index: 0
You could also do a for loop using an index counter:
$array = 'blue','red','purple','pink', 'black'
$array2 = 'brown','red','black','yellow', 'red'
for ($i = 0; $i -lt $array2.Count; $i++) {
if ($array -contains $array2[$i]) {
Write-Host "`$array2 contains the the string '$($array2[$i])' at index: $i"
}
}
Result:
$array2 contains the the string 'red' at index: 1
$array2 contains the the string 'black' at index: 2
$array2 contains the the string 'red' at index: 4
This is a practical example that uses BinarySearch and relies on your look-up array being sorted by the property of "interest".
Uses IComparer to force case insensitivity
# BinarySearch needs a sorted array
$mySortedArray = Get-ChildItem $env:TEMP | Sort-Object -Property Name
# Provide files available on your machine
$anotherArray = #(
'mat-debug-23484.log'
'MSIaa547.LOG'.ToLower()
)
foreach ($item in $anotherArray) {
$index = $null
# BinarySearch defaults to being case sensitive
$index = [array]::BinarySearch($mySortedArray.Name, $item,[Collections.CaseInsensitiveComparer]::Default)
# If no matches found index will be negative
if ($index -ge 0) {
Write-Host ('Index {0} filename {1} found!' -f $index, $mySortedArray[$index].Name) -ForegroundColor Green
}
}
# Adjusted to meet your example code
$array = #('blue','red','purple','pink')
$array2 = #('brown','red','black','yellow') | Sort-Object
$array | ForEach-Object {
$currentObject = $_
$index = $null
$index = [array]::BinarySearch($array2, $currentObject, [System.Collections.CaseInsensitiveComparer]::Default)
if ($index -ge 0) {
Write-Host ('Index={0} array2 value="{1}" found!' -f $index, $array2[$index]) -ForegroundColor Green
}
}
I've been experimenting with the different forms of operators/expressions involving parentheses, but I can't find an explanation for an interaction I'm running into. Namely, ( ) and $( ) (subexpression operator) are not equivalents. Nor is it equivalent to #( ) (array operator). For most cases this doesn't matter, but when trying to evaluate the contents of the parentheses as an expression (for example, variable assignment), they're different. I'm looking for an answer on what parentheses are doing when they aren't explicitly one operator or another and the about_ documents don't call this out.
($var = Test-Something) # -> this passes through
$($var = Test-Something) # -> $null
#($var = Test-Something) # -> $null
about_Operators
For the array and subexpression operators, the parentheses are simply needed syntactically. Their only purpose is to wrap the expression the operator should be applied on.
Some examples:
# always return array, even if no child items found
#(Get-ChildItem -Filter "*.log").Count
# if's don't work inside regular parentheses
$(if ($true) { 1 } else { 0 })
When you put (only) parentheses around a variable assignment, this is called variable squeezing.
$v = 1 # sets v to 1 and returns nothing
($v = 1) # sets v to 1 and returns assigned value
You can get the pass-thru version for all your examples by combining variable squeezing with the subexpression operator syntax (that is, adding a second pair of parentheses):
($var = Test-Something)
$(($var = Test-Something))
#(($var = Test-Something))
$( ) is unique. You can put multiple statements inside it:
$(echo hi; echo there) | measure | % count
2
You can also put things you can't normally pipe from, like foreach () and if, although the values won't come out until the whole thing is finished. This allows you to put multiple statements anywhere that just expects a value.
$(foreach ($i in 1..5) { $i } ) | measure | % count
5
$x = 10
if ( $( if ($x -lt 5) { $false } else { $x } ) -gt 20)
{$false} else {$true}
for ($i=0; $($y = $i*2; $i -lt 5); $i++) { $y }
$err = $( $output = ls foo ) 2>&1
Apologies for what is probably a newbish question.
I am writing some Powershell scripts that run various queries against AD. They will usually return a bunch of results, which are easy to deal with, ie:
$results = myQuery
write-host "Returned " + $results.Count + " results."
foreach ($result in $results) { doSomething }
No worries. However, if there is only 1 result, Powershell automagically translates that result into a single object, rather than an array that contains 1 object. So the above code would break, both at the Count and the foreach. I'm sure the same problem would occur with 0 results.
Could someone suggest an elegant way to handle this? Perhaps some way to cast the results so they are always an array?
Change the first line of code to
$results = #(myQuery)
This will always return an array. See this blog entry for additional details.
Actually, the foreach works just fine. All uses of foreach (the foreach keyword, the Foreach-Object cmdlet, and Foreach-Object's aliases "foreach" and "%") all have the same behavior of "wrapping" the object in question in an array if needed. Thus, all uses of foreach will work with both scalar and array values.
Annoyingly, this means they work with null values too. Say I do:
$x = $null
foreach ($y in $x) {Write-Host "Hello world 1"}
$x | Foreach-Object {Write-Host "Hello world 2"}
I'll get
"Hello world 1"
"Hello world 2"
out of that.
This has bitten me as well. No clever ideas on how to fix $results.Count, but the foreach can be fixed by switching to a pipeline.
$scalar = 1
$list = (1,2)
$list | % { $_ }
prints
1
2
$scalar | % { $_ }
prints
1