Creating an array with large initial size in powershell

Creating an array with large initial size in powershell - powershell

The only way I know to create an array in powershell is
$arr = #(1, 2, 3)
However, this creating method is not convenient if I want to create an array with large initial size such as 10000.
Because I don't want to write code like this
$arr = #(0, 0, 0, 0, 0, 0, ... ,0) # 10000 0s in this line of code
Writing code like the following is not efficient.
$arr = #()
for ($i = 1; $i -le 10000; $i++) {
$arr += 0
}
Because whenever += operator is executed, all of the elements in the old array would be copied into a newly created array.
What's the best way to create an array with large initial size in powershell?

Use New-Object in this case:
PS> $arr = New-Object int[] 10000; $arr.length
10000
Or, in PSv5+, using the static new() method on the type:
PS> $arr = [int[]]::new(10000); $arr.length
10000
These commands create a strongly typed array, using base type [int] in this example.
If the use case allows it, this is preferable for reasons of performance and type safety.
If you need to create an "untyped" array the same way that PowerShell does ([System.Object[]]), substitute object for int; e.g., [object[]]::new(10000); the elements of such an array will default to $null.
TessellatingHeckler's helpful answer, however, shows a much more concise alternative that even allows you initialize all elements to a specific value.
Arrays have a fixed size; if you need an array-like data structure that you can preallocate and grow dynamically, see Bluecakes' helpful [System.Collections.ArrayList]-based answer.
[System.Collections.ArrayList] is the resizable analog to [System.Object[]], and its generic equivalent - which like the [int[]] example above - allows you to use a specific type for performance and robustness (if feasible), is [System.Collections.Generic.List[<type>]], e.g.:
PS> $lst = [System.Collections.Generic.List[int]]::New(10000); $lst.Capacity
10000
Note that - as with [System.Collections.ArrayList] - specifying an initial capacity (10000, here) does not allocate the internally used array with that size right away - the capacity value is simply stored (and exposed as property .Capacity), and an internal array with that capacity (internal size that leaves room for growth) is allocated on demand, when the first element is added to the list.
[System.Collections.Generic.List[<type>]]'s .Add() method commendably does not produce output, whereas [System.Collections.ArrayList]'s does (it returns the index of the element that was just added).
# The non-generic ArrayList's .Add() produces (usually undesired) output.
PS> $al = [System.Collections.ArrayList]::new(); $al.Add('first elem')
0 # .Add() outputs the index of the newly added item
# Simplest way to suppress this output:
PS> $null = $al.Add('first elem')
# NO output.
# The generic List[T]'s .Add() does NOT produce output.
PS> $gl = [System.Collections.Generic.List[string]]::new(); $gl.Add('first elem')
# NO output from .Add()

Array literal multiplication:
$arr = #($null) * 10000

In PowerShell you are correct in that += will destroy the old array and create a new array with the new items.
For working with a large collection of items i would highly recommend using the ArrayList type from .NET as this is not a fixed size array so PowerShell will not destroy it every time you add an item to it and i've found this to work better in my projects.
Using ArrayList also means that you don't need to start with 10000 items. Because PowerShell won't need to recreate your array every time, you can start with 0 and then add each item as it's needed instead of starting with 10000.
So in your script i would create an empty ArrayList like so
[System.Collections.ArrayList]$arr = #()
and then when you need to add something to it just call .Add() (You don't need to prepopulate the array with 10000 items, it will expand as you add items).
$arr.Add([int]0)
Your example using an ArrayList:
[System.Collections.ArrayList]$arr = #()
for ($i = 1; $i -le 10000; $i++) {
$arr.Add([int]0)
}

You could always do this:
$a = #(0..9999)
$a.length
This is nearly like what you are already doing, except that you don't have to
write out all the values.

Related

Returning a ArrayList from a function in powershell contains indexes [duplicate]

This question already has answers here:
Powershell Join-Path showing 2 dirs in result instead of 1 - accidental script/function output
(1 answer)
Why does Range.BorderAround emit "True" to the console?
(1 answer)
Create a Single-Element Json Array Object Using PowerShell
(2 answers)
Closed 1 year ago.
I am new to PowerShell and there is a weird behavior I cannot explain. I call a function that returns a [System.Collections.ArrayList] but when I print my variable that receives the content of the array, if I have one value(for example: logXXX_20210222_075234355.txt), then I get 0 logXXX_20210222_075234355.txt. The value 0 gets added for some reason as if it has the index of the value.
If I have 4 values, it will look like this:
0 1 2 3 logXXX_20210222_075234315.txt logXXX_20210225_090407364.txt
logXXX_20210204_120318221.txt logXXX_20210129_122737751.txt
Can anyone help?
Here is a simple code that does that:
function returnAnArray{
$arrayToReturn =[System.Collections.ArrayList]::new()
$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
$fileNames = returnAnArray
Write-Host $fileNames
0 logICM_20210222_075234315.txt

It's characteristic of the ArrayList class to output the index on .Add(...). However, PowerShell returns all output, which will cause it to intermingle the index numbers with the true or other intended output.
My favorite solution is to simply cast the the output from the .Add(...) method to [Void]:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
[Void]$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
You can also use Out-Null for this purpose but in many cases it doesn't perform as well.
Another method is to assign it to $null like:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
$null = $arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
In some cases this can be marginally faster. However, I prefer the [Void] syntax and haven't observed whatever minor performance differential there may be.
Note: $null = ... works in all cases, while there are some cases where [Void] will not; See this answer (thanks again mklement0) for more information.
An aside, you can use casting to establish the list:
$arrayToReturn = [System.Collections.ArrayList]#()
Update Incorporating Important Comments from #mklement0:
return $arrayToReturn may not behave as intended. PowerShell's output behavior is to enumerate (stream) arrays down the pipeline. In such cases a 1 element array will end up returning a scalar. A multi-element array will return a typical object array [Object[]], not [Collection.ArrayList] as seems to be the intention.
The comma operator can be used to guarantee the return type by making the ArrayList the first element of another array. See this answer for more information.
Example without ,:
Function Return-ArrayList { [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Object[]
Example with ,:
Function Return-ArrayList { , [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Collections.ArrayList
Of course, this can also be handled by the calling code. Most commonly by wrapping the call in an array subexpression #(...). a call like: $filenames = #(returnAnArray) will force $filenames to be a typical object array ([Object[]]). Casting like $filenames = [Collections.ArrayList]#(returnArray) will make it an ArrayList.
For the latter approach, I always question if it's really needed. The typical use case for an ArrayList is to work around poor performance associated with using += to increment arrays. Often this can be accomplished by allowing PowerShell to return the array for you (see below). But, even if you're forced to use it inside the function, it doesn't mean you need it elsewhere in the code.
For Example:
$array = 1..10 | ForEach-Object{ $_ }
Is preferred over:
$array = [Collections.ArrayList]#()
1..10 | ForEach-Object{ [Void]$array.Add( $_ ) }
Persisting the ArrayList type beyond the function and through to the caller should be based on a persistent need. For example, if there's a need easily add/remove elements further along in the program.
Still More Information:
Notice the Return statement isn't needed either. This very much ties back to why you were getting extra output. Anything a function outputs is returned to the caller. Return isn't explicitly needed for this case. More commonly, Return can be used to exit a function at desired points...
A function like:
Function Demo-Return {
1
return
2
}
This will return 1 but not 2 because Return exited the function beforehand. However, if the function were:
Function Demo-Return
{
1
return 2
}
This returns 1, 2.
However, that's equivalent to Return 1,2 OR just 1,2 without Return
Update based on comments from #zett42:
You could avoid the ArrayList behavior altogether by using a different collection type. Most commonly a generic list, [Collections.Generic.List[object]]. Technically [ArrayList] is deprecated already making generic lists a better option. Furthermore, the .Add() method doesn't output anything, thus you do not need [Void] or any other nullification method. Generic lists are slightly faster than ArrayLists, and saving the nullification operation a further, albeit still small performance advantage.

ArrayList appears to store alternating indexes and values:
PS /home/alistair> $filenames[0]
0
PS /home/alistair> $filenames[1]
logICM_20210222_075234315.txt

Trying and failing to pass an array of custom objects by reference

I am creating an array of custom objects in my powershell script
$UnreachablePCs = [PSCustomObject]#()
Then I am passing this into a function like this
function GetComputerData {
param (
$Computers, [ref]$unreachable
)
...
$unreachablePC = [PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
$UnreachablePCs += $unreachablePC
Write-Output $UnreachablePCs.Count
...
}
GetComputerData -Computers $TraderWorkstations -unreachable ([ref]$UnreachablePCs)
Write-Output $UnreachablePCs.Count
$TraderWorkstations is a list of pcs which are iterated over in the function. All pcs that aren't reachable are added to the $UnreachablePCs array in an else branch in the function. In the function the .Count I'm calling will increment as workstations are added to the list. But After the function is called the final .Count is returning 0. What am I missing here?

Don't use [ref] parameters in PowerShell code: [ref]'s purpose is to facilitate calling .NET API methods; in PowerShell code, it is syntactically awkward and can lead to subtle bugs, such as in your case - see this answer guidance on when [ref] use is appropriate.
Instead, make your function output the objects that make up the result array (possibly one by one), and let PowerShell collect them for you in an array:
function GetComputerData {
param (
$Computers # NO [ref] parameter
)
# ...
# Output (one or more) [pscustomobject] instances.
[PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
# ...
}
# Collect the [pscustomobject] instances output by the function
# in an array.
$UnReachablePCs = #(GetComputerData -Computers $TraderWorkstations)
#(), the array-subexpression operator, always creates an [object[]] array. To create a strongly typed array instead, use:
[pscustomobject[]] $unreachablePCs = GetComputerData -Computers $TraderWorkstations
Important:
[PSCustomObject]#{ ... directly produces output from the function, due to PowerShell's implicit output feature, where any command or expression whose output isn't captured or redirected automatically contributes to the enclosing function's (script's) output (written to PowerShell's success output stream) - see this answer for details. All the objects written to the function's success output streams are captured by a variable assignment such as $UnReachablePCs = ...
Write-Output is the explicit (rarely needed) way to write to the success output stream, which also implies that you cannot use it for ad hoc debugging output to the display, because its output too becomes part of the function's "return value" (the objects sent to the success output stream).
If you want to-display output that doesn't "pollute" the success output stream, use Write-Host. Preferably, use cmdlets that target other, purpose-specific output streams, such as Write-Verbose and Write-Debug, though both of them require opt-in to produce visible output (see the linked docs).
As for the problems with your original approach:
$UnreachablePCs = [PSCustomObject]#()
This doesn't create an array of custom objects.
Instead, it creates an [object[]] array that is (uselessly, mostly invisibly) wrapped in a [psobject] instance.[1]
Use the following instead:
[PSCustomObject[]] $UnreachablePCs = #()
As for use of [ref] with an array variable updated with +=:
Fundamentally, you need to update a (parameter) variable containing a [ref] instance by assigning to its .Value property, not to the [ref] instance as a whole ($UnreachablePCs.Value = ... rather than $UnreachablePCs = ...)
However, the += technique should be avoided except for small arrays, because every += operation requires creating a new array behind the scenes (comprising the original elements and the new one(s)), which is necessary, because arrays are fixed-size data structures.
Either: Use an efficiently extensible list data type, such as [System.Collections.Generic.List[PSCustomObject]] and grow it via its .Add() method (in fact, if you create the list instance beforehand, you could pass it as a normal (non-[ref]) argument to a non-[ref] parameter, and the function would still directly update the list, due to operating on the same list instance when calling .Add() - that said, the output approach at the top is generally still preferable):
$unreachablePCs = [System.Collections.Generic.List[PSCustomObject]] #()
foreach ($i in 1..2) {
$unreachablePCs.Add([pscustomobject] #{ foo = $i })
}
Or - preferably - when possible: Use PowerShell's loop statements as expressions, and let PowerShell collect all outputs in an array for you (as also shown with output from a whole function above); e.g.:
# Automatically collects the two custom objects output by the loop.
[array] $unreachablePCs = foreach ($i in 1..2) {
[pscustomobject] #{ foo = $i }
}
[1] This behavior is unfortunate, but stems from the type accelerators [pscustomobject] and [psobject] being the same. A [pscustomobject] cast only works meaningfully in the context of creating a single custom object literal, i.e. if followed by a hashtable (e.g., [pscustomobject] #{ foo = 1 }). In all other cases, the mostly invisible wrapping in [psobject] occurs; e.g., [pscustomobject] #() -is [object[]] is $true - i.e., the result behaves like a regular array - but [pscustomobject] #() -is [psobject] is also $true, indicating the presence of a [psobject] wrapper.

Count the number of variables containing objects

I have an issue with .count. Its to do with the way it counts objects and also arrays containing objects.
For dull reasons, I have a situation where I don't know if what is being passed into my function is a single array of objects or 2 or more variables holding array lists of objects.
I though it would be easy to detect this. But because of the way count tries to help me out I am having trouble. When 2 or more arrays of objects are returned count reflects the number of arrays. But when only one is return count reflects the amount of objects in that array.
Simplified demo using simple arrays:
$Arr1 = "a","b","c"
$Arr2 ="d","f","e"
$arrayjoined = $Arr1,$Arr2
#count showing the number of arrays = 2
$arrayjoined.count
# Count on just one array = 3
$Arr1.count
My original function works apart from occasions when only one array is returned. Is there any thing I can do to check for this, or force the count to return 1 to reflect just one array returned?
PS. I know I can use += to get the results into the same array in the example above but I can't do this with the arrays coming into my function.

So what can we do if we can to count arrays in a array
$Array = 1,2,3
$CountArray = $Array
$CountArray.Length
$CountArray.Count
Will return
3
3
We can instead prepend the array with a ,
$Array = 1,2,3
$CountArray = ,$Array
$CountArray.Length
$CountArray.Count
Which will return
1
1
You can also add to an array like this
$Array = #()
$Array += ,#(1,2,3)
$Array.Length
$Array.Count
Returns
1
1

Update a value in a array of objects

If I have an array of objects like
$Results = #(
[PSCustomObject]#{
Email = $_.email
Type = $Type
}
# ...
)
and I want to update the 100th $results.type value from X to y. I thought this code would be
$results.type[100] = 'Hello'
But this is not working and I am not sure why?

$Results.Type[100] = 'Hello' doesn't work because $Results.Type isn't real!
$Results is an array. It contains a number of objects, each of which have a Type property. The array itself also has a number of properties, like its Length.
When PowerShell sees the . member invocation operator in the expression $Results.Type, it first attempts to resolve any properties of the array itself, but since the [array] type doesn't have a Type property, it instead tries to enumerate the items contained in the array, and invoke the Type member on each of them. This feature is known as member enumeration, and produces a new one-off array containing the values enumerated.
Change the expression to:
$Results[100].Type = 'Hello'
Here, instead, we reference the 101st item contained in the $Results array (which is real, it already exists), and overwrite the value of the Type property of that object.

#MathiasR.Jessen answered this. The way I was indexing the array was wrong.
the correct way is $results[100].type = 'Hello'

Add Members to Distribution Groups DLs

I have below .csv input file. These are the Distribution Groups.
Level-4 is member of Level3,
Level-3 is member of Level2,
Level-2 is member of Level1,
Level-1 is member of Level0.
So far, I have tried the below code. Starting to add Level-4 into Level-3. I have marked them for better understanding. However, I am not able to select the object and iterate in correct way using PowerShell.
e.g. First instance of Level-3 DL is 'DL_L3_US1' and it will contain members from Level-4 i.e. DL_L4_US1 and DL_L4_US2. How do I make this work?
$DLlist = Import-Csv C:\tempfile\Book2.csv
$Test = $DLlist | select Level3,Level4 | ? {$_.level3 -notlike $null }
foreach ($x in $DLlist)
{Add-DistributionGroupMember -Identity "$($x.Level3)" -Member "$($x.Level4)"}

So my first answer wasn't correct. I misunderstood the question. Here's a new example:
$Props = 'Level-0','Level-1','Level-2','Level-3','Level-4'
$GroupDefs = Import-Csv C:\temp\Book2.csv
For( $i = 0; $i -lt $GroupDefs.Count; ++$i )
{ # Loop through the objects that came from the CSV file...
For( $p = 1; $p -lt $Props.Count; ++$p )
{ # Loop through the known properties...
$Prop = $Props[$p] # Convenience var
$GroupProp = $Props[$p-1] # Convenience var
If( $GroupDefs[$i].$Prop ) {
# If the property is populated then loop backwards from $i-1,
# the group def record just prior to the current one.
$Member = $GroupDefs[$i].$Prop
:Inner For($r = ($i-1); $r -ge 0; --$r)
{
If( $GroupDefs[$r].$GroupProp ) {
# This means you hit the first record behind your current
# position that has the previous property populated. Now
# we know the group...
$Group = $GroupDefs[$r].$GroupProp
Add-DistributionGroupMember -Identity $Group -Member $Member -WhatIf
Break Inner
}
}
}
}
}
By using traditional For Loops we can find values at other positions. So, the way I worked this out is to nest a loop of the known properties in a loop of the group definitions. When I find a property that has a value, I then loop backward from the current position in $GroupDefs until I find the previous property populated. By that point, I've managed to find both the group and the member, so I can run the command.
Update for Comments:
There is no Dot sourcing in this program. Dot referencing is used. As previously mentioned . is an operator in the sense that the right-hand side will be evaluated before the property is referenced, hence we can use a variable or expression.
Imagine that you are going through the spreadsheet line by line. That is the outer loop of $GroupDefs. You can look at the value of $i as if it's a row#.
Now, for each of those rows, I want to look at each of a known set of property names. So we're going to loop through $Props. If one of those properties has a value, I then want to look at previous rows to find the nearest previous row where the previous property ($Prop[$p-1]) is populated. For example, if Level-2 is populated in Row# 3 I know I have to look back through rows 2, 1, 0 to find the first previous value for property Level-1. That part is the innermost loop, which moves backward through the $GroupDefs array from the current position -1 ($p = ($i-1)) to 0. When the first populated previous value for Level-1 is found I know that's the group name.
$Member is a convenience variable. It's set to $GroupDefs[$i].$Prop because the value of the given property is the member we wish to add to the yet to be determined group.
Note: $GroupDefs.$i returning nothing is expected. At any given moment $i is a number determined by the loop, but it is not the name of a property on the $GroupDefs array or any objects within it. So, it will neither return any property value from the array itself nor will it unroll (enumerate) any properties from the objects contained within.
Note: The value of $Prop will change as you loop through the properties and until a value is found on a property by the given name.
I realize this is confusing, but if you step through the code you will better understand what's happening. First, try to understand what's literally being done. Then the code should make more sense...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Creating an array with large initial size in powershell - powershell

Array literal multiplication: $arr = #($null) * 10000

You could always do this: $a = #(0..9999) $a.length This is nearly like what you are already doing, except that you don't have to write out all the values.

Related

Returning a ArrayList from a function in powershell contains indexes [duplicate]

Trying and failing to pass an array of custom objects by reference

Count the number of variables containing objects

Update a value in a array of objects

Add Members to Distribution Groups DLs

Categories

Resources