Returning a ArrayList from a function in powershell contains indexes [duplicate] - powershell

This question already has answers here:
Powershell Join-Path showing 2 dirs in result instead of 1 - accidental script/function output
(1 answer)
Why does Range.BorderAround emit "True" to the console?
(1 answer)
Create a Single-Element Json Array Object Using PowerShell
(2 answers)
Closed 1 year ago.
I am new to PowerShell and there is a weird behavior I cannot explain. I call a function that returns a [System.Collections.ArrayList] but when I print my variable that receives the content of the array, if I have one value(for example: logXXX_20210222_075234355.txt), then I get 0 logXXX_20210222_075234355.txt. The value 0 gets added for some reason as if it has the index of the value.
If I have 4 values, it will look like this:
0 1 2 3 logXXX_20210222_075234315.txt logXXX_20210225_090407364.txt
logXXX_20210204_120318221.txt logXXX_20210129_122737751.txt
Can anyone help?
Here is a simple code that does that:
function returnAnArray{
$arrayToReturn =[System.Collections.ArrayList]::new()
$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
$fileNames = returnAnArray
Write-Host $fileNames
0 logICM_20210222_075234315.txt

It's characteristic of the ArrayList class to output the index on .Add(...). However, PowerShell returns all output, which will cause it to intermingle the index numbers with the true or other intended output.
My favorite solution is to simply cast the the output from the .Add(...) method to [Void]:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
[Void]$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
You can also use Out-Null for this purpose but in many cases it doesn't perform as well.
Another method is to assign it to $null like:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
$null = $arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
In some cases this can be marginally faster. However, I prefer the [Void] syntax and haven't observed whatever minor performance differential there may be.
Note: $null = ... works in all cases, while there are some cases where [Void] will not; See this answer (thanks again mklement0) for more information.
An aside, you can use casting to establish the list:
$arrayToReturn = [System.Collections.ArrayList]#()
Update Incorporating Important Comments from #mklement0:
return $arrayToReturn may not behave as intended. PowerShell's output behavior is to enumerate (stream) arrays down the pipeline. In such cases a 1 element array will end up returning a scalar. A multi-element array will return a typical object array [Object[]], not [Collection.ArrayList] as seems to be the intention.
The comma operator can be used to guarantee the return type by making the ArrayList the first element of another array. See this answer for more information.
Example without ,:
Function Return-ArrayList { [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Object[]
Example with ,:
Function Return-ArrayList { , [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Collections.ArrayList
Of course, this can also be handled by the calling code. Most commonly by wrapping the call in an array subexpression #(...). a call like: $filenames = #(returnAnArray) will force $filenames to be a typical object array ([Object[]]). Casting like $filenames = [Collections.ArrayList]#(returnArray) will make it an ArrayList.
For the latter approach, I always question if it's really needed. The typical use case for an ArrayList is to work around poor performance associated with using += to increment arrays. Often this can be accomplished by allowing PowerShell to return the array for you (see below). But, even if you're forced to use it inside the function, it doesn't mean you need it elsewhere in the code.
For Example:
$array = 1..10 | ForEach-Object{ $_ }
Is preferred over:
$array = [Collections.ArrayList]#()
1..10 | ForEach-Object{ [Void]$array.Add( $_ ) }
Persisting the ArrayList type beyond the function and through to the caller should be based on a persistent need. For example, if there's a need easily add/remove elements further along in the program.
Still More Information:
Notice the Return statement isn't needed either. This very much ties back to why you were getting extra output. Anything a function outputs is returned to the caller. Return isn't explicitly needed for this case. More commonly, Return can be used to exit a function at desired points...
A function like:
Function Demo-Return {
1
return
2
}
This will return 1 but not 2 because Return exited the function beforehand. However, if the function were:
Function Demo-Return
{
1
return 2
}
This returns 1, 2.
However, that's equivalent to Return 1,2 OR just 1,2 without Return
Update based on comments from #zett42:
You could avoid the ArrayList behavior altogether by using a different collection type. Most commonly a generic list, [Collections.Generic.List[object]]. Technically [ArrayList] is deprecated already making generic lists a better option. Furthermore, the .Add() method doesn't output anything, thus you do not need [Void] or any other nullification method. Generic lists are slightly faster than ArrayLists, and saving the nullification operation a further, albeit still small performance advantage.

ArrayList appears to store alternating indexes and values:
PS /home/alistair> $filenames[0]
0
PS /home/alistair> $filenames[1]
logICM_20210222_075234315.txt

Related

Powershell logical function not executing when called from if statement

Got a question why logical function is not executed or evaluated at all when called from if-statement. tried with powershell version 4 and 7 in windows, both failed.
i have already worked around on this, but curious how could this happen.
can anyone please help? not sure whether this is any environment setup related, or powershell grammar issue. any opinions would be appreciated.
# example 1:
function isValidN{
echo “isValidN is called”
return 0 -lt 1;
}
function Main {
if(isValidN){
echo “t”
}else{
echo “f”
}
}
Main;
#result: it will not execute isValidN at all, it did not printout “isValidN is called”, Why?
#Example 2:
function isValidN{
echo “isValidN is called”
return 0 -lt 1;
}
function Main {
isValidN;
if(isValidN){
echo “t”
}else{
echo “f”
}
}
Main;
#result: the first isValidN call before if-statement would execute and printout “isValidN is called.” However, the 2nd one in the if-statement NOT. Why?
The crucial information is in the helpful comments by Santiago Squarzon and Mathias R. Jessen, but let me break it down conceptually:
As will become clear later, your isValidN function was called in the if statement, but produced no display output.
PowerShell functions and scripts do not have return values - they produce output that is typically written to the success output stream, PowerShell's analog to stdout (standard output).
In that sense, PowerShell functions more like a traditional shell rather than a typical programming language.
Unlike traditional shells, however, PowerShell offers six output streams, and targeting these streams is a way to output information that isn't data, such as debugging messages. Dedicated cmdlets exist to target each streams, as detailed in the conceptual about_Output_Streams help topic.
Any statement in a function or script may produce output - either implicitly, if output from a command call or the value of an expression isn't captured or redirected, or explicitly, with the - rarely necessary - Write-Output cmdlet, for which echo is a built-in alias.
The implication is that any script or function can have an open-ended number (multiple) "return values".
By contrast, Write-Host is meant for printing information to the host (display, terminal), and bypasses the success output stream (since version 5 of PowerShell, it writes to the information stream, which - as all streams do - goes to the host by default). As such, it is sometimes used as a quick way to provide debugging output, though it's generally preferable to use Write-Debug (for debugging only) or Write-Verbose (to provide extended information to the user on demand), but note that making their output visible requires opt-in.
Unlike in other languages, return is not needed to output data - unless there is a need to exit the scope at that point, its use is optional - and may be used independently of output statements; as syntactic sugar, data to output may be passed to return; e.g., return 0 -lt 1 is short for the following two separate statements 0 -lt 1; return: 0 -lt 1 outputs the value of the -lt operation, and return then exits the scope.
Applying the above to what you tried:
function isValidN {
echo "isValidN is called"
return 0 -lt 1
}
is the equivalent of:
function isValidN {
Write-Output "isValidN is called"
Write-Output (0 -lt 1)
return
}
Or, using implicit output, and omitting the redundant return call, given that the scope is implicitly exited at the end of the function:
function isValidN {
"isValidN is called"
0 -lt 1
}
Thus, your function outputs two objects: String "isValidN is called", and the evaluation of 0 -lt 1, which is $true
Using a command call (including scripts and functions) in an if conditional - captures all output from that command, so that:
if (isValidN) # ...
effectively becomes:
if (#("isValidN is called", $true)) # ...
That is, the two-object output from function isValidN, when captured, turned into a two-element array (of type [object[]]), which was implicitly evaluated as a Boolean ([bool]) due to its use with if.
PowerShell allows any object to be converted to [bool], and a 2+-element array always evaluates to $true, irrespective of its content - see the bottom section of this answer for a summary of PowerShell's to-Boolean conversion rules.
To summarize:
What you meant to print to the display, "isValidN is called", became part of the function's output ("return value"), "polluting" the function's output with incidental data, making the if conditional evaluate to $true always.
Because "isValidN is called" was part of the output, and that output was captured (and consumed) by the if conditional, it never printed to the display.
What I assume you meant do to:
function isValidN {
Write-Debug 'isValidN is called'
0 -lt 1
}
# Make Write-Debug output visible.
# To turn it back off:
# $DebugPreference = 'SilentlyContinue'
$DebugPreference = 'Continue'
# Call the function
if (isValidN) { 't' } else { 'f' }
Output:
DEBUG: isValidN is called
t
Note:
If you make a function or script an advanced one, you can turn debugging output on on a per-command call basis, by passing the common -Debug parameter, as an alternative to setting the $DebugPreference preference variable, which affects all commands in the current scope and descendant scopes.

Trying and failing to pass an array of custom objects by reference

I am creating an array of custom objects in my powershell script
$UnreachablePCs = [PSCustomObject]#()
Then I am passing this into a function like this
function GetComputerData {
param (
$Computers, [ref]$unreachable
)
...
$unreachablePC = [PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
$UnreachablePCs += $unreachablePC
Write-Output $UnreachablePCs.Count
...
}
GetComputerData -Computers $TraderWorkstations -unreachable ([ref]$UnreachablePCs)
Write-Output $UnreachablePCs.Count
$TraderWorkstations is a list of pcs which are iterated over in the function. All pcs that aren't reachable are added to the $UnreachablePCs array in an else branch in the function. In the function the .Count I'm calling will increment as workstations are added to the list. But After the function is called the final .Count is returning 0. What am I missing here?
Don't use [ref] parameters in PowerShell code: [ref]'s purpose is to facilitate calling .NET API methods; in PowerShell code, it is syntactically awkward and can lead to subtle bugs, such as in your case - see this answer guidance on when [ref] use is appropriate.
Instead, make your function output the objects that make up the result array (possibly one by one), and let PowerShell collect them for you in an array:
function GetComputerData {
param (
$Computers # NO [ref] parameter
)
# ...
# Output (one or more) [pscustomobject] instances.
[PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
# ...
}
# Collect the [pscustomobject] instances output by the function
# in an array.
$UnReachablePCs = #(GetComputerData -Computers $TraderWorkstations)
#(), the array-subexpression operator, always creates an [object[]] array. To create a strongly typed array instead, use:
[pscustomobject[]] $unreachablePCs = GetComputerData -Computers $TraderWorkstations
Important:
[PSCustomObject]#{ ... directly produces output from the function, due to PowerShell's implicit output feature, where any command or expression whose output isn't captured or redirected automatically contributes to the enclosing function's (script's) output (written to PowerShell's success output stream) - see this answer for details. All the objects written to the function's success output streams are captured by a variable assignment such as $UnReachablePCs = ...
Write-Output is the explicit (rarely needed) way to write to the success output stream, which also implies that you cannot use it for ad hoc debugging output to the display, because its output too becomes part of the function's "return value" (the objects sent to the success output stream).
If you want to-display output that doesn't "pollute" the success output stream, use Write-Host. Preferably, use cmdlets that target other, purpose-specific output streams, such as Write-Verbose and Write-Debug, though both of them require opt-in to produce visible output (see the linked docs).
As for the problems with your original approach:
$UnreachablePCs = [PSCustomObject]#()
This doesn't create an array of custom objects.
Instead, it creates an [object[]] array that is (uselessly, mostly invisibly) wrapped in a [psobject] instance.[1]
Use the following instead:
[PSCustomObject[]] $UnreachablePCs = #()
As for use of [ref] with an array variable updated with +=:
Fundamentally, you need to update a (parameter) variable containing a [ref] instance by assigning to its .Value property, not to the [ref] instance as a whole ($UnreachablePCs.Value = ... rather than $UnreachablePCs = ...)
However, the += technique should be avoided except for small arrays, because every += operation requires creating a new array behind the scenes (comprising the original elements and the new one(s)), which is necessary, because arrays are fixed-size data structures.
Either: Use an efficiently extensible list data type, such as [System.Collections.Generic.List[PSCustomObject]] and grow it via its .Add() method (in fact, if you create the list instance beforehand, you could pass it as a normal (non-[ref]) argument to a non-[ref] parameter, and the function would still directly update the list, due to operating on the same list instance when calling .Add() - that said, the output approach at the top is generally still preferable):
$unreachablePCs = [System.Collections.Generic.List[PSCustomObject]] #()
foreach ($i in 1..2) {
$unreachablePCs.Add([pscustomobject] #{ foo = $i })
}
Or - preferably - when possible: Use PowerShell's loop statements as expressions, and let PowerShell collect all outputs in an array for you (as also shown with output from a whole function above); e.g.:
# Automatically collects the two custom objects output by the loop.
[array] $unreachablePCs = foreach ($i in 1..2) {
[pscustomobject] #{ foo = $i }
}
[1] This behavior is unfortunate, but stems from the type accelerators [pscustomobject] and [psobject] being the same. A [pscustomobject] cast only works meaningfully in the context of creating a single custom object literal, i.e. if followed by a hashtable (e.g., [pscustomobject] #{ foo = 1 }). In all other cases, the mostly invisible wrapping in [psobject] occurs; e.g., [pscustomobject] #() -is [object[]] is $true - i.e., the result behaves like a regular array - but [pscustomobject] #() -is [psobject] is also $true, indicating the presence of a [psobject] wrapper.

Update a value in a array of objects

If I have an array of objects like
$Results = #(
[PSCustomObject]#{
Email = $_.email
Type = $Type
}
# ...
)
and I want to update the 100th $results.type value from X to y. I thought this code would be
$results.type[100] = 'Hello'
But this is not working and I am not sure why?
$Results.Type[100] = 'Hello' doesn't work because $Results.Type isn't real!
$Results is an array. It contains a number of objects, each of which have a Type property. The array itself also has a number of properties, like its Length.
When PowerShell sees the . member invocation operator in the expression $Results.Type, it first attempts to resolve any properties of the array itself, but since the [array] type doesn't have a Type property, it instead tries to enumerate the items contained in the array, and invoke the Type member on each of them. This feature is known as member enumeration, and produces a new one-off array containing the values enumerated.
Change the expression to:
$Results[100].Type = 'Hello'
Here, instead, we reference the 101st item contained in the $Results array (which is real, it already exists), and overwrite the value of the Type property of that object.
#MathiasR.Jessen answered this. The way I was indexing the array was wrong.
the correct way is $results[100].type = 'Hello'

Creating an array with large initial size in powershell

The only way I know to create an array in powershell is
$arr = #(1, 2, 3)
However, this creating method is not convenient if I want to create an array with large initial size such as 10000.
Because I don't want to write code like this
$arr = #(0, 0, 0, 0, 0, 0, ... ,0) # 10000 0s in this line of code
Writing code like the following is not efficient.
$arr = #()
for ($i = 1; $i -le 10000; $i++) {
$arr += 0
}
Because whenever += operator is executed, all of the elements in the old array would be copied into a newly created array.
What's the best way to create an array with large initial size in powershell?
Use New-Object in this case:
PS> $arr = New-Object int[] 10000; $arr.length
10000
Or, in PSv5+, using the static new() method on the type:
PS> $arr = [int[]]::new(10000); $arr.length
10000
These commands create a strongly typed array, using base type [int] in this example.
If the use case allows it, this is preferable for reasons of performance and type safety.
If you need to create an "untyped" array the same way that PowerShell does ([System.Object[]]), substitute object for int; e.g., [object[]]::new(10000); the elements of such an array will default to $null.
TessellatingHeckler's helpful answer, however, shows a much more concise alternative that even allows you initialize all elements to a specific value.
Arrays have a fixed size; if you need an array-like data structure that you can preallocate and grow dynamically, see Bluecakes' helpful [System.Collections.ArrayList]-based answer.
[System.Collections.ArrayList] is the resizable analog to [System.Object[]], and its generic equivalent - which like the [int[]] example above - allows you to use a specific type for performance and robustness (if feasible), is [System.Collections.Generic.List[<type>]], e.g.:
PS> $lst = [System.Collections.Generic.List[int]]::New(10000); $lst.Capacity
10000
Note that - as with [System.Collections.ArrayList] - specifying an initial capacity (10000, here) does not allocate the internally used array with that size right away - the capacity value is simply stored (and exposed as property .Capacity), and an internal array with that capacity (internal size that leaves room for growth) is allocated on demand, when the first element is added to the list.
[System.Collections.Generic.List[<type>]]'s .Add() method commendably does not produce output, whereas [System.Collections.ArrayList]'s does (it returns the index of the element that was just added).
# The non-generic ArrayList's .Add() produces (usually undesired) output.
PS> $al = [System.Collections.ArrayList]::new(); $al.Add('first elem')
0 # .Add() outputs the index of the newly added item
# Simplest way to suppress this output:
PS> $null = $al.Add('first elem')
# NO output.
# The generic List[T]'s .Add() does NOT produce output.
PS> $gl = [System.Collections.Generic.List[string]]::new(); $gl.Add('first elem')
# NO output from .Add()
Array literal multiplication:
$arr = #($null) * 10000
In PowerShell you are correct in that += will destroy the old array and create a new array with the new items.
For working with a large collection of items i would highly recommend using the ArrayList type from .NET as this is not a fixed size array so PowerShell will not destroy it every time you add an item to it and i've found this to work better in my projects.
Using ArrayList also means that you don't need to start with 10000 items. Because PowerShell won't need to recreate your array every time, you can start with 0 and then add each item as it's needed instead of starting with 10000.
So in your script i would create an empty ArrayList like so
[System.Collections.ArrayList]$arr = #()
and then when you need to add something to it just call .Add() (You don't need to prepopulate the array with 10000 items, it will expand as you add items).
$arr.Add([int]0)
Your example using an ArrayList:
[System.Collections.ArrayList]$arr = #()
for ($i = 1; $i -le 10000; $i++) {
$arr.Add([int]0)
}
You could always do this:
$a = #(0..9999)
$a.length
This is nearly like what you are already doing, except that you don't have to
write out all the values.

Pass a PowerShell array to a .NET function

I want to get some 10 values of type short from a .NET function.
In C# it works like this:
Int16[] values = new Int16[10];
Control1.ReadValues(values);
The C# syntax is ReadValues(short[] values).
I tried something like this:
$Control1.ReadValues([array][int16]$Result)
But there are only zeroes in the array.
In the comments you mention:
I believe that the C# function have a ref
So, the method signature is really:
ReadValues(ref short[] values)
Luckily, PowerShell has a [ref] type accelerator for this sort of situation
# Start by creating an array of Int16, length 10
$Result = [int16[]]#( ,0 * 10 )
# Pass the variable reference with the [ref] keyword
$Control1.ReadValues([ref]$Result)
For more inforation, see the about_Ref help file