Powershell reference type memory consumption - powershell

Powershell reference consumes a lot more memory than barely declare a variable, which is weird, it supposes to consume less.
Store with object
- it consumes less memory
Store with ref of object
- it consumes 2x memory than not using memory
I guess it is the class causing this, but I don't know why.
class LinkedListNode {
$value
$next = #()
$previous = #()
LinkedListNode($value) {
$this.value = $value
}
}
class test {
$hash = #{}
[object] Append($value) {
$newNode = New-Object LinkedListNode $value
$newNode.previous = $null
$newNode.next = $null
$this.hash.Add($value, [ref] $newNode) # with ref
# $this.hash.Add($value, $newNode) # with object
return $this
}
}
$t = [test]::new()
for ($i = 0; $i -lt 30000; $i++) {
$t.Append($i)
}
For the code below, ref consumes less memory, which I think is usual case.
for ($i = 0; $i -lt 30000; $i++) {
$testObject = New-Object -TypeName PSObject -Property #{
'forTest' = "test"
}
$test.Add($i, [ref] $testObject) # with ref
# $test.Add($i, $testObject) # with object
}

This is a tricky one because references in PowerShell aren't the same as C++ references, and don't do what you think they do. Basically, when you read about_Ref it indicates that it treats Variables and Objects differently.
Passing a variable of type int can be by reference or by value.
Passing an Object is always by reference.
What that means, is that the example you used:
The "by Object" used a true reference.
The "by Ref" actually wrapped the LinkedListNode object in a System.Management.Automation.PSReference object. This System.Management.Automation.PSReference object takes up some space, and due to the small object sizes, made it "seem" like it took up twice the memory.
The [ref] is meant for interacting with .NET functions that require it see: [ref] doesn't work with class members and Restrict use of [ref] to variables
Also, using [ref] with functions in PowerShell:
When passing a variable by reference, the function can change the data
and that change persists after the function executes.
This is different than how C++ would use references.

Related

How to extend and override Add in a collection class

Background
I have a data object in PowerShell with 4 properties, 3 of which are strings and the 4th a hashtable. I would like to arrange for a new type that is defined as a collection of this data object.
In this collection class, I wish to enforce a particular format that will make my code elsewhere in the module more convenient. Namely, I wish to override the add method with a new definition, such that unique combinations of the 3 string properties add the 4th property as a hashtable, while duplicates of the 3 string properties simply update the hashtable property of the already existing row with the new input hashtable.
This will allow me to abstract the expansion of the collection and ensure that when the Add method is called on it, it will retain my required format of hashtables grouped by unique combinations of the 3 string properties.
My idea was to create a class that extends a collection, and then override the add method.
Code so far
As a short description for my code below, there are 3 classes:
A data class for a namespace based on 3 string properties (which I can reuse in my script for other things).
A class specifically for adding an id property to this data class. This id is the key in a hashtable with values that are configuration parameters in the namespace of my object.
A 3rd class to handle a collection of these objects, where I can define the add method. This is where I am having my issue.
Using namespace System.Collections.Generic
Class Model_Namespace {
[string]$Unit
[string]$Date
[string]$Name
Model_Namespace([string]$unit, [string]$date, [string]$name) {
$this.Unit = $unit
$this.Date = $date
$this.Name = $name
}
}
Class Model_Config {
[Model_Namespace]$namespace
[Hashtable]$id
Model_Config([Model_Namespace]$namespace, [hashtable]$config) {
$this.namespace = $namespace
$this.id = $config
}
Model_Config([string]$unit, [string]$date, [string]$name, [hashtable]$config) {
$this.namespace = [Model_Namespace]::new($unit, $date, $name)
$this.id = $config
}
}
Class Collection_Configs {
$List = [List[Model_Config]]#()
[void] Add ([Model_Config]$newConfig ){
$checkNamespaceExists = $null
$u = $newConfig.Unit
$d = $newConfig.Date
$n = $newConfig.Name
$id = $newConfig.id
$checkNamespaceExists = $this.List | Where { $u -eq $_.Unit -and $d -eq $_.Date -and $n -eq $_.Name }
If ($checkNamespaceExists){
($this.List | Where { $u -eq $_.Unit -and $d -eq $_.Date -and $n -eq $_.Name }).id += $id
}
Else {
$this.List.add($newConfig)
}
}
}
Problem
I would like the class Collection_Configs to extend a built-in collection type and override the Add method. Like a generic List<> type, I could simply output the variable referencing my collection and automatically return the collection. This way, I won't need to dot into the List property to access the collection. In fact I wouldn't need the List property at all.
However, when I inherit from System.Array, I need to supply a fixed array size in the constructor. I'd like to avoid this, as my collection should be mutable. I tried inheriting from List, but I can't get the syntax to work; PowerShell throws a type not found error.
Is there a way to accomplish this?
Update
After mklement's helpful answer, I modified the last class as:
Using namespace System.Collections.ObjectModel
Class Collection_Configs : System.Collections.ObjectModel.Collection[Object]{
[void] Add ([Model_Config]$newConfig ){
$checkNamespaceExists = $null
$newConfigParams = $newConfig.namespace
$u = $newConfigParams.Unit
$d = $newConfigParams.Date
$n = $newConfigParams.Name
$id = $newConfig.id
$checkNamespaceExists = $this.namespace | Where { $u -eq $_.Unit -and $d -eq $_.Date -and $n -eq $_.Name }
If ($checkNamespaceExists){
($this | Where { $u -eq $_.namespace.Unit -and $d -eq $_.namespace.Date -and $n -eq $_.namespace.Name }).id += $id
}
Else {
([Collection[object]]$this).add($newConfig)
}
}
}
Which seems to work. In addition to the inheritance, had to do some other corrections regarding how I dotted into my input types, and I also needed to load the collection class separately after the other 2 classes as well as use the base class's add method in my else statement.
Going forward, I will have to do some other validation to ensure that a model_config type is entered. Currently the custom collection accepts any input, even though I auto-convert the add parameter to model_config, e.g.,
$config = [model_config]::new('a','b','c',#{'h'='t'})
$collection = [Collection_Configs]::new()
$collection.Add($config)
works, but
$collection.Add('test')
also works when it should fail validation. Perhaps it is not overriding correctly and using the base class's overload?
Last update
Everything seems to be working now. The last update to the class is:
using namespace System.Collections.ObjectModel
Class Collection_Configs : Collection[Model_Config]{
[void] Add ([Model_Config]$newConfig ){
$checkNamespaceExists = $null
$namespace = $newConfig.namespace
$u = $namespace.Unit
$d = $namespace.Date
$n = $namespace.Name
$id = $newConfig.id
$checkNamespaceExists = $this.namespace | Where { $u -eq $_.Unit -and $d -eq $_.Date -and $n -eq $_.Name }
If ($checkNamespaceExists){
($this | Where { $u -eq $_.namespace.Unit -and $d -eq $_.namespace.Date -and $n -eq $_.namespace.Name }).id += $id
}
Else {
[Collection[Model_Config]].GetMethod('Add').Invoke($this, [Model_Config[]]$newConfig)
}
}
}
Notice in the else statement that ....GetMethod('Add')... is necessary for Windows PowerShell, as pointed out in the footnote of mklement0's super useful and correct answer. If you are able to work with Core, then mklement0's syntax will work (I tested).
Also mentioned by mklement0, the types need to be loaded separately. FYI this can be done on the commandline for quick provisional testing by typing in the model_namespace and model_config classes and pressing enter before doing the same for Collection_Configs.
In summary this will create a custom collection type with custom methods in PowerShell.
It is possible to subclass System.Collections.Generic.List`1, as this simplified example, which derives from a list with [regex] elements, demonstrates:[1]
using namespace System.Collections.Generic
# Subclass System.Collections.Generic.List`1 with [regex] elements.
class Collection_Configs : List[regex] {
# Override the .Add() method.
# Note: You probably want to override .AddRange() too.
Add([regex] $item) {
Write-Verbose -Verbose 'Doing custom things...'
# Call the base-class method.
([List[regex]] $this).Add($item)
}
}
# Sample use.
$list = [Collection_Configs]::new()
$list.Add([regex] 'foo')
$list
However, as you note, it is recommended to derive custom collections from base class System.Collections.ObjectModel.Collection`1:
using namespace System.Collections.ObjectModel
# Subclass System.Collections.ObjectModel`1 with [regex] elements.
class Collection_Configs : Collection[regex] {
# Override the .Add() method.
# Note: Unlike with List`1, there is no .AddRange() method.
Add([regex] $item) {
Write-Verbose -Verbose 'Doing custom things...'
# Call the base-class method.
([Collection[regex]] $this).Add($item)
}
}
As for the pros and cons:
List`1 has more built-in functionality (methods) than ObjectModel`1, such as .Reverse(), Exists(), and .ForEach().
In the case of .ForEach() that actually works to the advantage of ObjectModel`1: not having such a method avoids a clash with PowerShell's intrinsic .ForEach() method.
Note that in either case it is important to use the specific type that your collection should be composed of as the generic type argument for the base class: [regex] in the example above, [Model_Config] in your real code (see next section).
If you use [object] instead, your collection won't be type-safe, because it'll have a void Add(object item) method that PowerShell will select whenever you call the .Add() method with an instance of a type that is not the desired type (or cannot be converted to it).
However, there's an additional challenge in your case:
As of PowerShell 7.3.1, because the generic type argument that determines the list element type is another custom class, that other class must unexpectedly be loaded beforehand, in a separate script, the script that defines the dependent Collection_Configs class.
This requirement is unfortunate, and at least conceptually related to the general (equally unfortunate) need to ensure that .NET types referenced in class definitions have been loaded before the enclosing script executes - see this post, whose accepted answer demonstrates workarounds.
However, given that all classes involved are part of the same script file in your case, a potential fix should be simpler than the one discussed in the linked post - see GitHub issue #18872.
[1] Note: There appears to be a bug in Windows PowerShell, where calling the base class' .Add() method fails if the generic type argument (element type) happens to be [pscustomobject] aka [psobject]: That is, while ([List[pscustomobject]] $this).Add($item) works as expected in PowerShell (Core) 7+, an error occurs in Windows PowerShell, which requires the following reflection-based workaround: [List[pscustomobject]].GetMethod('Add').Invoke($this, [object[]] $item)
There were a few issues with the original code:
The Using keyword was spelled incorrectly. It should be using.
The $List variable in the Collection_Configs class was not declared with a type. It should be [List[Model_Config]]$List.
The Add method in the Collection_Configs class was missing its return type. It should be [void] Add ([Model_Config]$newConfig).
The Add method was missing its opening curly brace.

Change nested property of an object dynamically (eg: $_.Property.SubProperty) [duplicate]

Say I have JSON like:
{
"a" : {
"b" : 1,
"c" : 2
}
}
Now ConvertTo-Json will happily create PSObjects out of that. I want to access an item I could do $json.a.b and get 1 - nicely nested properties.
Now if I have the string "a.b" the question is how to use that string to access the same item in that structure? Seems like there should be some special syntax I'm missing like & for dynamic function calls because otherwise you have to interpret the string yourself using Get-Member repeatedly I expect.
No, there is no special syntax, but there is a simple workaround, using iex, the built-in alias[1] for the Invoke-Expression cmdlet:
$propertyPath = 'a.b'
# Note the ` (backtick) before $json, to prevent premature expansion.
iex "`$json.$propertyPath" # Same as: $json.a.b
# You can use the same approach for *setting* a property value:
$newValue = 'foo'
iex "`$json.$propertyPath = `$newValue" # Same as: $json.a.b = $newValue
Caveat: Do this only if you fully control or implicitly trust the value of $propertyPath.
Only in rare situation is Invoke-Expression truly needed, and it should generally be avoided, because it can be a security risk.
Note that if the target property contains an instance of a specific collection type and you want to preserve it as-is (which is not common) (e.g., if the property value is a strongly typed array such as [int[]], or an instance of a list type such as [System.Collections.Generic.List`1]), use the following:
# "," constructs an aux., transient array that is enumerated by
# Invoke-Expression and therefore returns the original property value as-is.
iex ", `$json.$propertyPath"
Without the , technique, Invoke-Expression enumerates the elements of a collection-valued property and you'll end up with a regular PowerShell array, which is of type [object[]] - typically, however, this distinction won't matter.
Note: If you were to send the result of the , technique directly through the pipeline, a collection-valued property value would be sent as a single object instead of getting enumerated, as usual. (By contrast, if you save the result in a variable first and the send it through the pipeline, the usual enumeration occurs). While you can force enumeration simply by enclosing the Invoke-Expression call in (...), there is no reason to use the , technique to begin with in this case, given that enumeration invariably entails loss of the information about the type of the collection whose elements are being enumerated.
Read on for packaged solutions.
Note:
The following packaged solutions originally used Invoke-Expression combined with sanitizing the specified property paths in order to prevent inadvertent/malicious injection of commands. However, the solutions now use a different approach, namely splitting the property path into individual property names and iteratively drilling down into the object, as shown in Gyula Kokas's helpful answer. This not only obviates the need for sanitizing, but turns out to be faster than use of Invoke-Expression (the latter is still worth considering for one-off use).
The no-frills, get-only, always-enumerate version of this technique would be the following function:
# Sample call: propByPath $json 'a.b'
function propByPath { param($obj, $propPath) foreach ($prop in $propPath.Split('.')) { $obj = $obj.$prop }; $obj }
What the more elaborate solutions below offer: parameter validation, the ability to also set a property value by path, and - in the case of the propByPath function - the option to prevent enumeration of property values that are collections (see next point).
The propByPath function offers a -NoEnumerate switch to optionally request preserving a property value's specific collection type.
By contrast, this feature is omitted from the .PropByPath() method, because there is no syntactically convenient way to request it (methods only support positional arguments). A possible solution is to create a second method, say .PropByPathNoEnumerate(), that applies the , technique discussed above.
Helper function propByPath:
function propByPath {
param(
[Parameter(Mandatory)] $Object,
[Parameter(Mandatory)] [string] $PropertyPath,
$Value, # optional value to SET
[switch] $NoEnumerate # only applies to GET
)
Set-StrictMode -Version 1
# Note: Iteratively drilling down into the object turns out to be *faster*
# than using Invoke-Expression; it also obviates the need to sanitize
# the property-path string.
$props = $PropertyPath.Split('.') # Split the path into an array of property names.
if ($PSBoundParameters.ContainsKey('Value')) { # SET
$parentObject = $Object
if ($props.Count -gt 1) {
foreach ($prop in $props[0..($props.Count-2)]) { $parentObject = $parentObject.$prop }
}
$parentObject.($props[-1]) = $Value
}
else { # GET
$value = $Object
foreach ($prop in $props) { $value = $value.$prop }
if ($NoEnumerate) {
, $value
} else {
$value
}
}
}
Instead of the Invoke-Expression call you would then use:
# GET
propByPath $obj $propertyPath
# GET, with preservation of the property value's specific collection type.
propByPath $obj $propertyPath -NoEnumerate
# SET
propByPath $obj $propertyPath 'new value'
You could even use PowerShell's ETS (extended type system) to attach a .PropByPath() method to all [pscustomobject] instances (PSv3+ syntax; in PSv2 you'd have to create a *.types.ps1xml file and load it with Update-TypeData -PrependPath):
'System.Management.Automation.PSCustomObject',
'Deserialized.System.Management.Automation.PSCustomObject' |
Update-TypeData -TypeName { $_ } `
-MemberType ScriptMethod -MemberName PropByPath -Value { #`
param(
[Parameter(Mandatory)] [string] $PropertyPath,
$Value
)
Set-StrictMode -Version 1
$props = $PropertyPath.Split('.') # Split the path into an array of property names.
if ($PSBoundParameters.ContainsKey('Value')) { # SET
$parentObject = $this
if ($props.Count -gt 1) {
foreach ($prop in $props[0..($props.Count-2)]) { $parentObject = $parentObject.$prop }
}
$parentObject.($props[-1]) = $Value
}
else { # GET
# Note: Iteratively drilling down into the object turns out to be *faster*
# than using Invoke-Expression; it also obviates the need to sanitize
# the property-path string.
$value = $this
foreach ($prop in $PropertyPath.Split('.')) { $value = $value.$prop }
$value
}
}
You could then call $obj.PropByPath('a.b') or $obj.PropByPath('a.b', 'new value')
Note: Type Deserialized.System.Management.Automation.PSCustomObject is targeted in addition to System.Management.Automation.PSCustomObject in order to also cover deserialized custom objects, which are returned in a number of scenarios, such as using Import-CliXml, receiving output from background jobs, and using remoting.
.PropByPath() will be available on any [pscustomobject] instance in the remainder of the session (even on instances created prior to the Update-TypeData call [2]); place the Update-TypeData call in your $PROFILE (profile file) to make the method available by default.
[1] Note: While it is generally advisable to limit aliases to interactive use and use full cmdlet names in scripts, use of iex to me is acceptable, because it is a built-in alias and enables a concise solution.
[2] Verify with (all on one line) $co = New-Object PSCustomObject; Update-TypeData -TypeName System.Management.Automation.PSCustomObject -MemberType ScriptMethod -MemberName GetFoo -Value { 'foo' }; $co.GetFoo(), which outputs foo even though $co was created before Update-TypeData was called.
This workaround is maybe useful to somebody.
The result goes always deeper, until it hits the right object.
$json=(Get-Content ./json.json | ConvertFrom-Json)
$result=$json
$search="a.c"
$search.split(".")|% {$result=$result.($_) }
$result
You can have 2 variables.
$json = '{
"a" : {
"b" : 1,
"c" : 2
}
}' | convertfrom-json
$a,$b = 'a','b'
$json.$a.$b
1

How in general can I find functions that contains bugs due to array output?

Unassigned variables are outputted in PowerShell functions. So for instance in the function below you will get an object array returned instead of a string because [regex] is unassigned and needs to become assigned or nullified or whatever.
These bugs are very hard to detect when scanning codebases. Since some functions DO output an object array willingly (and then correctly handled) while for others it is indeed a bug. In this case the output was used to write somewhere and since an array gets transformed to a string it was undetectable.
function Get-PascalizedString {
param(
[string]$String
)
$rx = "(?:[^a-zA-Z0-9]*)(?<first>[a-zA-Z0-9])(?<reminder>[a-zA-Z0-9]*)(?:[^a-zA-Z0-9]*)"
$result = ""
[regex]::Matches($String, $rx) | ForEach-Object {$_.Groups} {
$TextInfo = (Get-Culture).TextInfo
$part = $TextInfo.ToTitleCase($_.Value.ToLower()).Trim()
$part = $part -replace "[^a-zA-Z0-9]"
$result = $result + $part
}
return $result
}
$a = Get-PascalizedString -String "aaa"
write-host $a
$a.GetType() // but...
So is there a smart way to detect these kind of bugs in larger codebases?

Make select -unique on arraylist return arraylist instead of string

I have three arraylists in below class. I want to keep them unique. However if there's only one item (string) in the arraylist and you use select -unique (or any other method to achieve this) it will return the string instead of a list of strings. Surrounding it with #() also doesn't work because that transforms it to an array instead of an arraylist, which I can't add stuff to.
Any suggestions that are still performant? I tried HashSets before but somehow had horrible experiences with those. See my previous post for that.. Post on hashset issue
Code below:
Class OrgUnit
{
[String]$name
$parents
$children
$members
OrgUnit($name){
$this.name = $name
$this.parents = New-Object System.Collections.ArrayList
$this.children = New-Object System.Collections.ArrayList
$this.members = New-Object System.Collections.ArrayList
}
addChild($child){
# > $null to supress output
$tmp = $this.children.Add($child)
$this.children = $this.children | select -Unique
}
addParent($parent){
# > $null to supress output
$tmp = $this.parents.Add($parent)
$this.parents = $this.parents | select -Unique
}
addMember($member){
# > $null to supress output
$tmp = $this.members.Add($member)
$this.members = $this.members | select -Unique
}
}
You're adding a new item to the array, then selecting unique items from it, and reassingning it every time you add a member. This is extremely inefficient, maybe try the following instead:
if (-not $this.parents.Contains($parent)) {
$this.parents.Add($parent) | out-null
}
Would be much faster even with least efficient output supressing by out-null.
Check with .Contains() if the item is already added, so you don't have to eliminate duplicates with Select-Object -Unique afterwards all the time.
if (-not $this.children.Contains($child))
{
[System.Void]($this.children.Add($child))
}
As has been pointed out, it's worth changing your approach due to its inefficiency:
Instead of blindly appending and then possibly removing the new element if it turns out to be duplicate with Select-Object -Unique, use a test to decide whether an element needs to be appended or is already present.
Patrick's helpful answer is a straightforward implementation of this optimized approach that will greatly speed up your code and should perform acceptably unless the array lists get very large.
As a side effect of this optimization - because the array lists are only ever modified in-place with .Add() - your original problem goes away.
To answer the question as asked:
Simply type-constrain your (member) variables if you want them to retain a given type even during later assignments.
That is, just as you did with $name, place the type you want the member to be constrained to the left of the member variable declarations:
[System.Collections.ArrayList] $parents
[System.Collections.ArrayList] $children
[System.Collections.ArrayList] $members
However, that will initialize these member variables to $null, which means you won't be able to just call .Add() in your .add*() methods; therefore, construct an (initially empty) instance as part of the declaration:
[System.Collections.ArrayList] $parents = [System.Collections.ArrayList]::new()
[System.Collections.ArrayList] $children = [System.Collections.ArrayList]::new()
[System.Collections.ArrayList] $members = [System.Collections.ArrayList]::new()
Also, you do have to use #(...) around your Select-Object -Unique pipeline; while that indeed outputs an array (type [object[]]), the type constraint causes that array to be converted to a [System.Collections.ArrayList] instance, as explained below.
The need for #(...) is somewhat surprising - see bottom section.
Notes on type constraints:
If you assign a value that isn't already of the type that the variable is constrained to, PowerShell attempts to convert it to that type; you can think of it as implicitly performing a cast to the constraining type on every assignment:
This can fail, if the assigned value simply isn't convertible; PowerShell's type conversions are generally very flexible, however.
In the case of collection-like types such as [System.Collections.ArrayList], any other collection-like type can be assigned, such as the [object[]] arrays returned by #(...) (PowerShell's array-subexpression operator). Note that, of necessity, this involves constructing a new [System.Collections.ArrayList] every time, which becomes, loosely speaking, a shallow clone of the input collection.
Pitfalls re assigning $null:
If the constraining type is a value type (if its .IsValueType property reports $true), assigning $null will result in the type's default value; e.g., after executing [int] $i = 42; $i = $null, $i isn't $null, it is 0.
If the constraining type is a reference type (such as [System.Collections.ArrayList]), assigning $null will truly store $null in the variable, though later attempts to assign non-null values will again result in conversion to the constraining type.
In essence, this is the same technique used in parameter variables, and can also be used in regular variables.
With regular variables (local variables in a function or script), you must also initialize the variable in order for the type constraint to work (for the variable to even be created); e.g.:
[System.Collections.ArrayList] $alist = 1, 2
Applied to a simplified version of your code:
Class OrgUnit
{
[string] $name
# Type-constrain $children too, just like $name above, and initialize
# with an (initially empty) instance.
[System.Collections.ArrayList] $children = [System.Collections.ArrayList]::new()
addChild($child){
# Add a new element.
# Note the $null = ... to suppress the output from the .Add() method.
$null = $this.children.Add($child)
# (As noted, this approach is inefficient.)
# Note the required #(...) around the RHS (see notes in the last section).
# Due to its type constraint, $this.children remains a [System.Collections.ArrayList] (a new instance is created from the
# [object[]] array that #(...) outputs).
$this.children = #($this.children | Select-Object -Unique)
}
}
With the type constraint in place, the .children property now remains a [System.Collections.ArrayList]:
PS> $ou = [OrgUnit]::new(); $ou.addChild(1); $ou.children.GetType().Name
ArrayList # Proof that $children retained its type identity.
Note: The need for #(...) - to ensure an array-valued assignment value in order to successfully convert to [System.Collections.ArrayList] - is somewhat surprising, given that the following works with the similar generic list type, [System.Collections.Generic.List[object]]:
# OK: A scalar (single-object) input results in a 1-element list.
[System.Collections.Generic.List[object]] $list = 'one'
By contrast, this does not work with [System.Collections.ArrayList]:
# !! FAILS with a scalar (single object)
# Error message: Cannot convert the "one" value of type "System.String" to type "System.Collections.ArrayList".
[System.Collections.ArrayList] $list = 'one'
# OK
# Forcing the RHS to an array ([object[]]) fixes the problem.
[System.Collections.ArrayList] $list = #('one')
Try this one:
Add-Type -AssemblyName System.Collections
Class OrgUnit
{
[String]$name
$parents
$children
$members
OrgUnit($name){
$this.name = $name
$this.parents = [System.Collections.Generic.List[object]]::new()
$this.children = [System.Collections.Generic.List[object]]::new()
$this.members = [System.Collections.Generic.List[object]]::new()
}
addChild($child){
# > $null to supress output
$tmp = $this.children.Add($child)
$this.children = [System.Collections.Generic.List[object]]#($this.children | select -Unique)
}
addParent($parent){
# > $null to supress output
$tmp = $this.parents.Add($parent)
$this.parents = [System.Collections.Generic.List[object]]#($this.parents | select -Unique)
}
addMember($member){
# > $null to supress output
$tmp = $this.members.Add($member)
$this.members = [System.Collections.Generic.List[object]]#($this.members | select -Unique)
}
}

Creating a reference to a hash table element

So, I'm trying to create a tree-type variable that I could use for data navigation. I've ran into an issue while trying to use reference variables on hash tables in PowerShell. Consider the following code:
$Tree = #{ TextValue = "main"; Children = #() }
$Item = #{ TextValue = "sub"; Children = #() }
$Pointer = [ref] $Tree.Children
$Pointer.Value += $Item
$Tree
When checking reference variable $Pointer, it shows appropriate values, but main variable $Tree is not affected. Is there no way to create references to a hash table element in PowerShell, and I'll have to switch to a 2-dimensional array?
Edit with more info:
I've accepted Mathias' answer, as using List looks like exactly what I need, but there's a little more clarity needed on how arrays and references interact. Try this code:
$Tree1 = #()
$Pointer = $Tree1
$Pointer += 1
Write-Host "tree1 is " $Tree1
$Tree2 = #()
$Pointer = [ref] $Tree2
$Pointer.Value += 1
Write-Host "tree2 is " $Tree2
As you can see from the output, it is possible to get a reference to an array and then modify the size of the array via that reference. I thought it would also work if an array is an element of another array or a hash table, but it does not. PowerShell seems to handle those differently.
I suspect this to be an unfortunate side-effect of the way += works on arrays.
When you use += on a fixed-size array, PowerShell replaces the original array with a new (and bigger) array. We can verify that $Pointer.Value no longer references the same array with GetHashCode():
PS C:\> $Tree = #{ Children = #() }
PS C:\> $Pointer = [ref]$Tree.Children
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
True
PS C:\> $Pointer.Value += "Anything"
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
False
One way of going about this is to avoid using #() and +=.
You could use a List type instead:
$Tree = #{ TextValue = "main"; Children = New-Object System.Collections.Generic.List[psobject] }
$Item = #{ TextValue = "sub"; Children = New-Object System.Collections.Generic.List[psobject] }
$Pointer = [ref] $Tree.Children
$Pointer.Value.Add($Item)
$Tree
To complement Mathias R. Jessen's helpful answer:
Indeed, any array is of fixed size and cannot be extended in place (#() creates an empty [object[]] array).
+= in PowerShell quietly creates a new array, with a copy of all the original elements plus the new one(s), and assigns that to the LHS.
Your use of [ref] is pointless, because $Pointer = $Tree.Children alone is sufficient to copy the reference to the array stored in $Tree.Children.
See bottom section for a discussion of appropriate uses of [ref].
Thus, both $Tree.Children and $Pointer would then contain a reference to the same array, just as $Pointer.Value does in your [ref]-based approach.
Because += creates a new array, however, whatever is on the LHS - be it $Pointer.Value or, without [ref], just $Pointer - simply receives a new reference to the new array, whereas $Tree.Children still points to the old one.
You can verify this by using the direct way to determine whether two variables or expressions "point" to the same instance of a reference type (which all collections are):
PS> [object]::ReferenceEquals($Pointer.Value, $Tree.Children)
False
Note that [object]::ReferenceEquals() is only applicable to reference types, not value types - variables containing the latter store values directly instead of referencing data stored elsewhere.
Mathias' approach solves your problem by using a [List`1] instance instead of an array, which can be extended in place with its .Add() method, so that the reference stored in $Pointer[.Value] never needs to change and continues to refer to the same list as $Tree.Children.
Regarding your follow-up question: appropriate uses of [ref]:
$Tree2 = #()
$Pointer = [ref] $Tree2
In this case, because [ref] is applied to a variable - as designed - it creates an effective variable alias: $Pointer.Value keeps pointing to whatever $Tree2 contains even if different data is assigned to $Tree2 later (irrespective of whether that data is a value-type or reference-type instance):
PS> $Tree2 = 'Now I am a string.'; $Pointer.Value
Now I am a string.
Also note that the typical [ref] use case is to pass variables to functions to .NET API methods that have ref or out parameters; while you can use it with PowerShell scripts and functions too in order to pass by-reference parameters, as shown in the following example, this is best avoided:
# Works, but best avoided in PowerShell code.
PS> function foo { param([ref] $vRef) ++$vRef.Value }; $v=1; foo ([ref] $v); $v
2 # value of $v was incremented via $vRef.Value
By contrast, you cannot use [ref] to create such a persistent indirect reference to data, such as the property of an object contained in a variable, and use of [ref] is essentially pointless there:
$Tree2 = #{ prop = 'initial val' }
$Pointer = [ref] $Tree2.prop # [ref] is pointless here
Later changing $Tree2.prop is not reflected in $Pointer.Value, because $Pointer.Value statically refers to the reference originally stored in $Tree2.prop:
PS> $Tree2.prop = 'later val'; $Pointer.Value
initial val # $Pointer.Value still points to the *original* data
PowerShell should arguably prevent use of [ref] with anything that is not a variable. However, there is a legitimate - albeit exotic - "off-label" use for [ref], for facilitating updating values in the caller's scope from descendant scopes, as shown in the conceptual about_Ref help topic.
You can create the pointer into your tree structure:
$Tree = #{ TextValue = "main"; Children = [ref]#() }
$Item1 = #{ TextValue = "sub1"; Children = [ref]#() }
$Item2 = #{ TextValue = "sub2"; Children = [ref]#() }
$Item3 = #{ TextValue = "subsub"; Children = [ref]#() }
$Pointer = $Tree.Children
$Pointer.Value += $Item1
$Pointer.Value += $Item2
$Pointer.Value.Get(0).Children.Value += $Item3
function Show-Tree {
param ( [hashtable] $Tree )
Write-Host $Tree.TextValue
if ($Tree.Children.Value.Count -ne 0) {
$Tree.Children.Value | ForEach-Object { Show-Tree $_ }
}
}
Show-Tree $Tree
Output:
main
sub1
subsub
sub2