Creating a reference to a hash table element - powershell

So, I'm trying to create a tree-type variable that I could use for data navigation. I've ran into an issue while trying to use reference variables on hash tables in PowerShell. Consider the following code:
$Tree = #{ TextValue = "main"; Children = #() }
$Item = #{ TextValue = "sub"; Children = #() }
$Pointer = [ref] $Tree.Children
$Pointer.Value += $Item
$Tree
When checking reference variable $Pointer, it shows appropriate values, but main variable $Tree is not affected. Is there no way to create references to a hash table element in PowerShell, and I'll have to switch to a 2-dimensional array?
Edit with more info:
I've accepted Mathias' answer, as using List looks like exactly what I need, but there's a little more clarity needed on how arrays and references interact. Try this code:
$Tree1 = #()
$Pointer = $Tree1
$Pointer += 1
Write-Host "tree1 is " $Tree1
$Tree2 = #()
$Pointer = [ref] $Tree2
$Pointer.Value += 1
Write-Host "tree2 is " $Tree2
As you can see from the output, it is possible to get a reference to an array and then modify the size of the array via that reference. I thought it would also work if an array is an element of another array or a hash table, but it does not. PowerShell seems to handle those differently.

I suspect this to be an unfortunate side-effect of the way += works on arrays.
When you use += on a fixed-size array, PowerShell replaces the original array with a new (and bigger) array. We can verify that $Pointer.Value no longer references the same array with GetHashCode():
PS C:\> $Tree = #{ Children = #() }
PS C:\> $Pointer = [ref]$Tree.Children
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
True
PS C:\> $Pointer.Value += "Anything"
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
False
One way of going about this is to avoid using #() and +=.
You could use a List type instead:
$Tree = #{ TextValue = "main"; Children = New-Object System.Collections.Generic.List[psobject] }
$Item = #{ TextValue = "sub"; Children = New-Object System.Collections.Generic.List[psobject] }
$Pointer = [ref] $Tree.Children
$Pointer.Value.Add($Item)
$Tree

To complement Mathias R. Jessen's helpful answer:
Indeed, any array is of fixed size and cannot be extended in place (#() creates an empty [object[]] array).
+= in PowerShell quietly creates a new array, with a copy of all the original elements plus the new one(s), and assigns that to the LHS.
Your use of [ref] is pointless, because $Pointer = $Tree.Children alone is sufficient to copy the reference to the array stored in $Tree.Children.
See bottom section for a discussion of appropriate uses of [ref].
Thus, both $Tree.Children and $Pointer would then contain a reference to the same array, just as $Pointer.Value does in your [ref]-based approach.
Because += creates a new array, however, whatever is on the LHS - be it $Pointer.Value or, without [ref], just $Pointer - simply receives a new reference to the new array, whereas $Tree.Children still points to the old one.
You can verify this by using the direct way to determine whether two variables or expressions "point" to the same instance of a reference type (which all collections are):
PS> [object]::ReferenceEquals($Pointer.Value, $Tree.Children)
False
Note that [object]::ReferenceEquals() is only applicable to reference types, not value types - variables containing the latter store values directly instead of referencing data stored elsewhere.
Mathias' approach solves your problem by using a [List`1] instance instead of an array, which can be extended in place with its .Add() method, so that the reference stored in $Pointer[.Value] never needs to change and continues to refer to the same list as $Tree.Children.
Regarding your follow-up question: appropriate uses of [ref]:
$Tree2 = #()
$Pointer = [ref] $Tree2
In this case, because [ref] is applied to a variable - as designed - it creates an effective variable alias: $Pointer.Value keeps pointing to whatever $Tree2 contains even if different data is assigned to $Tree2 later (irrespective of whether that data is a value-type or reference-type instance):
PS> $Tree2 = 'Now I am a string.'; $Pointer.Value
Now I am a string.
Also note that the typical [ref] use case is to pass variables to functions to .NET API methods that have ref or out parameters; while you can use it with PowerShell scripts and functions too in order to pass by-reference parameters, as shown in the following example, this is best avoided:
# Works, but best avoided in PowerShell code.
PS> function foo { param([ref] $vRef) ++$vRef.Value }; $v=1; foo ([ref] $v); $v
2 # value of $v was incremented via $vRef.Value
By contrast, you cannot use [ref] to create such a persistent indirect reference to data, such as the property of an object contained in a variable, and use of [ref] is essentially pointless there:
$Tree2 = #{ prop = 'initial val' }
$Pointer = [ref] $Tree2.prop # [ref] is pointless here
Later changing $Tree2.prop is not reflected in $Pointer.Value, because $Pointer.Value statically refers to the reference originally stored in $Tree2.prop:
PS> $Tree2.prop = 'later val'; $Pointer.Value
initial val # $Pointer.Value still points to the *original* data
PowerShell should arguably prevent use of [ref] with anything that is not a variable. However, there is a legitimate - albeit exotic - "off-label" use for [ref], for facilitating updating values in the caller's scope from descendant scopes, as shown in the conceptual about_Ref help topic.

You can create the pointer into your tree structure:
$Tree = #{ TextValue = "main"; Children = [ref]#() }
$Item1 = #{ TextValue = "sub1"; Children = [ref]#() }
$Item2 = #{ TextValue = "sub2"; Children = [ref]#() }
$Item3 = #{ TextValue = "subsub"; Children = [ref]#() }
$Pointer = $Tree.Children
$Pointer.Value += $Item1
$Pointer.Value += $Item2
$Pointer.Value.Get(0).Children.Value += $Item3
function Show-Tree {
param ( [hashtable] $Tree )
Write-Host $Tree.TextValue
if ($Tree.Children.Value.Count -ne 0) {
$Tree.Children.Value | ForEach-Object { Show-Tree $_ }
}
}
Show-Tree $Tree
Output:
main
sub1
subsub
sub2

Related

Powershell passing multiple parameters from one script to another [duplicate]

I've seen the # symbol used in PowerShell to initialise arrays.
What exactly does the # symbol denote and where can I read more about it?
In PowerShell V2, # is also the Splat operator.
PS> # First use it to create a hashtable of parameters:
PS> $params = #{path = "c:\temp"; Recurse= $true}
PS> # Then use it to SPLAT the parameters - which is to say to expand a hash table
PS> # into a set of command line parameters.
PS> dir #params
PS> # That was the equivalent of:
PS> dir -Path c:\temp -Recurse:$true
PowerShell will actually treat any comma-separated list as an array:
"server1","server2"
So the # is optional in those cases. However, for associative arrays, the # is required:
#{"Key"="Value";"Key2"="Value2"}
Officially, # is the "array operator." You can read more about it in the documentation that installed along with PowerShell, or in a book like "Windows PowerShell: TFM," which I co-authored.
While the above responses provide most of the answer it is useful--even this late to the question--to provide the full answer, to wit:
Array sub-expression (see about_arrays)
Forces the value to be an array, even if a singleton or a null, e.g. $a = #(ps | where name -like 'foo')
Hash initializer (see about_hash_tables)
Initializes a hash table with key-value pairs, e.g.
$HashArguments = #{ Path = "test.txt"; Destination = "test2.txt"; WhatIf = $true }
Splatting (see about_splatting)
Let's you invoke a cmdlet with parameters from an array or a hash-table rather than the more customary individually enumerated parameters, e.g. using the hash table just above, Copy-Item #HashArguments
Here strings (see about_quoting_rules)
Let's you create strings with easily embedded quotes, typically used for multi-line strings, e.g.:
$data = #"
line one
line two
something "quoted" here
"#
Because this type of question (what does 'x' notation mean in PowerShell?) is so common here on StackOverflow as well as in many reader comments, I put together a lexicon of PowerShell punctuation, just published on Simple-Talk.com. Read all about # as well as % and # and $_ and ? and more at The Complete Guide to PowerShell Punctuation. Attached to the article is this wallchart that gives you everything on a single sheet:
You can also wrap the output of a cmdlet (or pipeline) in #() to ensure that what you get back is an array rather than a single item.
For instance, dir usually returns a list, but depending on the options, it might return a single object. If you are planning on iterating through the results with a foreach-object, you need to make sure you get a list back. Here's a contrived example:
$results = #( dir c:\autoexec.bat)
One more thing... an empty array (like to initialize a variable) is denoted #().
The Splatting Operator
To create an array, we create a variable and assign the array. Arrays are noted by the "#" symbol. Let's take the discussion above and use an array to connect to multiple remote computers:
$strComputers = #("Server1", "Server2", "Server3")<enter>
They are used for arrays and hashes.
PowerShell Tutorial 7: Accumulate, Recall, and Modify Data
Array Literals In PowerShell
I hope this helps to understand it a bit better.
You can store "values" within a key and return that value to do something.
In this case I have just provided #{a="";b="";c="";} and if not in the options i.e "keys" (a, b or c) then don't return a value
$array = #{
a = "test1";
b = "test2";
c = "test3"
}
foreach($elem in $array.GetEnumerator()){
if ($elem.key -eq "a"){
$key = $elem.key
$value = $elem.value
}
elseif ($elem.key -eq "b"){
$key = $elem.key
$value = $elem.value
}
elseif ($elem.key -eq "c"){
$key = $elem.key
$value = $elem.value
}
else{
Write-Host "No other value"
}
Write-Host "Key: " $key "Value: " $value
}

Make select -unique on arraylist return arraylist instead of string

I have three arraylists in below class. I want to keep them unique. However if there's only one item (string) in the arraylist and you use select -unique (or any other method to achieve this) it will return the string instead of a list of strings. Surrounding it with #() also doesn't work because that transforms it to an array instead of an arraylist, which I can't add stuff to.
Any suggestions that are still performant? I tried HashSets before but somehow had horrible experiences with those. See my previous post for that.. Post on hashset issue
Code below:
Class OrgUnit
{
[String]$name
$parents
$children
$members
OrgUnit($name){
$this.name = $name
$this.parents = New-Object System.Collections.ArrayList
$this.children = New-Object System.Collections.ArrayList
$this.members = New-Object System.Collections.ArrayList
}
addChild($child){
# > $null to supress output
$tmp = $this.children.Add($child)
$this.children = $this.children | select -Unique
}
addParent($parent){
# > $null to supress output
$tmp = $this.parents.Add($parent)
$this.parents = $this.parents | select -Unique
}
addMember($member){
# > $null to supress output
$tmp = $this.members.Add($member)
$this.members = $this.members | select -Unique
}
}
You're adding a new item to the array, then selecting unique items from it, and reassingning it every time you add a member. This is extremely inefficient, maybe try the following instead:
if (-not $this.parents.Contains($parent)) {
$this.parents.Add($parent) | out-null
}
Would be much faster even with least efficient output supressing by out-null.
Check with .Contains() if the item is already added, so you don't have to eliminate duplicates with Select-Object -Unique afterwards all the time.
if (-not $this.children.Contains($child))
{
[System.Void]($this.children.Add($child))
}
As has been pointed out, it's worth changing your approach due to its inefficiency:
Instead of blindly appending and then possibly removing the new element if it turns out to be duplicate with Select-Object -Unique, use a test to decide whether an element needs to be appended or is already present.
Patrick's helpful answer is a straightforward implementation of this optimized approach that will greatly speed up your code and should perform acceptably unless the array lists get very large.
As a side effect of this optimization - because the array lists are only ever modified in-place with .Add() - your original problem goes away.
To answer the question as asked:
Simply type-constrain your (member) variables if you want them to retain a given type even during later assignments.
That is, just as you did with $name, place the type you want the member to be constrained to the left of the member variable declarations:
[System.Collections.ArrayList] $parents
[System.Collections.ArrayList] $children
[System.Collections.ArrayList] $members
However, that will initialize these member variables to $null, which means you won't be able to just call .Add() in your .add*() methods; therefore, construct an (initially empty) instance as part of the declaration:
[System.Collections.ArrayList] $parents = [System.Collections.ArrayList]::new()
[System.Collections.ArrayList] $children = [System.Collections.ArrayList]::new()
[System.Collections.ArrayList] $members = [System.Collections.ArrayList]::new()
Also, you do have to use #(...) around your Select-Object -Unique pipeline; while that indeed outputs an array (type [object[]]), the type constraint causes that array to be converted to a [System.Collections.ArrayList] instance, as explained below.
The need for #(...) is somewhat surprising - see bottom section.
Notes on type constraints:
If you assign a value that isn't already of the type that the variable is constrained to, PowerShell attempts to convert it to that type; you can think of it as implicitly performing a cast to the constraining type on every assignment:
This can fail, if the assigned value simply isn't convertible; PowerShell's type conversions are generally very flexible, however.
In the case of collection-like types such as [System.Collections.ArrayList], any other collection-like type can be assigned, such as the [object[]] arrays returned by #(...) (PowerShell's array-subexpression operator). Note that, of necessity, this involves constructing a new [System.Collections.ArrayList] every time, which becomes, loosely speaking, a shallow clone of the input collection.
Pitfalls re assigning $null:
If the constraining type is a value type (if its .IsValueType property reports $true), assigning $null will result in the type's default value; e.g., after executing [int] $i = 42; $i = $null, $i isn't $null, it is 0.
If the constraining type is a reference type (such as [System.Collections.ArrayList]), assigning $null will truly store $null in the variable, though later attempts to assign non-null values will again result in conversion to the constraining type.
In essence, this is the same technique used in parameter variables, and can also be used in regular variables.
With regular variables (local variables in a function or script), you must also initialize the variable in order for the type constraint to work (for the variable to even be created); e.g.:
[System.Collections.ArrayList] $alist = 1, 2
Applied to a simplified version of your code:
Class OrgUnit
{
[string] $name
# Type-constrain $children too, just like $name above, and initialize
# with an (initially empty) instance.
[System.Collections.ArrayList] $children = [System.Collections.ArrayList]::new()
addChild($child){
# Add a new element.
# Note the $null = ... to suppress the output from the .Add() method.
$null = $this.children.Add($child)
# (As noted, this approach is inefficient.)
# Note the required #(...) around the RHS (see notes in the last section).
# Due to its type constraint, $this.children remains a [System.Collections.ArrayList] (a new instance is created from the
# [object[]] array that #(...) outputs).
$this.children = #($this.children | Select-Object -Unique)
}
}
With the type constraint in place, the .children property now remains a [System.Collections.ArrayList]:
PS> $ou = [OrgUnit]::new(); $ou.addChild(1); $ou.children.GetType().Name
ArrayList # Proof that $children retained its type identity.
Note: The need for #(...) - to ensure an array-valued assignment value in order to successfully convert to [System.Collections.ArrayList] - is somewhat surprising, given that the following works with the similar generic list type, [System.Collections.Generic.List[object]]:
# OK: A scalar (single-object) input results in a 1-element list.
[System.Collections.Generic.List[object]] $list = 'one'
By contrast, this does not work with [System.Collections.ArrayList]:
# !! FAILS with a scalar (single object)
# Error message: Cannot convert the "one" value of type "System.String" to type "System.Collections.ArrayList".
[System.Collections.ArrayList] $list = 'one'
# OK
# Forcing the RHS to an array ([object[]]) fixes the problem.
[System.Collections.ArrayList] $list = #('one')
Try this one:
Add-Type -AssemblyName System.Collections
Class OrgUnit
{
[String]$name
$parents
$children
$members
OrgUnit($name){
$this.name = $name
$this.parents = [System.Collections.Generic.List[object]]::new()
$this.children = [System.Collections.Generic.List[object]]::new()
$this.members = [System.Collections.Generic.List[object]]::new()
}
addChild($child){
# > $null to supress output
$tmp = $this.children.Add($child)
$this.children = [System.Collections.Generic.List[object]]#($this.children | select -Unique)
}
addParent($parent){
# > $null to supress output
$tmp = $this.parents.Add($parent)
$this.parents = [System.Collections.Generic.List[object]]#($this.parents | select -Unique)
}
addMember($member){
# > $null to supress output
$tmp = $this.members.Add($member)
$this.members = [System.Collections.Generic.List[object]]#($this.members | select -Unique)
}
}

What is '#{}' meaning in PowerShell

I have line of scripts for review here, I noticed variable declaration with a value:
function readConfig {
Param([string]$fileName)
$config = #{}
Get-Content $fileName | Where-Object {
$_ -like '*=*'
} | ForEach-Object {
$key, $value = $_ -split '\s*=\s*', 2
$config[$key] = $value
}
return $config
}
I wonder what #{} means in $config = #{}?
#{} in PowerShell defines a hashtable, a data structure for mapping unique keys to values (in other languages this data structure is called "dictionary" or "associative array").
#{} on its own defines an empty hashtable, that can then be filled with values, e.g. like this:
$h = #{}
$h['a'] = 'foo'
$h['b'] = 'bar'
Hashtables can also be defined with their content already present:
$h = #{
'a' = 'foo'
'b' = 'bar'
}
Note, however, that when you see similar notation in PowerShell output, e.g. like this:
abc: 23
def: #{"a"="foo";"b"="bar"}
that is usually not a hashtable, but the string representation of a custom object.
The meaning of the #{}
can be seen in diffrent ways.
If the #{} is empty, an empty hash table is defined.
But if there is something between the curly brackets it can be used in a contex of an splatting operation.
Hash Table
Splatting
I think there is no need in explaining what an hash table is.
Splatting is a method of passing a collection of parameter values to a command as unit.
$prints = #{
Name = "John Doe"
Age = 18
Haircolor = "Red"
}
Write-Host #prints
Hope it helps! BR
Edit:
Regarding the updated code from the questioner the answer is
It defines an empty hash table.
Be aware that Get-Content has its own parameters!
THE MOST IMPORTANT 1:
[-Raw]

Powershell reference type memory consumption

Powershell reference consumes a lot more memory than barely declare a variable, which is weird, it supposes to consume less.
Store with object
- it consumes less memory
Store with ref of object
- it consumes 2x memory than not using memory
I guess it is the class causing this, but I don't know why.
class LinkedListNode {
$value
$next = #()
$previous = #()
LinkedListNode($value) {
$this.value = $value
}
}
class test {
$hash = #{}
[object] Append($value) {
$newNode = New-Object LinkedListNode $value
$newNode.previous = $null
$newNode.next = $null
$this.hash.Add($value, [ref] $newNode) # with ref
# $this.hash.Add($value, $newNode) # with object
return $this
}
}
$t = [test]::new()
for ($i = 0; $i -lt 30000; $i++) {
$t.Append($i)
}
For the code below, ref consumes less memory, which I think is usual case.
for ($i = 0; $i -lt 30000; $i++) {
$testObject = New-Object -TypeName PSObject -Property #{
'forTest' = "test"
}
$test.Add($i, [ref] $testObject) # with ref
# $test.Add($i, $testObject) # with object
}
This is a tricky one because references in PowerShell aren't the same as C++ references, and don't do what you think they do. Basically, when you read about_Ref it indicates that it treats Variables and Objects differently.
Passing a variable of type int can be by reference or by value.
Passing an Object is always by reference.
What that means, is that the example you used:
The "by Object" used a true reference.
The "by Ref" actually wrapped the LinkedListNode object in a System.Management.Automation.PSReference object. This System.Management.Automation.PSReference object takes up some space, and due to the small object sizes, made it "seem" like it took up twice the memory.
The [ref] is meant for interacting with .NET functions that require it see: [ref] doesn't work with class members and Restrict use of [ref] to variables
Also, using [ref] with functions in PowerShell:
When passing a variable by reference, the function can change the data
and that change persists after the function executes.
This is different than how C++ would use references.

Splatting after passing hashtable by reference in Powershell

I ran into a snag when I passed a hash table by reference to a function for splatting purposes. How can I fix this?
Function AllMyChildren {
param (
[ref]$ReferenceToHash
}
get-childitem #ReferenceToHash.Value
# etc.etc.
}
$MyHash = #{
'path' = '*'
'include' = '*.ps1'
'name' = $null
}
AllMyChildren ([ref]$MyHash)
Result: an error ("Splatted variables cannot be used as part of a property or array expression. Assign the result of the expression to a temporary variable then splat the temporary variable instead.").
Tried to do this:
$newVariable = $ReferenceToHash.Value
get-childitem #NewVariable
That did work and seemed right per the error message. Is it the preferred syntax in a case like this?
1) Passing hashtables (or any instances of classes, i.e. reference types) with [ref] makes no sense because they are always passed by reference themselves. [ref] is used with value types (scalars and instances of structures).
2) The splatting operator can be applied to a variable directly, not an expression.
Thus, in order to resolve the problem simply pass the hashtable in the function as it is:
Function AllMyChildren {
param (
[hashtable]$ReferenceToHash # it is a reference itself
)
get-childitem #ReferenceToHash
# etc.etc.
}
$MyHash = #{
'path' = '*'
'include' = '*.ps1'
'name' = $null
}
AllMyChildren $MyHash