Static typing of Hash table members - powershell

I know PowerShell by default is dynamically typed, which makes a lot of sense for "quick and dirty" one liners and short scripts. But I have started trying to declare variables by type in my long scripts, as it avoids certain bugs. This works fine with regular variables, even when initializing as $null like this [int]$int = $null.
However, I also use lots of hash tables to return multiple values from functions, and I would like to use static typing there too. But you can't use a similar approach like this...
$hash = #{
[int]int = $null
[string]string = $null
}
You can cast like this...
$hash = #{
int = [int]$null
string = [string]$null
}
but that still leaves the hash members dynamic, so $hash.int = 'string' is valid.
I could switch to using custom objects, but I find that rather ugly, v3 type accelerator not withstanding. Sadly I am also pretty much locked in to support for v2, so I feel like hash tables are still the way to go.
So, is there a way to do this in hash tables that I am missing? Or is this the reason for custom objects?

Related

Strong typing dictionary contents

As part of a refactor of a large PowerShell program from PS2.0, functions and scripting quick practices to PS5.0, classes and programming best practices, I have been moving to strong typing everywhere and finding some places where that brings up questions. The latest one being hash tables.
With both [Array] and [Hashtable] you can have a mix of content, which then makes enumerating that collection impossible to strongly type. For Arrays you have options like [String[]] or moving to Generic Lists with [System.Collections.Generic.List[String]]. But Dictionaries seem to pose a problem. I can't find a way to create a dictionary and limit it's values to a particular type. Something like [System.Collections.Specialized.OrderedDictionary[Int32,Int32]] fails.
So, IS there a way to make Dictionaries and OrderedDictionaries with strong typing of the index, the value or both? And if there isn't a way, is this considered a bit of a problem that must be overcome, or not a problem and if so why is it not a problem?
You can create generic dictionaries, but as per Create new System.Collections.Generic.Dictionary object fails in PowerShell you have to use a backtick to escape the comma in the list of generic type parameters:
e.g.
PS> $dict = new-object System.Collections.Generic.Dictionary[int`,string]
Note that PowerShell will still do its best to coerce types, so it will only throw an exception if it can't convert the type:
PS> $dict.Add("1", "aaa") # equivalent to $dict.Add(1, "aaa")
And I don't believe there is a generic OrderedDictionary out-of-the-box, which is probably why that fails :-)
PS> new-object System.Collections.Specialized.OrderedDictionary[Int32`,Int32]
New-Object: Cannot find type [System.Collections.Specialized.OrderedDictionary[Int32,Int32]]: verify that the assembly containing this type is loaded.

ArrayList .Add vs .AddRange vis-a-vis the Pipeline

Given a properly defined variable
$test = New-Object System.Collections.ArrayList
.Add pollutes the pipeline with the count of items in the array, while .AddRange does not.
$test.Add('Single') will dump the count to the console. $test.AddRange(#('Single2')) will be clean with no extra effort. Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?
Given that .AddRange requires coercing to an array when not using a variable (that is already an array) I am tending towards using [void]$variable.Add('String') when I know I need to only add one item, and [void]$test.AddRange($variable) when I am adding an array to an array, even when $variable only contains, or could only contain, a single item. The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?
Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?
Because many years ago, someone decided that's how ArrayList should behave!
Add() returns the index at which the argument was inserted into the list, which may indeed be useful and makes sense.
With AddRange() on the other hand, it's not immediately clear why it should return anything, and if yes, what? The index of the first item in the input arguments? The last? Or should it return a variable-sized array with all the insert indices? That would be awkward! So whoever implemented ArrayList decided not to return anything at all.
In C# or VB.NET, for which ArrayList was initially designed, "polluting the pipeline" doesn't really exist as a concept, the runtime would simply omit copying the return value back to the caller if someone invokes .Add() without assigning to a variable.
The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?
No, it's completely unnecessary. AddRange() is not magically one day gonna change to output anything.
If you don't ever need to know the insert index, use a [System.Collections.Generic.List[psobject]] instead:
$list = [System.Collections.Generic.List[psobject]]::new()
# this won't return anything, no need for `[void]`
$list.Add(123)
If for some reason you must use an ArrayList, you can "silence" it by overriding the Add() method:
function New-SilentArrayList {
# Create a new ArrayList
$newList = [System.Collections.ArrayList]::new()
# Create a new `Add()` method, then return the list
$newAdd = #{
InputObject = $newList
MemberType = 'ScriptMethod'
Name = 'Add'
Value = {param($obj) $this.AddRange(#($obj))}
}
Write-Output $(
Add-Member #newAdd -Force -PassThru
) -NoEnumerate
}
Now your ArrayList's Add() will never make a peep again!
PS C:\> $list = New-SilentArrayList
PS C:\> $list.Add(123)
PS C:\> $list
123
Apparently I didn't quiet understand where you where heading to.
"Add pollutes the pipeline", at a second thought is a correct statement but .Net methods like $variable.Add('String') do not use the PowerShell pipeline by itself (until the moment you output the array using the Write-Output command which is the default command if you do not assign it to a variable).
The Write-Output cmdlet is typically used in scripts to display
strings and other objects on the console. However, because the default
behavior is to display the objects at the end of a pipeline, it is
generally not necessary to use the cmdlet.
The point is that Add method of ArrayList returns a [Int32] "The ArrayList index at which the value has been added" and the AddRange doesn't return anything. Meaning if you don't assign the results to something else (which includes $Null = $test.Add('Single')) it will indeed be output to the PowerShell Pipeline.
Instead you might also consider to use the Add method of the List class which also doesn't return anything, see also: ArrayList vs List<> in C#.
But in general, I recommend to use native PowerShell commands that do use the Pipeline
(I can't give you a good example as it is not clear what output you expect but I noticed another question you removed and from that question, I presume that this Why should I avoid using the increase assignment operator (+=) to create a collection answer might help you further)

Is there a PowerShell equivalent for C#'s default operator, and if so what is the syntax?

In C#, it's possible to get the default value of any type using the default operator:
var i = default(int); // i == 0
// in C# 7.1+
int j = default; // j == 0
Is there a similar construct in PowerShell, and if so what is it? As far as I've been able to determine in my Googling and testing, default is only recognized by PS when present in switch blocks.
PowerShell has no direct language construct for it because it doesn't need it -- due to its loose typing you are almost never required to produce a value of a specific type and there is no support for creating generic types or functions. Untyped variables start off as $null if you do nothing special. Typed variables start off as whatever value you explicitly give them, and that's generally sufficient due to PowerShell's liberal rules for conversion ([int] "" and [int] $null are both 0).
Only in rare cases does this fail, like attempting to declare a variable of type DateTimeOffset, as there is no default constructor and $null or "" won't convert. Arguably, the fix there is to just explicitly construct a value using whatever the type does offer ([DateTimeOffset] $d = [DateTimeOffset]::Now, [DateTimeOffset] $d = [DateTimeOffset]::MinValue, [DateTimeOffset] $d = "0001-01-01 00:00Z"). Only in the very rare case that you have a dynamic type, and you'd like to get what C# would give you with default, would you need some special code. You can do it in pure PowerShell (well, almost, we need to call a method available since .NET 1.0):
Function Get-Default([Type] $t) { [Array]::CreateInstance($t, 1)[0] }
And then [DateTimeOffset] $d = Get-Default DateTimeOffset works (there is no way to infer the type in this case, though you are of course free to omit it from the variable).
Of course this does create a garbage array on every invocation; it does not invoke any constructors of the type itself, however. There are more involved approaches that avoid array creation, but they all involve getting complicated with generic methods (relying on LINQ) or explicitly compiling C# and aren't really worth demonstrating as they're less general. Obviously, even the function above should be used only in the unusual case where it might be needed and not as a general way of initializing variables -- typically you know the type and how to initialize it, or you don't care about the type in the first place.

Generic List in a Hash Table

I can define an array as a generic list like this
$array = [Collections.Generic.List[String]]#()
And I can define an element in a hash table as an array like this
$hash = #{
array = #()
}
But I can't define an element in a hash table as a Generic List, like this
$hash = #{
array = [Collections.Generic.List[String]]#()
}
Instead I get this error
Cannot convert the "System.Object[]" value of type "System.Object[]"
to type "System.Collections.Generic.List`1[System.String]
I have been using Generic Lists to avoid the (minor in my case, to be sure) performance issue with regularly adding to a standard array. But this is the first time I have needed to create a hash table that contains a generic list (for a complex return value).
So, first question, is this is even possible? And second question, what is the difference under the hood between simply setting a variable and a hash table element?
EDIT: This is interesting. I CAN use
[System.Collections.ArrayList]#()
and it works. So, now I am curious what exactly is the difference between
[System.Collections.ArrayList]
and
[Collections.Generic.List[String]]
I guess this is the down side of being self taught. I found reference to [Collections.Generic.List[String]] on a BLOG, and maybe [System.Collections.ArrayList] is a much better answer? What I think I understand from this is that the former is specifically typed as a list of strings, while the latter is a list of generic objects, which then must be cast in use, which has potential bug and performance issues. Still, I wonder why the typed generic doesn't work in a hash table.

Is there a simple way to create custom types in Powershell?

Just want to make a simple custom type from [System.Collections.ArrayList] to, say, just shorter[arrayList] or something like that and put it into a module for convenience. Looked into Add-Type but couldn't figure out if it fits and how to do it exactly. What I want to get is:
[ArrayList]<-[System.Collections.ArrayList] #Something like that
$myArList=New-Object ArrayList
$myArList.Add(1,2,3)
You're looking for a type accelerator.
[accelerators]::add('arrayList','System.Collections.ArrayList')
I would avoid using non-standard accelerators. PowerShell has good tab completion support for classes since at least v3.
So if you type [arraylTAB then it will complete the full name for you.
Ryan Bemrose brought up a great point; the [accelerators] type accelerator is not available by default, but you can create it like so:
$acc = [psobject].assembly.gettype("System.Management.Automation.TypeAccelerators")
$acc::Add('accelerators', $acc)
If you simply want to avoid re-typing System.Collections.ArrayList all the time, you can simply assign a type literal to a variable and use that:
$ListType = [System.Collections.ArrayList]
$MyArrayList = New-Object $ListType
# more code
$AnotherArrayList = New-Object $ListType
or, using the v5.0 new() constructor:
$MyArrayList = $ListType::new()