Generic List in a Hash Table - powershell

I can define an array as a generic list like this
$array = [Collections.Generic.List[String]]#()
And I can define an element in a hash table as an array like this
$hash = #{
array = #()
}
But I can't define an element in a hash table as a Generic List, like this
$hash = #{
array = [Collections.Generic.List[String]]#()
}
Instead I get this error
Cannot convert the "System.Object[]" value of type "System.Object[]"
to type "System.Collections.Generic.List`1[System.String]
I have been using Generic Lists to avoid the (minor in my case, to be sure) performance issue with regularly adding to a standard array. But this is the first time I have needed to create a hash table that contains a generic list (for a complex return value).
So, first question, is this is even possible? And second question, what is the difference under the hood between simply setting a variable and a hash table element?
EDIT: This is interesting. I CAN use
[System.Collections.ArrayList]#()
and it works. So, now I am curious what exactly is the difference between
[System.Collections.ArrayList]
and
[Collections.Generic.List[String]]
I guess this is the down side of being self taught. I found reference to [Collections.Generic.List[String]] on a BLOG, and maybe [System.Collections.ArrayList] is a much better answer? What I think I understand from this is that the former is specifically typed as a list of strings, while the latter is a list of generic objects, which then must be cast in use, which has potential bug and performance issues. Still, I wonder why the typed generic doesn't work in a hash table.

Related

Strong typing dictionary contents

As part of a refactor of a large PowerShell program from PS2.0, functions and scripting quick practices to PS5.0, classes and programming best practices, I have been moving to strong typing everywhere and finding some places where that brings up questions. The latest one being hash tables.
With both [Array] and [Hashtable] you can have a mix of content, which then makes enumerating that collection impossible to strongly type. For Arrays you have options like [String[]] or moving to Generic Lists with [System.Collections.Generic.List[String]]. But Dictionaries seem to pose a problem. I can't find a way to create a dictionary and limit it's values to a particular type. Something like [System.Collections.Specialized.OrderedDictionary[Int32,Int32]] fails.
So, IS there a way to make Dictionaries and OrderedDictionaries with strong typing of the index, the value or both? And if there isn't a way, is this considered a bit of a problem that must be overcome, or not a problem and if so why is it not a problem?
You can create generic dictionaries, but as per Create new System.Collections.Generic.Dictionary object fails in PowerShell you have to use a backtick to escape the comma in the list of generic type parameters:
e.g.
PS> $dict = new-object System.Collections.Generic.Dictionary[int`,string]
Note that PowerShell will still do its best to coerce types, so it will only throw an exception if it can't convert the type:
PS> $dict.Add("1", "aaa") # equivalent to $dict.Add(1, "aaa")
And I don't believe there is a generic OrderedDictionary out-of-the-box, which is probably why that fails :-)
PS> new-object System.Collections.Specialized.OrderedDictionary[Int32`,Int32]
New-Object: Cannot find type [System.Collections.Specialized.OrderedDictionary[Int32,Int32]]: verify that the assembly containing this type is loaded.

Static typing of Hash table members

I know PowerShell by default is dynamically typed, which makes a lot of sense for "quick and dirty" one liners and short scripts. But I have started trying to declare variables by type in my long scripts, as it avoids certain bugs. This works fine with regular variables, even when initializing as $null like this [int]$int = $null.
However, I also use lots of hash tables to return multiple values from functions, and I would like to use static typing there too. But you can't use a similar approach like this...
$hash = #{
[int]int = $null
[string]string = $null
}
You can cast like this...
$hash = #{
int = [int]$null
string = [string]$null
}
but that still leaves the hash members dynamic, so $hash.int = 'string' is valid.
I could switch to using custom objects, but I find that rather ugly, v3 type accelerator not withstanding. Sadly I am also pretty much locked in to support for v2, so I feel like hash tables are still the way to go.
So, is there a way to do this in hash tables that I am missing? Or is this the reason for custom objects?

Select item from PowerShell hashtable without using Foreach

I have this list of PSObjects, each of which contains a Hashtable. Currently I can get it out like this:
foreach ($item in $myListOfItems) { $item.Metadata["Title"] }
However, I am wondering if I can do it somehow with piping and Select. Is this possible? For example:
$myListOfItems | Select $_.Metadata["Title"]
...which only outputs a whole bunch of blank lines :(
Any ideas? Thanks so much in advance!
What about
$myListOfItems | select #{ Label="Title";Expression={$_.Metadata["Title"]}}
The original idea was close but just got the syntax wrong. To use $_, the expression must be within a script-block parameter (Powershell V3+):
$myListOfItems | select { $_.Metadata["title"] }
Without the enclosing script-block, $_ is evaluated in the context of the whole pipeline instead of the pipeline element (Select-Object in this case). At this level, $_ is not defined so evaluates to $null. PowerShell is somewhat forgiving about using $null inappropriately. In particular, attempting to retrieve a property of $null is not an error, it just returns $null. So $null.Metadata["title"] → $null["title"] → $null. Select-Object $null just outputs an empty selection object for each input object thus resulting in the blank lines.
It should be noted that, strictly speaking, neither the originally suggested Select-Object (as fixed) nor the hash table version in the answer provide the same output as the ForEach loop. The loop produces a sequence of values from the .Metadata["Title"] members while Select-Object produces a sequence of selection objects [Selected.System.Collections.Hashtable] each containing a member (named $_.Metadata["Title"] and Title respectively) which holds the value. Whether this causes a problem depends on the usage.
For a really short version, try:
$myListOfItems.Metadata.Title
Since PowerShell V3, specifying a non-existent member of a collection causes Powershell to perform a member enumeration† whereby it would enumerate the collection and return a sequence of values of the specified member of each element of the collection (assuming there is one). If we assume left to right evaluation then the expression would be evaluated as:
($myListOfItems.Metadata).Title
Firstly enumerating the collection of items, as it has no Metadata member. Each item does have a Metadata member thus resulting in an Array of the Metadata members. Then this collection of Metadata hashtables, having no Title member, is enumerated to produce an Array of the Title elements of the hashtables.
† I think the term member enumeration is somewhat misleading. It is not the members which are being enumerated but the objects in the collection and an Array of the specified member of each object is then created. Indeed, even though the manual about_Hash_Tables states that the hash table has properties named after the keys with corresponding values, this is actually just Powershell shorthand for invoking the indexer [] (which actually appears to override member access so beware of hash tables with keys "Count", "Keys", "Values", ... if you want to access the actual hash table members of these names). So, strictly speaking, there is no 'Title' member. The term "enumerated collection member access" (ECMA ;) would better describe what is happening but after all "A rose by any other name".

List of Tuple<int,string> in PowerShell

I want to create a List<Tuple<int,string>> in PowerShell, but
New-Object System.Collections.Generic.List[System.Collections.Generic.Tuple[int,string]]
does not work. What am I missing?
Lee's answer is the correct way to create a List of Tuples (although you can make the statement much shorter by omitting the System namespace). However, the better questions to ask in while programming in PowerShell are:
Why should I return a strongly typed object?
Do I really want to output a list?
The first one has its pros and cons. Strongly typed objects are useful to return if they have methods or events that will be useful for the next step in the pipeline. If, on the other hand, you just want to return a bunch of items with a name an int, I'd use something like:
[PSCustomObject]#{string="string";int=1}
This will create a property bag (what most devs know as a Tuple) containing the data you need, with more descriptive names than a .NET tuple will give you on the object. It's also pretty fast. If, on the other hand, the data is meant for an API, then by all means created the strongly typed object it expects.
The second question is a little bit harder to understand but has a much clearer answer. In many cases, you'll want to accept input for another function from the output of one function. In this, for many reasons, a strongly typed list is not your best friend. Strongly typed lists do not always clearly convert into arrays (this is especially true for generics), and, as arguments to a function, severely limit the different types of data you can put into the function. They also end up providing a little bit of a misleading and harder to use output (especially when piping in objects and producing multiple results), since the whole list will be displayed as one outputted item, instead of displaying each item on its own. Most annoyingly, strongly typed lists behave differently than arrays in PowerShell when you "over-index" (i.e. ask for item 10000 in a list of 5 items) Arrays will quietly return null. Lists will barf loudly. More practically, accumulating items into a list and then outputting the list will "hold" the pipeline until all items are in. This may be what you want, but in most cases it's nice to see output coming out of a function as it runs. Finally, lists add to the memory overhead of the function, as you need to accumulate a set of objects in the function's stack.
What I generally do is simply emit multiple objects. That is, I avoid using the return keyword and I take advantage of PowerShell's ability to return objects that are not captured into a variable. If I assign the result into a variable, the items will be accumulated within an arraylist and returned to you as an array. This quick little demonstration function shows you how.
function Get-RandomData {
param($count = 10)
foreach ($n in 1..$count){
[PSCustomObject]#{Name="Number$n";Number=Get-Random}
}
}
It's worth noting that specialized collections are still quite useful. I very often use Queues and Stacks when the need arises. However, I very rarely find myself using generics or lists unless I am working with a part of .NET that specifically requires generics or lists. This is pretty personally ironic, since I was the person who tested support for generics in PowerShell V2. It's absolutely required when you want to work with a piece of .NET that can only take a list of tuples. It's slightly to severely counterproductive in all other cases.
You can create it with:
New-Object 'Collections.Generic.List[Tuple[int,string]]'
You are spelling Generic wrongly, and the Tuple types are in the System namespace, not System.Collections.Generic.
This is what worked for me. I am also providing sample code to insert into the list:
$myList = New-Object System.Collections.ArrayList
#add range
$myList.AddRange((
[Tuple]::Create(1,"string 1"),
[Tuple]::Create(2,"string 2"),
[Tuple]::Create(3,"string 3")
));
#add single item
$myList.Add([Tuple]::Create(4,"string 4"))
#create variable and add to list
$myTuple = [Tuple]::Create(5,"string 5")
$myList.Add( $myTuple)
Write-Host $myList
Reference:
Using and Understanding Tuples in PowerShell
You can create a tuple with up to 7 elements like this:
$tuple = [tuple]::Create(1,2,3,4,5,6,7)
and you can get the value of an element by naming its item (starting with item1):
$tuple.item1
1
If you have 8 or more elements then use another tuple for the 8th element:
$tuple = [tuple]::Create(1,2,3,4,5,6,7,[tuple]::create(8,9))
internally, the 8th element is called "rest". You can get the values like this:
$tuple.rest.item1.item1
8
If you need to specify a type for each element then do it in front of each value:
$tuple = [tuple]::create([string]"a", [int]1, [byte]255)
Finally, adding a tuple to a list works like this:
$list = New-Object 'Collections.ArrayList'
$tuple = [tuple]::create([string]"a", [int]1, [byte]255)
$list.add($tuple)
There is no need to specify the tuple-details for creating the list.
Keep in mind, that you cannot change a value of a tuple later and the default sort-method sorts ascendig in the order of items1, item2 etc. (or you need a custom IComparer), but using them is super fast (way faster than working with large lists of PsObject or PSCustomObject and also faster than an import-Csv)!

What does [] mean here?

$self->[UTF8] = $conf->{utf8};
Never seen such code before.
What does [] mean here?
In this case, the object $self is implemented as a blessed array reference rather than the far more common method of using a blessed hash reference. The syntax $foo->[42] accesses a single element from an array reference. Presumably, UTF8 is a constant that returns a numeric index into the array.
You see this idiom sometimes when people become convinced (usually incorrectly) that hash lookups on object attributes result in significant overhead and try to prematurely optimize their code.
The [] implies that $self is a reference to a list/array (assuming the code works). This looks a bit odd, though, as list indexes should be numeric.