I want to create a List<Tuple<int,string>> in PowerShell, but
New-Object System.Collections.Generic.List[System.Collections.Generic.Tuple[int,string]]
does not work. What am I missing?
Lee's answer is the correct way to create a List of Tuples (although you can make the statement much shorter by omitting the System namespace). However, the better questions to ask in while programming in PowerShell are:
Why should I return a strongly typed object?
Do I really want to output a list?
The first one has its pros and cons. Strongly typed objects are useful to return if they have methods or events that will be useful for the next step in the pipeline. If, on the other hand, you just want to return a bunch of items with a name an int, I'd use something like:
[PSCustomObject]#{string="string";int=1}
This will create a property bag (what most devs know as a Tuple) containing the data you need, with more descriptive names than a .NET tuple will give you on the object. It's also pretty fast. If, on the other hand, the data is meant for an API, then by all means created the strongly typed object it expects.
The second question is a little bit harder to understand but has a much clearer answer. In many cases, you'll want to accept input for another function from the output of one function. In this, for many reasons, a strongly typed list is not your best friend. Strongly typed lists do not always clearly convert into arrays (this is especially true for generics), and, as arguments to a function, severely limit the different types of data you can put into the function. They also end up providing a little bit of a misleading and harder to use output (especially when piping in objects and producing multiple results), since the whole list will be displayed as one outputted item, instead of displaying each item on its own. Most annoyingly, strongly typed lists behave differently than arrays in PowerShell when you "over-index" (i.e. ask for item 10000 in a list of 5 items) Arrays will quietly return null. Lists will barf loudly. More practically, accumulating items into a list and then outputting the list will "hold" the pipeline until all items are in. This may be what you want, but in most cases it's nice to see output coming out of a function as it runs. Finally, lists add to the memory overhead of the function, as you need to accumulate a set of objects in the function's stack.
What I generally do is simply emit multiple objects. That is, I avoid using the return keyword and I take advantage of PowerShell's ability to return objects that are not captured into a variable. If I assign the result into a variable, the items will be accumulated within an arraylist and returned to you as an array. This quick little demonstration function shows you how.
function Get-RandomData {
param($count = 10)
foreach ($n in 1..$count){
[PSCustomObject]#{Name="Number$n";Number=Get-Random}
}
}
It's worth noting that specialized collections are still quite useful. I very often use Queues and Stacks when the need arises. However, I very rarely find myself using generics or lists unless I am working with a part of .NET that specifically requires generics or lists. This is pretty personally ironic, since I was the person who tested support for generics in PowerShell V2. It's absolutely required when you want to work with a piece of .NET that can only take a list of tuples. It's slightly to severely counterproductive in all other cases.
You can create it with:
New-Object 'Collections.Generic.List[Tuple[int,string]]'
You are spelling Generic wrongly, and the Tuple types are in the System namespace, not System.Collections.Generic.
This is what worked for me. I am also providing sample code to insert into the list:
$myList = New-Object System.Collections.ArrayList
#add range
$myList.AddRange((
[Tuple]::Create(1,"string 1"),
[Tuple]::Create(2,"string 2"),
[Tuple]::Create(3,"string 3")
));
#add single item
$myList.Add([Tuple]::Create(4,"string 4"))
#create variable and add to list
$myTuple = [Tuple]::Create(5,"string 5")
$myList.Add( $myTuple)
Write-Host $myList
Reference:
Using and Understanding Tuples in PowerShell
You can create a tuple with up to 7 elements like this:
$tuple = [tuple]::Create(1,2,3,4,5,6,7)
and you can get the value of an element by naming its item (starting with item1):
$tuple.item1
1
If you have 8 or more elements then use another tuple for the 8th element:
$tuple = [tuple]::Create(1,2,3,4,5,6,7,[tuple]::create(8,9))
internally, the 8th element is called "rest". You can get the values like this:
$tuple.rest.item1.item1
8
If you need to specify a type for each element then do it in front of each value:
$tuple = [tuple]::create([string]"a", [int]1, [byte]255)
Finally, adding a tuple to a list works like this:
$list = New-Object 'Collections.ArrayList'
$tuple = [tuple]::create([string]"a", [int]1, [byte]255)
$list.add($tuple)
There is no need to specify the tuple-details for creating the list.
Keep in mind, that you cannot change a value of a tuple later and the default sort-method sorts ascendig in the order of items1, item2 etc. (or you need a custom IComparer), but using them is super fast (way faster than working with large lists of PsObject or PSCustomObject and also faster than an import-Csv)!
Related
Well i do get a simple (?) issue : i do not manage to remove a given hashtable from an ArrayList :
$testarraylist = New-Object System.Collections.ArrayList
$testarraylist.Add(1)
$testarraylist.Add(2)
$testarraylist.Add(3)
$testarraylist.Add(4)
$testarraylist.Add(#{1=1})
$testarraylist.Add(#{2=2})
$testarraylist.Add(#{3=3})
$testarraylist.Add(5)
$testarraylist.Add(6)
$testarraylist.Add(7)
i can remove simple "list" element, but i fail at removing the hashtable one
$testarraylist.remove(x)
the only way i found i by using
$testarraylist.removeat(4)
Which work, but aren't there an easier way ?
I searched quit a bit and i found lots of exemple but strangely none on this specific case.
Well maybe not strange as it may be super easy so that's why no one ever had to ask this question ?
or my google skill are failling me ... ?
thanks in advance.
This all comes from how the ArrayList will look for the object you want to remove. If you have a look at the .NET specification again, numbers like System.Int32 are a value types, but collections like System.Collections.Hashtable are reference types.
Basically, what it means, value types are always passed "by value", but for reference types, only a "reference" to that instance is passed.
You can try that out, so 1 -eq 1, because both have the same value, but #{1=1} -ne #{1=1}, because they are two separate instances, and thus two different references.
So, what you would have to do, is store the reference to the original instance in a variable first:
$h = #{1=1}
$testarraylist.Add($h)
$testarraylist.Remove($h)
Because this is such a basic and important concept in .NET, and basically all programming languages, I recommend you to take a few minutes and read more about it.
As part of a refactor of a large PowerShell program from PS2.0, functions and scripting quick practices to PS5.0, classes and programming best practices, I have been moving to strong typing everywhere and finding some places where that brings up questions. The latest one being hash tables.
With both [Array] and [Hashtable] you can have a mix of content, which then makes enumerating that collection impossible to strongly type. For Arrays you have options like [String[]] or moving to Generic Lists with [System.Collections.Generic.List[String]]. But Dictionaries seem to pose a problem. I can't find a way to create a dictionary and limit it's values to a particular type. Something like [System.Collections.Specialized.OrderedDictionary[Int32,Int32]] fails.
So, IS there a way to make Dictionaries and OrderedDictionaries with strong typing of the index, the value or both? And if there isn't a way, is this considered a bit of a problem that must be overcome, or not a problem and if so why is it not a problem?
You can create generic dictionaries, but as per Create new System.Collections.Generic.Dictionary object fails in PowerShell you have to use a backtick to escape the comma in the list of generic type parameters:
e.g.
PS> $dict = new-object System.Collections.Generic.Dictionary[int`,string]
Note that PowerShell will still do its best to coerce types, so it will only throw an exception if it can't convert the type:
PS> $dict.Add("1", "aaa") # equivalent to $dict.Add(1, "aaa")
And I don't believe there is a generic OrderedDictionary out-of-the-box, which is probably why that fails :-)
PS> new-object System.Collections.Specialized.OrderedDictionary[Int32`,Int32]
New-Object: Cannot find type [System.Collections.Specialized.OrderedDictionary[Int32,Int32]]: verify that the assembly containing this type is loaded.
Given a properly defined variable
$test = New-Object System.Collections.ArrayList
.Add pollutes the pipeline with the count of items in the array, while .AddRange does not.
$test.Add('Single') will dump the count to the console. $test.AddRange(#('Single2')) will be clean with no extra effort. Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?
Given that .AddRange requires coercing to an array when not using a variable (that is already an array) I am tending towards using [void]$variable.Add('String') when I know I need to only add one item, and [void]$test.AddRange($variable) when I am adding an array to an array, even when $variable only contains, or could only contain, a single item. The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?
Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?
Because many years ago, someone decided that's how ArrayList should behave!
Add() returns the index at which the argument was inserted into the list, which may indeed be useful and makes sense.
With AddRange() on the other hand, it's not immediately clear why it should return anything, and if yes, what? The index of the first item in the input arguments? The last? Or should it return a variable-sized array with all the insert indices? That would be awkward! So whoever implemented ArrayList decided not to return anything at all.
In C# or VB.NET, for which ArrayList was initially designed, "polluting the pipeline" doesn't really exist as a concept, the runtime would simply omit copying the return value back to the caller if someone invokes .Add() without assigning to a variable.
The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?
No, it's completely unnecessary. AddRange() is not magically one day gonna change to output anything.
If you don't ever need to know the insert index, use a [System.Collections.Generic.List[psobject]] instead:
$list = [System.Collections.Generic.List[psobject]]::new()
# this won't return anything, no need for `[void]`
$list.Add(123)
If for some reason you must use an ArrayList, you can "silence" it by overriding the Add() method:
function New-SilentArrayList {
# Create a new ArrayList
$newList = [System.Collections.ArrayList]::new()
# Create a new `Add()` method, then return the list
$newAdd = #{
InputObject = $newList
MemberType = 'ScriptMethod'
Name = 'Add'
Value = {param($obj) $this.AddRange(#($obj))}
}
Write-Output $(
Add-Member #newAdd -Force -PassThru
) -NoEnumerate
}
Now your ArrayList's Add() will never make a peep again!
PS C:\> $list = New-SilentArrayList
PS C:\> $list.Add(123)
PS C:\> $list
123
Apparently I didn't quiet understand where you where heading to.
"Add pollutes the pipeline", at a second thought is a correct statement but .Net methods like $variable.Add('String') do not use the PowerShell pipeline by itself (until the moment you output the array using the Write-Output command which is the default command if you do not assign it to a variable).
The Write-Output cmdlet is typically used in scripts to display
strings and other objects on the console. However, because the default
behavior is to display the objects at the end of a pipeline, it is
generally not necessary to use the cmdlet.
The point is that Add method of ArrayList returns a [Int32] "The ArrayList index at which the value has been added" and the AddRange doesn't return anything. Meaning if you don't assign the results to something else (which includes $Null = $test.Add('Single')) it will indeed be output to the PowerShell Pipeline.
Instead you might also consider to use the Add method of the List class which also doesn't return anything, see also: ArrayList vs List<> in C#.
But in general, I recommend to use native PowerShell commands that do use the Pipeline
(I can't give you a good example as it is not clear what output you expect but I noticed another question you removed and from that question, I presume that this Why should I avoid using the increase assignment operator (+=) to create a collection answer might help you further)
Suppose I have a list:
$DeletedUsers = New-Object System.Collections.Generic.List[System.object]
So I can easily add and remove users from the collection.
I want to be able to pass this list to a function that does something, but without modifying the original list, and it must stay of the same generic list type.
convertAll() seems to do exactly what I want without having to script out the creation of a new list myself with foreach-object, but I don't understand how to utilize the overload definitions (or quite understand what they mean).
There are many examples in C#, but I haven't been able to find one that demonstrates it in PoSH.
Example Scenario:
Assume $DeletedUsers contains a list of User objects of PSCustomObject type. With typical "User" properties such as department or Employment status. This list should be be capable of being passed to functions that will change statuses of the users property that can then be added to a separate output list of the same Generic.List type.
Currently any changes by the example function.
Function ProcessUser {
[Cmdletbinding()]
Param($DeletedUsers)
begin{$DeletedUsersClone = $($DeletedUsers).psobject.copy()} #OR similar
process{
$DeletedUsersClone | foreach { $_ | Add-Member -NotePropertyName
"Processed" -NotePropertyValue "Processed:00"; $Outputlist.add($_)}
}
}
Impacts the original $DeletedUsers, erroneously adding processed information to a list that should stay static.
There are alternate ways to prevent this from impacting the ultimate objective of the script, but the question is:
How do I create a True, non-referenced clone of a System.Collections.Generic.List[System.object] using built-in C# methods.
The trick is to use a scriptblock with an explicit cast to the delegate type. This looks like:
$DeletedUsers.ConvertAll([converter[object,object]] {param ($i) <# do convert #> })
Note:
As became clear later, the OP is looking for a deep clone of the original list; i.e., not only should the list as a whole be cloned, but also its elements.
This answer only shows how to create a shallow clone (and how to pass a list read-only).
See Bruce Payette's helpful answer for a deep-cloning approach based on the .ConvertAll method; with [pscustomobject] instances, you'd use the following (but note that .psobject.Copy() only creates shallow copies of the [pscustomobject] instances themselves):
$DeletedUsers.ConvertAll([converter[pscustomobject, pscustomobject]] {param ($obj) $obj.psobject.copy() })
If you want to pass a shallow clone of your list to a callee:
Pass [Collections.Generic.List[object]]::new($DeletedUsers) (PSv5+ syntax)
Alternatively, if the type of the list elements isn't known or if you don't want to repeat it, pass: $DeletedUsers.GetRange(0, $DeletedUsers.Count)
If you just want to prevent accidental modification of your list by a callee:
Pass $DeletedUsers.AsReadOnly() - however, that does change the type, namely to [Collections.ObjectModel.ReadOnlyCollection[object]]
If I have an ArrayList<Double> dblList and a Predicate<Double> IS_EVEN I am able to remove all even elements from dblList using:
Collections2.filter(dblList, IS_EVEN).clear()
if dblList however is a result of a transformation like
dblList = Lists.transform(intList, TO_DOUBLE)
this does not work any more as the transformed list is immutable :-)
Any solution?
Lists.transform() accepts a List and helpfully returns a result that is RandomAccess list. Iterables.transform() only accepts an Iterable, and the result is not RandomAccess. Finally, Iterables.removeIf (and as far as I see, this is the only one in Iterables) has an optimization in case that the given argument is RandomAccess, the point of which is to make the algorithm linear instead of quadratic, e.g. think what would happen if you had a big ArrayList (and not an ArrayDeque - that should be more popular) and kept removing elements from its start till its empty.
But the optimization depends not on iterator remove(), but on List.set(), which is cannot be possibly supported in a transformed list. If this were to be fixed, we would need another marker interface, to denote that "the optional set() actually works".
So the options you have are:
Call Iterables.removeIf() version, and run a quadratic algorithm (it won't matter if your list is small or you remove few elements)
Copy the List into another List that supports all optional operations, then call Iterables.removeIf().
The following approach should work, though I haven't tried it yet.
Collection<Double> dblCollection =
Collections.checkedCollection(dblList, Double.class);
Collections2.filter(dblCollection, IS_EVEN).clear();
The checkCollection() method generates a view of the list that doesn't implement List. [It would be cleaner, but more verbose, to create a ForwardingCollection instead.] Then Collections2.filter() won't call the unsupported set() method.
The library code could be made more robust. Iterables.removeIf() could generate a composed Predicate, as Michael D suggested, when passed a transformed list. However, we previously decided not to complicate the code by adding special-case logic of that sort.
Maybe:
Collection<Double> odds = Collections2.filter(dblList, Predicates.not(IS_EVEN));
or
dblList = Lists.newArrayList(Lists.transform(intList, TO_DOUBLE));
Collections2.filter(dblList, IS_EVEN).clear();
As long as you have no need for the intermediate collection, then you can just use Predicates.compose() to create a predicate that first transforms the item, then evaluates a predicate on the transformed item.
For example, suppose I have a List<Double> from which I want to remove all items where the Integer part is even. I already have a Function<Double,Integer> that gives me the Integer part, and a Predicate<Integer> that tells me if it is even.
I can use these to get a new predicate, INTEGER_PART_IS_EVEN
Predicate<Double> INTEGER_PART_IS_EVEN = Predicates.compose(IS_EVEN, DOUBLE_TO_INTEGER);
Collections2.filter(dblList, INTEGER_PART_IS_EVEN).clear();
After some tries, I think I've found it :)
final ArrayList<Integer> ints = Lists.newArrayList(1, 2, 3, 4, 5);
Iterables.removeIf(Iterables.transform(ints, intoDouble()), even());
System.out.println(ints);
[1,3,5]
I don't have a solution, instead I found some kind of a problem with Iterables.removeIf() in combination with Lists.TransformingRandomAccessList.
The transformed list implements RandomAccess, thus Iterables.removeIf() delegates to Iterables.removeIfFromRandomAccessList() which depends on an unsupported List.set() operation.
Calling Iterators.removeIf() however would be successful, as the remove() operation IS supported by Lists.TransformingRandomAccessList.
see: Iterables: 147
Conclusion: instanceof RandomAccess does not guarantee List.set().
Addition:
In special situations calling removeIfFromRandomAccessList() even works:
if and only if the elements to erase form a compact group at the tail of the List or all elements are covered by the Predicate.