Case insensitive [Linq.Enumerable]::SequenceEqual()

Case insensitive [Linq.Enumerable]::SequenceEqual() - powershell

I have been using [Linq.Enumerable]::SequenceEqual() to compare the order of items in two arrays, for example
$validOrder = #('one', 'two', 'three')
$provided = #('one', 'two', 'three')
[Linq.Enumerable]::SequenceEqual($validOrder, $provided)
This works but now I realize I want to address capitalization errors independently, so I want to test for order in a case insensitive way.
I found this that documents a different method signature, with IEqualityComparer<T> as the third value. That sure seems like the right direction, but I haven't found anything that helps me implement this in powershell. I tried just using 'OrdinalIgnoreCase' for the last argument, which works for [String].FindIndex() as was pointed out in another thread. But alas not here. I also found this which actually makes a custom comparer for a different object type, but that seems like I am just then implementing manually the thing that I actually want, and I am not sure what the value would be of using [Linq.Enumerable]::SequenceEqual(), I could just pass my arrays to my class method and do the work there directly.
I have also made this approach work
if (-not (Compare-Object -SyncWindow 0 $validOrder $provided)) {
$result = 'ordered'
} else {
$result = 'disordered'
}
And it is already case insensitive. But it is also slower, and I may have a lot of these tests to do, so speed will be of benefit.
Lastly I see that this seems to work, and is SUPER simple, case insensitive (if I want) and seemingly fast. The total number of items in the array will always be small, it is the number of repeats with different array that is the performance issue.
$result = ($provided -join ' ') -eq ($validOrder -join ' ')
So, is this last option viable, or am I missing something obvious that argues against it?
Also, I feel like I will have other situations where IEqualityComparer<T> is a useful argument, so knowing how to do it would be useful. Assuming I am reading this right and IEqualityComparer<T> provides a mechanism for a different form of comparison, other than just rolling my own comparison.

I tried just using 'OrdinalIgnoreCase' for the last argument, ...
You're so close:
$lower = 'a b c'.Split()
$upper = 'A B C'.Split()
$ignoreCaseComparer = [System.StringComparer]::OrdinalIgnoreCase
[Linq.Enumerable]::SequenceEqual($lower, $upper, $ignoreCaseComparer)
The StringComparer class - NOT to be confused with the StringComparison enum type - implements IEqualityComparer<string>, and so satisfies the type constraint.
Please note that the above method invocation only works because I used String.Split() to create the method arguments, and Split() explicitly returns a [string[]] - for most other array-like expressions you may find yourself needing to explicitly type the input arguments in order for PowerShell to pass the correct type parameters:
# This will fail - type of `#()` is [object[]],
# and `SequenceEquals<object>()` doesn't accept stringcomparer
$lower = #(-split 'a b c')
$upper = #(-split 'A B C')
$ignoreCaseComparer = [System.StringComparer]::OrdinalIgnoreCase
[Linq.Enumerable]::SequenceEqual($lower, $upper, $ignoreCaseComparer)
# This will succeed thanks to the cast(s) to [string[]] at the call site
$lower = #(-split 'a b c')
$upper = #(-split 'A B C')
$ignoreCaseComparer = [System.StringComparer]::OrdinalIgnoreCase
[Linq.Enumerable]::SequenceEqual([string[]]$lower, [string[]]$upper, $ignoreCaseComparer)

Probably not the best answer to your exact question, but left here for posterity...
If you want to use your own custom comparer (e.g. reversing values from one array before comparing) you can so something like this:
$expected = #("one", "two", "three");
$actual = #("eno", "owt", "eerht");
# see https://stackoverflow.com/a/32635196/3156906
class ReversingComparer : System.Collections.Generic.IEqualityComparer[string] {
[bool]Equals([string]$x, [string]$y) { return $x -eq ($y[-1..-$y.Length] -join ''); }
[int]GetHashCode([string] $x) { return $x.GetHashCode(); }
}
[ReversingComparer] $comparer = [ReversingComparer]::new();
[Linq.Enumerable]::SequenceEqual([string[]]$expected, [string[]]$actual, $comparer);
# true

Related

Returning a ArrayList from a function in powershell contains indexes [duplicate]

This question already has answers here:
Powershell Join-Path showing 2 dirs in result instead of 1 - accidental script/function output
(1 answer)
Why does Range.BorderAround emit "True" to the console?
(1 answer)
Create a Single-Element Json Array Object Using PowerShell
(2 answers)
Closed 1 year ago.
I am new to PowerShell and there is a weird behavior I cannot explain. I call a function that returns a [System.Collections.ArrayList] but when I print my variable that receives the content of the array, if I have one value(for example: logXXX_20210222_075234355.txt), then I get 0 logXXX_20210222_075234355.txt. The value 0 gets added for some reason as if it has the index of the value.
If I have 4 values, it will look like this:
0 1 2 3 logXXX_20210222_075234315.txt logXXX_20210225_090407364.txt
logXXX_20210204_120318221.txt logXXX_20210129_122737751.txt
Can anyone help?
Here is a simple code that does that:
function returnAnArray{
$arrayToReturn =[System.Collections.ArrayList]::new()
$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
$fileNames = returnAnArray
Write-Host $fileNames
0 logICM_20210222_075234315.txt

It's characteristic of the ArrayList class to output the index on .Add(...). However, PowerShell returns all output, which will cause it to intermingle the index numbers with the true or other intended output.
My favorite solution is to simply cast the the output from the .Add(...) method to [Void]:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
[Void]$arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
You can also use Out-Null for this purpose but in many cases it doesn't perform as well.
Another method is to assign it to $null like:
function returnAnArray{
$arrayToReturn = [System.Collections.ArrayList]::new()
$null = $arrayToReturn.Add('logICM_20210222_075234315.txt')
return $arrayToReturn
}
In some cases this can be marginally faster. However, I prefer the [Void] syntax and haven't observed whatever minor performance differential there may be.
Note: $null = ... works in all cases, while there are some cases where [Void] will not; See this answer (thanks again mklement0) for more information.
An aside, you can use casting to establish the list:
$arrayToReturn = [System.Collections.ArrayList]#()
Update Incorporating Important Comments from #mklement0:
return $arrayToReturn may not behave as intended. PowerShell's output behavior is to enumerate (stream) arrays down the pipeline. In such cases a 1 element array will end up returning a scalar. A multi-element array will return a typical object array [Object[]], not [Collection.ArrayList] as seems to be the intention.
The comma operator can be used to guarantee the return type by making the ArrayList the first element of another array. See this answer for more information.
Example without ,:
Function Return-ArrayList { [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Object[]
Example with ,:
Function Return-ArrayList { , [Collections.ArrayList]#(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName
Returns: System.Collections.ArrayList
Of course, this can also be handled by the calling code. Most commonly by wrapping the call in an array subexpression #(...). a call like: $filenames = #(returnAnArray) will force $filenames to be a typical object array ([Object[]]). Casting like $filenames = [Collections.ArrayList]#(returnArray) will make it an ArrayList.
For the latter approach, I always question if it's really needed. The typical use case for an ArrayList is to work around poor performance associated with using += to increment arrays. Often this can be accomplished by allowing PowerShell to return the array for you (see below). But, even if you're forced to use it inside the function, it doesn't mean you need it elsewhere in the code.
For Example:
$array = 1..10 | ForEach-Object{ $_ }
Is preferred over:
$array = [Collections.ArrayList]#()
1..10 | ForEach-Object{ [Void]$array.Add( $_ ) }
Persisting the ArrayList type beyond the function and through to the caller should be based on a persistent need. For example, if there's a need easily add/remove elements further along in the program.
Still More Information:
Notice the Return statement isn't needed either. This very much ties back to why you were getting extra output. Anything a function outputs is returned to the caller. Return isn't explicitly needed for this case. More commonly, Return can be used to exit a function at desired points...
A function like:
Function Demo-Return {
1
return
2
}
This will return 1 but not 2 because Return exited the function beforehand. However, if the function were:
Function Demo-Return
{
1
return 2
}
This returns 1, 2.
However, that's equivalent to Return 1,2 OR just 1,2 without Return
Update based on comments from #zett42:
You could avoid the ArrayList behavior altogether by using a different collection type. Most commonly a generic list, [Collections.Generic.List[object]]. Technically [ArrayList] is deprecated already making generic lists a better option. Furthermore, the .Add() method doesn't output anything, thus you do not need [Void] or any other nullification method. Generic lists are slightly faster than ArrayLists, and saving the nullification operation a further, albeit still small performance advantage.

ArrayList appears to store alternating indexes and values:
PS /home/alistair> $filenames[0]
0
PS /home/alistair> $filenames[1]
logICM_20210222_075234315.txt

Trying and failing to pass an array of custom objects by reference

I am creating an array of custom objects in my powershell script
$UnreachablePCs = [PSCustomObject]#()
Then I am passing this into a function like this
function GetComputerData {
param (
$Computers, [ref]$unreachable
)
...
$unreachablePC = [PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
$UnreachablePCs += $unreachablePC
Write-Output $UnreachablePCs.Count
...
}
GetComputerData -Computers $TraderWorkstations -unreachable ([ref]$UnreachablePCs)
Write-Output $UnreachablePCs.Count
$TraderWorkstations is a list of pcs which are iterated over in the function. All pcs that aren't reachable are added to the $UnreachablePCs array in an else branch in the function. In the function the .Count I'm calling will increment as workstations are added to the list. But After the function is called the final .Count is returning 0. What am I missing here?

Don't use [ref] parameters in PowerShell code: [ref]'s purpose is to facilitate calling .NET API methods; in PowerShell code, it is syntactically awkward and can lead to subtle bugs, such as in your case - see this answer guidance on when [ref] use is appropriate.
Instead, make your function output the objects that make up the result array (possibly one by one), and let PowerShell collect them for you in an array:
function GetComputerData {
param (
$Computers # NO [ref] parameter
)
# ...
# Output (one or more) [pscustomobject] instances.
[PSCustomObject]#{
ComputerName = $i.DNSHostName
CPU = "n/a"
Cores = "n/a"
IP = "n/a"
Memory = "n/a"
Uptime = "n/a"
OS = "n/a"
Board = "n/a"
}
# ...
}
# Collect the [pscustomobject] instances output by the function
# in an array.
$UnReachablePCs = #(GetComputerData -Computers $TraderWorkstations)
#(), the array-subexpression operator, always creates an [object[]] array. To create a strongly typed array instead, use:
[pscustomobject[]] $unreachablePCs = GetComputerData -Computers $TraderWorkstations
Important:
[PSCustomObject]#{ ... directly produces output from the function, due to PowerShell's implicit output feature, where any command or expression whose output isn't captured or redirected automatically contributes to the enclosing function's (script's) output (written to PowerShell's success output stream) - see this answer for details. All the objects written to the function's success output streams are captured by a variable assignment such as $UnReachablePCs = ...
Write-Output is the explicit (rarely needed) way to write to the success output stream, which also implies that you cannot use it for ad hoc debugging output to the display, because its output too becomes part of the function's "return value" (the objects sent to the success output stream).
If you want to-display output that doesn't "pollute" the success output stream, use Write-Host. Preferably, use cmdlets that target other, purpose-specific output streams, such as Write-Verbose and Write-Debug, though both of them require opt-in to produce visible output (see the linked docs).
As for the problems with your original approach:
$UnreachablePCs = [PSCustomObject]#()
This doesn't create an array of custom objects.
Instead, it creates an [object[]] array that is (uselessly, mostly invisibly) wrapped in a [psobject] instance.[1]
Use the following instead:
[PSCustomObject[]] $UnreachablePCs = #()
As for use of [ref] with an array variable updated with +=:
Fundamentally, you need to update a (parameter) variable containing a [ref] instance by assigning to its .Value property, not to the [ref] instance as a whole ($UnreachablePCs.Value = ... rather than $UnreachablePCs = ...)
However, the += technique should be avoided except for small arrays, because every += operation requires creating a new array behind the scenes (comprising the original elements and the new one(s)), which is necessary, because arrays are fixed-size data structures.
Either: Use an efficiently extensible list data type, such as [System.Collections.Generic.List[PSCustomObject]] and grow it via its .Add() method (in fact, if you create the list instance beforehand, you could pass it as a normal (non-[ref]) argument to a non-[ref] parameter, and the function would still directly update the list, due to operating on the same list instance when calling .Add() - that said, the output approach at the top is generally still preferable):
$unreachablePCs = [System.Collections.Generic.List[PSCustomObject]] #()
foreach ($i in 1..2) {
$unreachablePCs.Add([pscustomobject] #{ foo = $i })
}
Or - preferably - when possible: Use PowerShell's loop statements as expressions, and let PowerShell collect all outputs in an array for you (as also shown with output from a whole function above); e.g.:
# Automatically collects the two custom objects output by the loop.
[array] $unreachablePCs = foreach ($i in 1..2) {
[pscustomobject] #{ foo = $i }
}
[1] This behavior is unfortunate, but stems from the type accelerators [pscustomobject] and [psobject] being the same. A [pscustomobject] cast only works meaningfully in the context of creating a single custom object literal, i.e. if followed by a hashtable (e.g., [pscustomobject] #{ foo = 1 }). In all other cases, the mostly invisible wrapping in [psobject] occurs; e.g., [pscustomobject] #() -is [object[]] is $true - i.e., the result behaves like a regular array - but [pscustomobject] #() -is [psobject] is also $true, indicating the presence of a [psobject] wrapper.

Raku: Trouble Accessing Value of a Multidimensional Hash

I am having issues accessing the value of a 2-dimensional hash. From what I can tell online, it should be something like: %myHash{"key1"}{"key2"} #Returns value
However, I am getting the error: "Type Array does not support associative indexing."
Here's a Minimal Reproducible Example.
my %hash = key1-dim1 => key1-dim2 => 42, key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'foo bar'}; # Type Array does not support associative indexing.
Here's another reproducible example, but longer:
my #tracks = 'Foo Bar', 'Foo Baz';
my %count;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash.push: (#words[$i + 1] => 1);
%counts.push: (#words[$i] => %wordHash);
say %counts{#words[$i]}{#words[$i+1]}; #===============CRASHES HERE================
say %counts.kv;
$i = $i + 1;
}
}
}
In my code above, the problem line where the 2-d hash value is accessed will work once in the first iteration of the for-loop. However, it always crashes with that error on the second time through. I've tried replacing the array references in the curly braces with static key values in case something was weird with those, but that did not affect the result. I can't seem to find what exactly is going wrong by searching online.
I'm very new to raku, so I apologize if it's something that should be obvious.

After adding the second elements with push to the same part of the Hash, the elment is now an array. Best you can see this by print the Hash before the crash:
say "counts: " ~ %counts.raku;
#first time: counts: {:aaa(${:aaa(1)})}
#second time: counts: {:aaa($[{:aaa(1)}, {:aaa(1)}])}
The square brackets are indicating an array.
Maybe BagHash does already some work for you. See also raku sets without borders
my #tracks = 'aa1 aa2 aa2 aa3', 'bb1 bb2', 'cc1';
for #tracks -> $title {
my $n = BagHash.new: $title.words;
$n.raku.say;
}
#("aa2"=>2,"aa1"=>1,"aa3"=>1).BagHash
#("bb1"=>1,"bb2"=>1).BagHash
#("cc1"=>1).BagHash

Let me first explain the minimal example:
my %hash = key1-dim1 => key1-dim2 => 42,
key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'key2-dim2'}; # Type Array does not support associative indexing.
The problem is that the value associated with key2-dim1 isn't itself a hash but is instead an Array. Arrays (and all other Positionals) only support indexing by position -- by integer. They don't support indexing by association -- by string or object key.
Hopefully that explains that bit. See also a search of SO using the [raku] tag plus 'Type Array does not support associative indexing'.
Your longer example throws an error at this line -- not immediately, but eventually:
say %counts{...}{...}; # Type Array does not support associative indexing.
The hash %counts is constructed by the previous line:
%counts.push: ...
Excerpting the doc for Hash.push:
If a key already exists in the hash ... old and new value are both placed into an Array
Example:
my %h = a => 1;
%h.push: (a => 1); # a => [1,1]
Now consider that the following code would have the same effect as the example from the doc:
my %h;
say %h.push: (a => 1); # {a => 1}
say %h.push: (a => 1); # {a => [1,1]}
Note how the first .push of a => 1 results in a 1 value for the a key of the %h hash, while the second .push of the same pair results in a [1,1] value for the a key.
A similar thing is going on in your code.
In your code, you're pushing the value %wordHash into the #words[$i] key of the %counts hash.
The first time you do this the resulting value associated with the #words[$i] key in %counts is just the value you pushed -- %wordHash. This is just like the first push of 1 above resulting in the value associated with the a key, from the push, being 1.
And because %wordHash is itself a hash, you can associatively index into it. So %counts{...}{...} works.
But the second time you push a value to the same %counts key (i.e. when the key is %counts{#words[$i]}, with #words[$i] set to a word/string/key that is already held by %counts), then the value associated with that key will not end up being associated with %wordHash but instead with [%wordHash, %wordHash].
And you clearly do get such a second time in your code, if the #tracks you are feeding in have titles that begin with the same word. (I think the same is true even if the duplication isn't the first word but instead later ones. But I'm too confused by your code to be sure what the exact broken combinations are. And it's too late at night for me to try understand it, especially given that it doesn't seem important anyway.)
So when your code then evaluates %counts{#words[$i]}{#words[$i+1]}, it is the same as [%wordHash, %wordHash]{...}. Which doesn't make sense, so you get the error you see.
Hopefully the foregoing has been helpful.
But I must say I'm both confused by your code, and intrigued as to what you're actually trying to accomplish.
I get that you're just learning Raku, and that what you've gotten from this SO might already be enough for you, but Raku has a range of nice high level hash like data types and functionality, and if you describe what you're aiming at we might be able to help with more than just clearing up Raku wrinkles that you and we have been dealing with thus far.
Regardless, welcome to SO and Raku. :)

Well, this one was kind of funny and surprising. You can't go wrong if you follow the other question, however, here's a modified version of your program:
my #tracks = ['love is love','love is in the air', 'love love love'];
my %counts;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash{#words[$i + 1]} = 1;
%counts{#words[$i]} = %wordHash;
say %counts{#words[$i]}{#words[$i+1]}; # The buck stops here
say %counts.kv;
$i = $i + 1;
}
}
}
Please check the line where it crashed before. Can you spot the difference? It was kind of a (un)lucky thing that you used i as a loop variable... i is a complex number in Raku. So it was crashing because it couldn't use complex numbers to index an array. You simply had dropped the $.
You can use sigilless variables in Raku, as long as they're not i, or e, or any of the other constants that are already defined.
I've also made a couple of changes to better reflect the fact that you're building a Hash and not an array of Pairs, as Lukas Valle said.

References to arrays/objects in PowerShell

I use the reference Variable to pass parameters to functions and to manipulate values across functions using the same base variable.
For my other script this worked fine, and maybe this is just a thought problem here, but why this isn't working?:
$Script:NestedLists = #("test", #("test_level_2"))
function AddToReference
{
param([ref]$RefVar)
$RefVar.Value += #("hi")
}
AddToReference -RefVar ([ref]($Script:NestedLists[1]))
$Script:NestedLists[1]
I thought the output of $Script:NestedLists[1] would be "test_level_2" and "hi" but it is just "test_level_2"

This little change made it work
$Script:NestedLists = #("test",#("test_level_2"))
function AddToReference
{
param ([ref]$RefVar) ($RefVar.value)[1] += , "hi"
}
addtoreference ([ref]$Script:NestedLists)
$Script:NestedLists[1]
Why moving the [1] to $refvar made it work, I have no idea, I wish I had a better understanding. Also, this get's really tricky because if you add a value to the first array in the $script it moves the [1] to [2] etc...
I would personally do something to keep each array separate, and at the end just combine them as needed...
$a = "first","array"
$b = "second","array"
$script:nested = $null
$script:nested += , $a
$script:nested += , $b
The "+= ," combines each array in a nested array. so [0] would equal $a and [1] would equal $b

Storing datatypes and embedding them in other datatypes

I can store a data type in a variable like this
$type = [int]
and use it like this:
$type.GetType().Name
but, how can I embed it in another declaration? e.g.
[Func[$type]]
* Update I *
so the invoke-expression will do the trick (thanks Mike z). what I was trying to do is create a lambda expression. this is how I can do it now:
$exp = [System.Linq.Expressions.Expression]
$IR = [Neo4jClient.Cypher.ICypherResultItem]
Invoke-Expression "`$FuncType = [Func[$IR]]"
$ret = $exp::Lambda($FuncType, ...)
but also thanks to #PetSerAl and #Jan for interesting alternatives

This does not appear to be possible directly, at least according to the PowerShell 3.0 specification.
The [type] syntax is called a type-literal by the spec and its definition does not included any parts that can be expressions. It is composed of type-names which are composed of type-characters but there is nothing that is dynamic about them.
Reading through the spec, I noticed that something like this however works:
$type = [int]
$try = Read-Host
$type::"$(if ($try) { 'Try' } else { '' })Parse"
Now you might wonder why $type::$variable is allowed. That is because :: is an operator who's left hand side is an expression that must evaluate to a type. The right hand side is a member-name which allows simple names, string literals, and use of the subexpression operator.
However PowerShell is extremely resilient and you can do almost anything dynamicly via Invoke-Expression. Let's say you want to declare variable that is a generic delegate based on a type you know only at runtime:
$type = [int] # This could come from somewhere else entirely
Invoke-Expression "`$f = [Func[$type]]{ return 1 }"
Now $f has your delegate. You will need to test this out if $type is some complex nested or generic type but it should work for most basic types. I tested with [int] and [System.Collections.Generic.List[int]] it worked fine for both.

It can be achieved by reflection:
$type = [int]
$Func = [Func``1] # you have to use mangled name to get generic type definition.
$Func.MakeGenericType($type)

Unfortunately, I don't thing you can do this. Have a look at this question Is possible to cast a variable to a type stored in another variable?.
There is a suggestion that a conversion is possible using Convert.ChangeType method on objects that implement IConvertible, but as far as I can tell this is not implemented in PowerShell.
You can fake it a little bit, by using your stored type in a scriptblock, but this may not be what you are after.
$type = [byte]
$code = [scriptblock]::create("[$type]`$script:var = 10")
& $code
$var.gettype()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Byte System.ValueType

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Case insensitive [Linq.Enumerable]::SequenceEqual() - powershell

Related

Returning a ArrayList from a function in powershell contains indexes [duplicate]

Trying and failing to pass an array of custom objects by reference

Raku: Trouble Accessing Value of a Multidimensional Hash

References to arrays/objects in PowerShell

Storing datatypes and embedding them in other datatypes

Categories

Resources