Compare objects based on subset of properties - powershell

Say I have 2 powershell hashtables one big and one small and, for a specific purpose I want to say they are equal if for the keys in the small one, the keys on the big hastable are the same.
Also I don't know the names of the keys in advance. I can use the following function that uses Invoke-Expression but I am looking for nicer solutions, that don't rely on this.
Function Compare-Subset {
Param(
[hashtable] $big,
[hashtable] $small
)
$keys = $small.keys
Foreach($k in $keys) {
$expression = '$val = $big.' + "$k" + ' -eq ' + '$small.' + "$k"
Invoke-Expression $expression
If(-not $val) {return $False}
}
return $True
}
$big = #{name='Jon'; car='Honda'; age='30'}
$small = #{name = 'Jon'; car='Honda'}
Compare-Subset $big $small

A simple $true/$false can easily be gotten. This will return $true if there are no differences:
[string]::IsNullOrWhiteSpace($($small|Select -Expand Keys|Where{$Small[$_] -ne $big[$_]}))
It checks for all keys in $small to see if the value of that key in $small is the same of the value for that key in $big. It will only output any values that are different. It's wrapped in a IsNullOrWhitespace() method from the [String] type, so if any differences are found it returns false. If you want to list differences just remove that method.

This could be the start of something. Not sure what output you are looking for but this will output the differences between the two groups. Using the same sample data that you provided:
$results = Compare-Object ($big.GetEnumerator() | % { $_.Name }) ($small.GetEnumerator() | % { $_.Name })
$results | ForEach-Object{
$key = $_.InputObject
Switch($_.SideIndicator){
"<="{"Only reference object has the key: '$key'"}
"=>"{"Only difference object has the key: '$key'"}
}
}
In primetime you would want something different but just to show you the above would yield the following output:
Only reference object has the key: 'age'

Related

IndexOf() or .FindIndex() case-insensitive

I am trying to validate some XML with verbose logging of issues, including required order of attributes and miscapitalization. If the required order of attributes is one, two, three and the XML in question has one, three, two I want to log it. And if an attributes is simply miscapitalized, say TWO instead of two I want to log that as well.
Currently I have two arrays, $ordered with the names of the attributes as they should be (correct capitalization) and $miscapitalized with the names of the miscapitalized attributes.
So, given attributes of one, three, TWO and required order of one, two, three
$ordered = one, two, three
$miscapitalized = TWO
From here I want to append the miscapitalizion, so a new variable
$logged = one, two (TWO), three
I can get the index of $ordered where the miscapitalization occurs with
foreach ($attribute in $ordered) {
if ($attribute -iin $miscapitalized) {
$indexOrdered = [array]::IndexOf($ordered, $attribute)
}
}
However, I can't get the index in $miscapitalized based on the (correctly capitalized) $attribute. I tried
$miscapitalized = #('one', 'two', 'three')
$miscapitalized.IndexOf('TWO')
which doesn't work because .IndexOf() is case sensitive. I found this that says [Collections.Generic.List[Object]] will work, so I thought perhaps Generic.List was where the functionality came from. So I tried
$miscapitalized = [System.Collections.Generic.List[String]]#('one', 'two', 'three')
$miscapitalized.FindIndex('TWO')
Which throws
Cannot find an overload for "FindIndex" and the argument count: "1".
That led me to this that says I need an actual predicate type, not just a string. At which point I am in WAY over my head, and the only thing that I could come up with is $miscapitalized.FindIndex([System.Predicate]::new('TWO')) which doesn't work. I suspect a Predicate could/should be a regex somehow, but I can't seem to find anything that points me in the right direction, or at least that I can understand and recognize that it is pointing me in the right direction. I also found https://www.powershellstation.com/2010/05/18/passing-predicates-as-parameters-in-powershell/ that talks about a code block as predicate, but I am not clear that it's the same usage of the term predicate (it is a widely used term) nor can I grok how to even make a code block that would be helpful here.
I did come up with this approach, which uses the same foreach search in $miscapitalized as in $ordered and it does work. But I wonder if there is a more graceful approach that doesn't require nested loops. Plus, understanding Predicate as it applies here seems useful, as well as (possibly) how a codeblock might be used.
$ordered = #('one', 'two', 'three')
$miscapitalized = #('TWO')
$replacements = [System.Collections.Specialized.OrderedDictionary]::new()
foreach ($orderedAttribute in $ordered) {
if ($orderedAttribute -iin $miscapitalized) {
$indexOrdered = [array]::IndexOf($ordered, $orderedAttribute)
foreach ($miscapitalizedAttribute in $miscapitalized) {
if (($miscapitalizedAttribute -iin $ordered) -and ($miscapitalizedAttribute -ieq $orderedAttribute) -and ($miscapitalizedAttribute -cne $orderedAttribute)) {
#$indexMiscapitalized = [array]::IndexOf($miscapitalized, $miscapitalizedAttribute)
$replacements.Add($indexOrdered, "$orderedAttribute ($miscapitalizedAttribute)")
}
}
}
}
if ($replacements.Count -gt 0) {
foreach ($index in $replacements.Keys) {
$ordered[$index] = $replacements.$index
}
}
$ordered
EDIT: Based on comments below, I have tried this
$ordered = #('one', 'two', 'three')
$miscapitalized = #('TWO', 'Three')
$replacements = [System.Collections.Specialized.OrderedDictionary]::new()
foreach ($orderedAttribute in $ordered) {
if ($orderedAttribute -iin $miscapitalized) {
$indexOrdered = [array]::IndexOf($ordered, $orderedAttribute)
if ($indexMiscapitalized = $miscapitalized.FindIndex({param($s) $s -eq $orderedAttribute})) {
$replacements.Add($indexOrdered, "$orderedAttribute ($($miscapitalized[$indexMiscapitalized]))")
}
}
}
if ($replacements.Count -gt 0) {
foreach ($index in $replacements.Keys) {
$ordered[$index] = $replacements.$index
}
}
$ordered
Which gets the last one (three/Three) but is missing two/TWO. But lots of possible solutions to try tomorrow, since there will be something to learn from each one.
You can substitute a scriptblock for the predicate required by FindIndex():
PS ~> $miscapitalized = [System.Collections.Generic.List[String]]#('one', 'two', 'three')
PS ~> $predicate = {param($s) $s -eq 'TWO'}
PS ~> $miscapitalized.FindIndex($predicate)
1
This will work as expected since PowerShell's -eq operator is case-insensitive by default.
Perhaps, you're overthinking this. You could use Compare-Object to do all the hard work and then you can inspect results and log them accordingly:
# Reference array for attributes order and capitalization
[array]$reference = #(
'one'
'two'
'three'
'four'
)
# Example XML
[xml]$xml = '<foo one="1" oNe="oNe" thrEE="thrEE" two="2">dummy</foo>'
# Compare XML attributes to refrerence array
# -SyncWindow 0 - Order of items in the array matters
# https://stackoverflow.com/questions/40507552/powershell-order-sensitive-compare-objects-diff
Compare-Object -ReferenceObject $reference -DifferenceObject $xml.foo.Attributes.Name -SyncWindow 0 -CaseSensitive -includeEqual
This will produce:
InputObject SideIndicator
----------- -------------
one ==
oNe =>
two <=
thrEE =>
three <=
two =>
four <=
As you can see, the one attribute is in at the correct index (==) and properly cased. We also have additional oNe attribute, that is out of place.
You could also group the Compare-Object result and produce hashtable, which you can use for advanced logging. You could do all kinds of lookups and comparisons using SideIndicator and InputObject properties.
$group = Compare-Object -ReferenceObject $reference -DifferenceObject $xml.foo.Attributes.Name -SyncWindow 0 -CaseSensitive -includeEqual |
Group-Object -Property InputObject -AsHashTable -AsString
$group
Result
four {#{InputObject=four; SideIndicator=<=}}
one {#{InputObject=one; SideIndicator===}, #{InputObject=oNe; SideIndicator==>}}
thrEE {#{InputObject=thrEE; SideIndicator==>}, #{InputObject=three; SideIndicator=<=}}
two {#{InputObject=two; SideIndicator=<=}, #{InputObject=two; SideIndicator==>}}
In this case hashtable keys will be case-insensitive, so you can do stuff like this:
foreach ($r in $reference) {
$ret = $group.$r | Where-Object {
$_.SideIndicator -ne '==' -and $_.InputObject -cne $r
} | Select-Object -ExpandProperty InputObject |
ForEach-Object {
'Index of {0}: {1}' -f $_, $xml.foo.Attributes.Name.IndexOf($_)
}
if ($ret) {
#{ $r = $ret }
}
}
Name Value
---- -----
one Index of "oNe": 1
three Index of "thrEE": 2
You could add a small helper function that finds the index case-insensitive:
function Find-Index {
param (
[Parameter(Mandatory = $true, Position = 0)]
[string[]]$Array,
[Parameter(Mandatory = $true, Position = 1)]
[string]$Value
)
for ($i = 0; $i -lt $Array.Count; $i++) {
if ($Array[$i] -eq $Value) { return $i }
}
-1
# or combine the elements with some unlikely string
# convert that to lowercase and split on the same unlikely string
# then use regular IndexOf() against the value which is also lower-cased:
# (($Array -join '~#~').ToLowerInvariant() -split '~#~').IndexOf($Value.ToLowerInvariant())
}
Then below that, use it like this:
# if any of the below arrays has only one item, wrap it inside #()
$ordered = 'one','two','three'
$miscapitalized = 'One','TWO'
$logged = foreach ($item in $ordered) {
$index = Find-Index $miscapitalized $item
if ($index -ge 0) {
'{0} ({1})' -f $item, $miscapitalized[$index]
}
else { $item }
}
$logged -join ','
Output
one (One),two (TWO),three

Powershell sorting hash table

I am seeing some seemingly very weird behavior with a hash table I am sorting and then trying to review the results. I build the hash table, then I need to sort that table based on values, and I see two bits of weirdness.
This works fine outside of a class
$hash = [hashtable]::New()
$type = 'conformset'
$hash.Add($type, 1)
$type = 'applyset'
$hash.Add($type , 1)
$type = 'conformset'
$hash.$type ++
$hash.$type ++
$hash
Write-Host
$hash = $hash.GetEnumerator() | Sort-Object -property:Value
$hash
I see the contents of the hash twice, unsorted and then sorted.
However, when using a class it does nothing.
class Test {
# Constructor (abstract class)
Test () {
$hash = [hashtable]::New()
$type = 'conformset'
$hash.Add($type, 1)
$type = 'applyset'
$hash.Add($type , 1)
$type = 'conformset'
$hash.$type ++
$hash.$type ++
$hash
Write-Host
$hash = $hash.GetEnumerator() | Sort-Object -property:Value
$hash
}
}
[Test]::New()
This just echos Test to the console, with nothing related to the hash table. My assumption here is that it relates somehow to how the pipeline is interrupted, which lets be honest, is a great reason to move to classes, given how common polluted pipeline errors are. So, moving to a loop based approach, this fails to show the second, sorted, sorted hash table in a class or not.
$hash = [hashtable]::New()
$type = 'conformset'
$hash.Add($type, 1)
$type = 'applyset'
$hash.Add($type , 1)
$type = 'conformset'
$hash.$type ++
$hash.$type ++
foreach ($key in $hash.Keys) {
Write-Host "$key $($hash.$key)!"
}
Write-Host
$hash = ($hash.GetEnumerator() | Sort-Object -property:Value)
foreach ($key in $hash.Keys) {
Write-Host "$key $($hash.$key)!!"
}
But, very weirdly, this shows only the first loop based output, but BOTH of the direct dumps.
$hash = [hashtable]::New()
$type = 'conformset'
$hash.Add($type, 1)
$type = 'applyset'
$hash.Add($type , 1)
$type = 'conformset'
$hash.$type ++
$hash.$type ++
foreach ($key in $hash.Keys) {
Write-Host "$key $($hash.$key)!"
}
$hash
Write-Host
$hash = ($hash.GetEnumerator() | Sort-Object -property:Value)
foreach ($key in $hash.Keys) {
Write-Host "$key $($hash.$key)!!"
}
$hash
The output now is
conformset 3!
applyset 1!
Name Value
---- -----
conformset 3
applyset 1
applyset 1
conformset 3
So obviously $hash is being sorted. But the loop won't show it? Huh? Is this buggy behavior, or intended behavior I just don't understand the reason for, and therefor the way around?
Vasil Svilenov Nikolov's helpful answer explains the fundamental problem with your approach:
You fundamentally cannot sort a hash table ([hashtable] instance) by its keys: the ordering of keys in a hash table is not guaranteed and cannot be changed.
What $hash = $hash.GetEnumerator() | Sort-Object -property:Value does is to instead create an array of [System.Collections.DictionaryEntry] instances; the resulting array has no .Keys property, so your second foreach ($key in $hash.Keys) loop is never entered.
An unrelated problem is that you generally cannot implicitly write to the output stream from PowerShell classes:
Writing to the output stream from a class method requires explicit use of return; similarly, errors must be reported via Throw statements.
In your case, the code is in a constructor for class Test, and constructors implicitly return the newly constructed instance - you're not allowed to return anything from them.
To solve your problem, you need a specialized data type that combines the features of a hash table with maintaining the entry keys in sort order.[1]
.NET type System.Collections.SortedList provides this functionality (there's also a generic version, as Lee Dailey notes):
You can use that type to begin with:
# Create a SortedList instance, which will maintain
# the keys in sorted order, as entries are being added.
$sortedHash = [System.Collections.SortedList]::new()
$type = 'conformset'
$sortedHash.Add($type, 1) # Or: $sortedHash[$type] = 1 or: $sortedHash.$type = 1
$type = 'applyset'
$sortedHash.Add($type , 1)
$type = 'conformset'
$sortedHash.$type++
$sortedHash.$type++
Or even convert from (and to) an existing hash table:
# Construct the hash table as before...
$hash = [hashtable]::new() # Or: $hash = #{}
$type = 'conformset'
$hash.Add($type, 1)
$type = 'applyset'
$hash.Add($type , 1)
$type = 'conformset'
$hash.$type++
$hash.$type++
# ... and then convert it to a SortedList instance with sorted keys.
$hash = [System.Collections.SortedList] $hash
[1] Note that this is different from an ordered dictionary, which PowerShell offers with literal syntax [ordered] #{ ... }: an ordered dictionary maintains the keys in the order in which they are inserted, not based on sorting. Ordered dictionaries are of type System.Collections.Specialized.OrderedDictionary
When you do $hash = $hash.GetEnumerator() | Sort-Object -property:Value , you are re-assigning the hashtable in to an array, try $hash.GetType() , and that will of course behave differently than a hashtable, you can check out the methods etc. Get-Member -InputObject $hash
I dont think you can sort a hashtable, and you do not need to. You might instead try Ordered Dictionary $hash = [Ordered]#{}
Ordered dictionaries differ from hash tables in that the keys always
appear in the order in which you list them. The order of keys in a
hash table is not determined.
One of the best use of a hashtable that I like is the speed of search.
For example, you can instantly get the value of a name in the hashtable the following way $hash['applyset']
If you want to know more about how hashtable work, and how/when to use it, I think this article is a good start :
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_hash_tables?view=powershell-6

PS Object unescape character

I have small error when running my code. I assign a string to custom object but it's parsing the string by itself and throwing an error.
Code:
foreach ($item in $hrdblistofobjects) {
[string]$content = Get-Content -Path $item
[string]$content = $content.Replace("[", "").Replace("]", "")
#here is line 43 which is shown as error as well
foreach ($object in $listofitemsdb) {
$result = $content -match $object
$OurObject = [PSCustomObject]#{
ObjectName = $null
TestObjectName = $null
Result = $null
}
$OurObject.ObjectName = $item
$OurObject.TestObjectName = $object #here is line 52 which is other part of error
$OurObject.Result = $result
$Resultsdb += $OurObject
}
}
This code loads an item and checks if an object exists within an item. Basically if string part exists within a string part and then saves result to a variable. I am using this code for other objects and items but they don't have that \p part which I am assuming is the issue. I can't put $object into single quotes for obvious reasons (this was suggested on internet but in my case it's not possible). So is there any other option how to unescape \p? I tried $object.Replace("\PMS","\\PMS") but that did not work either (this was suggested somewhere too).
EDIT:
$Resultsdb = #(foreach ($item in $hrdblistofobjects) {
[string]$content = Get-Content -Path $item
[string]$content = $content.Replace("[", "").Replace("]", "")
foreach ($object in $listofitemsdb) {
[PSCustomObject]#{
ObjectName = $item
TestObjectName = $object
Result = $content -match $object
}
}
}
)
$Resultsdb is not defined as an array, hence you get that error when you try to add one object to another object when that doesn't implement the addition operator.
You shouldn't be appending to an array in a loop anyway. That will perform poorly, because with each iteration it creates a new array with the size increased by one, copies all elements from the existing array, puts the new item in the new free slot, and then replaces the original array with the new one.
A better approach is to just output your objects in the loop and collect the loop output in a variable:
$Resultsdb = foreach ($item in $hrdblistofobjects) {
...
foreach ($object in $listofitemsdb) {
[PSCustomObject]#{
ObjectName = $item
TestObjectName = $object
Result = $content -match $object
}
}
}
Run the loop in an array subexpression if you need to ensure that the result is an array, otherwise it will be empty or a single object when the loop returns less than two results.
$Resultsdb = #(foreach ($item in $hrdblistofobjects) {
...
})
Note that you need to suppress other output on the default output stream in the loop, so that it doesn't pollute your result.
I changed the match part to this and it's working fine $result = $content -match $object.Replace("\PMS","\\PMS").
Sorry for errors in posting. I will amend that.

What is '#{}' meaning in PowerShell

I have line of scripts for review here, I noticed variable declaration with a value:
function readConfig {
Param([string]$fileName)
$config = #{}
Get-Content $fileName | Where-Object {
$_ -like '*=*'
} | ForEach-Object {
$key, $value = $_ -split '\s*=\s*', 2
$config[$key] = $value
}
return $config
}
I wonder what #{} means in $config = #{}?
#{} in PowerShell defines a hashtable, a data structure for mapping unique keys to values (in other languages this data structure is called "dictionary" or "associative array").
#{} on its own defines an empty hashtable, that can then be filled with values, e.g. like this:
$h = #{}
$h['a'] = 'foo'
$h['b'] = 'bar'
Hashtables can also be defined with their content already present:
$h = #{
'a' = 'foo'
'b' = 'bar'
}
Note, however, that when you see similar notation in PowerShell output, e.g. like this:
abc: 23
def: #{"a"="foo";"b"="bar"}
that is usually not a hashtable, but the string representation of a custom object.
The meaning of the #{}
can be seen in diffrent ways.
If the #{} is empty, an empty hash table is defined.
But if there is something between the curly brackets it can be used in a contex of an splatting operation.
Hash Table
Splatting
I think there is no need in explaining what an hash table is.
Splatting is a method of passing a collection of parameter values to a command as unit.
$prints = #{
Name = "John Doe"
Age = 18
Haircolor = "Red"
}
Write-Host #prints
Hope it helps! BR
Edit:
Regarding the updated code from the questioner the answer is
It defines an empty hash table.
Be aware that Get-Content has its own parameters!
THE MOST IMPORTANT 1:
[-Raw]

Powershell: how to fetch a single column from a multi-dimensional array?

Is there a function, method, or language construction allowing to retrieve a single column from a multi-dimensional array in Powershell?
$my_array = #()
$my_array += ,#(1,2,3)
$my_array += ,#(4,5,6)
$my_array += ,#(7,8,9)
# I currently use that, and I want to find a better way:
foreach ($line in $my_array) {
[array]$single_column += $line[1] # fetch column 1
}
# now $single_column contains only 2 and 5 and 8
My final goal is to find non-duplicated values from one column.
Sorry, I don't think anything like that exist. I would go with:
#($my_array | foreach { $_[1] })
To quickly find unique values I tend to use hashtables keys hack:
$UniqueArray = #($my_array | foreach -Begin {
$unique = #{}
} -Process {
$unique.($_[1]) = $null
} -End {
$unique.Keys
})
Obviously it has it limitations...
To extract one column:
$single_column = $my_array | foreach { $_[1] }
To extract any columns:
$some_columns = $my_array | foreach { ,#($_[2],$_[1]) } # any order
To find non-duplicated values from one column:
$unique_array = $my_array | foreach {$_[1]} | sort-object -unique
# caveat: the resulting array is sorted,
# so BartekB have a better solution if sort is a problem
I tried #BartekB's solution and it worked for me. But for the unique part I did the following.
#($my_array | foreach { $_[1] } | select -Unique)
I am not very familiar with powershell but I am posting this hoping it helps others since it worked for me.