Merging hashtables in PowerShell: how? - powershell

I am trying to merge two hashtables, overwriting key-value pairs in the first if the same key exists in the second.
To do this I wrote this function which first removes all key-value pairs in the first hastable if the same key exists in the second hashtable.
When I type this into PowerShell line by line it works. But when I run the entire function, PowerShell asks me to provide (what it considers) missing parameters to foreach-object.
function mergehashtables($htold, $htnew)
{
$htold.getenumerator() | foreach-object
{
$key = $_.key
if ($htnew.containskey($key))
{
$htold.remove($key)
}
}
$htnew = $htold + $htnew
return $htnew
}
Output:
PS C:\> mergehashtables $ht $ht2
cmdlet ForEach-Object at command pipeline position 1
Supply values for the following parameters:
Process[0]:
$ht and $ht2 are hashtables containing two key-value pairs each, one of them with the key "name" in both hashtables.
What am I doing wrong?

Merge-Hashtables
Instead of removing keys you might consider to simply overwrite them:
$h1 = #{a = 9; b = 8; c = 7}
$h2 = #{b = 6; c = 5; d = 4}
$h3 = #{c = 3; d = 2; e = 1}
Function Merge-Hashtables {
$Output = #{}
ForEach ($Hashtable in ($Input + $Args)) {
If ($Hashtable -is [Hashtable]) {
ForEach ($Key in $Hashtable.Keys) {$Output.$Key = $Hashtable.$Key}
}
}
$Output
}
For this cmdlet you can use several syntaxes and you are not limited to two input tables:
Using the pipeline: $h1, $h2, $h3 | Merge-Hashtables
Using arguments: Merge-Hashtables $h1 $h2 $h3
Or a combination: $h1 | Merge-Hashtables $h2 $h3
All above examples return the same hash table:
Name Value
---- -----
e 1
d 2
b 6
c 3
a 9
If there are any duplicate keys in the supplied hash tables, the value of the last hash table is taken.
(Added 2017-07-09)
Merge-Hashtables version 2
In general, I prefer more global functions which can be customized with parameters to specific needs as in the original question: "overwriting key-value pairs in the first if the same key exists in the second". Why letting the last one overrule and not the first? Why removing anything at all? Maybe someone else want to merge or join the values or get the largest value or just the average...
The version below does no longer support supplying hash tables as arguments (you can only pipe hash tables to the function) but has a parameter that lets you decide how to treat the value array in duplicate entries by operating the value array assigned to the hash key presented in the current object ($_).
Function
Function Merge-Hashtables([ScriptBlock]$Operator) {
$Output = #{}
ForEach ($Hashtable in $Input) {
If ($Hashtable -is [Hashtable]) {
ForEach ($Key in $Hashtable.Keys) {$Output.$Key = If ($Output.ContainsKey($Key)) {#($Output.$Key) + $Hashtable.$Key} Else {$Hashtable.$Key}}
}
}
If ($Operator) {ForEach ($Key in #($Output.Keys)) {$_ = #($Output.$Key); $Output.$Key = Invoke-Command $Operator}}
$Output
}
Syntax
HashTable[] <Hashtables> | Merge-Hashtables [-Operator <ScriptBlock>]
Default
By default, all values from duplicated hash table entries will added to an array:
PS C:\> $h1, $h2, $h3 | Merge-Hashtables
Name Value
---- -----
e 1
d {4, 2}
b {8, 6}
c {7, 5, 3}
a 9
Examples
To get the same result as version 1 (using the last values) use the command: $h1, $h2, $h3 | Merge-Hashtables {$_[-1]}. If you would like to use the first values instead, the command is: $h1, $h2, $h3 | Merge-Hashtables {$_[0]} or the largest values: $h1, $h2, $h3 | Merge-Hashtables {($_ | Measure-Object -Maximum).Maximum}.
More examples:
PS C:\> $h1, $h2, $h3 | Merge-Hashtables {($_ | Measure-Object -Average).Average} # Take the average values"
Name Value
---- -----
e 1
d 3
b 7
c 5
a 9
PS C:\> $h1, $h2, $h3 | Merge-Hashtables {$_ -Join ""} # Join the values together
Name Value
---- -----
e 1
d 42
b 86
c 753
a 9
PS C:\> $h1, $h2, $h3 | Merge-Hashtables {$_ | Sort-Object} # Sort the values list
Name Value
---- -----
e 1
d {2, 4}
b {6, 8}
c {3, 5, 7}
a 9

I see two problems:
The open brace should be on the same line as Foreach-object
You shouldn't modify a collection while enumerating through a collection
The example below illustrates how to fix both issues:
function mergehashtables($htold, $htnew)
{
$keys = $htold.getenumerator() | foreach-object {$_.key}
$keys | foreach-object {
$key = $_
if ($htnew.containskey($key))
{
$htold.remove($key)
}
}
$htnew = $htold + $htnew
return $htnew
}

Not a new answer, this is functionally the same as #Josh-Petitt with improvements.
In this answer:
Merge-HashTable uses the correct PowerShell syntax if you want to drop this into a module
Wasn't idempotent. I added cloning of the HashTable input, otherwise your input was clobbered, not an intention
added a proper example of usage
function Merge-HashTable {
param(
[hashtable] $default, # Your original set
[hashtable] $uppend # The set you want to update/append to the original set
)
# Clone for idempotence
$default1 = $default.Clone();
# We need to remove any key-value pairs in $default1 that we will
# be replacing with key-value pairs from $uppend
foreach ($key in $uppend.Keys) {
if ($default1.ContainsKey($key)) {
$default1.Remove($key);
}
}
# Union both sets
return $default1 + $uppend;
}
# Real-life example of dealing with IIS AppPool parameters
$defaults = #{
enable32BitAppOnWin64 = $false;
runtime = "v4.0";
pipeline = 1;
idleTimeout = "1.00:00:00";
} ;
$options1 = #{ pipeline = 0; };
$options2 = #{ enable32BitAppOnWin64 = $true; pipeline = 0; };
$results1 = Merge-HashTable -default $defaults -uppend $options1;
# Name Value
# ---- -----
# enable32BitAppOnWin64 False
# runtime v4.0
# idleTimeout 1.00:00:00
# pipeline 0
$results2 = Merge-HashTable -default $defaults -uppend $options2;
# Name Value
# ---- -----
# idleTimeout 1.00:00:00
# runtime v4.0
# enable32BitAppOnWin64 True
# pipeline 0

In case you want to merge the whole hashtable tree
function Join-HashTableTree {
param (
[Parameter(Mandatory = $true, ValueFromPipeline = $true)]
[hashtable]
$SourceHashtable,
[Parameter(Mandatory = $true, Position = 0)]
[hashtable]
$JoinedHashtable
)
$output = $SourceHashtable.Clone()
foreach ($key in $JoinedHashtable.Keys) {
$oldValue = $output[$key]
$newValue = $JoinedHashtable[$key]
$output[$key] =
if ($oldValue -is [hashtable] -and $newValue -is [hashtable]) { $oldValue | ~+ $newValue }
elseif ($oldValue -is [array] -and $newValue -is [array]) { $oldValue + $newValue }
else { $newValue }
}
$output;
}
Then, it can be used like this:
Set-Alias -Name '~+' -Value Join-HashTableTree -Option AllScope
#{
a = 1;
b = #{
ba = 2;
bb = 3
};
c = #{
val = 'value1';
arr = #(
'Foo'
)
}
} |
~+ #{
b = #{
bb = 33;
bc = 'hello'
};
c = #{
arr = #(
'Bar'
)
};
d = #(
42
)
} |
ConvertTo-Json
It will produce the following output:
{
"a": 1,
"d": 42,
"c": {
"val": "value1",
"arr": [
"Foo",
"Bar"
]
},
"b": {
"bb": 33,
"ba": 2,
"bc": "hello"
}
}

I just needed to do this and found this works:
$HT += $HT2
The contents of $HT2 get added to the contents of $HT.

The open brace has to be on the same line as ForEach-Object or you have to use the line continuation character (backtick).
This is the case because the code within { ... } is really the value for the -Process parameter of ForEach-Object cmdlet.
-Process <ScriptBlock[]>
Specifies the script block that is applied to each incoming object.
This will get you past the current issue at hand.

I think the most compact code to merge (without overwriting existing keys) would be this:
function Merge-Hashtables($htold, $htnew)
{
$htnew.keys | where {$_ -notin $htold.keys} | foreach {$htold[$_] = $htnew[$_]}
}
I borrowed it from Union and Intersection of Hashtables in PowerShell

I wanted to point out that one should not reference base properties of the hashtable indiscriminately in generic functions, as they may have been overridden (or overloaded) by items of the hashtable.
For instance, the hashtable $hash=#{'keys'='lots of them'} will have the base hashtable property, Keys overridden by the item keys, and thus doing a foreach ($key in $hash.Keys) will instead enumerate the hashed item keys's value, instead of the base property Keys.
Instead the method GetEnumerator or the keys property of the PSBase property, which cannot be overridden, should be used in functions that may have no idea if the base properties have been overridden.
Thus, Jon Z's answer is the best.

To 'inherit' key-values from parent hashtable ($htOld) to child hashtables($htNew), without modifying values of already existing keys in the child hashtables,
function MergeHashtable($htOld, $htNew)
{
$htOld.Keys | %{
if (!$htNew.ContainsKey($_)) {
$htNew[$_] = $htOld[$_];
}
}
return $htNew;
}
Please note that this will modify the $htNew object.

Here is a function version that doesn't use the pipeline (not that the pipeline is bad, just another way to do it). It also returns a merged hashtable and leaves the original unchanged.
function MergeHashtable($a, $b)
{
foreach ($k in $b.keys)
{
if ($a.containskey($k))
{
$a.remove($k)
}
}
return $a + $b
}

I just wanted to expand or simplify on jon Z's answer. There just seems to be too many lines and missed opportunities to use Where-Object. Here is my simplified version:
Function merge_hashtables($htold, $htnew) {
$htold.Keys | ? { $htnew.ContainsKey($_) } | % {
$htold.Remove($_)
}
$htold += $htnew
return $htold
}

Related

Array of reference to var in powershell

Ok I guess this question has already been answered somewhere but I do not find it. So here is my few lines of codes
$a = 0
$b = 0
$c = 0
$array = #($a, $b, $c)
foreach ($var in $array) {
$var = 3
}
Write-Host "$a : $b : $c"
What I try to do is loop into $array and modify a, b and c variables to get 3 : 3 : 3 ... I find something about [ref] but I am not sure I understood how to use it.
You'll need to wrap the values in objects of a reference type (eg. a PSObject) and then assign to a property on said object:
$a = [pscustomobject]#{ Value = 0 }
$b = [pscustomobject]#{ Value = 0 }
$c = [pscustomobject]#{ Value = 0 }
$array = #($a, $b, $c)
foreach ($var in $array) {
$var.Value = 3
}
Write-Host "$($a.Value) : $($b.Value) : $($c.Value)"
Since $a and $array[0] now both contain a reference to the same object, updates to properties on either will be reflected when accessed through the other
As you mentioned you can use the [ref] keyword, it will create an object with a "Value" property and that's what you have to manipulate to set the original variables.
$a = 1
$b = 2
$c = 3
$array = #(
([ref] $a),
([ref] $b),
([ref] $c)
)
foreach ($item in $array)
{
$item.Value = 3
}
Write-Host "a: $a, b: $b, c: $c" # a: 3, b: 3, c: 3
You could also use the function Get-Variable to get variables:
$varA = Get-Variable -Name a
This way you can get more information about the variable like the name.
And if your variables have some kind of prefix you could get them all using a wildcard.
$variables = Get-Variable -Name my*
And you would get all variables that start with "my".

Powershell - Normal variables are behaving like reference variables

Here is the simple powershell code -
$arr = #()
$a = [PSCustomObject]#{
a = 'a'
b = 'b'
}
$arr += $a
$b = [PSCustomObject]#{
a = 'c'
b = 'd'
}
$arr += $b
$f = $arr | Where-Object {$_.a -eq 'a'}
$f.a = '1'
Write-Host "`$a.a=$($a.a); `$f.a=$($f.a)"
$f.a = '11'
Write-Host "`$a.a=$($a.a); `$f.a=$($f.a)"
Output:
$a.a=1; $f.a=1
$a.a=11; $f.a=11
My problem is - How the changing of $f value is also changing the value of $a value? I'm not aware of this concept.
And, what can I do to avoid this behavior?
Type [PSCustomObject] is a reference type, so multiple variables can "point to" (reference) a given instance; see this answer for more information about reference types vs. value types.
If you want to create a copy (a shallow clone) of a [PSCustomObject] instance, call .psobject.Copy():
$f = ($arr | Where-Object {$_.a -eq 'a'}).psobject.Copy()

Powershell array of arrays [duplicate]

This question already has answers here:
Powershell create array of arrays
(3 answers)
Closed 5 years ago.
This is building $ret into a long 1 dimensional array rather than an array of arrays. I need it to be an array that is populated with $subret objects. Thanks.
$ret = #()
foreach ($item in $items){
$subret = #()
$subRet = $item.Name , $item.Value
$ret += $subret
}
there might be other ways but arraylist normally works for me, in this case I would do:
$ret = New-Object System.Collections.ArrayList
and then
$ret.add($subret)
The suspected preexisting duplicate question is indeed a duplicate:
Given that + with an array as the LHS concatenates arrays, you must nest the RHS with the unary form of , (the array-construction operator) if it is an array that should be added as a single element:
# Sample input
$items = [pscustomobject] #{ Name = 'n1'; Value = 'v1'},
[pscustomobject] #{ Name = 'n2'; Value = 'v2'}
$ret = #() # create an empty *array*
foreach ($item in $items) {
$subret = $item.Name, $item.Value # use of "," implicitly creates an array
$ret += , $subret # unary "," creates a 1-item array
}
# Show result
$ret.Count; '---'; $ret[0]; '---'; $ret[1]
This yields:
2
---
n1
v1
---
n2
v2
The reason the use of [System.Collections.ArrayList] with its .Add() method worked too - a method that is generally preferable when building large arrays - is that .Add() only accepts a single object as the item to add, irrespective of whether that object is a scalar or an array:
# Sample input
$items = [pscustomobject] #{ Name = 'n1'; Value = 'v1'},
[pscustomobject] #{ Name = 'n2'; Value = 'v2'}
$ret = New-Object System.Collections.ArrayList # create an *array list*
foreach ($item in $items) {
$subret = $item.Name, $item.Value
# .Add() appends whatever object you pass it - even an array - as a *single* element.
# Note the need for $null = to suppress output of .Add()'s return value.
$null = $ret.Add($subret)
}
# Produce sample output
$ret.Count; '---'; $ret[0]; '---'; $ret[1]
The output is the same as above.
Edit
It is more convoluted to create an array of tuples than fill an array with PsObjects containing Name Value as the two properties.
Select the properties you want from $item then add them to the array
$item = $item | select Name, Value
$arr = #()
$arr += $item
You can reference the values in this array by doing this
foreach($obj in $arr)
{
$name = $obj.Name
$value = $obj.Value
# Do actions with the values
}

Compare objects based on subset of properties

Say I have 2 powershell hashtables one big and one small and, for a specific purpose I want to say they are equal if for the keys in the small one, the keys on the big hastable are the same.
Also I don't know the names of the keys in advance. I can use the following function that uses Invoke-Expression but I am looking for nicer solutions, that don't rely on this.
Function Compare-Subset {
Param(
[hashtable] $big,
[hashtable] $small
)
$keys = $small.keys
Foreach($k in $keys) {
$expression = '$val = $big.' + "$k" + ' -eq ' + '$small.' + "$k"
Invoke-Expression $expression
If(-not $val) {return $False}
}
return $True
}
$big = #{name='Jon'; car='Honda'; age='30'}
$small = #{name = 'Jon'; car='Honda'}
Compare-Subset $big $small
A simple $true/$false can easily be gotten. This will return $true if there are no differences:
[string]::IsNullOrWhiteSpace($($small|Select -Expand Keys|Where{$Small[$_] -ne $big[$_]}))
It checks for all keys in $small to see if the value of that key in $small is the same of the value for that key in $big. It will only output any values that are different. It's wrapped in a IsNullOrWhitespace() method from the [String] type, so if any differences are found it returns false. If you want to list differences just remove that method.
This could be the start of something. Not sure what output you are looking for but this will output the differences between the two groups. Using the same sample data that you provided:
$results = Compare-Object ($big.GetEnumerator() | % { $_.Name }) ($small.GetEnumerator() | % { $_.Name })
$results | ForEach-Object{
$key = $_.InputObject
Switch($_.SideIndicator){
"<="{"Only reference object has the key: '$key'"}
"=>"{"Only difference object has the key: '$key'"}
}
}
In primetime you would want something different but just to show you the above would yield the following output:
Only reference object has the key: 'age'

Updating hash table values in a 'foreach' loop in PowerShell

I'm trying to loop through a hash table and set the value of each key to 5 and PowerShell gives an error:
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
foreach($key in $myHash.keys){
$myHash[$key] = 5
}
An error occurred while enumerating through a collection:
Collection was modified; enumeration operation may not execute..
At line:1 char:8
+ foreach <<<< ($key in $myHash.keys){
+ CategoryInfo : InvalidOperation: (System.Collecti...tableEnumer
ator:HashtableEnumerator) [], RuntimeException
+ FullyQualifiedErrorId : BadEnumeration
What gives and how do I resolve this problem?
You can't modify Hashtable while enumerating it. This is what you can do:
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
$myHash = $myHash.keys | foreach{$r=#{}}{$r[$_] = 5}{$r}
Edit 1
Is this any simpler for you:
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
foreach($key in $($myHash.keys)){
$myHash[$key] = 5
}
There is a much simpler way of achieving this. You cannot change the value of a hashtable while enumerating it because of the fact that it's a reference type variable. It's exactly the same story in .NET.
Use the following syntax to get around it. We are converting the keys collection into a basic array using the #() notation. We make a copy of the keys collection, and reference that array instead which means we can now edit the hashtable.
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
foreach($key in #($myHash.keys)){
$myHash[$key] = 5
}
You do not need to clone the whole hashtable for this example. Just enumerating the key collection by forcing it to an array #(...) is enough:
foreach($key in #($myHash.keys)) {...
Use clone:
foreach($key in ($myHash.clone()).keys){
$myHash[$key] = 5
}
Or in the one-liner:
$myHash = ($myHash.clone()).keys | % {} {$myHash[$_] = 5} {$myHash}
I'm new to PowerShell, but I'm quite a fan of using in-built functions, because I find it more readable. This is how I would tackle the problem, using GetEnumerator and Clone. This approach also allows one to reference to the existing hash values ($_.value) for modifying purposes.
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
$myHash.Clone().GetEnumerator() | foreach-object {$myHash.Set_Item($_.key, 5)}
You have to get creative!
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
$keys = #()
[array] $keys = $myHash.keys
foreach($key in $keys)
{
$myHash.Set_Item($key, 5)
}
$myHash
Name Value
---- -----
c 5
a 5
b 5
As mentioned in a previous answer, clone is the way to go. I had a need to replace any null values in a hash with "Unknown" nd this one-liner does the job.
($record.Clone()).keys | %{if ($record.$_ -eq $null) {$record.$_ = "Unknown"}}
$myHash = #{
Americas = 0;
Asia = 0;
Europe = 0;
}
$countries = #("Americas", "Asia", "Europe", "Americas", "Asia")
foreach($key in $($myHash.Keys))
{
foreach($Country in $countries)
{
if($key -eq $Country)
{
$myHash[$key] += 1
}
}
}
$myHash
$myHash = #{
Americas = 0;
Asia = 0;
Europe = 0;
}
$countries = #("Americas", "Asia", "Europe", "Americas", "Asia")
foreach($key in $($myHash.Keys))
{
foreach($Country in $countries)
{
if($key -eq $Country)
{
$myHash[$key] += 1
}
}
}
Updating a hash value if array elements matched with a hash key.
It seems when you update the hash table inside the foreach loop, the enumerator invalidates itself. I got around this by populating a new hash table:
$myHash = #{}
$myHash["a"] = 1
$myHash["b"] = 2
$myHash["c"] = 3
$newHash = #{}
foreach($key in $myHash.keys){
$newHash[$key] = 5
}
$myHash = $newHash