Comparing Two Arrays Without Using -Compare - powershell

I have two array's, one contains multiple columns from a CSV file read in, and the other just contains server names, both type string. For this comparison, I plan on only using the name column from the CSV file. I don't want to use -compare because I want to still be able to use all CSV columns with the results. Here is an example of data from each array.
csvFile.Name:
linu40944
windo2094
windo4556
compareFile:
linu40944
windo2094
linu24455
As you can see, they contain similar server names, except $csvFile.Name contains 25,000+ records, and $compareFile contains only 3,500.
I've tried:
foreach ($server in $compareFile) {
if ($csvFile.Name -like $server) {
$count++
}
}
Every time I run this, it takes forever to run, and results in $count having a value in the millions when it should be roughly 3,000. I've tried different variations of -match, -eq, etc. where -like is. Also note that my end goal is to do something else where $count is, but for now I'm just trying to make sure it is outputting as much as it should, which it is not.
Am I doing something wrong here? Am I using the wrong formatting?

One possible thought given the size of your data.
Create a hashtable (dictionary) for every name in the first/larger file. Name is the Key. Value is 0 for each.
For each name in your second/smaller/compare file, add 1 to the value in your hashtable IF it exists. If it does not exist, what is your plan???
Afterwards, you can dump all keys and values and see which ones are 0, 1, or >1 which may or may not be of value to you.
If you need help with this code, I may be able to edit my answer. Since you are new, to StackOverflow, perhaps you want to try this first yourself.

Build custom objects from $compareFile (so that you can compare the same property), then use Compare-Object with the parameter -PassThru for the comparison. Discriminate the results using the SideIndicator.
$ref = $compareFile | ForEach-Object {
New-Object -Type PSObject -Property #{
'Name' = $_
}
}
Compare-Object $csvFile $ref -Property Name -PassThru | Where-Object {
$_.SideIndicator -eq '<='
} | Select-Object -Property * -Exclude SideIndicator
The trailing Select-Object removes the additional property SideIndicator that Compare-Object adds to the result.

Related

Powershell eq operator saying hashes are different, while Write-Host is showing the opposite

I have a script that periodically generates a list of all files in a directory, and then writes a text file of the results to a different directory.
I'd like to change this so it checks the newest text file in the output directory, and only makes a new one if there's differences. It seemed simple enough.
Here's what I tried:
First I get the most recent file in the directory, grab the hash, and write my variable values to the console:
$lastFile = gci C:\ReportOutputDir | sort LastWriteTime | select -last 1 | Select-Object -ExpandProperty FullName
$oldHash = Get-FileHash $lastFile | Select-Object Hash
Write-Host 'lastFile = '$lastFile
Write-Host 'oldHash = '$oldHash
Output:
lastFile = C:\ReportOutputDir\test1.txt
oldHash = #{Hash=E7787C54F5BAE236100A24A6F453A5FDF6E6C7333B60ED8624610EAFADF45521}
Then I do the exact same gci on the FileList dir, and create a new file (new_test.txt), then grab the hash of this file:
gci -Path C:\FileLists -File -Recurse -Name -Depth 2 | Sort-Object | out-file C:\ReportOutputDir\new_test.txt
$newFile = gci C:\ReportOutputDir | sort LastWriteTime | select -last 1 | Select-Object -ExpandProperty FullName
$newHash = Get-FileHash $newFile | Select-Object Hash
Write-Host 'newFile = '$newFile
Write-Host 'newHash = '$newHash
Output:
newFile = C:\ReportOutputDir\new_test.txt
newHash = #{Hash=E7787C54F5BAE236100A24A6F453A5FDF6E6C7333B60ED8624610EAFADF45521}
Finally, I attempt my -eq operator where I'd usually simply remove the newFile if it's equal. For now, I'm just doing a simple :
if ($newHash -eq $oldHash){
'files are equal'
}
else {'files are not equal'}
And somehow, I'm getting
files are not equal
What gives? Also, for the record I was originally trying to save the gci output to a variable and comparing the contents of the last file to the gci output, but was also having trouble with the -eq operator. Fairly new to powershell stuff so I'm sure I'm doing something wrong here.
Select-Object Hash creates an object with a .Hash property and it is that property that contains the hash string.
The object returned is of type [pscustomobject], and two instances of this type never compare as equal - even if all their property names and values are equal:
The reason is that reference equality is tested, because [pscustomobject] is a .NET reference type that doesn't define custom equality-testing logic.
Testing reference equality means that only two references to the very same instance compare as equal.
A quick example:
PS> [pscustomobject] #{ foo = 1 } -eq [pscustomobject] #{ foo = 1 }
False # !! Two distinct instances aren't equal, no matter what they contain.
You have two options:
Compare the .Hash property values, not the objects as a whole:
if ($newHash.Hash -eq $oldHash.Hash) { # ...
If you don't need a [pscustomobject] wrapper for the hash strings, use Select-Object's -ExpandProperty parameter instead of the (possibly positionally implied) -Property parameter:
Select-Object -ExpandProperty Hash
As for why the Write-Host output matched:
When you force objects to be converted to string representations - essentially, Write-Host calls .ToString() on its arguments - the string representations of distinct [pscustomobject] instances that have the same properties and values will be the same:
PS> "$([pscustomobject] #{ foo = 1 })" -eq "$([pscustomobject] #{ foo = 1 })"
True # Same as: '#{foo=1}' -eq '#{foo=1}'
However, you should not rely on these hashtable-like string representations to determine equality of [pscustomobject]s as a whole, because of the inherent limitations of these representations, which can easily yield false positives.
This answer shows how to compare [pscustomobject] instances as a whole, by comparing all of their property values, by passing all property names to Compare-Object -Property - but note that this assumes that all property values are either strings or instances of .NET value types or corresponding properties must again either reference the very same instance of a .NET reference type or be of a type that implements custom equality-comparison logic.

Powershell sort two fields and and get latest from CSV

I am trying to find a way to sort a CSV by two fields and retrieve only the latest item.
CSV fields: time, computer, type, domain.
Item that works is below but is slow due to scale of CSV and I feel like there is a better way.
$sorted = $csv | Group-Object {$_.computer} | ForEach {$_.Group | Sort-Object Time -Descending | Select-Object -First 1}
As Lee_Dailey suggests, you'll probably have better luck with a hashtable instead, Group-Object (unless used with the -NoElement parameter) is fairly slow and memory-hungry.
The fastest way off the top of my head would be something like this:
# use the call operator & instead of ForEach-Object to avoid overhead from pipeline parameter binding
$csv |&{
begin{
# create a hashtable to hold the newest object per computer
$newest = #{}
}
process{
# test if the object in the pipeline is newer that the one we have
if(-not $newest.ContainsKey($_.Computer) -or $newest[$_.Computer].Time -lt $_.Time){
# update our hashtable with the newest object
$newest[$_.Computer] = $_
}
}
end{
# return the newest-per-computer object
$newest.Values
}
}

In Powershell, is there a better way to store/find data in an n-dimensional array than a custom object

I find myself continually faced with the need to store mixed-type data in some kind of a structure for later lookup.
For a recent example, I am performing data migration and I will store the old UUID, new UUID, source environment, target environment, and schema for an unknown number of entries.
I have been meeting this need by creating an array and inserting System.Objects with NoteProperty members for each of the columns of data.
This strikes me as a very clumsy approach but I feel like I may be limited by Powershell's functionality. If I need to, for example, locate all entries that used a particular schema, I write a foreach loop that sticks each entry with a matching schema name in a whole new array that I can return. I would really like the ability to more easily search for all objects that contain a member matching a particular value, modify existing members, etc.
Is there a better built-in data structure that will suit my needs, or is creating a custom object the right thing to do?
For reference, I'm doing something like this to create my structure:
$objectArray= #();
foreach(thing to process){
$tempObj = New-Object System.Object;
$tempObj | Add-Member -MemberType NoteProperty -Name "membername" -Value xxxxx
....repeat for each member...
$objectArray += $tempObj
}
If I need to find something in it, I then have to:
$matchingObjs = #()
foreach ($obj in $objectArray){
if($obj.thing -eq value){$matchingObjs += $obj}
}
This really sucks and I know there has to be a more elegant way. I'm still fairly new to powershell so I don't know what utilities it has to help me. I'm using v5.
With PowerShell 3.0 you could use a [PSCustomObject], here's an article on the different object creation methods.
Also setting the array equal to the output of the foreach loop will be more efficient than repeatedly recreating an array with +=.
$objectArray = foreach ($item in $collection) {
[pscustomobject]#{
"membername" = "xxxxx"
}
}
The Where-Object cmdlet or the .where() method looks like what you need in your second loop.
$matchingObjs = $objectArray | Where-Object {$_.thing -eq "value"}
It also sounds like you could use Where-Object/.where() to filter the initial data and just create an object which matches what you are looking for. For example:
$matchingObjs = $InputData |
Where-Object {$_.thing -eq "value"} |
ForEach-Object {
[pscustomobject]#{
"membername" = xxxxx
}
}
If your data can be expressed as key value pairs, then a hashtable will be the most efficient, see about_Hash_Tables for more info.
There is no built-in way to do what you are asking. One way is to segment your data into separate hashtables so you can do easy lookups by a common key, say the ID.
# Create a hastable for the IDs
$ids = #{};
foreach(thing to process){
$ids.Add($uid, 'Value')
}
# Find the $uid exists
$keyExists = $ids.Keys -Contains $uid
# Find value of stored for $uid
$keyValue = $ids[$uid]
As a side note, you don't have to create Syste.Object, you can simple do this:
$objectArray = #();
gci | % {
$objectArray += #{
'Key1' = 'Value 1'
'Key2' = 'Value 2'
}
}
If you need to compare complex objects, you can build them with #{} and then use Compare-Object on the two objects, just another idea.
For example, this will get a file listing of two different directories, and tell me what file exists or doesn't exist between the two directories:
$packages = (gci $boxStarterRepo -Recurse *.nuspec | Select-Object -ExpandProperty Name) -replace '.nuspec', ''
$packages += (gci $boxStarterPrivateRepo -Recurse *.nuspec | Select-Object -ExpandProperty Name) -replace '.nuspec', ''
$packages = $packages | Sort-Object
Compare-Object $packages $done

Why is this not working? - Trying to save properties in a variable for use multiple times in a function

I am trying to find a way to save the properties for a select statement in PowerShell but it isn't working. I haven't found a way to make an entire statement a literal so that it isn't reviewed until the variable is opened.
Here is what works:
$wsus.GetSummariesPercomputerTarget($CurrentMonthUpdateScope, $ComputerScope) |
Select-Object #{L="WSUSServer";E={$Server}},
#{L="FromDate";E={$($CurrentMonthUpdateScope.FromCreationDate).ToString("MM/dd/yyyy")}},
#{L="ToDate";E={$($CurrentMonthUpdateScope.ToCreationDate).ToString("MM/dd/yyyy")}},
#{L='Computer';E={($wsus.GetComputerTarget([guid]$_.ComputerTargetID)).FullDomainName}},
DownloadedCount,
NotInstalledCount,
InstalledPendingRebootCount,
FailedCount,
Installedcount |
Sort-Object -Property "Computer"
and I am trying to get the properties mentioned (starting just after the Select-Object statement and ending just before the last pipe) placed in a variable so that I can use the same properties multiple times with different scopes.
I have tried this:
$Properties = '#{L="WSUSServer";E={$Server}},
#{L="FromDate";E={$($CurrentMonthUpdateScope.FromCreationDate).ToString("MM/dd/yyyy")}},
#{L="ToDate";E={$($CurrentMonthUpdateScope.ToCreationDate).ToString("MM/dd/yyyy")}},
#{L="Computer";E={($wsus.GetComputerTarget([guid]$_.ComputerTargetID)).FullDomainName}},
DownloadedCount,
NotInstalledCount,
InstalledPendingRebootCount,
FailedCount,
Installedcount'
$wsus.GetSummariesPercomputerTarget($CurrentMonthUpdateScope, $ComputerScope) |
Select-Object $Properties |
Sort-Object -Property "Computer"
While this runs it doesn't give any data and I think it confuses PowerShell.
This gives the same response:
$Properties = "#{L=`"WSUSServer`";E={$Server}},
#{L=`"FromDate`";E={$($CurrentMonthUpdateScope.FromCreationDate).ToString(`"MM/dd/yyyy`")}},
#{L=`"ToDate`";E={$($CurrentMonthUpdateScope.ToCreationDate).ToString(`"MM/dd/yyyy`")}},
#{L=`"Computer`";E={($wsus.GetComputerTarget([guid]$_.ComputerTargetID)).FullDomainName}},
DownloadedCount,
NotInstalledCount,
InstalledPendingRebootCount,
FailedCount,
Installedcount"
Any options, thoughts, etc.?
The -Property argument of Select-Object expects an array, not a string. So something like this:
$Properties = #(#{L="WSUSServer";E={$Server}},
#{L="FromDate";E={$($CurrentMonthUpdateScope.FromCreationDate).ToString("MM/dd/yyyy")}},
#{L="ToDate";E={$($CurrentMonthUpdateScope.ToCreationDate).ToString("MM/dd/yyyy")}},
#{L="Computer";E={($wsus.GetComputerTarget([guid]$_.ComputerTargetID)).FullDomainName}},
"DownloadedCount",
"NotInstalledCount",
"InstalledPendingRebootCount",
"FailedCount",
"Installedcount")
Note, you will need to turn the simple property names into strings within your array.

Index into powershell Import-Csv row like an array

I am importing data from various csv files, usually with 4 or 5 fields.
e.g. one might look like:
id, name, surname, age
1,tom,smith,32
2,fred,bloggs,50
I have managed to grab the header row titles into and array that looks like:
id, name, surname, age
the first data row looks like:
#{ID=1; name=tom; surname=smith; age=32}
say I assign it to $myRow
what I want to be able to do is access the ID, name etc field in $myRow by index, 0, 1, 2 etc, not by the property name.
Is this possible?
Thanks
You can do something like this, but it may be slow for large sets of rows and/or properties:
$users =
import-csv myusers.csv |
foreach {
$i=0
foreach ($property in $_.psobject.properties.name)
{
$_ | Add-Member -MemberType AliasProperty -Name $i -Value $property -passthru
$i++
}
}
That just adds an Alias property for each property name in the object
When I wanted to do something similar, I went about it differently.
I used Import-Csv to get the contents into a table. Then I stepped through the table, row by row, and used an inner loop to retrieve the field values, one by one into variables with the same name as the column name.
This created a context where I could apply the values to variables embedded in some kind of template. Here is an edited version of the code.
foreach ($item in $list) {
$item | Get-Member -membertype properties | foreach {
Set-variable -name $_.name -value $item.$($_.name)
}
Invoke-expression($template) >> Outputfile.txt
}
I'm writing the expanded templates to an output file, but you get the idea. This end up working more or less the way mail merge applies a mailing list to a form letter.
I wouldn't use this approach for more than a few hundred rows and a dozen columns. It gets slow.
Explanation:
The inner loop needs more explanation. $list is a table that contains
the imported image of a csv file. $item is one row from this table.
Get-Member gets each field (called a property) from that row. Each
field has a name and a value. $_.name delivers the name of the
current field. $item.($_.name) delivers the value. Set-Variable
creates a variable. It's very inefficient to create the same
variables over and over again for each row in the table, but I don't
care.
This snippet was clipped from a larger snippet that imports a list and a template, produces an expansion of the template for each item in the list, and outputs the series of expansions into a text file. I didn't include the whole snippet because it's too far afield from the question that was asked.
You can actually index your array with ($MyRow[1]).age in order to get the age of the first row.