PowerShell: list CSV file rows where at least one value between the 3rd and last column is equal to "0" or "1" - powershell

In my PowerShell script, I'm working with a CSV file that looks like this (with a number of rows and columns that can vary, but there will always be at least the headers and the first 2 columns):
OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1
I basically list servers in the first column and users in the first row (CSV header). This represents a user "access granting" matrix to servers (1 for "give access", 0 for "remove access", and void for "don't change").
I'm looking for a way to extract only the rows that include a value equal to "1" or "0" between (and including) the 3rd and last column. (= to eventually get the list of servers where access rights should be changed)
So taking the above example, I only want the following lines returned:
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Windows;hostname5;1;1;1
Any hints to make this possible? Or the opposite (getting the ones without any 0 or 1)?
Even if it means using "Get-Content" instead of "Import-CSV". I don't care about the 1st (headers) row; I know how to exclude that.
Thank you!
--- Final solution, thanks to #Tomalak's answer:
$AccessMatrix = Import-CSV $CSVfile -delimiter ';'
$columns = $AccessMatrix | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$AccessMatrix = $AccessMatrix | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col.trim() -eq "1" -OR $row.$col.trim() -eq "0") {
$row # this pushes the $row onto the pipeline
break
}
}
}

The following uses Get-Member to select the names of all columns after the first two.
Then, using ForEach-Object, we can output only those rows that have a value in any of those columns.
$data = ConvertFrom-Csv "OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1" -Delimiter ";"
$columns = $data | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$data | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The break statement stops the execution of the inner foreach loop because there is no point in further checking as soon as the first column with any value is found.
This is equivalent to the above, if you prefer Where-Object:
$data | Where-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
return $true
}
}
}

Related

PowerShell: Find unique values from multiple CSV files

let's say that I have several CSV files and I need to check a specific column and find values that exist in one file, but not in any of the others. I'm having a bit of trouble coming up with the best way to go about it as I wanted to use Compare-Object and possibly keep all columns and not just the one that contains the values I'm checking.
So I do indeed have several CSV files and they all have a Service Code column, and I'm trying to create a list for each Service Code that only appears in one file. So I would have "Service Codes only in CSV1", "Service Codes only in CSV2", etc.
Based on some testing and a semi-related question, I've come up with a workable solution, but with all of the nesting and For loops, I'm wondering if there is a more elegant method out there.
Here's what I do have:
$files = Get-ChildItem -LiteralPath "C:\temp\ItemCompare" -Include "*.csv"
$HashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $files.Count; $i++){
$TempHashSet = [System.Collections.Generic.HashSet[String]]::New([String[]](Import-Csv $files[$i])."Service Code")
$HashList.Add($TempHashSet)
}
$FinalHashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $HashList.Count; $i++){
$UniqueHS = [System.Collections.Generic.HashSet[String]]::New($HashList[$i])
For ($j = 0; $j -lt $HashList.Count; $j++){
#Skip the check when the HashSet would be compared to itself
If ($j -eq $i){Continue}
$UniqueHS.ExceptWith($HashList[$j])
}
$FinalHashList.Add($UniqueHS)
}
It seems a bit messy to me using so many different .NET references, and I know I could make it cleaner with a tag to say using namespace System.Collections.Generic, but I'm wondering if there is a way to make it work using Compare-Object which was my first attempt, or even just a simpler/more efficient method to filter each file.
I believe I found an "elegant" solution based on Group-Object, using only a single pipeline:
# Import all CSV files.
Get-ChildItem $PSScriptRoot\csv\*.csv -File -PipelineVariable file | Import-Csv |
# Add new column "FileName" to distinguish the files.
Select-Object *, #{ label = 'FileName'; expression = { $file.Name } } |
# Group by ServiceCode to get a list of files per distinct value.
Group-Object ServiceCode |
# Filter by ServiceCode values that exist only in a single file.
# Sort-Object -Unique takes care of possible duplicates within a single file.
Where-Object { ( $_.Group.FileName | Sort-Object -Unique ).Count -eq 1 } |
# Expand the groups so we get the original object structure back.
ForEach-Object Group |
# Format-Table requires sorting by FileName, for -GroupBy.
Sort-Object FileName |
# Finally pretty-print the result.
Format-Table -Property ServiceCode, Foo -GroupBy FileName
Test Input
a.csv:
ServiceCode,Foo
1,fop
2,fip
3,fap
b.csv:
ServiceCode,Foo
6,bar
6,baz
3,bam
2,bir
4,biz
c.csv:
ServiceCode,Foo
2,bla
5,blu
1,bli
Output
FileName: b.csv
ServiceCode Foo
----------- ---
4 biz
6 bar
6 baz
FileName: c.csv
ServiceCode Foo
----------- ---
5 blu
Looks correct to me. The values 1, 2 and 3 are duplicated between multiple files, so they are excluded. 4, 5 and 6 exist only in single files, while 6 is a duplicate value only within a single file.
Understanding the code
Maybe it is easier to understand how this code works, by looking at the intermediate output of the pipeline produced by the Group-Object line:
Count Name Group
----- ---- -----
2 1 {#{ServiceCode=1; Foo=fop; FileName=a.csv}, #{ServiceCode=1; Foo=bli; FileName=c.csv}}
3 2 {#{ServiceCode=2; Foo=fip; FileName=a.csv}, #{ServiceCode=2; Foo=bir; FileName=b.csv}, #{ServiceCode=2; Foo=bla; FileName=c.csv}}
2 3 {#{ServiceCode=3; Foo=fap; FileName=a.csv}, #{ServiceCode=3; Foo=bam; FileName=b.csv}}
1 4 {#{ServiceCode=4; Foo=biz; FileName=b.csv}}
1 5 {#{ServiceCode=5; Foo=blu; FileName=c.csv}}
2 6 {#{ServiceCode=6; Foo=bar; FileName=b.csv}, #{ServiceCode=6; Foo=baz; FileName=b.csv}}
Here the Name contains the unique ServiceCode values, while Group "links" the data to the files.
From here it should already be clear how to find values that exist only in single files. If duplicate ServiceCode values within a single file wouldn't be allowed, we could even simplify the filter to Where-Object Count -eq 1. Since it was stated that dupes within single files may exist, we need the Sort-Object -Unique to count multiple equal file names within a group as only one.
It is not completely clear what you expect as an output.
If this is just the ServiceCodes that intersect then this is actually a duplicate with:
Comparing two arrays & get the values which are not common
Union and Intersection in PowerShell?
But taking that you actually want the related object and files, you might use this approach:
$HashTable = #{}
ForEach ($File in Get-ChildItem .\*.csv) {
ForEach ($Object in (Import-Csv $File)) {
$HashTable[$Object.ServiceCode] = $Object |Select-Object *,
#{ n='File'; e={ $File.Name } },
#{ n='Count'; e={ $HashTable[$Object.ServiceCode].Count + 1 } }
}
}
$HashTable.Values |Where-Object Count -eq 1
Here is my take on this fun exercise, I'm using a similar approach as yours with the HashSet but adding [System.StringComparer]::OrdinalIgnoreCase to leverage the .Contains(..) method:
using namespace System.Collections.Generic
# Generate Random CSVs:
$charset = 'abABcdCD0123xXyYzZ'
$ran = [random]::new()
$csvs = #{}
foreach($i in 1..50) # Create 50 CSVs for testing
{
$csvs["csv$i"] = foreach($z in 1..50) # With 50 Rows
{
$index = (0..2).ForEach({ $ran.Next($charset.Length) })
[pscustomobject]#{
ServiceCode = [string]::new($charset[$index])
Data = $ran.Next()
}
}
}
# Get Unique 'ServiceCode' per CSV:
$result = #{}
foreach($key in $csvs.Keys)
{
# Get all unique `ServiceCode` from the other CSVs
$tempHash = [HashSet[string]]::new(
[string[]]($csvs[$csvs.Keys -ne $key].ServiceCode),
[System.StringComparer]::OrdinalIgnoreCase
)
# Filter the unique `ServiceCode`
$result[$key] = foreach($line in $csvs[$key])
{
if(-not $tempHash.Contains($line.ServiceCode))
{
$line
}
}
}
# Test if the code worked,
# If something is returned from here means it didn't work
foreach($key in $result.Keys)
{
$tmp = $result[$result.Keys -ne $key].ServiceCode
foreach($val in $result[$key])
{
if($val.ServiceCode -in $tmp)
{
$val
}
}
}
i was able to get unique items as follow
# Get all items of CSVs in a single variable with adding the file name at the last column
$CSVs = Get-ChildItem "C:\temp\ItemCompare\*.csv" | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}}
}
Foreach($line in $CSVs){
$ServiceCode = $line.ServiceCode
$file = $line.Filename
if (!($CSVs | where {$_.ServiceCode -eq $ServiceCode -and $_.filename -ne $file})){
$line
}
}

Compare 2 .csv files

I have two .csv files with many information in it. If at the end of the sentence is a "M", I have to look if this row is in the other file. When it's there I have to look if the code at the beggining of the row is the same, when not then I have to do nothing, but when it's the same I have to make a new file.
This is the information I have to look if it's in the other file:
You can see that the information is here:
I also have rows with a "B" at the end but this is unimportant:
Now, when the information is here, I have to export all rows that are same in both files.
I have to export the rows in a new file which have the same code at the beginning which is circeld in red:
I have tried different solutions that I looked up in the Internet, but nothing really works.
Perhaps something like this?
$datenbank = Import-Csv "C:\Users\information1.csv"
$zentral = Import-Csv "C:\Users\information2.csv"
$new = ""
foreach ($line in $datenbank) {
$Spalte = $line.Split(",")
foreach ($z in $Zentral) {
$found = $false
foreach ($d in $Datenbanktyp) {
if ($d.$Spalte[1] -eq $z.$Spalte[1]) {
$found = $true
}
}
if ($found -eq $true) {
$new += $z
}
}
}
Or can it work with a if..elseif..else loop?
Let's see if I got this right. You have one file where the second-last column contains a letter. If that letter is "M" you want to check if the value of the column before that (partially) matches a column from a second file. If it does, you then want to export all rows from the second file that have the same value in the first column as the matched row to a new file.
Since you didn't reveal the column names I'm going to dub the third- and second-last columns from the first file "Erin" and "Marty", the match column from the second file "Pat", and the first column from the second file "Gene".
$datenbank | Where-Object {
$_.Marty -ceq 'M'
} | Select-Object -Expand Erin -Unique | ForEach-Object {
$outfile = "export_${_}.csv" # adjust output filename as you see fit
$firstcol = $zentral |
Where { $_.Pat -like "*${_}*" } |
Select-Object -Expand Gene
$zentral | Where-Object {
$_.Gene -eq $firstcol
} | Export-Csv $outfile
}
Another approach would be to group your second file by the first column and then check if the groups contain a matching value.
$groups = $zentral | Group-Object Gene
$datenbank | Where-Object {
$_.Marty -ceq 'M'
} | Select-Object -Expand Erin -Unique | ForEach-Object {
$outfile = "export_${_}.csv" # adjust output filename as you see fit
$groups | Where-Object {
$_.Group.Pat -like "*${_}*"
} | Select-Object -Expand Group | Export-Csv $outfile
}
Replace "Erin", "Marty", "Pat" and "Gene" with the actual column titles from your CSV files. Should your files not contain column titles you need to specify them via the -Header parameter of Import-Csv, otherwise the cmdlet will interpret the first data row as the headers.

PowerShell : compare 2 excel files or 2 sheets

I have to make a script which can compare 2 excel files or sheets and if one of the cells isn't de the same it tells me which one it is but i don't know how to do this, I watched another situation like this one but i didn't manage to do it can you help me ?
my files are test1.csv and test2.csv
Try this.
$file1 = Import-Csv test1.csv
$file2 = Import-Csv test2.csv
Compare-Object $file1 $file2 -property "HeaderProperty" -IncludeEqual
#Vivek Kumar : Be careful, Compare-Object has a -SyncWindow parameter which has a value by default and that can give only a part of the results.
A very good explanation here : http://community.idera.com/powershell/powershell_com_featured_blogs/b/tobias/posts/tipps-amp-tricks-using-compare-object
One way to bypass this "problem" is to set the -SyncWindow by dividing by 2 the -ReferenceObject :
$file1 = Import-Csv test1.csv
$file2 = Import-Csv test2.csv
Compare-Object -ReferenceObject $file1 -DifferenceObject $file2 -SyncWindow ($file1.length / 2)
Since you mention the files are CSV, you can do all you need with standard PS functions.
However, if using Excel (XLSX/XLS) files, you may be interested in this library: https://github.com/RamblingCookieMonster/PSExcel. Just switch Import-CSV for Import-XLSX.
Below's a very basic example of how this could be done.
Code
function Report-OffendingCell { #NB: doesn't follow good naming conventions
[CmdletBinding()]
param (
[Parameter(Mandatory=$true, ValueFromPipeline = $true)]
[long]$ColumnIndex
,
[Parameter(Mandatory=$true)]
[long]$RowIndex
,
[Parameter(Mandatory=$true)]
[string]$SheetName
,
[Parameter(Mandatory=$false)]
[string]$Explanation
)
process {
#If you want column letters instead of numbers, use something like Convert-NumberToA1 from https://gallery.technet.microsoft.com/office/Powershell-function-that-88f9f690
#"[{0}]!{1}{2}" -f $SheetName, (Convert-NumberToA1 $ColumnIndex + 1), ($RowIndex + 1)
#I've returned an object instead, since that's more useful for any further PS automation
(New-Object -TypeName PSObject -Property #{
ColumnNo = $ColumnIndex + 1
RowNo = $RowIndex + 1
SheetName = $SheetName
Explanation = $Explanation
})
}
}
function Compare-Tables {
[CmdletBinding()]
param (
[Parameter(Mandatory=$true)]
[PSObject[]]$Table1
,
[Parameter(Mandatory=$true)]
[PSObject[]]$Table2
,
[Parameter(Mandatory=$false)]
[string]$Table1Name = 'Table1'
,
[Parameter(Mandatory=$false)]
[string]$Table2Name = 'Table2'
)
begin {
[long]$t1Cols = ($Table1[0].PSObject.Properties | Measure-Object).Count - 1
[long]$t2Cols = ($Table2[0].PSObject.Properties | Measure-Object).Count - 1
[long]$t1Rows = $Table1.Count - 1
[long]$t2Rows = $Table2.Count - 1
[long]$minCols = [System.Math]::Min($t1Cols, $t2Cols)
[long]$maxCols = [System.Math]::Max($t1Cols, $t2Cols)
[long]$minRows = [System.Math]::Min($t1Rows, $t2Rows)
[long]$maxRows = [System.Math]::Max($t1Rows, $t2Rows)
[string]$offendingColTable = if ($maxCols -eq $t1Cols){$Table1Name}else{$Table2Name}
[string]$offendingRowTable = if ($maxRows -eq $t1Rows){$Table1Name}else{$Table2Name}
write-verbose $offendingColTable
write-verbose $offendingRowTable
write-verbose $maxCols
write-verbose $t1Cols
write-verbose $t2Cols
}
process {
0..$minRows | %{ #loop through each row which is populated in both sheets
[long]$row = $_
0..$minCols |
?{(#($Table1[$row].PSObject.Properties)[$_].Value) -ne (#($Table2[$row].PSObject.Properties)[$_].Value)} |
Report-OffendingCell -RowIndex $row -SheetName $Table2Name -Explanation 'Values differ between sheets!' #sheetname could be Table1 or Table2 here; since the cell exists in both sheets
($minCols + 1)..$maxCols | Report-OffendingCell -RowIndex $row -SheetName $offendingColTable -Explanation 'Entire Column only exists on one sheet!'
}
($minRows + 1)..$maxRows | %{ #for any rows which don't exist in one of the sheets, output that
[long]$row = $_
0..$maxCols | Report-OffendingCell -RowIndex $row -SheetName $offendingRowTable -Explanation 'Entire Row only exists on one sheet!'
}
}
}
$test1 = Import-CSV -Path '.\test1.csv'
$test2 = Import-CSV -Path '.\test2.csv'
Compare-Tables -Table1 $test1 -Table2 $test2 -Table1Name 'test1' -Table2Name 'test2' -Verbose | ft SheetName, ColumnNo, RowNo, Explanation
#just so I don't mess up your session with my mock
if((Get-Command Import-Csv).Source -ne 'Microsoft.PowerShell.Utility') {
Remove-Item 'function:Import-Csv'
}
Code for Testing
To provide the example output below, you can use the following code. This overwrites the Import-CSV function with a mocked version of that function which simply returns fixed value data. This code is not required for the real-world scenario; just for those who don't have suitable test CSV files who want something to experiment with.
#region 'Mocked Standard Functions'
#you don't need this function; this is just to make testing simple
function Import-CSV {
param($Path)
switch ($Path) {
'.\test1.csv' {
#(
#{
'Column A Heading'='Row 1 Cell 1';
'Column B Heading'='Row 1 Cell 2';
'Column C Heading'='Row 1 Cell 3';
'Column D Heading'='Row 1 Cell 4';
}
, #{
'Column A Heading'='Row 2 Cell 1';
'Column B Heading'='Row 2 Cell 2';
'Column C Heading'='Row 2 Cell 3';
'Column D Heading'='Row 2 Cell 4';
}
, #{
'Column A Heading'='Row 3 Cell 1';
'Column B Heading'='Row 3 Cell 2';
'Column C Heading'='Row 3 Cell 3';
'Column D Heading'='Row 3 Cell 4';
}
) | %{(New-Object -TypeName PSObject -Property $_)} | select 'Column A Heading', 'Column B Heading', 'Column C Heading', 'Column D Heading' #select needed to ensure columns are returned in the correct order
}
'.\test2.csv' {
#(
#{
'Column Heading 1'='Row 1 Cell 1';
'Column B Heading'='Row 1 Cell 2';
'Column C Heading'='Row 1 Cell 3 difference';
'Column D Heading'='Row 1 Cell 4';
}
, #{
'Column Heading 1'='Row 2 Cell 1';
'Column B Heading'='Row 2 Cell 2';
'Column C Heading'='Row 2 Cell 3';
'Column D Heading'='Row 2 Cell 4';
'Column E Heading'='Row 2 Cell 5 bonus ball!'; #note that though we've not defined on the previous "row", the import function assumes a table, so we'll still have a property on the previous row; only it'll be null
}
) | %{(New-Object -TypeName PSObject -Property $_)}| select 'Column Heading 1', 'Column B Heading', 'Column C Heading', 'Column D Heading', 'Column E Heading' #select needed to ensure columns are returned in the correct order
}
default {throw "no dummy data defined for $Path"}
}
}
#endregion 'Mocked Standard Functions'
Example Output
SheetName ColumnNo RowNo Explanation
--------- -------- ----- -----------
test2 3 1 Values differ between sheets!
test2 5 1 Entire Column only exists on one sheet!
test2 5 2 Entire Column only exists on one sheet!
test1 1 3 Entire Row only exists on one sheet!
test1 2 3 Entire Row only exists on one sheet!
test1 3 3 Entire Row only exists on one sheet!
test1 4 3 Entire Row only exists on one sheet!
test1 5 3 Entire Row only exists on one sheet!
Function Compare-WorkSheet {
<#
.Synopsis
Compares two worksheets with the same name in different files.
.Description
This command takes two file names, a worksheet name and a name for a key column.
It reads the worksheet from each file and decides the column names.
It builds as hashtable of the key column values and the rows they appear in
It then uses PowerShell's compare object command to compare the sheets (explicity checking all column names which have not been excluded)
For the difference rows it adds the row number for the key of that row - we have to add the key after doing the comparison,
otherwise rows will be considered as different simply because they have different row numbers
We also add the name of the file in which the difference occurs.
If -BackgroundColor is specified the difference rows will be changed to that background.
.Example
Compare-WorkSheet -Referencefile 'Server56.xlsx' -Differencefile 'Server57.xlsx' -WorkSheetName Products -key IdentifyingNumber -ExcludeProperty Install* | format-table
The two workbooks in this example contain the result of redirecting a subset of properties from Get-WmiObject -Class win32_product to Export-Excel
The command compares the "products" pages in the two workbooks, but we don't want to register a differnce if if the software was installed on a
different date or from a different place, so Excluding Install* removes InstallDate and InstallSource.
This data doesn't have a "name" column" so we specify the "IdentifyingNumber" column as the key.
The results will be presented as a table.
.Example
compare-WorkSheet "Server54.xlsx" "Server55.xlsx" -WorkSheetName services -GridView
This time two workbooks contain the result of redirecting Get-WmiObject -Class win32_service to Export-Excel
Here the -Differencefile and -Referencefile parameter switches are assumed , and the default setting for -key ("Name") works for services
This will display the differences between the "services" sheets using a grid view
.Example
Compare-WorkSheet 'Server54.xlsx' 'Server55.xlsx' -WorkSheetName Services -BackgroundColor lightGreen
This version of the command outputs the differences between the "services" pages and also highlights any different rows in the spreadsheet files.
.Example
Compare-WorkSheet 'Server54.xlsx' 'Server55.xlsx' -WorkSheetName Services -BackgroundColor lightGreen -FontColor Red -Show
This builds on the previous example: this time Where two changed rows have the value in the "name" column (the default value for -key),
this version adds highlighting of the changed cells in red; and then opens the Excel file.
.Example
Compare-WorkSheet 'Pester-tests.xlsx' 'Pester-tests.xlsx' -WorkSheetName 'Server1','Server2' -Property "full Description","Executed","Result" -Key "full Description"
This time the reference file and the difference file are the same file and two different sheets are used. Because the tests include the
machine name and time the test was run the command specifies a limited set of columns should be used.
.Example
Compare-WorkSheet 'Server54.xlsx' 'Server55.xlsx' -WorkSheetName general -Startrow 2 -Headername Label,value -Key Label -GridView -ExcludeDifferent
The "General" page has a title and two unlabelled columns with a row forCPU, Memory, Domain, Disk and so on
So the command is instructed to starts at row 2 to skip the title and to name the columns: the first is "label" and the Second "Value";
the label acts as the key. This time we interested the rows which are the same in both sheets,
and the result is displayed using grid view. Note that grid view works best when the number of columns is small.
.Example
Compare-WorkSheet 'Server1.xlsx' 'Server2.xlsx' -WorkSheetName general -Startrow 2 -Headername Label,value -Key Label -BackgroundColor White -Show -AllDataBackgroundColor LightGray
This version of the previous command lightlights all the cells in lightgray and then sets the changed rows back to white; only
the unchanged rows are highlighted
#>
[cmdletbinding(DefaultParameterSetName)]
Param(
#First file to compare
[parameter(Mandatory=$true,Position=0)]
$Referencefile ,
#Second file to compare
[parameter(Mandatory=$true,Position=1)]
$Differencefile ,
#Name(s) of worksheets to compare.
$WorkSheetName = "Sheet1",
#Properties to include in the DIFF - supports wildcards, default is "*"
$Property = "*" ,
#Properties to exclude from the the search - supports wildcards
$ExcludeProperty ,
#Specifies custom property names to use, instead of the values defined in the column headers of the TopRow.
[Parameter(ParameterSetName='B', Mandatory)]
[String[]]$Headername,
#Automatically generate property names (P1, P2, P3, ..) instead of the using the values the top row of the sheet
[Parameter(ParameterSetName='C', Mandatory)]
[switch]$NoHeader,
#The row from where we start to import data, all rows above the StartRow are disregarded. By default this is the first row.
[int]$Startrow = 1,
#If specified, highlights all the cells - so you can make Equal cells one colour, and Diff cells another.
[System.Drawing.Color]$AllDataBackgroundColor,
#If specified, highlights the DIFF rows
[System.Drawing.Color]$BackgroundColor,
#If specified identifies the tabs which contain DIFF rows (ignored if -backgroundColor is omitted)
[System.Drawing.Color]$TabColor,
#Name of a column which is unique and will be used to add a row to the DIFF object, default is "Name"
$Key = "Name" ,
#If specified, highlights the DIFF columns in rows which have the same key.
[System.Drawing.Color]$FontColor,
#If specified opens the Excel workbooks instead of outputting the diff to the console (unless -passthru is also specified)
[Switch]$Show,
#If specified, the command tries to the show the DIFF in a Gridview and not on the console. (unless-Passthru is also specified). This Works best with few columns selected, and requires a key
[switch]$GridView,
#If specified -Passthrough full set of diff data is returned without filtering to the specified properties
[Switch]$PassThru,
#If specified the result will include equal rows as well. By default only different rows are returned
[Switch]$IncludeEqual,
#If Specified the result includes only the rows where both are equal
[Switch]$ExcludeDifferent
)
#if the filenames don't resolve, give up now.
try { $oneFile = ((Resolve-Path -Path $Referencefile -ErrorAction Stop).path -eq (Resolve-Path -Path $Differencefile -ErrorAction Stop).path)}
Catch { Write-Warning -Message "Could not Resolve the filenames." ; return }
#If we have one file , we mush have two different worksheet names. If we have two files we can a single string or two strings.
if ($onefile -and ( ($WorkSheetName.count -ne 2) -or $WorkSheetName[0] -eq $WorkSheetName[1] ) ) {
Write-Warning -Message "If both the Reference and difference file are the same then worksheet name must provide 2 different names"
return
}
if ($WorkSheetName.count -eq 2) {$worksheet1 = $WorkSheetName[0] ; $WorkSheet2 = $WorkSheetName[1]}
elseif ($WorkSheetName -is [string]) {$worksheet1 = $WorkSheet2 = $WorkSheetName}
else {Write-Warning -Message "You must provide either a single worksheet name or two names." ; return }
$params= #{ ErrorAction = [System.Management.Automation.ActionPreference]::Stop }
foreach ($p in #("HeaderName","NoHeader","StartRow")) {if ($PSBoundParameters[$p]) {$params[$p] = $PSBoundParameters[$p]}}
try {
$Sheet1 = Import-Excel -Path $Referencefile -WorksheetName $WorkSheet1 #params
$Sheet2 = Import-Excel -Path $Differencefile -WorksheetName $WorkSheet2 #Params
}
Catch {Write-Warning -Message "Could not read the worksheet from $Referencefile and/or $Differencefile." ; return }
#Get Column headings and create a hash table of Name to column letter.
$headings = $Sheet1[-1].psobject.Properties.name # This preserves the sequence - using get-member would sort them alphabetically!
$headings | ForEach-Object -Begin {$columns = #{} ; $i=65 } -Process {$Columns[$_] = [char]($i ++) }
#Make a list of property headings using the Property (default "*") and ExcludeProperty parameters
if ($Key -eq "Name" -and $NoHeader) {$key = "p1"}
$propList = #()
foreach ($p in $Property) {$propList += ($headings.where({$_ -like $p}) )}
foreach ($p in $ExcludeProperty) {$propList = $propList.where({$_ -notlike $p}) }
if (($headings -contains $key) -and ($propList -notcontains $Key)) {$propList += $Key}
$propList = $propList | Select-Object -Unique
if ($propList.Count -eq 0) {Write-Warning -Message "No Columns are selected with -Property = '$Property' and -excludeProperty = '$ExcludeProperty'." ; return}
#Add RowNumber, Sheetname and file name to every row
$FirstDataRow = $startRow + 1
if ($Headername -or $NoHeader) {$FirstDataRow -- }
$i = $FirstDataRow ; foreach ($row in $Sheet1) {Add-Member -InputObject $row -MemberType NoteProperty -Name "_Row" -Value ($i ++)
Add-Member -InputObject $row -MemberType NoteProperty -Name "_Sheet" -Value $worksheet1
Add-Member -InputObject $row -MemberType NoteProperty -Name "_File" -Value $Referencefile}
$i = $FirstDataRow ; foreach ($row in $Sheet2) {Add-Member -InputObject $row -MemberType NoteProperty -Name "_Row" -Value ($i ++)
Add-Member -InputObject $row -MemberType NoteProperty -Name "_Sheet" -Value $worksheet2
Add-Member -InputObject $row -MemberType NoteProperty -Name "_File" -Value $Differencefile}
if ($ExcludeDifferent -and -not $IncludeEqual) {$IncludeEqual = $true}
#Do the comparison and add file,sheet and row to the result - these are prefixed with "_" to show they are added the addition will fail if the sheet has these properties so split the operations
[PSCustomObject[]]$diff = Compare-Object -ReferenceObject $Sheet1 -DifferenceObject $Sheet2 -Property $propList -PassThru -IncludeEqual:$IncludeEqual -ExcludeDifferent:$ExcludeDifferent |
Sort-Object -Property "_Row","File"
#if BackgroundColor was specified, set it on extra or extra or changed rows
if ($diff -and $BackgroundColor) {
#Differences may only exist in one file. So gather the changes for each file; open the file, update each impacted row in the shee, save the file
$updates = $diff.where({$_.SideIndicator -ne "=="}) | Group-object -Property "_File"
foreach ($file in $updates) {
try {$xl = Open-ExcelPackage -Path $file.name }
catch {Write-warning -Message "Can't open $($file.Name) for writing." ; return}
if ($AllDataBackgroundColor) {
$file.Group._sheet | Sort-Object -Unique | ForEach-Object {
$ws = $xl.Workbook.Worksheets[$_]
if ($headerName) {$range = "A" + $startrow + ":" + $ws.dimension.end.address}
else {$range = "A" + ($startrow + 1) + ":" + $ws.dimension.end.address}
Set-Format -WorkSheet $ws -BackgroundColor $AllDataBackgroundColor -Range $Range
}
}
foreach ($row in $file.group) {
$ws = $xl.Workbook.Worksheets[$row._Sheet]
$range = $ws.Dimension -replace "\d+",$row._row
Set-Format -WorkSheet $ws -Range $range -BackgroundColor $BackgroundColor
}
if ($TabColor) {
foreach ($tab in ($file.group._sheet | Select-Object -Unique)) {
$xl.Workbook.Worksheets[$tab].TabColor = $TabColor
}
}
$xl.save() ; $xl.Stream.Close() ; $xl.Dispose()
}
}
#if font colour was specified, set it on changed properties where the same key appears in both sheets.
if ($diff -and $FontColor -and ($propList -contains $Key) ) {
$updates = $diff.where({$_.SideIndicator -ne "=="}) | Group-object -Property $Key | Where-Object {$_.count -eq 2}
if ($updates) {
$XL1 = Open-ExcelPackage -path $Referencefile
if ($oneFile ) {$xl2 = $xl1}
else {$xl2 = Open-ExcelPackage -path $Differencefile }
foreach ($u in $updates) {
foreach ($p in $propList) {
if($u.Group[0].$p -ne $u.Group[1].$p ) {
Set-Format -WorkSheet $xl1.Workbook.Worksheets[$u.Group[0]._sheet] -Range ($Columns[$p] + $u.Group[0]._Row) -FontColor $FontColor
Set-Format -WorkSheet $xl2.Workbook.Worksheets[$u.Group[1]._sheet] -Range ($Columns[$p] + $u.Group[1]._Row) -FontColor $FontColor
}
}
}
$xl1.Save() ; $xl1.Stream.Close() ; $xl1.Dispose()
if (-not $oneFile) {$xl2.Save() ; $xl2.Stream.Close() ; $xl2.Dispose()}
}
}
elseif ($diff -and $FontColor) {Write-Warning -Message "To match rows to set changed cells, you must specify -Key and it must match one of the included properties." }
#if nothing was found write a message which wont be redirected
if (-not $diff) {Write-Host "Comparison of $Referencefile::$worksheet1 and $Differencefile::$WorkSheet2 returned no results." }
if ($show) {
Start-Process -FilePath $Referencefile
if (-not $oneFile) { Start-Process -FilePath $Differencefile }
if ($GridView) { Write-Warning -Message "-GridView is ignored when -Show is specified" }
}
elseif ($GridView -and $propList -contains $key) {
if ($IncludeEqual -and -not $ExcludeDifferent) {
$GroupedRows = $diff | Group-Object -Property $key
}
else { #to get the right now numbers on the grid we need to have all the rows.
$GroupedRows = Compare-Object -ReferenceObject $Sheet1 -DifferenceObject $Sheet2 -Property $propList -PassThru -IncludeEqual |
Group-Object -Property $key
}
#Additions, deletions and unchanged rows will give a group of 1; changes will give a group of 2 .
#If one sheet has extra rows we can get a single "==" result from compare, but with the row from the reference sheet
#but the row in the other sheet might so we will look up the row number from the key field build a hash table for that
$Sheet2 | ForEach-Object -Begin {$Rowhash = #{} } -Process {$Rowhash[$_.$key] = $_._row }
$ExpandedDiff = ForEach ($g in $GroupedRows) {
#we're going to create a custom object from a hash table. We want the fields to be ordered
$hash = [ordered]#{}
foreach ($result IN $g.Group) {
# if result indicates equal or "in Reference" set the reference side row. If we did that on a previous result keep it. Otherwise set to "blank"
if ($result.sideindicator -ne "=>") {$hash["<Row"] = $result._Row }
elseif (-not $hash["<Row"]) {$hash["<Row"] = "" }
#if we have already set the side, this is the second record, so set side to indicate "changed"
if ($hash.Side) {$hash.side = "<>"} else {$hash["Side"] = $result.sideindicator}
#if result is "in reference" and we don't have a matching "in difference" (meaning a change) the lookup will be blank. Which we want.
$hash[">Row"] = $Rowhash[$g.Name]
#position the key as the next field (only appears once)
$Hash[$key] = $g.Name
#For all the other fields we care about create <=FieldName and/or =>FieldName
foreach ($p in $propList.Where({$_ -ne $key})) {
if ($result.SideIndicator -eq "==") {$hash[("=>$P")] = $hash[("<=$P")] =$result.$P}
else {$hash[($result.SideIndicator+$P)] =$result.$P}
}
}
[Pscustomobject]$hash
}
#Sort by reference row number, and fill in any blanks in the difference-row column
$ExpandedDiff = $ExpandedDiff | Sort-Object -Property "<row"
for ($i = 1; $i -lt $ExpandedDiff.Count; $i++) {if (-not $ExpandedDiff[$i].">row") {$ExpandedDiff[$i].">row" = $ExpandedDiff[$i-1].">row" } }
#Sort by difference row number, and fill in any blanks in the reference-row column
$ExpandedDiff = $ExpandedDiff | Sort-Object -Property ">row"
for ($i = 1; $i -lt $ExpandedDiff.Count; $i++) {if (-not $ExpandedDiff[$i]."<row") {$ExpandedDiff[$i]."<row" = $ExpandedDiff[$i-1]."<row" } }
#if we had to put the equal rows back, take them out; sort, make sure all the columns are present in row 1 so the grid puts them in, and output
if ( $ExcludeDifferent) {$ExpandedDiff = $ExpandedDiff.where({$_.side -eq "=="}) | Sort-Object -Property "<row" ,">row" }
elseif ( $IncludeEqual) {$ExpandedDiff = $ExpandedDiff | Sort-Object -Property "<row" ,">row" }
else {$ExpandedDiff = $ExpandedDiff.where({$_.side -ne "=="}) | Sort-Object -Property "<row" ,">row" }
$ExpandedDiff | Update-FirstObjectProperties | Out-GridView -Title "Comparing $Referencefile::$worksheet1 (<=) with $Differencefile::$WorkSheet2 (=>)"
}
elseif ($GridView ) {Write-Warning -Message "To use -GridView you must specify -Key and it must match one of the included properties." }
elseif (-not $PassThru) {return ($diff | Select-Object -Property (#(#{n="_Side";e={$_.SideIndicator}},"_File" ,"_Sheet","_Row") + $propList))}
if ( $PassThru) {return $diff }
}

powershell: Check if any of a bunch of properties is set

I'm importing a csv-file which looks like this:
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
Now I want to check, if any of the value-properties in one group is set.
I can do
Import-Csv .\test.csv | where {$_.Value1.1 -or $_.Value1.2 -or $_.Value1.3}
or
Import-Csv .\test.csv | foreach {
if ($_.Value1 -or $_.Value2 -or $_.Value3) {
Write-Output $_
}
}
But my "real" csv-file contains about 200 columns and I have to check 31 properties x 5 different object types that are mixed up in this csv. So my code will be realy ugly.
Is there anything like
where {$_.Value1.*}
or
where {$ArrayWithPropertyNames}
?
You could easily use the Get-Member cmdlet to get the properties which have the correct prefix (just use * as a wildcard after the prefix).
So to achieve what you want you could just filter the data based on whether any of the properties with the correct prefix contains data.
The script below uses your sample data, with a row4 added, and filters the list to find all items which have a value in any property starting with value1.
$csv = #"
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
row4,v1.1,,v1.3
"#
$data = ConvertFrom-csv $csv
$data | Where {
$currentDataItem = $_
$propertyValues = $currentDataItem |
# Get's all the properties with the correct prefix
Get-Member 'value1*' -MemberType NoteProperty |
# Gets the values for each of those properties
Foreach { $currentDataItem.($_.Name) } |
# Only keep the property value if it has a value
Where { $_ }
# Could just return $propertyValues, but this makes the intention clearer
$hasValueOnPrefixedProperty = $propertyValues.Length -gt 0
Write-Output $hasValueOnPrefixedProperty
}
Alternate solution:
$PropsToCheck = 'Value1*'
Import-csv .\test.csv |
Where {
(($_ | Select $PropsToCheck).psobject.properties.value) -contains ''
}

compare columns in two csv files

With all of the examples out there you would think I could have found my solution. :-)
Anyway, I have two csv files; one with two columns, one with 4. I need to compare one column from each one using powershell. I thought I had it figured out but when I did a compare of my results, it comes back as false when I know it should be true. Here's what I have so far:
$newemp = Import-Csv -Path "C:\Temp\newemp.csv" -Header login_id, lastname, firstname, other | Select-Object "login_id"
$ps = Import-Csv -Path "C:\Temp\Emplid_LoginID.csv" | Select-Object "login id"
If ($newemp -eq $ps)
{
write-host "IDs match" -forgroundcolor green
}
Else
{
write-host "Not all IDs match" -backgroundcolor yellow -foregroundcolor black
}
I had to specifiy headers for the first file because it doesn't have any. What's weird is that I can call each variable to see what it holds and they end up with the same info but for some reason still comes up as false. This occurs even if there is only one row (not counting the header row).
I started to parse them as arrays but wasn't quite sure that was the right thing. What's important is that I compare row1 of the first file with with row1 of the second file. I can't just do a simple -match or -contains.
EDIT: One annoying thing is that the variables seem to hold the header row as well. When I call each one, the header is shown. But if I call both variables, I only see one header but two rows.
I just added the following check but getting the same results (False for everything):
$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps -PassThru | ForEach-Object { $_.InputObject }
Using latkin's answer from here I think this would give you the result set you're looking for. As per latkin's comment, the property comparison is redundant for your purposes but I left it in as it's good to know. Additionally the header is specified even for the csv with headers to prevent the header row being included in the comparison.
$newemp = Import-Csv -Path "C:\Temp\_sotemp\Book1.csv" -Header loginid |
Select-Object "loginid"
$ps = Import-Csv -Path "C:\Temp\_sotemp\Book2.csv" -Header loginid |
Select-Object "loginid"
#get list of (imported) CSV properties
$props1 = $newemp | gm -MemberType NoteProperty | select -expand Name | sort
$props2 = $ps | gm -MemberType NoteProperty | select -expand Name | sort
#first check that properties match
#omit this step if you know for sure they will be
if(Compare-Object $props1 $props2){
throw "Properties are not the same! [$props1] [$props2]"
}
#pass properties list to Compare-Object
else{
Compare-Object $newemp $ps -Property $props1
}
In the second line, I see there a space "login id" and the first line doesn't have it. Could that be an issue. Try having the same name for the headers in the .csv files itself. And it works for without providing header or select statements. Below is my experiment based upon your input.
emp.csv
loginid firstname lastname
------------------------------
abc123 John patel
zxy321 Kohn smith
sdf120 Maun scott
tiy123 Dham rye
k2340 Naam mason
lk10j5 Shaan kelso
303sk Doug smith
empids.csv
loginid
-------
abc123
zxy321
sdf120
tiy123
PS C:\>$newemp = Import-csv C:\scripts\emp.csv
PS C:\>$ps = Import-CSV C:\scripts\empids.csv
PS C:\>$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps | foreach { $_.InputObject}
Shows the difference objects that are not in $ps
loginid firstname lastname SideIndicator
------- --------- -------- -------------
k2340 Naam mason <=
lk10j5 Shaan kelso <=
303sk Doug smith <=
I am not sure if this is what you are looking for but i have used the PowerShell to do some CSV formatting for myself.
$test = Import-Csv .\Desktop\Vmtools-compare.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {
$check = "no"
break
}
}
if ($check -ne "no") {$n}
}
}
this is how my excel csv file looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
8
0
so basically script takes each number under Name column and then checks it against prod column. If the number is there then it won't display else it will display that number.
I have also done it the opposite way:
$test = Import-Csv c:\test.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {echo $n}
}
}
}
this is how my excel csv looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
3
5
2
so script shows the matching entries only.
You can play around with the code to look at different columns.