compare columns in two csv files - powershell

With all of the examples out there you would think I could have found my solution. :-)
Anyway, I have two csv files; one with two columns, one with 4. I need to compare one column from each one using powershell. I thought I had it figured out but when I did a compare of my results, it comes back as false when I know it should be true. Here's what I have so far:
$newemp = Import-Csv -Path "C:\Temp\newemp.csv" -Header login_id, lastname, firstname, other | Select-Object "login_id"
$ps = Import-Csv -Path "C:\Temp\Emplid_LoginID.csv" | Select-Object "login id"
If ($newemp -eq $ps)
{
write-host "IDs match" -forgroundcolor green
}
Else
{
write-host "Not all IDs match" -backgroundcolor yellow -foregroundcolor black
}
I had to specifiy headers for the first file because it doesn't have any. What's weird is that I can call each variable to see what it holds and they end up with the same info but for some reason still comes up as false. This occurs even if there is only one row (not counting the header row).
I started to parse them as arrays but wasn't quite sure that was the right thing. What's important is that I compare row1 of the first file with with row1 of the second file. I can't just do a simple -match or -contains.
EDIT: One annoying thing is that the variables seem to hold the header row as well. When I call each one, the header is shown. But if I call both variables, I only see one header but two rows.
I just added the following check but getting the same results (False for everything):
$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps -PassThru | ForEach-Object { $_.InputObject }

Using latkin's answer from here I think this would give you the result set you're looking for. As per latkin's comment, the property comparison is redundant for your purposes but I left it in as it's good to know. Additionally the header is specified even for the csv with headers to prevent the header row being included in the comparison.
$newemp = Import-Csv -Path "C:\Temp\_sotemp\Book1.csv" -Header loginid |
Select-Object "loginid"
$ps = Import-Csv -Path "C:\Temp\_sotemp\Book2.csv" -Header loginid |
Select-Object "loginid"
#get list of (imported) CSV properties
$props1 = $newemp | gm -MemberType NoteProperty | select -expand Name | sort
$props2 = $ps | gm -MemberType NoteProperty | select -expand Name | sort
#first check that properties match
#omit this step if you know for sure they will be
if(Compare-Object $props1 $props2){
throw "Properties are not the same! [$props1] [$props2]"
}
#pass properties list to Compare-Object
else{
Compare-Object $newemp $ps -Property $props1
}

In the second line, I see there a space "login id" and the first line doesn't have it. Could that be an issue. Try having the same name for the headers in the .csv files itself. And it works for without providing header or select statements. Below is my experiment based upon your input.
emp.csv
loginid firstname lastname
------------------------------
abc123 John patel
zxy321 Kohn smith
sdf120 Maun scott
tiy123 Dham rye
k2340 Naam mason
lk10j5 Shaan kelso
303sk Doug smith
empids.csv
loginid
-------
abc123
zxy321
sdf120
tiy123
PS C:\>$newemp = Import-csv C:\scripts\emp.csv
PS C:\>$ps = Import-CSV C:\scripts\empids.csv
PS C:\>$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps | foreach { $_.InputObject}
Shows the difference objects that are not in $ps
loginid firstname lastname SideIndicator
------- --------- -------- -------------
k2340 Naam mason <=
lk10j5 Shaan kelso <=
303sk Doug smith <=

I am not sure if this is what you are looking for but i have used the PowerShell to do some CSV formatting for myself.
$test = Import-Csv .\Desktop\Vmtools-compare.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {
$check = "no"
break
}
}
if ($check -ne "no") {$n}
}
}
this is how my excel csv file looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
8
0
so basically script takes each number under Name column and then checks it against prod column. If the number is there then it won't display else it will display that number.
I have also done it the opposite way:
$test = Import-Csv c:\test.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {echo $n}
}
}
}
this is how my excel csv looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
3
5
2
so script shows the matching entries only.
You can play around with the code to look at different columns.

Related

PowerShell: Find unique values from multiple CSV files

let's say that I have several CSV files and I need to check a specific column and find values that exist in one file, but not in any of the others. I'm having a bit of trouble coming up with the best way to go about it as I wanted to use Compare-Object and possibly keep all columns and not just the one that contains the values I'm checking.
So I do indeed have several CSV files and they all have a Service Code column, and I'm trying to create a list for each Service Code that only appears in one file. So I would have "Service Codes only in CSV1", "Service Codes only in CSV2", etc.
Based on some testing and a semi-related question, I've come up with a workable solution, but with all of the nesting and For loops, I'm wondering if there is a more elegant method out there.
Here's what I do have:
$files = Get-ChildItem -LiteralPath "C:\temp\ItemCompare" -Include "*.csv"
$HashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $files.Count; $i++){
$TempHashSet = [System.Collections.Generic.HashSet[String]]::New([String[]](Import-Csv $files[$i])."Service Code")
$HashList.Add($TempHashSet)
}
$FinalHashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $HashList.Count; $i++){
$UniqueHS = [System.Collections.Generic.HashSet[String]]::New($HashList[$i])
For ($j = 0; $j -lt $HashList.Count; $j++){
#Skip the check when the HashSet would be compared to itself
If ($j -eq $i){Continue}
$UniqueHS.ExceptWith($HashList[$j])
}
$FinalHashList.Add($UniqueHS)
}
It seems a bit messy to me using so many different .NET references, and I know I could make it cleaner with a tag to say using namespace System.Collections.Generic, but I'm wondering if there is a way to make it work using Compare-Object which was my first attempt, or even just a simpler/more efficient method to filter each file.
I believe I found an "elegant" solution based on Group-Object, using only a single pipeline:
# Import all CSV files.
Get-ChildItem $PSScriptRoot\csv\*.csv -File -PipelineVariable file | Import-Csv |
# Add new column "FileName" to distinguish the files.
Select-Object *, #{ label = 'FileName'; expression = { $file.Name } } |
# Group by ServiceCode to get a list of files per distinct value.
Group-Object ServiceCode |
# Filter by ServiceCode values that exist only in a single file.
# Sort-Object -Unique takes care of possible duplicates within a single file.
Where-Object { ( $_.Group.FileName | Sort-Object -Unique ).Count -eq 1 } |
# Expand the groups so we get the original object structure back.
ForEach-Object Group |
# Format-Table requires sorting by FileName, for -GroupBy.
Sort-Object FileName |
# Finally pretty-print the result.
Format-Table -Property ServiceCode, Foo -GroupBy FileName
Test Input
a.csv:
ServiceCode,Foo
1,fop
2,fip
3,fap
b.csv:
ServiceCode,Foo
6,bar
6,baz
3,bam
2,bir
4,biz
c.csv:
ServiceCode,Foo
2,bla
5,blu
1,bli
Output
FileName: b.csv
ServiceCode Foo
----------- ---
4 biz
6 bar
6 baz
FileName: c.csv
ServiceCode Foo
----------- ---
5 blu
Looks correct to me. The values 1, 2 and 3 are duplicated between multiple files, so they are excluded. 4, 5 and 6 exist only in single files, while 6 is a duplicate value only within a single file.
Understanding the code
Maybe it is easier to understand how this code works, by looking at the intermediate output of the pipeline produced by the Group-Object line:
Count Name Group
----- ---- -----
2 1 {#{ServiceCode=1; Foo=fop; FileName=a.csv}, #{ServiceCode=1; Foo=bli; FileName=c.csv}}
3 2 {#{ServiceCode=2; Foo=fip; FileName=a.csv}, #{ServiceCode=2; Foo=bir; FileName=b.csv}, #{ServiceCode=2; Foo=bla; FileName=c.csv}}
2 3 {#{ServiceCode=3; Foo=fap; FileName=a.csv}, #{ServiceCode=3; Foo=bam; FileName=b.csv}}
1 4 {#{ServiceCode=4; Foo=biz; FileName=b.csv}}
1 5 {#{ServiceCode=5; Foo=blu; FileName=c.csv}}
2 6 {#{ServiceCode=6; Foo=bar; FileName=b.csv}, #{ServiceCode=6; Foo=baz; FileName=b.csv}}
Here the Name contains the unique ServiceCode values, while Group "links" the data to the files.
From here it should already be clear how to find values that exist only in single files. If duplicate ServiceCode values within a single file wouldn't be allowed, we could even simplify the filter to Where-Object Count -eq 1. Since it was stated that dupes within single files may exist, we need the Sort-Object -Unique to count multiple equal file names within a group as only one.
It is not completely clear what you expect as an output.
If this is just the ServiceCodes that intersect then this is actually a duplicate with:
Comparing two arrays & get the values which are not common
Union and Intersection in PowerShell?
But taking that you actually want the related object and files, you might use this approach:
$HashTable = #{}
ForEach ($File in Get-ChildItem .\*.csv) {
ForEach ($Object in (Import-Csv $File)) {
$HashTable[$Object.ServiceCode] = $Object |Select-Object *,
#{ n='File'; e={ $File.Name } },
#{ n='Count'; e={ $HashTable[$Object.ServiceCode].Count + 1 } }
}
}
$HashTable.Values |Where-Object Count -eq 1
Here is my take on this fun exercise, I'm using a similar approach as yours with the HashSet but adding [System.StringComparer]::OrdinalIgnoreCase to leverage the .Contains(..) method:
using namespace System.Collections.Generic
# Generate Random CSVs:
$charset = 'abABcdCD0123xXyYzZ'
$ran = [random]::new()
$csvs = #{}
foreach($i in 1..50) # Create 50 CSVs for testing
{
$csvs["csv$i"] = foreach($z in 1..50) # With 50 Rows
{
$index = (0..2).ForEach({ $ran.Next($charset.Length) })
[pscustomobject]#{
ServiceCode = [string]::new($charset[$index])
Data = $ran.Next()
}
}
}
# Get Unique 'ServiceCode' per CSV:
$result = #{}
foreach($key in $csvs.Keys)
{
# Get all unique `ServiceCode` from the other CSVs
$tempHash = [HashSet[string]]::new(
[string[]]($csvs[$csvs.Keys -ne $key].ServiceCode),
[System.StringComparer]::OrdinalIgnoreCase
)
# Filter the unique `ServiceCode`
$result[$key] = foreach($line in $csvs[$key])
{
if(-not $tempHash.Contains($line.ServiceCode))
{
$line
}
}
}
# Test if the code worked,
# If something is returned from here means it didn't work
foreach($key in $result.Keys)
{
$tmp = $result[$result.Keys -ne $key].ServiceCode
foreach($val in $result[$key])
{
if($val.ServiceCode -in $tmp)
{
$val
}
}
}
i was able to get unique items as follow
# Get all items of CSVs in a single variable with adding the file name at the last column
$CSVs = Get-ChildItem "C:\temp\ItemCompare\*.csv" | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}}
}
Foreach($line in $CSVs){
$ServiceCode = $line.ServiceCode
$file = $line.Filename
if (!($CSVs | where {$_.ServiceCode -eq $ServiceCode -and $_.filename -ne $file})){
$line
}
}

PowerShell: list CSV file rows where at least one value between the 3rd and last column is equal to "0" or "1"

In my PowerShell script, I'm working with a CSV file that looks like this (with a number of rows and columns that can vary, but there will always be at least the headers and the first 2 columns):
OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1
I basically list servers in the first column and users in the first row (CSV header). This represents a user "access granting" matrix to servers (1 for "give access", 0 for "remove access", and void for "don't change").
I'm looking for a way to extract only the rows that include a value equal to "1" or "0" between (and including) the 3rd and last column. (= to eventually get the list of servers where access rights should be changed)
So taking the above example, I only want the following lines returned:
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Windows;hostname5;1;1;1
Any hints to make this possible? Or the opposite (getting the ones without any 0 or 1)?
Even if it means using "Get-Content" instead of "Import-CSV". I don't care about the 1st (headers) row; I know how to exclude that.
Thank you!
--- Final solution, thanks to #Tomalak's answer:
$AccessMatrix = Import-CSV $CSVfile -delimiter ';'
$columns = $AccessMatrix | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$AccessMatrix = $AccessMatrix | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col.trim() -eq "1" -OR $row.$col.trim() -eq "0") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The following uses Get-Member to select the names of all columns after the first two.
Then, using ForEach-Object, we can output only those rows that have a value in any of those columns.
$data = ConvertFrom-Csv "OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1" -Delimiter ";"
$columns = $data | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$data | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The break statement stops the execution of the inner foreach loop because there is no point in further checking as soon as the first column with any value is found.
This is equivalent to the above, if you prefer Where-Object:
$data | Where-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
return $true
}
}
}

Which operator provides quicker output -match -contains or Where-Object for large CSV files

I am trying to build a logic where I have to query 4 large CSV files against 1 CSV file. Particularly finding an AD object against 4 domains and store them in variable for attribute comparison.
I have tried importing all files in different variables and used below 3 different codes to get the desired output. But it takes longer time for completion than expected.
CSV import:
$AllMainFile = Import-csv c:\AllData.csv
#Input file contains below
EmployeeNumber,Name,Domain
Z001,ABC,Test.com
Z002,DEF,Test.com
Z003,GHI,Test1.com
Z001,ABC,Test2.com
$AAA = Import-csv c:\AAA.csv
#Input file contains below
EmployeeNumber,Name,Domain
Z001,ABC,Test.com
Z002,DEF,Test.com
Z003,GHI,Test1.com
Z001,ABC,Test2.com
Z004,JKL,Test.com
$BBB = Import-Csv C:\BBB.csv
$CCC = Import-Csv C:\CCC.csv
$DDD = Import-Csv c:\DDD.csv
Sample code 1:
foreach ($x in $AllMainFile) {
$AAAoutput += $AAA | ? {$_.employeeNumber -eq $x.employeeNumber}
$BBBoutput += $BBB | ? {$_.employeeNumber -eq $x.employeeNumber}
$CCCoutput += $CCC | ? {$_.employeeNumber -eq $x.employeeNumber}
$DDDoutput += $DDD | ? {$_.employeeNumber -eq $x.employeeNumber}
if ($DDDoutput.Count -le 1 -and $AAAoutput.Count -le 1 -and $BBBoutput.Count -le 1 -and $CCCoutput.Count -le 1) {
#### My Other script execution code here
} else {
#### My Other script execution code here
}
}
Sample code 2 (just replacing with -match instead of Where-Object):
foreach ($x in $AllMainFile) {
$AAAoutput += $AAA -match $x.EmployeeNumber
$BBBoutput += $BBB -match $x.EmployeeNumber
$CCCoutput += $CCC -match $x.EmployeeNumber
$DDDoutput += $AllMainFile -match $x.EmployeeNumber
if ($DDDoutput.Count -le 1 -and $AAAoutput.Count -le 1 -and $BBBoutput.Count -le 1 -and $CCCoutput.Count -le 1) {
#### My Other script execution code here
} else {
#### My Other script execution code here
}
}
Sample code 3 (just replacing with -contains operator):
foreach ($x in $AllMainFile) {
foreach ($c in $AAA){ if ($AllMainFile.employeeNumber -contains $c.employeeNumber) {$AAAoutput += $c}}
foreach ($c in $BBB){ if ($AllMainFile.employeeNumber -contains $c.employeeNumber) {$BBBoutput += $c}}
foreach ($c in $CCC){ if ($AllMainFile.employeeNumber -contains $c.employeeNumber) {$CCCoutput += $c}}
foreach ($c in $DDD){ if ($AllMainFile.employeeNumber -contains $c.employeeNumber) {$DDDoutput += $c}}
if ($DDDoutput.Count -le 1 -and $AAAoutput.Count -le 1 -and $BBBoutput.Count -le 1 -and $CCCoutput.Count -le 1) {
#### My Other script execution code here
} else {
#### My Other script execution code here
}
}
I am expecting to execute the script as quick and fast as possible by comparing and lookup all 4 CSV files against 1 input file. Each files contains more than 1000k objects/rows with 5 columns.
Performance
Before answering the question, I would like to clear some air about measuring the performance of PowerShell cmdlets. Native PowerShell is very good in streaming objects and therefore could save a lot of memory if streamed correctly (do not assign a stream to a variable or use brackets). PowerShell is also capable of invoking almost every existing .Net methods (like Add()) and technologies like LINQ.
The usual way of measuring the performance of a command is:
(Measure-Command {<myCommand>}).TotalMilliseconds
If you use this on native powershell streaming cmdlets, they appear not to perform very well in comparison with statements and dotnet commands. Often it is concluded that e.g. LINQ outperforms native PowerShell commands well over a factor hundred. The reason for this is that LINQ is reactive and using a deferred (lazy) execution: It tells it has done the job but it is actually doing it at the moment you need any result (besides it is caching a lot of results which is easiest to exclude from a benchmark by starting a new session) where of Native PowerShell is rather proactive: it passes any resolved item immediately back into the pipeline and any next cmdlet (e.g. Export-Csv) might than finalize the item and release it from memory.
In other words, if you have a slow input (see: Advocating native PowerShell) or have a large amount data to process (e.g. larger than the physical memory available), it might be better and easier to use the Native PowerShell approach.
Anyways, if you are comparing any results, you should test is in practice and test it end-to-end and not just on data that is already available in memory.
Building a list
I agree that using the Add() method on a list is much faster that using += which concatenates the new item with the current array and then reassigns it back to the array.
But again, both approaches stall the pipeline as they collect all the data in memory where you might be better off to intermediately release the result to the disk.
HashTables
You will probably find the most performance improvement in using a hash table as they are optimized for a binary search.
As it is required to compare two collections to each other, you can't stream both but as explained, it might be best and easiest you use 1 hash table for one side and compare this to each item in a stream at the other side and because you want to compare the AllData which each of the other tables, it is best to index that table into memory (in the form of a hash table).
This is how I would do this:
$Main = #{}
ForEach ($Item in $All) {
$Main[$Item.EmployeeNumber] = #{MainName = $Item.Name; MainDomain = $Item.Domain}
}
ForEach ($Name in 'AAA', 'BBB', 'CCC', 'DDD') {
Import-Csv "C:\$Name.csv" | Where-Object {$Main.ContainsKey($_.EmployeeNumber)} | ForEach-Object {
[PSCustomObject](#{EmployeeNumber = $_.EmployeeNumber; Name = $_.Name; Domain = $_.Domain} + $Main[$_.EmployeeNumber])
} | Export-Csv "C:\Output$Name.csv"
}
Addendum
Based on the comment (and the duplicates in the lists), it appears that actually a join on all keys is requested and not just on the EmployeeNumber. For this you need to concatenate the concerned keys (separated with a separator that is not used in the data) and use that as key for the hash table.
Not in the question but from the comment it appears also that full-join is expected. For the right-join part this can be done by returning the right object in case there is no match found in the main table ($Main.ContainsKey($Key)). For the left-join part this is more complex as you will need to track ($InnerMain) which items in main are already matched and return the leftover items in the end:
$Main = #{}
$Separator = "`t" # Chose a separator that isn't used in any value
ForEach ($Item in $All) {
$Key = $Item.EmployeeNumber, $Item.Name, $Item.Domain -Join $Separator
$Main[$Key] = #{MainEmployeeNumber = $Item.EmployeeNumber; MainName = $Item.Name; MainDomain = $Item.Domain} # What output is expected?
}
ForEach ($Name in 'AAA', 'BBB', 'CCC', 'DDD') {
$InnerMain = #($False) * $Main.Count
$Index = 0
Import-Csv "C:\$Name.csv" | ForEach-Object {
$Key = $_.EmployeeNumber, $_.Name, $_.Domain -Join $Separator
If ($Main.ContainsKey($Key)) {
$InnerMain[$Index] = $True
[PSCustomObject](#{EmployeeNumber = $_.EmployeeNumber; Name = $_.Name; Domain = $_.Domain} + $Main[$Key])
} Else {
[PSCustomObject](#{EmployeeNumber = $_.EmployeeNumber; Name = $_.Name; Domain = $_.Domain; MainEmployeeNumber = $Null; MainName = $Null; MainDomain = $Null})
}
$Index++
} | Export-Csv "C:\Output$Name.csv"
$Index = 0
ForEach ($Item in $All) {
If (!$InnerMain[$Index]) {
$Key = $Item.EmployeeNumber, $Item.Name, $Item.Domain -Join $Separator
[PSCustomObject](#{EmployeeNumber = $Null; Name = $Null; Domain = $Null} + $Main[$Key])
}
$Index++
} | Export-Csv "C:\Output$Name.csv"
}
Join-Object
Just FYI, I have made a few improvements to Join-Object cmdlet (use and installation are very simple, see: In Powershell, what's the best way to join two tables into one?) including an easier changing of multiple joins which might come in handy for a request as this one. Although I still do not have full understanding of what you exactly looking for (and have minor questions like: how could the domains differ in a domain column if it is an extract from one specific domain?).
I take the general description "Particularly finding an AD object against 4 domains and store them in variable for attribute comparison" as leading.
In here I presume that the $AllMainFile is actually just an intermediate table existing out of a concatenation of all concerned tables (and not really necessarily but just confusing as it might contain to types of duplicates the employeenumbers from the same domain and the employeenumbers from other domains). If this is correct, you can just omit this table using the Join-Object cmdlet:
$AAA = ConvertFrom-Csv #'
EmployeeNumber,Name,Domain
Z001,ABC,Domain1
Z002,DEF,Domain2
Z003,GHI,Domain3
'#
$BBB = ConvertFrom-Csv #'
EmployeeNumber,Name,Domain
Z001,ABC,Domain1
Z002,JKL,Domain2
Z004,MNO,Domain4
'#
$CCC = ConvertFrom-Csv #'
EmployeeNumber,Name,Domain
Z005,PQR,Domain2
Z001,ABC,Domain1
Z001,STU,Domain2
'#
$DDD = ConvertFrom-Csv #'
EmployeeNumber,Name,Domain
Z005,VWX,Domain4
Z006,XYZ,Domain1
Z001,ABC,Domain3
'#
$AAA | FullJoin $BBB -On EmployeeNumber -Discern AAA |
FullJoin $CCC -On EmployeeNumber -Discern BBB |
FullJoin $DDD -On EmployeeNumber -Discern CCC,DDD | Format-Table
Result:
EmployeeNumber AAAName AAADomain BBBName BBBDomain CCCName CCCDomain DDDName DDDDomain
-------------- ------- --------- ------- --------- ------- --------- ------- ---------
Z001 ABC Domain1 ABC Domain1 ABC Domain1 ABC Domain3
Z001 ABC Domain1 ABC Domain1 STU Domain2 ABC Domain3
Z002 DEF Domain2 JKL Domain2
Z003 GHI Domain3
Z004 MNO Domain4
Z005 PQR Domain2 VWX Domain4
Z006 XYZ Domain1

Multiple Criteria Matching in PowerShell

Hello PowerShell Scriptwriters,
I got an objective to count rows, based on the multiple criteria matching. My PowerShell script can able to fetch me the end result, but it consumes too much time[when the rows are more, the time it consumes becomes even more]. Is there a way to optimism my existing code? I've shared my code for your reference.
$csvfile = Import-csv "D:\file\filename.csv"
$name_unique = $csvfile | ForEach-Object {$_.Name} | Select-Object -Unique
$region_unique = $csvfile | ForEach-Object {$_."Region Location"} | Select-Object -Unique
$cost_unique = $csvfile | ForEach-Object {$_."Product Cost"} | Select-Object -Unique
Write-host "Save Time on Report" $csvfile.Length
foreach($nu in $name_unique)
{
$inc = 1
foreach($au in $region_unique)
{
foreach($tu in $cost_unique)
{
foreach ($mainfile in $csvfile)
{
if (($mainfile."Region Location" -eq $au) -and ($mainfile.'Product Cost' -eq $tu) -and ($mainfile.Name -eq $nu))
{
$inc++ #Matching Counter
}
}
}
}
$inc #expected to display Row values with the total count.And export the result as csv
}
You can do this quite simply using the Group option on a Powershell object.
$csvfile = Import-csv "D:\file\filename.csv"
$csvfile | Group Name,"Region Location","Product Cost" | Select Name, Count
This gives output something like the below
Name Count
---- ------
f1, syd, 10 2
f2, syd, 10 1
f3, syd, 20 1
f4, melb, 10 2
f2, syd, 40 1
P.S. the code you provided above is not matching all of the fields, it is simply checking the Name parameter (looping through the other parameters needlessly).

powershell: Check if any of a bunch of properties is set

I'm importing a csv-file which looks like this:
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
Now I want to check, if any of the value-properties in one group is set.
I can do
Import-Csv .\test.csv | where {$_.Value1.1 -or $_.Value1.2 -or $_.Value1.3}
or
Import-Csv .\test.csv | foreach {
if ($_.Value1 -or $_.Value2 -or $_.Value3) {
Write-Output $_
}
}
But my "real" csv-file contains about 200 columns and I have to check 31 properties x 5 different object types that are mixed up in this csv. So my code will be realy ugly.
Is there anything like
where {$_.Value1.*}
or
where {$ArrayWithPropertyNames}
?
You could easily use the Get-Member cmdlet to get the properties which have the correct prefix (just use * as a wildcard after the prefix).
So to achieve what you want you could just filter the data based on whether any of the properties with the correct prefix contains data.
The script below uses your sample data, with a row4 added, and filters the list to find all items which have a value in any property starting with value1.
$csv = #"
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
row4,v1.1,,v1.3
"#
$data = ConvertFrom-csv $csv
$data | Where {
$currentDataItem = $_
$propertyValues = $currentDataItem |
# Get's all the properties with the correct prefix
Get-Member 'value1*' -MemberType NoteProperty |
# Gets the values for each of those properties
Foreach { $currentDataItem.($_.Name) } |
# Only keep the property value if it has a value
Where { $_ }
# Could just return $propertyValues, but this makes the intention clearer
$hasValueOnPrefixedProperty = $propertyValues.Length -gt 0
Write-Output $hasValueOnPrefixedProperty
}
Alternate solution:
$PropsToCheck = 'Value1*'
Import-csv .\test.csv |
Where {
(($_ | Select $PropsToCheck).psobject.properties.value) -contains ''
}