Compare-Object multi values powershell - powershell

please, could you help me find a solution to handle csv file with multiple field column
File1.csv
Teams,Category,Members
Team1,A,Smith;Johnson
Team1,C,Jones;Miller;Garcia
Team3,E,Wilson;Martinez
Team4,A,Martin;Jackson;White;Williams
File2.csv
Teams,Category,Members
Team1,A,Smith;Johnson
Team2,C,Jones;Miller;Garcia
Team3,E,Wilson;Martinez;Gonzalez;Hall
Team4,A,Martin;Jackson;Williams
Diff :
Add Gonzalez and Hall on teams 3
Remove White on Team-4
$1 = Import-Csv -Path ".\File1.csv" -Delimiter ','
$2 = Import-Csv -Path ".\File2.csv" -Delimiter ','
Compare-Object $1 $2 -Property Members -PassThru
Result :
Teams Category Members SideIndicator
Team3 E Wilson;Martinez;Gonzalez;Hall =>
Team4 A Martin;Jackson;Williams =>
Team3 E Wilson;Martinez <=
Team4 A Martin;Jackson;White;Williams <=
what is expected :
Teams Category Members SideIndicator
Team3 E Gonzalez and Hall =>
Team4 A White <=

I'd compare objects first to find differencies (notice that I compare two properties: Teams and Members to avoid missing entries in case the membership of different teams matches) and then compare the arrays created from matching objects:
$1 = Import-Csv -Path ".\File1.csv" -Delimiter ','
$2 = Import-Csv -Path ".\File2.csv" -Delimiter ','
$comparisonRes = Compare-Object $1 $2 -Property Teams,Members -PassThru
foreach ($obj in $comparisonRes | Where-Object SideIndicator -eq "=>") {
# $obj = ($comparisonRes | Where-Object SideIndicator -eq "=>")[0]
$matchingEntry = $1 | Where-Object {$_.Teams -eq $obj.Teams}
$matchingEntryMembers = $matchingEntry.Members -split ";"
$currentEntryMembers = $obj.Members -split ";"
$diffMembers = Compare-Object $matchingEntryMembers $currentEntryMembers
# Uncomment to log
# $diffMembers
# Do something with $diffMembers here
}

You might want to use json instead of csv which supports arrays and numbers. Otherwise the teams look like two semicolon separated strings.
file1.json
[
{"Teams":"Team1","Category":"A","Members":["Smith","Johnson"]},
{"Teams":"Team1","Category":"C","Members":["Jones","Miller","Garcia"]},
{"Teams":"Team3","Category":"E","Members":["Wilson","Martinez"]},
{"Teams":"Team4","Category":"A","Members":["Martin","Jackson","White","Williams"]}
]
file2.json
[
{"Teams":"Team1","Category":"A","Members":["Smith","Johnson"]},
{"Teams":"Team2","Category":"C","Members":["Jones","Miller","Garcia"]},
{"Teams":"Team3","Category":"E","Members":["Wilson","Martinez","Gonzalez","Hall"]},
{"Teams":"Team4","Category":"A","Members":["Martin","Jackson","Williams"]}
]
$1 = cat file1.json | convertfrom-json
$2 = cat file2.json | convertfrom-json
Compare-Object $1 $2 -Property Members -PassThru
Teams Category Members SideIndicator
----- -------- ------- -------------
Team3 E {Wilson, Martinez, Gonzalez, Hall} =>
Team4 A {Martin, Jackson, Williams} =>
Team3 E {Wilson, Martinez} <=
Team4 A {Martin, Jackson, White, Williams} <=
Here's a closer answer. Run compare-object on members only one line at a time, then add teams and category to it.
$1 = cat file1.json | convertfrom-json
$2 = cat file2.json | convertfrom-json
for($i = 0; $i -lt $1.length; $i++) {
compare-object $1[$i].members $2[$i].members |
select #{n='Teams'; e={$1[$i].teams}},
#{n='Category'; e={$1[$i].Category}},
#{n='Members'; e={$_.inputobject}},
sideindicator
}
Teams Category Members SideIndicator
----- -------- ------- -------------
Team3 E Gonzalez =>
Team3 E Hall =>
Team4 A White <=
Here's another way using a zip function PowerShell/CLI: "Foreach" loop with multiple arrays on both lists of objects.
$1 = cat file1.json | convertfrom-json
$2 = cat file2.json | convertfrom-json
function Zip($a1, $a2) { # function allows it to stream
while ($a1) {
$x, $a1 = $a1 # $a1 gets the tail of the list
$y, $a2 = $a2
[tuple]::Create($x, $y)
}
}
zip $1 $2 | % {
$whole = $_ # will lose this $_ in the select
compare-object $whole.item1.members $whole.item2.members |
select #{n='Teams'; e={$whole.item1.teams}},
#{n='Category'; e={$whole.item1.Category}},
inputobject,sideindicator
}
Teams Category InputObject SideIndicator
----- -------- ----------- -------------
Team3 E Gonzalez =>
Team3 E Hall =>
Team4 A White <=

Related

Add a column to a csv file and fill up new column based on an existing column powershell

I have been trying to add a new column to a csv file and populating the new column based on value in an existing column.
I have a table like this:
|name | number | state | desc|
| ---- | ------ |-------|-----|
|a | 1 | n | i |
|b | 2 | n | j |
|c | 3 | l | j |
|d | 4 | m | k |
I want to add a new column data and populate it based on number column matching with an array.
This is my code so far:
$a=("a","b","c")
$b=("p","q","r")
.
.
.
$c= import-csv -Path "C:\..."
$b |where-object {filtered the file based on some criteria}| select-object number, state, desc, #{Name="data"; Expression={Foreach-object {if ($_.number in $a){$_data = "x"}
elseif($_.number in $b){$_.data = "y"}.......} | export-csv -notypeinformation -path "C:\...."
The script runs but do not populate the new column. Please help
You've got the right idea. Import-Csv will produce an array of objects and you can use Select-Object to add calculated properties, then pipe again to Export-Csv. However, it's not exactly clear from the description or the example code what the expression should be. How do you want to define the new "data" property?
For now I'll work with what we have. The array variables $a & $b will never match anything. Also you can't use ForEach-Object like that, nor will assigning to $data work. The returning value of the Expression script block gets assigned to the property you named data. The following example demonstrates the point:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv"|
Select-Object number, state, desc,
#{Name = 'Data'; Expression = { If( $_.Number -in $a ){ 'x' } elseif( $_.Number -in $b ){ 'y' } Else { $null }}} |
Export-Csv -Path "C:\temp\12-22-20_New.csv" -NoTypeInformation
The resulting Csv file will look something like:
number state desc Data
------ ----- ---- ----
1 n i x
2 n j x
3 l j x
4 m k y
Update: Example Using Add-Member
You do not need to use a loop to add the property:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv" |
Add-Member -MemberType ScriptProperty -Name "data" -Value { If( $this.Number -in $a ){ 'x' } elseif( $this.Number -in $b ){ 'y' } Else { $null }} -PassThru |
Export-Csv -Path C:\temp\12-22-20_New.csv -NoTypeInformation
By using a MemberType of ScriptProperty we can make a slight modification to script block, replacing $_ with $this The pipe is an implicit loop. I'm not sure if there are any detractions to using a ScriptProperty, but this exports as expected. This approach doesn't require storing the output in $c, but -PassThru would facilitate that if preferred.
99% of the time Select-Object is used for this. The only difference I'm aware of it Select-Object converts the objects to PSCustomObjects. Get-Member will preserve the underlying type, however Import-Csv only outputs PSCustomObjects in the first place, so there's no impact here.
Try iterating over the $c array of imported objects and add the new property to all objects. You want to make sure the new column exists in all of the objects. You can either use Select-Object as in your example, or you can use Add-Member to add it to the imported object.
$a=("a","b","c")
$b=("p","q","r")
...
$c = Import-Csv -Path "C:\..."
$c | ForEach-Object {
$value = ""
# custom logic for value of "data"
# if (...) { $value = ... }
$_ | Add-Member -MemberType NoteProperty -Name "data" -Value $value
}
$c | Export-Csv -NoTypeInformation -path "C:\...."

Group by and add or subtract depending on the indicator

I have a file which has transaction_date, transaction_amount and debit_credit_indicator. I want to write a program which shows for each date total count and total amount.
Total amount is calculated as follows -
if debit_credit_indicator is 'C' add else if 'D' subtract.
I got till grouping by indicators but don't know how to proceed after wards.
My ouput looks like this
TRANSACTION_DATE DEBIT_CREDIT_INDICA TotalAmount Count
TOR
---------------- ------------------- ----------- -----
2019-02-26 C 1478
2019-02-25 D 100
2019-02-26 D 200
param([string]$inputFileName=30)
(Get-Content $inputFileName) -replace '\|', ',' | Set-Content c:\learnpowershell\test.csv
$transactionData = Import-csv c:\learnpowershell\test.csv | Group-Object -Property TRANSACTION_DATE, DEBIT_CREDIT_INDICATOR
[Array] $newsbData += foreach($gitem in $transactionData)
{
$gitem.group | Select -Unique TRANSACTION_DATE, DEBIT_CREDIT_INDICATOR, `
#{Name = ‘TotalAmount’;Expression = {(($gitem.group) | measure -Property TRANSACTION_AMOUNT -sum).sum}},
#{Name = ‘Count’;Expression = {(($gitem.group) | Measure-Object -count).count}}
};
write-output $newsbData
I suppose you want replace '|' by ',' because you dont know -delimiter option otherwise keep you code for replace. now i propose my code for your problem:
#import en group by date
import-csv "c:\learnpowershell\test.csv" -Delimiter '|' | group TRANSACTION_DATE | %{
$TotalCredit=0
$TotalDebit=0
$CountRowCredit=0
$CountRowDebit=0
$HasProblem=$false
#calculation by date for every group
$_.Group | %{
if ($_.DEBIT_CREDIT_INDICATOR -EQ 'C')
{
$TotalCredit+=$_.transaction_amount
$CountRowCredit++
}
elseif ($_.DEBIT_CREDIT_INDICATOR -EQ 'D')
{
$TotalDebit+=$_.transaction_amount
$CountRowDebit++
}
else
{
$HasProblem=$true
}
}
#output result
[pscustomobject]#{
TRANSACTION_DATE=$_.Name
CountRow=$_.Count
Credit_Total=$TotalCredit
Credit_CountRow=$CountRowCredit
Debit_Total=-$TotalDebit
Debit_CountRow=$CountRowDebit
Total_DebitCredit=$TotalCredit - $TotalDebit
HasProblem=$HasProblem
}
}
You can add ' | Format-Table ' if you want print result formated in table

Compare-Object - Separate side columns

Is it possible to display the results of a PowerShell Compare-Object in two columns showing the differences of reference vs difference objects?
For example using my current cmdline:
Compare-Object $Base $Test
Gives:
InputObject SideIndicator
987654 =>
555555 <=
123456 <=
In reality the list is rather long. For easier data reading is it possible to format the data like so:
Base Test
555555 987654
123456
So each column shows which elements exist in that object vs the other.
For bonus points it would be fantastic to have a count in the column header like so:
Base(2) Test(1)
555555 987654
123456
Possible? Sure. Feasible? Not so much. PowerShell wasn't really built for creating this kind of tabular output. What you can do is collect the differences in a hashtable as nested arrays by input file:
$ht = #{}
Compare-Object $Base $Test | ForEach-Object {
$value = $_.InputObject
switch ($_.SideIndicator) {
'=>' { $ht['Test'] += #($value) }
'<=' { $ht['Base'] += #($value) }
}
}
then transpose the hashtable:
$cnt = $ht.Values |
ForEach-Object { $_.Count } |
Sort-Object |
Select-Object -Last 1
$keys = $ht.Keys | Sort-Object
0..($cnt-1) | ForEach-Object {
$props = [ordered]#{}
foreach ($key in $keys) {
$props[$key] = $ht[$key][$_]
}
New-Object -Type PSObject -Property $props
} | Format-Table -AutoSize
To include the item count in the header name change $props[$key] to $props["$key($($ht[$key].Count))"].

if loop not seems to be working

I have a code as given below:
$datearray = #()
$temp = Get-Content "C:\temp.txt"
$temp1 = Get-Content "C:\temp1.txt"
foreach ($te in $temp) {
$t = $te -split '-'
$da = $t[1]
$mo = $t[2]
$yea = $t[3]
$fulldate = "$da-$mo-$yea"
if ($temp1 -match $fulldate) {
if ($fulldate -match $te) {
$datearray += $_
$fmt = 'dd-MM-yy-HH-mm'
$culture = [Globalization.CultureInfo]::InvariantCulture
*!* $datearray | sort { [DateTime]::ParseExact(($_ -split '-', 2)[1], $fmt, $culture) } | select -Last 1 | Add-Content "c:\temp4.txt"
} else {
#some operation
}
} else {
#some operation
}
}
For your understanding, I will show you how temp1.txt looks like:
17-07-15
18-07-15
19-07-15
20-07-15
21-07-15
22-07-15
23-07-15
temp.txt is:
testdatabase-17-07-15-22-00
testdatabase-17-07-15-23-00
testdatabase-21-07-15-10-00
testdatabase-21-07-15-23-00
What I am trying to do is that whenever it reaches the code marked with *!*, it goes back to foreach loop in the top every time. That marked code is not getting executed.
Can someone please tell me the solution?
Use the Group-Object cmdlet to group the databases by date, then select the most recent database name from each group:
$fmt = 'dd-MM-yy-HH-mm'
$culture = [Globalization.CultureInfo]::InvariantCulture
Get-Content 'C:\temp.txt' |
select #{n='Timestamp';e={[DateTime]::ParseExact(($_ -split '-', 2)[1], $fmt, $culture)}},
#{n='Database';e={$_}} |
group { $_.Timestamp.Date } |
% { $_.Group | sort Timestamp | select -Last 1 -Expand Database }
The code uses a select statement to transform the list of lines into a list of custom objects with a Timestamp and a Database property in order to simplify grouping and sorting the database names by date.
Inspecting the output after each step of the pipeline should help you understand the logic behind this. Get-Content produces a list of strings with the lines from the file:
PS C:\> Get-Content 'C:\temp.txt'
testdatabase-17-07-15-22-00
testdatabase-17-07-15-23-00
testdatabase-21-07-15-10-00
testdatabase-21-07-15-23-00
By using Select-Object with calculated properties the list of strings is transformed into a list of custom objects with 2 properties, the database name and the timestamp (as a DateTime object):
PS C:\> Get-Content 'C:\temp.txt' |
>> select #{n='Timestamp';e={[DateTime]::ParseExact(($_ -split '-', 2)[1], $fmt, $culture)}},
>> #{n='Database';e={$_}}
>>
Timestamp Database
--------- --------
17.07.2015 22:00:00 testdatabase-17-07-15-22-00
17.07.2015 23:00:00 testdatabase-17-07-15-23-00
21.07.2015 10:00:00 testdatabase-21-07-15-10-00
21.07.2015 23:00:00 testdatabase-21-07-15-23-00
Grouping these objects by the date portion of the timestamp gets you a list of GroupInfo objects whose Group property contains a list of the database names for a given date:
PS C:\> Get-Content 'C:\temp.txt' |
>> select #{n='Timestamp';e={[DateTime]::ParseExact(($_ -split '-', 2)[1], $fmt, $culture)}},
>> #{n='Database';e={$_}} |
>> group { $_.Timestamp.Date }
>>
Count Name Group
----- ---- -----
2 17.07.2015 00:00:00 {#{Timestamp=17.07.2015 22:00:00; Database=testdatabase-17-07-15-22-00}, #{Timestamp...
2 21.07.2015 00:00:00 {#{Timestamp=21.07.2015 10:00:00; Database=testdatabase-21-07-15-10-00}, #{Timestamp...
The ForEach-Object loop then sorts the elements of each group by timestamp and selects the last (most recent) database name from each group:
PS C:\> Get-Content 'C:\temp.txt' |
>> select #{n='Timestamp';e={[DateTime]::ParseExact(($_ -split '-', 2)[1], $fmt, $culture)}},
>> #{n='Database';e={$_}} |
>> group { $_.Timestamp.Date } |
>> % { $_.Group | sort Timestamp | select -Last 1 -Expand Database }
>>
testdatabase-17-07-15-23-00
testdatabase-21-07-15-23-00

Convert the result of select to array

I have the following code.
$l = #("A", "B", "X", "Y")
echo "A,B,X`n1,2,3,4" > .\myFile # Create test file
$f = cat myFile | ConvertFrom-Csv | gm -MemberType NoteProperty | select Name
compare $l $f
$a = .... # convert $f to array
compare $l $a
How to convert the $f to array so it can be compared with an array? Bracing #(...) doesn't work.
I got the following result when compare $l and $f.
PS C:\Users\nick> compare $l $f
InputObject SideIndicator
----------- -------------
#{Name=A} =>
#{Name=B} =>
#{Name=X} =>
A <=
B <=
X <=
Y <=
Replace select Name with select -Expand Name or ForEach-Object { $_.Name }.
Another approach, if you are looking to get an array from a single property would be to use the "ExpandProperty" switch from Select like this:
$f = cat myFile | ConvertFrom-Csv | gm -MemberType NoteProperty | Select -ExpandProperty Name
You can cast both objects to ArrayLists then compare them.
[System.Collections.ArrayList]$array1 = $l
[System.Collections.ArrayList]$array2 = $f
Try this:
$a = New-Object System.Collections.ArrayList;
$f | % { $null = $a.Add($_.Name); }
compare $l $a;