Join-Object two different csv files using PowerShell - powershell

The first .csv file is an monthly backup size in KB on based client name. The second .csv file is an next monthly backup size in KB on based client name.
It lists all the Client Name in column A. Column B has the corresponding policy name of client and last column backup size in KB (i.e. - 487402463).
If the difference between client size (1638838488 - 1238838488 = 0.37 in TB ) is greater than 0.10 TB , the results will be spit out in TB size to a csv file like below.
Also , a client may be related multiple policy name.
My question is : I want to add something too.
Backup size may decrease in the next month such as hostname15,Company_Policy_11.
Also , hostname55,Company_Policy_XXX may have different policy name.
hostnameXX,Company_Policy_XXX,0 and hostnameXX,Company_Policy_XXX,41806794 it may be duplicate client and policy name. if this does not exist in CSV2 then I want to display as negative (-0.14) like below. Or may be exist in CSV2 hostnameZZ,Company_Policy_XXX as well.
Lastly just it may be in CSV2 such as hostnameSS,Company_Policy_XXX.
I used the Join-Object module. https://github.com/ili101/Join-Object
Example CSVFile1.csv
Client Name,Policy Name,KB Size
hostname1,Company_Policy,487402463
hostname2,Company_Policy,227850336
hostname3,Company_Policy_11,8360960
hostname4,Company_Policy_11,1238838488
hostname15,Company_Policy_11,3238838488
hostname1,Company_Policy_55,521423110
hostname10,Company_Policy,28508975
hostname3,Company_Policy_66,295925
hostname5,Company_Policy_22,82001824
hostname2,Company_Policy_33,26176885
hostnameXX,Company_Policy_XXX,0
hostnameXX,Company_Policy_XXX,141806794
hostnameYY,Company_Policy_XXX,121806794
hostname55,Company_Policy_XXX,41806794
hostnameZZ,Company_Policy_XXX,0
hostnameZZ,Company_Policy_XXX,141806794
Example CSVFile2.csv
Client Name,Policy Name,KB Size
hostname1,Company_Policy,487402555
hostname2,Company_Policy,227850666
hostname3,Company_Policy_11,8361200
hostname4,Company_Policy_11,1638838488
hostname1,Company_Policy_55,621423110
hostname15,Company_Policy_11,1238838488
hostname10,Company_Policy,28908975
hostname3,Company_Policy_66,295928
hostname5,Company_Policy_22,92001824
hostname2,Company_Policy_33,36176885
hostname22,Company_Policy,291768854
hostname23,Company_Policy,291768854
hostname55,Company_Policy_BBB,191806794
hostnameZZ,Company_Policy_XXX,0
hostnameZZ,Company_Policy_XXX,291806794
hostnameSS,Company_Policy_XXX,0
hostnameSS,Company_Policy_XXX,291806794
Desired Output :
Client Name,Policy Name,TB Size
hostname4,Company_Policy_11,0.37
hostname22,Company_Policy,0.27
hostname23,Company_Policy,0.27
hostnameYY,Company_Policy_XXX,-0.12
hostnameXX,Company_Policy_XXX,-0.14
hostname15,Company_Policy_11,-2
hostname55,Company_Policy_BBB,0.15
hostnameZZ,Company_Policy_XXX,0.15
hostnameSS,Company_Policy_XXX,0.29
Here is my script so far :
$CSV2 | FullJoin $CSV1 `
-On 'Client Name','Policy Name' `
-Property 'Client Name',
'Policy Name',
#{'TB Size' = {[math]::Round(($Left.'KB Size' - $Right.'KB Size') * 1KB / 1TB, 2)}} |
Where-Object {[math]::Abs($_.'TB Size') -gt 0.10} | Export-Csv C:\Toolbox\DataReport.csv -NoTypeInformation

You could so something similar to the following. This assumes you want to subtract CSV1 values from CSV2 values.
# Read CSV files and make CSV1 sizes negative. Makes summing totals simpler.
$1 = Import-Csv CSVFile1.csv | Foreach-Object { $_.'KB Size' = -$_.'KB Size'; $_ }
$2 = Import-Csv CSVFile2.csv
# Calculated Properties to be used with Select-Object
$CalculatedProperties = #{n='Client Name';e={$_.Group.'Client Name' | Get-Unique}},
#{n='Policy Name';e={$_.Group.'Policy Name' | Get-Unique}},
#{n='TB Size';e={[math]::Round(($_.Group.'KB Size' | Measure -Sum).Sum*1KB/1TB,2)}}
# Grouping objects based on unique client and policy name combinations
$1 + $2 | Group-Object 'Client Name','Policy Name' |
Select-object $CalculatedProperties |
Where {[math]::Abs($_.'TB Size') -gt 0.10}

Related

Powershell - Compare CSV 1 to CSV 2 and then update CSV1

I am not looking for a writing service but please can someone point me in the right direction as I am completely at a lost as to how to proceed.
Overview
I have a CSV which contains a lot of data, some of which comes from a script and some in manually imputed. I can run the script and get new data which is good. What I would like to do is find a way to compare the orginal CSV 1 to the new CSV 2 and update CSV 1.
Code I currently have
$Vips_to_check = #{}
Import-Csv 'C:\Users\user\Documents\20221201\Netscaler VIPs per Cluster_edited - Raw Data.csv' |
Where-Object {$_.PRD -match "No PRD code from VIP IP and VIP has no backend IPs" -or
$_.PRD -match "No PRD code found from VIP or backend IPs" -or
$_.PRD -match "No PRD code found from backend IPs" -and
$_.ipv46 -notcontains "0.0.0.0"} |
$Results_from_PIM = Import-Csv 'C:\Users\user\Documents\20221201\VIP-Owners_edited.csv'
Both of the CSV's have the same headers and layout which is good. I assume!
CSV 1
Name IPV46 Port Curstate Suggested PRD Display Name tech Owner Slack Channel Support Email
name 1 1.2.3.4 8080 Down No No No No No No No
CSV 2
Name IPV46 Port Curstate Suggested PRD Display Name tech Owner Slack Channel Support Email
name 1 1.2.3.4 8080 Down No PRD123 TMOL Gary TMOL Support Support#email.com nsr.sys
I would guess at creating a hashtable but I just can't seem to get my head around the format of them. I tried
$ht = $Results_from_pim #{}
$_.Name = (cant figure out how to reference the cell)
$_.PRD =
$_.("Display Name")
$_.("Tech Owner")
Once I have the data in the hash table how do I overwrite the CSV 1 data?
Any points or guides would be great. I have tried reading up on https://learn.microsoft.com/en-gb/powershell/scripting/learn/deep-dives/everything-about-hashtable?view=powershell-7.3 and https://learn.microsoft.com/en-us/powershell/scripting/learn/deep-dives/everything-about-pscustomobject?view=powershell-7.3
But that left me even more confused.
At the moment the difference is only 4 or 5 entries and it would of been quicker for me to manually edit in excel but as this script gets larger I can see it being more time consuming to do manually.
As always thank you.
UPDATE
$ht = #{}
foreach ($item in $Results_from_PIM) {
"name = $($item.name)"
"prd = $($item.PRD)"
"Display Name = $($item.'Display Name')"
"Tech Owner = $($item.'Tech Owner')"
"Slack Channel = $($item.'Slack Channel')"
"Support Email = $($Item.'Support Email')"
}
I have created the hash table that I wanted from the CSV 2. Just got to get it to compare to CSV 1.
Update 2
Further to #theo request I have adjusted the question. Also to clarify When I want to merge the CSV it is based on matching the Name, IPV46 and Port on both CSV and then moving the updated data from CSV2 into CSV1.
You can do that with the code below (no extra module needed):
$csv1 = 'C:\Users\user\Documents\20221201\Netscaler VIPs per Cluster_edited - Raw Data.csv'
$csv2 = 'C:\Users\user\Documents\20221201\VIP-Owners_edited.csv'
$Results_from_PIM = Import-Csv -Path $csv2
$newData = Import-Csv -Path $csv1 | ForEach-Object {
$search = $_.Name + $_.IPV46 + $_.Port # combine these fields into a single string
$compare = $Results_from_PIM | Where-Object { ($_.Name + $_.IPV46 + $_.Port) -eq $search }
if ($compare) {
# output the result from csv2
$compare
}
else {
# output the original row from csv1
$_
}
}
# now you can save the updated data to a new file or overwrite csv1 if you like
$csv3 = 'C:\Users\user\Documents\20221201\VIP-Owners_Updated.csv'
$newData | Export-Csv -Path $csv3 -NoTypeInformation
P.S. Please read about Formatting
After being direct to In PowerShell, what's the best way to join two tables into one? by #jdweng. I performed the following which seems to have meet my requirements
Install-Module -Name JoinModule -Scope CurrentUser
$Vips_to_check = Import-Csv 'C:\Users\user\Documents\20221201\Netscaler VIPs per Cluster - Raw Data.csv'
$Results_from_PIM = Import-Csv 'C:\Users\user\Documents\20221201\VIP-Owners.csv'
$Vips_to_check | Update-Object $Results_from_PIM -On name, Ipv46, port | Export-Csv 'C:\Users\user\Documents\20221201\Final_data1.csv'
Going to do further testing with larger data sets but appears to work as required.

how to caculate the avg of column from a csv which does not have headers with powershell?

The data array inside the csv which does not have headers(shoudl be: pkg, pp0, pp1, dram, time):
37.0036,27.553,0,0,0.100111
35.622,26.1947,0,0,0.200702
34.931,25.5656,0,0,0.300765
34.814,25.4795,0,0,0.400826
34.924,25.5676,0,0,0.500888
34.8971,25.5443,0,0,0.600903
if I want to get the avg value of the columns and make the output like:
The avg of Pkg: xxx
The avg of pp0: xxx
The avg of pp1: xxx
The avg of time: xxx
how can I do?
When you're using Import-CSV, PowerShell references the first row as the header row. The error you're getting,
import-csv : The member "0" is already present.
Is because there is already a header name of 0 in the header row. To give new names to the headers, use the Import-CSV -Header command to give manual names in the csv file.
From here, you can use the Measure-Object command to determine the averages
$myData = Import-Csv .\a.csv -Header pkg,pp0,pp1,dram,time
Write-Host "The avg of Pkg: $(($myData | Measure-Object -Property pkg -Average).Average)"
Write-Host "The avg of pp0: $(($myData | Measure-Object -Property pp0 -Average).Average)"
Write-Host "The avg of pp1: $(($myData | Measure-Object -Property pp1 -Average).Average)"
Write-Host "The avg of time: $(($myData | Measure-Object -Property time -Average).Average)"

Powershell: Create CSV entries

I have a CSV File and in Column 1 are words.
I want to modify the words, for example I want to add the String "cat" at the end and write it down in Column 2.
I've posted a Question days ago where #Theo archived this:
$CSV = Import-CSV -Path 'C:\path.csv' -Header Column1
$newCsv = foreach ($row in $CSV) {
# output an Object that gets collected in variable $newCsv
# Select-Object * takes everything already in $row,
# #{Name = 'Column2'; Expression = {$row.Column1 + 'cat'}} adds the extra column to it.
$row | Select-Object *, #{Name = 'Column2'; Expression = {$row.Column1 + '-cat'}}
}
# output on screen:
$newCsv
# output to new CSV file
$newCsv | Export-Csv -Path 'C:\path.csv' -NoTypeInformation
Output (on screen):
Column1 Column2
------- -------
Wild Wildcat
Copy Copycat
Hell Hellcat
Tom Tomcat
Snow Snowcat
As far as good, now I want to create and write down a Password in Column 3.
So I would create a variable with a randomized Password with some of the many PS generators already posted online.
And I also would like to declare Column1 and Column2 as a variable because I need those 3 Column entries further down the road to create a .txt File that include those.
Just if you curious why the hell do I create a .txt File AND a CSV:
Column1 is basicly a Systemname, Column2 Username and Column3 a Password.
I document access data in the CSV and create a Script in the .txt, so I can create the User in the Exchange by Script (for about 1000+ Systems).
I appreciate any hint!

Combine two CSV files in powershell without changing the order of columns

I have "a.csv" and "b.csv" . I tried to merge them with below commands
cd c:/users/mine/test
Get-Content a.csv, b.csv | Select-Object -Unique | Set-Content -Encoding ASCII joined.csv
But I got Output file like b.csv added by end of the row of a.csv. I wanted add by end of the column of a.csv then b.csv columns should begin
Vm Resource SID
mnvb vclkn vxjcb
vjc.v vnxc,m bvkxncb
Vm 123 456 789
mnvb apple banana orange
vjc.v lemon onion tomato
My expected output should be like below. Without changing the order
Vm Resource SID 123 456 789
mnvb vclkn vxjcb apple banana orange
vjc.v vnxc,m bvkxncb lemon onion tomato
From here, there are two ways to do it -
Join-Object custom function by RamblingCookieMonster. This is short and sweet. After you import the function in your current PoSh environment, you can use the below command to get your desired result -
Join-Object -Left $a -Right $b -LeftJoinProperty vm -RightJoinProperty vm | Export-Csv Joined.csv -NTI
The accepted answer from mklement which would work for you as below -
# Read the 2 CSV files into collections of custom objects.
# Note: This reads the entire files into memory.
$doc1 = Import-Csv a.csv
$doc2 = Import-Csv b.csv
$outFile = 'Joined.csv'
# Determine the column (property) names that are unique to document 2.
$doc2OnlyColNames = (
Compare-Object $doc1[0].psobject.properties.name $doc2[0].psobject.properties.name |
Where-Object SideIndicator -eq '=>'
).InputObject
# Initialize an ordered hashtable that will be used to temporarily store
# each document 2 row's unique values as key-value pairs, so that they
# can be appended as properties to each document-1 row.
$htUniqueRowD2Props = [ordered] #{}
# Process the corresponding rows one by one, construct a merged output object
# for each, and export the merged objects to a new CSV file.
$i = 0
$(foreach($rowD1 in $doc1) {
# Get the corresponding row from document 2.
$rowD2 = $doc2[$i++]
# Extract the values from the unique document-2 columns and store them in the ordered
# hashtable.
foreach($pname in $doc2OnlyColNames) { $htUniqueRowD2Props.$pname = $rowD2.$pname }
# Add the properties represented by the hashtable entries to the
# document-1 row at hand and output the augmented object (-PassThru).
$rowD1 | Add-Member -NotePropertyMembers $htUniqueRowD2Props -PassThru
}) | Export-Csv -NoTypeInformation -Encoding Utf8 $outFile

Compare two different csv files using PowerShell

I'm looking for a solution to compare 2 .csv files and compare the results.
The first .csv file is an monthly backup size in KB on based client name. The second .csv file is an next monthly backup size in KB on based client name.
It lists all the Client Name in column A. Column B has the corresponding policy name of client and last column backup size in KB (i.e. - 487402463).
If the difference between client size (1638838488 - 1238838488 = 0.37 in TB ) is greater than 0.10 TB , the results will be spit out in TB size to a csv file like below.
Also , a client may be related multiple policy name.
My question is : I want to add something too. Sometimes it may be duplicate client and policy name such as hostnameXX,Company_Policy_XXX or case-sensitive HOSTNAMEXX,Company_Policy_XXX.
additionally, lets say , if hostnameYY,Company_Policy_XXX,41806794 does not exist in CSV2 then I want to display as negative like below.
I used the Join-Object module.
Example CSVFile1.csv
Client Name,Policy Name,KB Size
hostname1,Company_Policy,487402463
hostname2,Company_Policy,227850336
hostname3,Company_Policy_11,8360960
hostname4,Company_Policy_11,1238838488
hostname1,Company_Policy_55,521423110
hostname10,Company_Policy,28508975
hostname3,Company_Policy_66,295925
hostname5,Company_Policy_22,82001824
hostname2,Company_Policy_33,26176885
hostnameXX,Company_Policy_XXX,0
hostnameXX,Company_Policy_XXX,41806794
hostnameYY,Company_Policy_XXX,41806794
Example CSVFile2.csv
Client Name,Policy Name,KB Size
hostname1,Company_Policy,487402555
hostname2,Company_Policy,227850666
hostname3,Company_Policy_11,8361200
hostname4,Company_Policy_11,1638838488
hostname1,Company_Policy_55,621423110
hostname10,Company_Policy,28908975
hostname3,Company_Policy_66,295928
hostname5,Company_Policy_22,92001824
hostname2,Company_Policy_33,36176885
hostname22,Company_Policy,291768854
hostname23,Company_Policy,291768854
Desired Output :
Client Name,Policy Name,TB Size
hostname4,Company_Policy_11,0.37
hostname22,Company_Policy,0.27
hostname23,Company_Policy,0.27
hostnameYY,Company_Policy_XXX,-0.03
hostnameXX,Company_Policy_XXX,-0.04
Using this Join-Object cmdlet (see also: what's the best way to join two tables into one?):
$CSV2 | FullJoin $CSV1 `
-On 'Client Name','Policy Name' `
-Property 'Client Name',
'Policy Name',
#{'TB Size' = {[math]::Round(($Left.'KB Size' - $Right.'KB Size') * 1KB / 1TB, 2)}} |
Where-Object {[math]::Abs($_.'TB Size') -gt 0.01}
Result:
Client Name Policy Name TB Size
----------- ----------- -------
hostname4 Company_Policy_11 -0.37
hostname1 Company_Policy_55 -0.09
hostnameXX Company_Policy_XXX 0.04
hostnameYY Company_Policy_XXX 0.04
hostname22 Company_Policy -0.27
hostname23 Company_Policy -0.27
Update 2019-11-24
Improved -Where parameter which will now also apply to outer joins.
You can now use the -Where parameter instead of the Where-Object cmdlet for these type of queries, e.g.:
$Actual = $CSV2 | FullJoin $CSV1 `
-On 'Client Name','Policy Name' `
-Property 'Client Name',
'Policy Name',
#{'TB Size' = {[math]::Round(($Left.'KB Size' - $Right.'KB Size') / 1GB, 2)}} `
-Where {[math]::Abs($Left.'KB Size' - $Right.'KB Size') -gt 100MB}
The advantage of using the -Where parameter is that there is a slight performance improvement as some output objects aren't required to be created at all.
Note 1: The -Where parameter applies to the $Left and $Right objects that represent respectively each $LeftInput and $RightInput object and not the Output Object. In other words you can't use e.g. the calculated TB Size property in the -Where expression for this example.
Note 2: The $Right object always exists in a Left Join or full join even if there is no relation. In case there is no relation, all properties of the $Right object will be set to $Null. The same applies to the $Left object in a right join or full join.
I have never used the Join-Object module, so I wrote it using standard cmdlets.
$data1 = Import-Csv "CSVFile1.csv"
$data1 | ForEach-Object { $_."KB Size" = -1 * $_."KB Size" } # Convert to negative value
$data2 = Import-Csv "CSVFile2.csv"
#($data2; $data1) | Group-Object "Client Name","Policy Name" | ForEach-Object {
$size = [Math]::Round(($_.Group | Measure-Object "KB Size" -Sum).Sum * 1KB / 1TB, 2)
if ($size -ge 0 -and $size -lt 0.1) { return }
[pscustomobject]#{
"Client Name" = $_.Group[0]."Client Name"
"Policy Name" = $_.Group[0]."Policy Name"
"TB Size" = $size
}
}