I have a csv like this:
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
I need to group by the MPN column then check the oldest order first to see if Backordered_by_Pallet is greater than or equal to Reserved_Sum.
If it is -cge display only that row for that group. if its not, then check to see if the next order plus the first order is and display both of them and so on. until the backorered total is greater than Reserved_Sum
This is what it looks like in my head:
look at oldest order first for matching MPN
if oldest orders Backordered > Reserved Sum
Then only display oldest order
Else if oldest order + second oldest order > Reserved Sum
then display both orders
Else If Less Than, Add Next Order etc
Expected Output:
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
2243,30,12014,4/26/2021,1.4,1
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
5647,87,13304,9/23/2021,0.01,1
I have gotten different pieces to work, but i cant figure out how to put it all together:
returning if its greater or not is easy enough:
$Magic | ForEach-Object {
If ($_.Backordered_by_Pallet -cge $_.Reserved_Sum) {$_}
Else {"Nothing To Order"}
}
and i have tried adding in a group by
$Magic | Group-Object MPN | ForEach-Object {
If ($_.group.Backordered_by_Pallet -cge $_.group.Reserved_Sum) {$_}
Else {"Nothing_Left_To_Order"}
}
but that displays the whole group or nothing and im not sure how to combine it all, not to mention how to add the previous rows amount if needed.
I believe i need to do a several layer deep for-each so i group the MPN, make an array for just that one mpn, then a for each on that array (sorted by oldest) (not sure how to pull the previous row to add) then export just the results, then the loop moves on to the next group and so on.
Like this? I know this is not real, i jut cant figure it out
$Magic_Hash = $Magic_File | Group-Object -Property MPN -AsHashTable | Sort $_.group.Customer_Order_Date
ForEach ($item in $Magic_Hash) {
If ($item.group.Backordered_by_Pallet -cge $_.group.Reserved_Sum) {$_}
Elseif ($item.group.Backordered_by_Pallet + $item.group.Backordered_by_Pallett["2nd oldest order"] -cge $_.group.Reserved_Sum) {$_}
else {"Nothing_Left"}
}
```
Thank you so much for all your help this community is amazing
The code itself is quite awful, but I believe this works. I added comments to understand more or less the thought process.
One thing to note is, "Nothing To Order" has no place or is not defined how you want to display this since, it is a string and if you need to display this information it would probably have to be inserted on one of the cells or create a new column for this.
#'
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
'# |ConvertFrom-Csv |
Group-Object MPN | ForEach-Object {
$skip = $false
[double]$backorderSum = 0
# Sort by Customer_Order_Date, oldest will be first in line
foreach($line in $_.Group | Sort-Object {[datetime]$_.Customer_Order_Date})
{
if($skip)
{
continue
}
# If Backordered_by_Pallet is greater than or equal to Reserved_Sum
if([double]$line.Backordered_by_Pallet -ge [double]$line.Reserved_Sum)
{
# Display this line and skip the rest
$skip = $true
$line
}
else
{
# Display this line
$line
# Keep a record of previous Values
$backorderSum += $line.Backordered_by_Pallet
# Until this record is greater than or equal to Reserved_Sum
if($backorderSum -ge [double]$line.Reserved_Sum)
{
# Skip the rest when this condition is met
$skip = $true
}
}
}
} | FT
OUTPUT
MPN Per_Pallet Customer_Order Customer_Order_Date Backordered_by_Pallet Reserved_Sum
--- ---------- -------------- ------------------- --------------------- ------------
501 116.82 12055 4/28/2021 3.18 1.02
2243 30 12014 4/26/2021 1.4 1
2435 50.22 12014 4/26/2021 1 2
2435 50.22 13311 9/24/2021 1.14 2
218 40 13236 9/15/2021 3 5
218 40 13382 10/4/2021 3 5
7593 64 12670 7/2/2021 5 5
484 8 12582 6/22/2021 0.38 2
484 8 12798 7/16/2021 1.38 2
484 8 13255 9/18/2021 1 2
5647 87 13304 9/23/2021 0.01 1
First step is to group the records based on the MPN column/property, so let's do that first, using the aptly named Group-Object cmdlet:
$records = #'
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
'# |ConvertFrom-Csv
$groups = $records |Group-Object MPN
Now that they're all grouped together correctly, we can start going through each group, sort the associated records by date/order number, and then output the first one that matches the condition:
foreach($group in $groups){
# sort records by order number
$recordsInGroup = $group.Group |Sort-Object Customer_Order
# filter records based on the criteria, output only the first 1
$recordsInGroup |Where-Object { +$_.Backordered_by_Pallet -ge $_.Reserved_Sum } |Select-Object -First 1
}
The + in front of $_.Backordered_by_Pallet in the Where-Object filter will mae PowerShell convert the value to a [double], ensuring correct numeric comparison with $_.Reserved_Sum
I have a csv file, that contains the next data:
Pages,Pages BN,Pages Color,Customer
145,117,28,Report_Alexis
46,31,15,Report_Alexis
75,27,48,Report_Alexis
145,117,28,Report_Jack
46,31,15,Report_Jack
75,27,48,Report_Jack
145,117,28,Report_Amy
46,31,15,Report_Amy
75,27,48,Report_Amy
So what i need to do , is sum each column based on the report name and the export to another csv file like this
Pages,Pages BN,Pages Color,Customer
266,175,91,Report_Alexis
266,175,91,Report_Jack
266,175,91,Report_Amy
How can i do this?
I tried with this:
$coutnpages = Import-Csv "C:\temp\testcount\final file2.csv" |where {$_.Filename -eq 'Report_Jack'} | Measure-Object -Property Pages -Sum
then
$Countpages.Sum | Set-Content -Path "C:\temp\testcount\final file3.csv"
But this is just one, and then i dont know how to follow.
Can you please help me?
Working code
$IdentityColumns = #('Customer')
$ColumnsToSum = #('Pages', 'Pages BN', 'Pages Color')
$CSVFileInput = 'S:\SCRIPTS\1.csv'
Import-Csv -Path $CSVFileInput |
Group-Object -Property $IdentityColumns |
ForEach-Object {
$resultHT = #{ Customer = $_.Name } # This is result HashTable (Key-Value collection). We add here sum's next line.
#($_.Group | Measure-Object -Property $ColumnsToSum -Sum ) | # Run calculating of sum for all $ColumnsToSum`s in one line
ForEach-Object { $resultHT[$_.Property] = $_.Sum } # For each calculated property we set property in result HashTable
return [PSCustomObject]$resultHT # Convert HashTable to PSCustomObject. This better.
} | # End of ForEach-Object by groups
Select #($ColumnsToSum + $IdentityColumns) | # This sets order of columns. It may be important.
Out-GridView # Or replace with Export-Csv
#Export-Csv ...
Explanation:
Use Group-Object to make collection of groups. Groups have 4 properties:
Name - Name of group, equals to stingified values of property(-ies) you're grouping by
Values - Collection of values of properties you're grouping by (not stringified)
Count - Count of elements grouped into this group
Group - Values of elements grouped into this group
For grouping by single string properties (in this case it is ok), you can easily use Name of group, otherwise, always use Values.
So after Group-Object, you iterate not on collection-of-rows of CSV, but on collection-of-collections-of-rows grouped by some condition.
Measure-Object can process more than one propertiy for single pass (not mixing between values from different properties), we use this actively. This results in array of objects with attribute Property equal to passed to Measure-Object and value (Sum in our case). We move those Property=Sum pairs to hashtable.
[PSCustomObject] converts hashtable to object. Objects are always better for output.
I have a question similar to this one but with a twist:
Powershell Group Object in CSV and exporting it
My file has 42 existing headers. The delimiter is a standard comma, and there are no quotation marks in this file.
master_account_number,sub,txn,cur,last,first,address,address2,city,state,zip,ssn,credit,email,phone,cell,workphn,dob,chrgnum,cred,max,allow,neg,plan,downpayment,pmt2,min,clid,cliname,owner,merch,legal,is_active,apply,ag,offer,settle_perc,min_pay,plan2,lstpmt,orig,placedate
The file's data (the first 6 columns) looks like this:
master_account_number,sub,txn,cur,last,first
001,12,35,50.25,BIRD, BIG
001,34,47,100.10,BIRD, BIG
002,56,9,10.50,BUNNY, BUGS
002,78,3,20,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
I am only working with the first column master_account_number and the 4th column cur.
I want to check for duplicates of the"master_account_number" column, if found then add the totals up from the 4th column "cur" for only those dupes found and then do a combine for any rows that we just did a sum on. The summed value from the dupes should replace the cur value in our combined row.
With that said, our out-put should look like so.
master_account_number,sub,txn,cur,last,first
001,12,35,150.35,BIRD, BIG
002,56,9,30.50,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
Now that we have that out the way, here is how this question differs. I want to keep all 42 columns intact in the out-put file. In the other question I referenced above, the input was 5 columns and the out-put was 4 columns and this is not what I'm trying to achieve. I have so many more headers, I'd hate to have specify individually all 42 columns. That seems inefficient anyhow.
As for what I have so far for code... not much.
$revNB = "\\server\path\example.csv"
$global:revCSV = import-csv -Path $revNB | ? {$_.is_active -eq "Y"}
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{Expression={ ($_.Group|Measure-Object cur -Sum).Sum }}
Ultimately I want the output to look identical to the input, only the output should merge duplicate account numbers rows, and add all the "cur" values, where the merged row contains the sum of the grouped cur values, in the cur field.
Last Update: Tried Rich's solution and got an error. Modified what he had to this $dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object Name, #{Name='curSum'; Expression={ ($_.Group | Measure-Object cur -Sum).Sum}}
And this gets me exactly what my own code got me so I am still looking for a solution. I need to output this CSV with all 42 headers. Even for items with no duplicates.
Other things I've tried:
This doesn't give me the data I need in the columns, the columns are there but they are blank.
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{ expression={$_.Name}; label='master_account_number' },
sub_account_number,
charge_txn,
#{Name='current_balance'; Expression={ ($_.Group | Measure-Object current_balance -Sum).Sum },
last,
}
You're pretty close, but you used current_balance where you probably meant cur.
Here's a start:
$dupesGrouped = $revCSV | Group-Object master_account_number |
Select-Object Name, #{N='curSum'; E={ ($_.Group | Measure-Object cur -Sum).Sum},
#{N='last'; E={ ($_.Group | Select-Object last -first 1).last} }
You can add the other fields by adding Name;Expression hashtables for each of the fields you want to summarize. I assumed you would want to select the first occurrence of repeated last name for the same master_account_number. The output will be incorrect if the last name differs for the same master_account_number.
In the case of changing only part of the data, there is also the following way.
$dupesGrouped = $revCSV | Group-Object master_account_number | ForEach-Object {
# copy the first data in order not to change original data
$new = $_.Group[0].psobject.Copy()
# update the value of cur property
$new.cur = ($_.Group | Measure-Object cur -Sum).Sum
# output
$new
}