Powershell - group array objects by properties and sum - powershell

I am working on getting some data out of CSV file with a script and have no idea to solve the most important part - I have an array with few hundred lines, there are about 50 Ids in those lines, and each Id has a few different services attached to it. Each line has a price attached.
I want to group lines by ID and Service and I want each of those groups in some sort of variable so I can sum the prices. I filter out unique IDs and Services earlier in a script because they are different all the time.
Some example data:
$data = #(
[pscustomobject]#{Id='1';Service='Service1';Propertyx=1;Price='5'}
[pscustomobject]#{Id='1';Service='Service2';Propertyx=1;Price='4'}
[pscustomobject]#{Id='2';Service='Service1';Propertyx=1;Price='17'}
[pscustomobject]#{Id='3';Service='Service1';Propertyx=1;Price='3'}
[pscustomobject]#{Id='2';Service='Service2';Propertyx=1;Price='11'}
[pscustomobject]#{Id='4';Service='Service1';Propertyx=1;Price='7'}
[pscustomobject]#{Id='2';Service='Service3';Propertyx=1;Price='5'}
[pscustomobject]#{Id='3';Service='Service2';Propertyx=1;Price='4'}
[pscustomobject]#{Id='4';Service='Service2';Propertyx=1;Price='12'}
[pscustomobject]#{Id='1';Service='Service3';Propertyx=1;Price='8'})
$ident = $data.Id | select -unique | sort
$Serv = $data.Service | select -unique | sort
All help will be appreciated!

Use Group-Object to group objects by common values across one or more properties.
For example, to calculate the sum per Id, do:
$data |Group-Object Id |ForEach-Object {
[pscustomobject]#{
Id = $_.Name
Sum = $_.Group |Measure-Object Price -Sum |ForEach-Object Sum
}
}
Which should yield output like:
Id Sum
-- ---
1 17
2 33
3 7
4 19

Related

Powershell Display rows only if column 2 -cge column 3 for a group based on column 1

I have a csv like this:
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
I need to group by the MPN column then check the oldest order first to see if Backordered_by_Pallet is greater than or equal to Reserved_Sum.
If it is -cge display only that row for that group. if its not, then check to see if the next order plus the first order is and display both of them and so on. until the backorered total is greater than Reserved_Sum
This is what it looks like in my head:
look at oldest order first for matching MPN
if oldest orders Backordered > Reserved Sum
Then only display oldest order
Else if oldest order + second oldest order > Reserved Sum
then display both orders
Else If Less Than, Add Next Order etc
Expected Output:
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
2243,30,12014,4/26/2021,1.4,1
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
5647,87,13304,9/23/2021,0.01,1
I have gotten different pieces to work, but i cant figure out how to put it all together:
returning if its greater or not is easy enough:
$Magic | ForEach-Object {
If ($_.Backordered_by_Pallet -cge $_.Reserved_Sum) {$_}
Else {"Nothing To Order"}
}
and i have tried adding in a group by
$Magic | Group-Object MPN | ForEach-Object {
If ($_.group.Backordered_by_Pallet -cge $_.group.Reserved_Sum) {$_}
Else {"Nothing_Left_To_Order"}
}
but that displays the whole group or nothing and im not sure how to combine it all, not to mention how to add the previous rows amount if needed.
I believe i need to do a several layer deep for-each so i group the MPN, make an array for just that one mpn, then a for each on that array (sorted by oldest) (not sure how to pull the previous row to add) then export just the results, then the loop moves on to the next group and so on.
Like this? I know this is not real, i jut cant figure it out
$Magic_Hash = $Magic_File | Group-Object -Property MPN -AsHashTable | Sort $_.group.Customer_Order_Date
ForEach ($item in $Magic_Hash) {
If ($item.group.Backordered_by_Pallet -cge $_.group.Reserved_Sum) {$_}
Elseif ($item.group.Backordered_by_Pallet + $item.group.Backordered_by_Pallett["2nd oldest order"] -cge $_.group.Reserved_Sum) {$_}
else {"Nothing_Left"}
}
```
Thank you so much for all your help this community is amazing
The code itself is quite awful, but I believe this works. I added comments to understand more or less the thought process.
One thing to note is, "Nothing To Order" has no place or is not defined how you want to display this since, it is a string and if you need to display this information it would probably have to be inserted on one of the cells or create a new column for this.
#'
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
'# |ConvertFrom-Csv |
Group-Object MPN | ForEach-Object {
$skip = $false
[double]$backorderSum = 0
# Sort by Customer_Order_Date, oldest will be first in line
foreach($line in $_.Group | Sort-Object {[datetime]$_.Customer_Order_Date})
{
if($skip)
{
continue
}
# If Backordered_by_Pallet is greater than or equal to Reserved_Sum
if([double]$line.Backordered_by_Pallet -ge [double]$line.Reserved_Sum)
{
# Display this line and skip the rest
$skip = $true
$line
}
else
{
# Display this line
$line
# Keep a record of previous Values
$backorderSum += $line.Backordered_by_Pallet
# Until this record is greater than or equal to Reserved_Sum
if($backorderSum -ge [double]$line.Reserved_Sum)
{
# Skip the rest when this condition is met
$skip = $true
}
}
}
} | FT
OUTPUT
MPN Per_Pallet Customer_Order Customer_Order_Date Backordered_by_Pallet Reserved_Sum
--- ---------- -------------- ------------------- --------------------- ------------
501 116.82 12055 4/28/2021 3.18 1.02
2243 30 12014 4/26/2021 1.4 1
2435 50.22 12014 4/26/2021 1 2
2435 50.22 13311 9/24/2021 1.14 2
218 40 13236 9/15/2021 3 5
218 40 13382 10/4/2021 3 5
7593 64 12670 7/2/2021 5 5
484 8 12582 6/22/2021 0.38 2
484 8 12798 7/16/2021 1.38 2
484 8 13255 9/18/2021 1 2
5647 87 13304 9/23/2021 0.01 1
First step is to group the records based on the MPN column/property, so let's do that first, using the aptly named Group-Object cmdlet:
$records = #'
MPN,Per_Pallet,Customer_Order,Customer_Order_Date,Backordered_by_Pallet,Reserved_Sum
501,116.82,12055,4/28/2021,3.18,1.02
501,116.82,12421,6/7/2021,2.36,1.02
501,116.82,12424,6/7/2021,3.91,1.02
2243,30,12014,4/26/2021,1.4,1
2243,30,12425,6/7/2021,4.8,1
2243,30,12817,7/21/2021,0.4,1
2243,30,13359,9/29/2021,0.6,1
2435,50.22,12014,4/26/2021,1,2
2435,50.22,13311,9/24/2021,1.14,2
218,40,13236,9/15/2021,3,5
218,40,13382,10/4/2021,3,5
7593,64,12670,7/2/2021,5,5
484,8,12582,6/22/2021,0.38,2
484,8,12798,7/16/2021,1.38,2
484,8,13255,9/18/2021,1,2
484,8,13288,9/22/2021,1,2
5647,87,13304,9/23/2021,0.01,1
'# |ConvertFrom-Csv
$groups = $records |Group-Object MPN
Now that they're all grouped together correctly, we can start going through each group, sort the associated records by date/order number, and then output the first one that matches the condition:
foreach($group in $groups){
# sort records by order number
$recordsInGroup = $group.Group |Sort-Object Customer_Order
# filter records based on the criteria, output only the first 1
$recordsInGroup |Where-Object { +$_.Backordered_by_Pallet -ge $_.Reserved_Sum } |Select-Object -First 1
}
The + in front of $_.Backordered_by_Pallet in the Where-Object filter will mae PowerShell convert the value to a [double], ensuring correct numeric comparison with $_.Reserved_Sum

Sum various columns to get subtotal depending on a criteria from a row using Powershell

I have a csv file, that contains the next data:
Pages,Pages BN,Pages Color,Customer
145,117,28,Report_Alexis
46,31,15,Report_Alexis
75,27,48,Report_Alexis
145,117,28,Report_Jack
46,31,15,Report_Jack
75,27,48,Report_Jack
145,117,28,Report_Amy
46,31,15,Report_Amy
75,27,48,Report_Amy
So what i need to do , is sum each column based on the report name and the export to another csv file like this
Pages,Pages BN,Pages Color,Customer
266,175,91,Report_Alexis
266,175,91,Report_Jack
266,175,91,Report_Amy
How can i do this?
I tried with this:
$coutnpages = Import-Csv "C:\temp\testcount\final file2.csv" |where {$_.Filename -eq 'Report_Jack'} | Measure-Object -Property Pages -Sum
then
$Countpages.Sum | Set-Content -Path "C:\temp\testcount\final file3.csv"
But this is just one, and then i dont know how to follow.
Can you please help me?
Working code
$IdentityColumns = #('Customer')
$ColumnsToSum = #('Pages', 'Pages BN', 'Pages Color')
$CSVFileInput = 'S:\SCRIPTS\1.csv'
Import-Csv -Path $CSVFileInput |
Group-Object -Property $IdentityColumns |
ForEach-Object {
$resultHT = #{ Customer = $_.Name } # This is result HashTable (Key-Value collection). We add here sum's next line.
#($_.Group | Measure-Object -Property $ColumnsToSum -Sum ) | # Run calculating of sum for all $ColumnsToSum`s in one line
ForEach-Object { $resultHT[$_.Property] = $_.Sum } # For each calculated property we set property in result HashTable
return [PSCustomObject]$resultHT # Convert HashTable to PSCustomObject. This better.
} | # End of ForEach-Object by groups
Select #($ColumnsToSum + $IdentityColumns) | # This sets order of columns. It may be important.
Out-GridView # Or replace with Export-Csv
#Export-Csv ...
Explanation:
Use Group-Object to make collection of groups. Groups have 4 properties:
Name - Name of group, equals to stingified values of property(-ies) you're grouping by
Values - Collection of values of properties you're grouping by (not stringified)
Count - Count of elements grouped into this group
Group - Values of elements grouped into this group
For grouping by single string properties (in this case it is ok), you can easily use Name of group, otherwise, always use Values.
So after Group-Object, you iterate not on collection-of-rows of CSV, but on collection-of-collections-of-rows grouped by some condition.
Measure-Object can process more than one propertiy for single pass (not mixing between values from different properties), we use this actively. This results in array of objects with attribute Property equal to passed to Measure-Object and value (Sum in our case). We move those Property=Sum pairs to hashtable.
[PSCustomObject] converts hashtable to object. Objects are always better for output.

Add random users to groups from CSV file

I have a question, i'm adding users from a csv file to some groups, in this ex. it's MAX. 3 users/group.But what I want is, that I have 10 groups and then adding the random users from the csv file to the groups, then maybe i got 7 groups with 4 users and then last 3 with 3 users and thats OK.
But how do I changes this script, from just adding 3 users/group, to adding users to the "hardcoded" 10 groups, from the csv file?
ATM. I got this:
$deltager = Import-Csv C:\Users\Desktop\liste.csv
$holdstr = 3
$maxTeams = [math]::ceiling($deltager.Count/$holdstr)
$teams = #{}
$shuffled = $deltager | Get-Random -Count $deltager.Count
$shuffled | ForEach-Object { $i = 1 }{
$teams["$([Math]::Floor($i / $holdstr))"] += #($_.Navn)
$i++
}
$Grupper = $teams | Out-String
Write-Host $Grupper
Use the remainder/modulo operator % to "wrap around" the group index, then simply add one user at a time for optimum distribution:
# define number of groups
$numberOfTeams = 10
# read participant records from csv
$participants = Import-Csv C:\Users\Desktop\liste.csv
# create jagged array for the team rosters
$teams = ,#() * $numberOfTeams
# go through participant list, add to "next" group in line
$index = 0
$participants |%{
$teams[$index++ % $numberOfTeams] += #($_.Navn)
}
If you want to randomize the order of participants, simply sort the list randomly before populating the groups:
$participants = Import-Csv C:\Users\Desktop\liste.csv |Sort-Object {Get-Random}
Each item in $teams will be another array of names, so to enumerate them:
0..$teams.Length |%{
Write-Host "Team $($_+1):" $($teams[$_] -join ', ')
}

Powershell - Import-CSV Group-Object SUM a number from grouped objects and then combine all grouped objects to single rows

I have a question similar to this one but with a twist:
Powershell Group Object in CSV and exporting it
My file has 42 existing headers. The delimiter is a standard comma, and there are no quotation marks in this file.
master_account_number,sub,txn,cur,last,first,address,address2,city,state,zip,ssn,credit,email,phone,cell,workphn,dob,chrgnum,cred,max,allow,neg,plan,downpayment,pmt2,min,clid,cliname,owner,merch,legal,is_active,apply,ag,offer,settle_perc,min_pay,plan2,lstpmt,orig,placedate
The file's data (the first 6 columns) looks like this:
master_account_number,sub,txn,cur,last,first
001,12,35,50.25,BIRD, BIG
001,34,47,100.10,BIRD, BIG
002,56,9,10.50,BUNNY, BUGS
002,78,3,20,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
I am only working with the first column master_account_number and the 4th column cur.
I want to check for duplicates of the"master_account_number" column, if found then add the totals up from the 4th column "cur" for only those dupes found and then do a combine for any rows that we just did a sum on. The summed value from the dupes should replace the cur value in our combined row.
With that said, our out-put should look like so.
master_account_number,sub,txn,cur,last,first
001,12,35,150.35,BIRD, BIG
002,56,9,30.50,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
Now that we have that out the way, here is how this question differs. I want to keep all 42 columns intact in the out-put file. In the other question I referenced above, the input was 5 columns and the out-put was 4 columns and this is not what I'm trying to achieve. I have so many more headers, I'd hate to have specify individually all 42 columns. That seems inefficient anyhow.
As for what I have so far for code... not much.
$revNB = "\\server\path\example.csv"
$global:revCSV = import-csv -Path $revNB | ? {$_.is_active -eq "Y"}
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{Expression={ ($_.Group|Measure-Object cur -Sum).Sum }}
Ultimately I want the output to look identical to the input, only the output should merge duplicate account numbers rows, and add all the "cur" values, where the merged row contains the sum of the grouped cur values, in the cur field.
Last Update: Tried Rich's solution and got an error. Modified what he had to this $dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object Name, #{Name='curSum'; Expression={ ($_.Group | Measure-Object cur -Sum).Sum}}
And this gets me exactly what my own code got me so I am still looking for a solution. I need to output this CSV with all 42 headers. Even for items with no duplicates.
Other things I've tried:
This doesn't give me the data I need in the columns, the columns are there but they are blank.
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{ expression={$_.Name}; label='master_account_number' },
sub_account_number,
charge_txn,
#{Name='current_balance'; Expression={ ($_.Group | Measure-Object current_balance -Sum).Sum },
last,
}
You're pretty close, but you used current_balance where you probably meant cur.
Here's a start:
$dupesGrouped = $revCSV | Group-Object master_account_number |
Select-Object Name, #{N='curSum'; E={ ($_.Group | Measure-Object cur -Sum).Sum},
#{N='last'; E={ ($_.Group | Select-Object last -first 1).last} }
You can add the other fields by adding Name;Expression hashtables for each of the fields you want to summarize. I assumed you would want to select the first occurrence of repeated last name for the same master_account_number. The output will be incorrect if the last name differs for the same master_account_number.
In the case of changing only part of the data, there is also the following way.
$dupesGrouped = $revCSV | Group-Object master_account_number | ForEach-Object {
# copy the first data in order not to change original data
$new = $_.Group[0].psobject.Copy()
# update the value of cur property
$new.cur = ($_.Group | Measure-Object cur -Sum).Sum
# output
$new
}

re-arrange and combine powershell custom objects

I have a system that currently reads data from a CSV file produced by a separate system that is going to be replaced.
The imported CSV file looks like this
PS> Import-Csv .\SalesValues.csv
Sale Values AA BB
----------- -- --
10 6 5
5 3 4
3 1 9
To replace this process I hope to produce an object that looks identical to the CSV above, but I do not want to continue to use a CSV file.
I already have a script that reads data in from our database and extracts the data that I need to use. I'll not detail the fairly long script that preceeds this point but in effect it looks like this:
$SQLData = Custom-SQLFunction "SELECT * FROM SALES_DATA WHERE LIST_ID = $LISTID"
$SQLData will contain ~5000+ DataRow objects that I need to query.
One of those DataRow object looks something like this:
lead_id : 123456789
entry_date : 26/10/2018 16:51:16
modify_date : 01/11/2018 01:00:02
status : WRONG
user : mrexample
vendor_lead_code : TH1S15L0NGC0D3
source_id : A543212
list_id : 333004
list_name : AA Some Text
gmt_offset_now : 0.00
SaleValue : 10
list_name is going to be prefixed with AA or BB.
SaleValue can be any integer 3 and up, however realistically extremely unlikely to be higher than 100 (as this is a monthly donation) and will be one of 3,5,10 in the vast majority of occurrences.
I already have script that takes the content of list_name, creates and populates the data I need to use into two separate psobjects ($AASalesValues and $BBSalesValues) that collates the total numbers of 'SaleValue' across the data set.
Because I cannot reliably anticipate the value of any SaleValue I have to dynamically create the psobjects properties like this
foreach ($record in $SQLData) {
if ($record.list_name -match "BB") {
if ($record.SaleValue -gt 0) {
if ($BBSalesValues | Get-Member -Name $($record.SaleValue) -MemberType Properties) {
$BBSalesValues.$($record.SaleValue) = $BBSalesValues.$($record.SaleValue)+1
} else {
$BBSalesValues | Add-Member -Name $($record.SaleValue) -MemberType NoteProperty -Value 1
}
}
}
}
The two resultant objects look like this:
PS> $AASalesValues
10 5 3 50
-- - - --
17 14 3 1
PS> $BBSalesvalues
3 10 5 4
- -- - -
36 12 11 1
I now have the data that I need, however I need to format it in a way that replicates the format of the CSV so I can pass it directly to another existing powershell script that is configured to expect the data in the format that the CSV is in, but I do not want to write the data to a file.
I'd prefer to pass this directly to the next part of the script.
Ultimately what I want to do is to produce a new object/some output that looks like the output from Import-Csv command at the top of this post.
I'd like a new object, say $OverallSalesValues, to look like this:
PS>$overallSalesValues
Sale Values AA BB
50 1 0
10 17 12
5 14 11
4 0 1
3 3 36
In the above example the values from $AASalesValues is listed under the AA column, the values from $BBSalesValues is listed under the BB column, with the rows matching the headers of the two original objects.
I did try this with hashtables but I was unable to work out how to both create them from dynamic values and format them to how I needed them to look.
Finally got there.
$TotalList = #()
foreach($n in 3..200){
if($AASalesValues.$n -or $BBSalesValues.$n){
$AACount = $AASalesValues.$n
$BBcount = $BBSalesValues.$n
$values = [PSCustomObject]#{
'Sale Value'= $n
AA = $AACount
BB = $BBcount
}
$TotalList += $values
}
}
$TotalList
produces an output of
Sale Value AA BB
---------- -- --
3 3 36
4 2
5 14 11
10 18 12
50 1
Just need to add a bit to include '0' values instead of $null.
I'm going to assume that $record contains a list of the database results for either $AASalesValues or $BBSalesValues, not both, otherwise you'd need some kind of selector to avoid counting records of one group with the other group.
Group the records by their SaleValue property as LotPings suggested:
$BBSalesValues = $record | Group-Object SaleValue -NoElement
That will give you a list of the SaleValue values with their respective count.
PS> $BBSalesValues
Count Name
----- ----
36 3
12 10
11 5
1 4
You can then update your CSV data with these values like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$($AASalesValues; $BBSalesValues) | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$csv[$_] = New-Object -Type PSObject -Property #{
'Sale Values' = $_
'AA' = 0
'BB' = 0
}
}
}
# update records with values from $AASalesValues
$AASalesValues | ForEach-Object {
[int]$csv[$_.Name].AA += $_.Count
}
# update records with values from $BBSalesValues
$BBSalesValues | ForEach-Object {
[int]$csv[$_.Name].BB += $_.Count
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType
Even with your updated question the approach would be pretty much the same, you'd just add another level of grouping for collecting the sales numbers:
$sales = #{}
$record | Group-Object {$_.list_name.Split()[0]} | ForEach-Object {
$sales[$_.Name] = $_.Group | Group-Object SaleValue -NoElement
}
and then adjust the merging to something like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$sales.Values | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$prop = #{'Sale Values' = $_}
$sales.Keys | ForEach-Object {
$prop[$_] = 0
}
$csv[$_] = New-Object -Type PSObject -Property $prop
}
}
# update records with values from $sales
$sales.GetEnumerator() | ForEach-Object {
$name = $_.Key
$_.Value | ForEach-Object {
[int]$csv[$_.Name].$name += $_.Count
}
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType