I have the following scenario.
A .csv file containing orders. Orders with multiple items are on separate rows.
I'm grouping rows by order id and sku to perform sums on some columns prior to exporting to .csv
I have the following code below which performs the grouping and sums and can write out the results to separate .csv
What I need to do is append the original .csv file by replacing the original rows with the summed rows.
Any help greatly appreciated
For example:
example
$output = #()
# Import .CSV and group on amazon-order-id and sku
# Filter group to only give lines with multiple occurances of each sku per order
Import-Csv D:\Exports\Test\AMAZON\*.csv | Group-Object amazon-order-id, sku | Where-Object {$_.Count -gt 1} |
# Loop through group object. Take first line of each group and place in $new variable
# Using dot notation, sum required columns and add rows to $output variable
ForEach-Object {
$new = $_.Group[0].psobject.Copy()
$new.'quantity-shipped' = ($_.Group | Measure-Object quantity-shipped -Sum).Sum
$new.'item-price' = ($_.Group | Measure-Object item-price -Sum).Sum
$new.'item-tax' = ($_.Group | Measure-Object item-tax -Sum).Sum
$new.'shipping-price' = ($_.Group | Measure-Object shipping-price -Sum).Sum
$new.'shipping-tax' = ($_.Group | Measure-Object shipping-tax -Sum).Sum
$new.'gift-wrap-price' = ($_.Group | Measure-Object gift-wrap-price -Sum).Sum
$new.'gift-wrap-tax' = ($_.Group | Measure-Object gift-wrap-tax -Sum).Sum
$new.'item-promotion-discount' = ($_.Group | Measure-Object item-promotion-discount -Sum).Sum
$new.'ship-promotion-discount' = ($_.Group | Measure-Object ship-promotion-discount -Sum).Sum
$output += $new
}
#Select all group members and export to .csv file
$output | select * | Export-Csv D:\Exports\Test\AMAZON\Import_Me.csv -not
I am assuming that you want to group common rows into a single row and only be left with those grouped single rows plus uncommon rows.
$CSVFiles = Get-ChildItem -Path D:\Exports\Test\AMAZON -File -Filter '*.csv' | Where Extension -eq '.csv'
$output = foreach ($csv in $CSVFiles) {
$csvOutput = Import-Csv $csv.FullName
$group = $csvOutput | Group-Object amazon-order-id, sku
$group | Where Count -gt 1 | Foreach-Object {
$_.Group[0].'quantity-shipped' = ($_.Group | Measure-Object quantity-shipped -Sum).Sum
$_.Group[0].'item-price' = ($_.Group | Measure-Object item-price -Sum).Sum
$_.Group[0].'item-tax' = ($_.Group | Measure-Object item-tax -Sum).Sum
$_.Group[0].'shipping-price' = ($_.Group | Measure-Object shipping-price -Sum).Sum
$_.Group[0].'shipping-tax' = ($_.Group | Measure-Object shipping-tax -Sum).Sum
$_.Group[0].'gift-wrap-price' = ($_.Group | Measure-Object gift-wrap-price -Sum).Sum
$_.Group[0].'gift-wrap-tax' = ($_.Group | Measure-Object gift-wrap-tax -Sum).Sum
$_.Group[0].'item-promotion-discount' = ($_.Group | Measure-Object item-promotion-discount -Sum).Sum
$_.Group[0].'ship-promotion-discount' = ($_.Group | Measure-Object ship-promotion-discount -Sum).Sum
}
$group | Foreach-Object { $_.Group[0] } # Output to variable only
$group | Foreach-Object { $_.Group[0] } | Export-Csv $csv.FullName -NoType
}
$output
Explanation:
$CSVFiles is a collection of CSV files. We need to be able to target them one-by-one to know which file to update. Each file is targeted using a foreach loop with the current file being $csv.
Since $csvOutput is the contents of a CSV file as an array of PSCustomObjects, we can update each object with a Foreach-Object and it will be reflected back in $csvOutput.
$group is assigned the Group-Object output of each CSV file. Using a variable here minimizes the grouping action to only once per file. First, perform the modifications on each group where there are multiple matches. Using your logic, the first object in a grouping is modified. A Foreach-Object is used to go through all the groupings.
Once all modifications are done for one CSV file, $group is output (this includes multiple and single groupings) and only the first object in a grouping is selected ($_.Group[0]) using another Foreach-Object, which works for the single object groupings as well. That output is passed into Export-Csv to update the appropriate file.
$output lists all CSV contents after modifications.
Related
I am trying to import multiple csv files and output a total score, i don't want to create another csv for the output, below is how the csv is stored. below is csv 1
and this is csv 2
i want to group by Name and total the wins, please see code below that i have tried
get-item -Path "File Path" |
ForEach-Object {
import-csv $_|
Group-Object Name
Select-Object Name, #{ n='Wins'; e={ ($_.Group | Measure-Object Wins -Sum).Sum } }
}
i was hoping for an outcome like below
any help would be awesome
for some reason the current code is showing the below
Its looking better but still not grouping on Name
This will give you the output you are expecting, with the names and total wins for each player.
$csv1 = import-csv "File path of CSV 1"
$csv2 = import-csv "File path of CSV 2"
$allRecords = $csv1 + $csv2
$allRecords | Group-Object Name | Select-Object Name, #{ n='Wins'; e={ ($_.Group | Measure-Object Wins -Sum).Sum } }
the ouptut
Update
With multiple Csv Files
$allRecords = #()
$directory = "Path of the directory containing the CSV files"
$filePaths = Get-ChildItem -Path $directory -Filter "*.csv"
foreach ($filePath in $filePaths) {
$csv = import-csv $filePath
$allRecords += $csv
}
$allRecords | Group-Object Name | Select-Object Name, #{ n='Wins'; e={ ($_.Group | Measure-Object Wins -Sum).Sum } }
If you have a very high number of csv files, you'll find something like this much faster:
$CombinedRecords = Get-ChildItem -Filter *.csv -Path C:\temp | Select-Object -ExpandProperty FullName | Import-Csv
$CombinedRecords | Group-Object Name | Select-Object Name, #{ n='Wins'; e={ ($_.Group | Measure-Object Wins -Sum).Sum } }
It can even be a one-liner:
Get-ChildItem -Filter *.csv -Path C:\temp | Select-Object -ExpandProperty FullName | Import-Csv | Group-Object Name | Select-Object Name, #{ n='Wins'; e={ ($_.Group | Measure-Object Wins -Sum).Sum } }
I created a powershell command to collect and sort a txt file.
Input example:
a,1
a,1
b,3
c,4
z,5
The output that I have to get:
a,2
b,3
c,4
z,5
Here is my code so far:
$filename = 'test.txt'
Get-Content $filename | ForEach-Object {
$Line = $_.Trim() -Split ','
New-Object -TypeName PSCustomObject -Property #{
Alphabet= $Line[0]
Value= [int]$Line[1]
}
}
example with negative value input
a,1,1
a,1,2
b,3,1
c,4,1
z,5,0
Import your text file as a CSV file using Import-Csv with given column names (-Header), which parses the lines into objects with the column names as property names.
Then use Group-Object to group the objects by shared .Letter values, i.e. the letter that is in the first field of each line.
Using ForEach-Object, process each group of objects (lines), and output a single string that contains the shared letter and the sum of all .Number property values across the objects that make up the group, obtained via Measure-Object -Sum:
#'
a,1
a,1
b,3
c,4
z,5
'# > ./test.txt
Import-Csv -Header Letter, Number test.txt |
Group-Object Letter |
ForEach-Object {
'{0},{1}' -f $_.Name, ($_.Group | Measure-Object -Sum -Property Number ).Sum
}
Note: The above OOP approach is flexible, but potentially slow.
Here's a plain-text alternative that will likely perform better:
Get-Content test.txt |
Group-Object { ($_ -split ',')[0] } |
ForEach-Object {
'{0},{1}' -f $_.Name, ($_.Group -replace '^.+,' | Measure-Object -Sum).Sum
}
See also:
-split, the string splitting operator
-replace, the regular-expression-based string replacement operator
Grouping with summing of multiple fields:
It is easy to extend the OOP approach: add another header field to name the additional column, and add another output field that sums that added column's values for each group too:
#'
a,1,10
a,1,10
b,3,30
'# > ./test.txt
Import-Csv -Header Letter, NumberA, NumberB test.txt |
Group-Object Letter |
ForEach-Object {
'{0},{1},{2}' -f $_.Name,
($_.Group | Measure-Object -Sum -Property NumberA).Sum,
($_.Group | Measure-Object -Sum -Property NumberB).Sum
}
Output (note the values in the a line):
a,2,20
b,3,30
Extending the plain-text approach requires a bit more work:
#'
a,1,10
a,1,10
b,3,30
c,4,40
z,5,50
'# > ./test.txt
Get-Content test.txt |
Group-Object { ($_ -split ',')[0] } |
ForEach-Object {
'{0},{1},{2}' -f $_.Name,
($_.Group.ForEach({ ($_ -split ',')[1] }) | Measure-Object -Sum).Sum,
($_.Group.ForEach({ ($_ -split ',')[2] }) | Measure-Object -Sum).Sum
}
One way to go about this is using Group-Object for the count, and then replacing the current number after the comma with the count.
$filename = 'test.txt'
Get-Content $filename | Group-Object |
ForEach-Object -Process {
if ($_.Count -ne 1) {
$_.Name -replace '\d',$_.Count
}
else {
$_.Name
}
} | ConvertFrom-Csv -Header 'Alphabet','Value'
I've got a Question. I have a Script which gets everybodys Usernames and there sums der pst Files and puts out a Table with all Users and the Size of all ther pst Files in GB.
$Gesamt = Get-ChildItem $Verzeichniss |Select-Object Name,#{Name='TotalSizeInGB';Expression={ (Get-ChildItem -Path $Verzeichniss$($_.Name)\ , $Verzeichniss$($_.Name)\Archiv\ , $Verzeichniss$($_.Name)\Outlook\ -Filter *$($FileType) | Measure Length -Sum).Sum /1.GB}} | Sort-Object -Property TotalSizeInGB -Descending | Select-Object -Property Name,TotalSizeInGB -First 30
My Problem now is that the size of the pst files is about 10 digits long after the ".". But I don't know how i maybe can make it only 2 digits long after the ".". Do you guys have a idea?
try using round, like so
$a = 111.2226
[math]::Round($a,2)
will give you: 111.22 and
[math]::Round($a)
will give you: 111
Solution:
$Gesamt = Get-ChildItem $Verzeichniss |Select-Object Name,#{Name='TotalSizeInGB';Expression={ [math]::Round(((Get-ChildItem -Path $Verzeichniss$($_.Name)\ , $Verzeichniss$($_.Name)\Archiv\ , $Verzeichniss$($_.Name)\Outlook\ -Filter *$($FileType) | Measure Length -Sum).Sum /1.GB),2)}} | Sort-Object -Property TotalSizeInGB -Descending | Select-Object -Property Name,TotalSizeInGB -First 30
Modify the query accordingly.
ref: https://devblogs.microsoft.com/scripting/powertip-use-powershell-to-round-to-specific-decimal-place/
If this is a display problem, then you can use string formatting to get you there
$Gesamt = Get-ChildItem $Verzeichniss |Select-Object Name,#{Name='TotalSizeInGB';Expression={ (Get-ChildItem -Path $Verzeichniss$($_.Name)\ , $Verzeichniss$($_.Name)\Archiv\ , $Verzeichniss$($_.Name)\Outlook\ -Filter *$($FileType) | Measure Length -Sum).Sum /1.GB}} | Sort-Object -Property TotalSizeInGB -Descending | Select-Object -Property Name,TotalSizeInGB -First 30
Output results using format string specifying digits
$Gesamt | ForEach-Object { Write-Output ( 'Name: {0}; Total Size(GB): {1:F2}' -f $_.Name, $_.TotalSizeInGB ) }
I have a .csv file which I'm grouping on two properties 'DN', 'SKU' and then performing a sum on another property 'DeliQty'
This works fine and the sum is reflected back to the group.
However I then need to re group just on 'DN' and write out to separate files.
I've tried Select-Object -Expand Group but this reverts to the original contents without the summed lines.
Is there a way to un group preserving the summed lines and then group again?
$CSVFiles = Get-ChildItem -Path C:\Scripts\INVENTORY\ASN\IMPORT\ -Filter *.csv
foreach ($csv in $CSVFiles) {
$group = Import-Csv $csv.FullName | Group-Object DN, SKU
$group | Where Count -gt 1 | ForEach-Object {
$_.Group[0].'DeliQty' = ($_.Group | Measure-Object DeliQty -Sum).Sum
}
}
You may do the following:
$CSVFiles = Get-ChildItem -Path C:\Scripts\INVENTORY\ASN\IMPORT\ -Filter *.csv
foreach ($csv in $CSVFiles) {
$group = Import-Csv $csv.FullName | Group-Object DN, SKU | Foreach-Object {
if ($_.Count -gt 1) {
$_.Group[0].DeliQty = ($_.Group | Measure-Object DeliQty -Sum).Sum
}
$_.Group[0]
}
# outputs summed and single group objects
$group
}
Here is what I am trying to do:
Search my computer for files ending with a .doc, .docx, .xls, or .xlsx
Output the filenames and sizes (in groups by file extension) to a text file named “File_Summary.txt”.
I also want the total of the number of files and total file size for each file extension listed in the output.
I can't even get past the check folder part:
$Folder_To_Check = C:\AIU
$Report_File_Location = "File_Summary.txt"
$files= Get-Childitem -Path $Folder_To_Check-Include *doc, *docx, *xls, *xlsx $Report_File_Location
$totalfiles = ($files | Measure-Object).Count
$totalsize = ($files | Measure-Object -Sum Length).Sum
Update. Here is my code again with some changes I made from the suggestions, but I'm still coming up empty.
$Report_File_Location = "File_Summary.txt"
$files= Get-Childitem C:\AIU -include "*doc", "*docx", "*xls", "*xlsx"-recurse | Sort-Object | Get-Unique -asString
$files | Out-File $Report_File_Location
$totalfiles = ($files | Measure-Object).Count
$totalsize = ($files | Measure-Object -Sum Length).Sum
write-host "totalfiles: $totalfiles"
write-host "totalsize: $totalsize"
The more I was looking about this I think I shouldn't use the Sort-Object but to use Group Extension -NoElement | Sort Count -Descending that would give me the total number of files for each type?
UPDATE
Thanks to help of people here I got my code to work. But I had an issue where it was saying that my file didn't exist. The problem? I needed to list the entire folder path and use SINGLE quotes.
This code works:
$Folder_To_Check = 'C:\Users\Sarah\Documents\AIU'
$Report_File_Location = "File_Summary.txt"
$results = Get-ChildItem $Folder_To_Check -Include *.doc,*.docx,*.xls,*.xlsx -Recurse
$results | Group-Object extension | ForEach-Object {
[PSCustomObject]#{
Results = $_.Name
Count = $_.Count
Size = [Math]::Round(($_.Group | Measure-Object -Sum Length | Select-Object - ExpandProperty Sum) / 1MB,2)
}
} | Out-File $Report_File_Location -Append
BIG props to Matt for helping me organize my results so nice. Thank you for helping me learn.
$Folder_To_Check = C:\AIU
$Report_File_Location = "File_Summary.txt"
$results = Get-ChildItem $Folder_To_Check -Include *.doc,*.docx,*.xls,*.xlsx -Recurse
$results | Group-Object extension | ForEach-Object {
[PSCustomObject]#{
Extension = $_.Name
Count = $_.Count
Size = [Math]::Round(($_.Group | Measure-Object -Sum Length | Select-Object -ExpandProperty Sum) / 1MB,2)
}
} | Out-File $Report_File_Location -Append
Get all of the files you are looking for with Get-ChildItem much like you were. Vasja mentioned it as well that you might want to use -Recurse to get results from sub directories as well. Use Group-Object to collect the files by extension. For each collection output a custom object of the extension and file count, which both come Group-Object, and the size of all the files of that particular extension converted to MB and rounded to 2 decimal places.
Update for 2.0
In case you only have 2.0 installed I wanted to provide and answer that works for that.
$results | Group-Object extension | ForEach-Object {
$properties = #{
Extension = $_.Name
Count = $_.Count
Size = [Math]::Round(($_.Group | Measure-Object -Sum Length | Select-Object -ExpandProperty Sum) / 1MB,2)
}
New-Object -TypeName PSObject -Property $properties
}
Added some quotes.
Also you probably want -Recurse on Get-Childitem
$Folder_To_Check = "C:\AIU"
$Report_File_Location = "E:\tmp\File_Summary.txt"
$files = Get-Childitem -Path $Folder_To_Check -Include *doc, *docx, *xls, *xlsx -Recurse
$files | Out-File $Report_File_Location
$totalfiles = ($files | Measure-Object).Count
$totalsize = ($files | Measure-Object -Sum Length).Sum
write-host "totalfiles: $totalfiles"
write-host "totalsize: $totalsize"
Yep, you need a collection of strings for the -Include argument. So, what you tried is one string, that being:
"*doc, *docx, *xls, *xlsx"
While the commas do need to seperate the extensions when you include it within the quotes it thinks that's a part of the one thing to include, so it's seriously looking for files that have anything (as per the asterisk) then "doc," then anything then "docx," then anything then... you see where I'm going. It thinks it has to include all of that. Instead you need a collection of strings like:
-Include "*doc","*docx","*xls","xlsx"
I hope that helps. Here's your line modified to what should work:
$files= Get-Childitem -Path $Folder_To_Check-Include "*doc", "*docx", "*xls", "*xlsx"