I have a CSV File called Products.csv
Product_ID,Category
1,A
2,A
3,A
4,B
I want a powershell script that will show me the Unique Categories along with the Count and export to CSV.
i.e.
A,3
B,1
I have used the following code to extract the Unique Categories, but cannot get the Count:
Import-Csv Products.csv -DeLimiter ","|
Select 'Category' -Unique |
Export-Csv Summary.csv -DeLimiter "," -NoTypeInformation
Can anyone help me out?
Thanks.
You can use Group-Object to get the count.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Foreach-Object {
"{0},{1}" -f $_.Name,$_.Count
}
If you want a CSV output of the count, you need headers for your data. Group-Object outputs property Name which is the grouped property value and Count which is the number of items in that group.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Select-Object Name,Count |
Export-Csv Summary.csv -Delimiter ',' -NoType
You can take the above code a step further and use Select-Object's calculated properties. Then you can create custom named columns and/or values with expressions.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category |
Select-Object #{n='Product_ID';e={$_.Name}},Count |
Export-Csv Summary.csv -Delimiter ',' -NoType
I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}
I have a series of files that have changed some header naming and column counts over time. However, the files always have the first column as the start date and second column as the end date.
I would like to get just these two columns, but the name has changed over time.
What I have tried is this:
$FileContents=Import-CSV -Path "$InputFilePath"
foreach ($line in $FileContents)
{
$StartDate=$line[0]
$EndDate=$line[1]
}
...but $FileContents is (I believe) an array of a type (objects?) that I'm not sure how to positionally access in PowerShell. Any help would be appreciated.
Edit: The files switched from comma delimiter to pipe delimiter a while back and there are 1000s of files to work with, so I use Import-CSV because it can implicitly read either format.
You could use the -Header parameter to give the first to columns of the csv the header names you want. Then you'll skip the first line that has the old header.
$FileContents = Import-CSV -Path "$InputFilePath" -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
foreach ($line in $FileContents) {
$StartDate = $line.StartDate
$EndDate = $line.EndDate
}
Here's an example:
Example.csv
a,b,c
1,2,3
4,5,6
Import-CSV -Path Example.csv -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
StartDate EndDate
--------- -------
1 2
4 5
If you use Import-Csv, PowerShell will indeed create an object for you. The "columns" are calles properties. You can select properties with Select-Object. You have to name the properties, you want to select. Since you don't know the property names in advance, you can get the names with Get-Member. The first two properties should match the first two columns in your CSV.
Use the following sample code and apply it to your script:
$csv = #'
1,2,3,4,5
a,b,c,d,e
g,h,i,j,k
'#
$csv = $csv | ConvertFrom-Csv
$properties = $csv | Get-Member -MemberType NoteProperty | Select-Object -First 2 -ExpandProperty Name
$csv | Select-Object -Property $properties
How about this:
$FileContents=get-content -Path "$InputFilePath"
for ($i=0;$i -lt $FileContents.count;$i++){
$textrow = ($FileContents[$i]).split(",")
$StartDate=$textrow[0]
$EndDate=$textrow[1]
#do what you want with the variables
write-host $startdate
write-host $EndDate
}
pending you are referencing a csv file....
Other solution with foreach (%=alias of foreach) and split :
Get-Content "example.csv" | select -skip 1 | %{$row=$_ -split ',', 3; [pscustomobject]#{NewCol1=$row[0];NewCol2=$row[1]}}
You can build predicate into the select too like this :
Get-Content "example.csv" | select #{N="Newcol1";E={($_ -split ',', 3)[0]}}, #{N="Newcol2";E={($_ -split ',', 3)[1]}} -skip 1
With convertfrom-string
Get-Content "example.csv" | ConvertFrom-Csv -Delimiter ',' -Header col1, col2 | select -skip 1