Get Unique Column and Count from CSV file in Powershell - powershell

I have a CSV File called Products.csv
Product_ID,Category
1,A
2,A
3,A
4,B
I want a powershell script that will show me the Unique Categories along with the Count and export to CSV.
i.e.
A,3
B,1
I have used the following code to extract the Unique Categories, but cannot get the Count:
Import-Csv Products.csv -DeLimiter ","|
Select 'Category' -Unique |
Export-Csv Summary.csv -DeLimiter "," -NoTypeInformation
Can anyone help me out?
Thanks.

You can use Group-Object to get the count.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Foreach-Object {
"{0},{1}" -f $_.Name,$_.Count
}
If you want a CSV output of the count, you need headers for your data. Group-Object outputs property Name which is the grouped property value and Count which is the number of items in that group.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Select-Object Name,Count |
Export-Csv Summary.csv -Delimiter ',' -NoType
You can take the above code a step further and use Select-Object's calculated properties. Then you can create custom named columns and/or values with expressions.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category |
Select-Object #{n='Product_ID';e={$_.Name}},Count |
Export-Csv Summary.csv -Delimiter ',' -NoType

Related

Select-Object not selecting object from csv-file

I'm trying to import some csv files to further work with them and export them in the end. They all have two header lines from which i'll only need the second one. I also need to delete most columns except a few. Unfortunately it seems you'll need to decide if you want to skip rows with get-content or exclude columns with import-csv. Neither of those can't do both, so i got a workaround:
$out="bla\bla\out.csv"
$in="bla\bla\in.csv"
$header= (get-content $in -TotalCount 2 )[-1]
$out = Import-csv $in -Header $header -Delimiter ";"|select column1 | Export-Csv -Path $out -NoTypeInformation
this returns an empty csv with the header name column1. What am i doing wrong?
Edit:
The input csv looks like:
filename;filename;...
column1;column2;...
1;a;...
2;b;...
...
I guess that -Header can't read arrays without single quotation marks, so i'm trying to find a solution to that atm.
If you know the name of the header you want to filter on, the following should do the trick and only requires reading the file once:
$out = "out.csv"
$in = "in.csv"
Get-Content $in | Select-Object -Skip 1 |
ConvertFrom-Csv -Delimiter ';' | Select-Object column1 |
Export-Csv $out -NoTypeInformation
If however, you don't know the name of the header you need to filter on (column1 on example above) but you know it's the first column, it would require an extra step:
$csv = Get-Content $in | Select-Object -Skip 1 | ConvertFrom-Csv -Delimiter ';'
$csv | Select-Object $csv[0].PSObject.Properties.Name[0] | Export-Csv $out -NoTypeInformation
We can get the first object of object array ($csv[0]) and get it's properties by accessing it's PSObject.Properties then select the 1st property (.Name[0] - column1 in this case).

export csv rows where duplicate values found in column

I have a csv file where I am trying to export rows into another csv file only where the values in the id column have duplicates.
I have the following csv file...
"id","blablah"
"valOne","valTwo"
"valOne","asdfdsa"
"valThree","valFour"
"valFive","valSix"
"valFive","qwreweq"
"valSeven","valEight"
I need the output csv file to look like the following...
"valOne","valTwo"
"valOne","asdfdsa"
"valFive","valSix"
"valFive","qwreweq"
Here is the code I have so far:
$inputCsv = Import-CSV './test.csv' -delimiter ","
#$output = #()
$inputCsv | Group-Object -prop id, blablah | Where-Object {$_.id -gt 1} |
Select-Object
##{n='id';e={$_.Group[0].id}},
##{n='blablah';e={$_.Group[0].blablah}}
#Export-Csv 'C:\scripts\powershell\output.csv' -NoTypeInformation
#Write-Host $output
#$output | Export-Csv 'C:\scripts\powershell\output.csv' -NoTypeInformation
I've searched multiple how-to's but can't seem to find the write syntax. Can anyone help with this?
Just group on the ID property and if there is more than 1 count in the group then expand those and export.
$inputCsv = Import-CSV './test.csv' -delimiter ","
$inputCsv |
Group-Object -Property ID |
Where-Object count -gt 1 |
Select-Object -ExpandProperty group |
Export-Csv output.csv -NoTypeInformation
output.csv will contain
"id","blablah"
"valOne","valTwo"
"valOne","asdfdsa"
"valFive","valSix"
"valFive","qwreweq"

Getting only a repeating files from directory and subdirectories

I'm trying to do script for finding non-unique files.
The script should take one .csv file with data: name of files, LastWriteTime and Length. Then I try to make another .csv based on that one, which will contain only those objects whose combination of Name+Length+LastWriteTime is NON-unique.
I tried following script which uses $csvfile containing files list:
$csvdata = Import-Csv -Path $csvfile -Delimiter '|'
$csvdata |
Group-Object -Property Name, LastWriteTime, Length |
Where-Object -FilterScript { $_.Count -gt 1 } |
Select-Object -ExpandProperty Group -Unique |
Export-Csv $csvfile2 -Delimiter '|' -NoTypeInformation -Encoding Unicode
$csvfile was created by:
{
Get-ChildItem -Path $mainFolderPath -Recurse -File |
Sort-Object $sortMode |
Select-Object Name, LastWriteTime, Length, Directory |
Export-Csv $csvfile -Delimiter '|' -NoTypeInformation -Encoding Unicode
}
(Get-Content $csvfile) |
ForEach-Object { $_ -replace '"' } |
Out-File $csvfile -Encoding Unicode
But somehow in another $csvfile2 there is only the one (first) non-unique record. Does anyone have an idea how to improve it so it can list all non-unique records?
You need to use -Property * -Unique to get a list of unique objects. However, you cannot use -Property and -ExpandProperty at the same time here, because you want the latter parameter to apply to the input objects ($_) and the former parameter to apply to an already expanded property of those input objects ($_.Group).
Expand the property Group first, then select the unique objects:
... |
Select-Object -ExpandProperty Group |
Select-Object -Property * -Unique |
...

Delete line in CSV if two columns not equal

I have big CSV Files, here some example of content:
Name;Number;Type;AlterName
Prag;1418;2;2012;Prag
Prag;1836;3;2012;Prag
Prag;1836;514;2012;Moscow
...
And I need delete the line where is not equal Name and AlterName.
In this case:
Prag;1836;514;2012;Moscow
Simply check if the fields are equal.
Import-Csv 'C:\path\to\input.csv' -Delimiter ';' |
Where-Object { $_.Name -eq $_.AlterName } |
Export-Csv 'C:\path\to\output.csv' -Delimiter ';' -NoType

Merge Columns from CSV

I've got a CSV like this:
Group;Name;Color
Fruit;Apple;green
Vegetable;Carrot;orange
Fruit;Banana;yellow
Fruit;cherry;red
Vegetable;cucumber;green
and want to merge it (via PowerShell) so that each Group appears only one time and the according 'Names' next to it in an Array(?), like this:
Group;Name;color
Fruit;{Apple,Banana,Cherry};{green,yellow,red}
Vegetable;{Carrot;cucumber};{orange,green}
Use Group-Object for grouping objects by their properties:
Import-Csv 'C:\path\to\input.csv' -Delimiter ';' |
Group-Object Group |
select #{n='Group';e={$_.Name}},
#{n='Name';e={'{{{0}}}' -f ($_.Group.Name -join ',')}},
#{n='Color';e={'{{{0}}}' -f ($_.Group.Color -join ',')}} |
Export-Csv 'C:\path\to\output.csv' -NoType -Delimiter ';'