Select-Object not selecting object from csv-file - powershell

I'm trying to import some csv files to further work with them and export them in the end. They all have two header lines from which i'll only need the second one. I also need to delete most columns except a few. Unfortunately it seems you'll need to decide if you want to skip rows with get-content or exclude columns with import-csv. Neither of those can't do both, so i got a workaround:
$out="bla\bla\out.csv"
$in="bla\bla\in.csv"
$header= (get-content $in -TotalCount 2 )[-1]
$out = Import-csv $in -Header $header -Delimiter ";"|select column1 | Export-Csv -Path $out -NoTypeInformation
this returns an empty csv with the header name column1. What am i doing wrong?
Edit:
The input csv looks like:
filename;filename;...
column1;column2;...
1;a;...
2;b;...
...
I guess that -Header can't read arrays without single quotation marks, so i'm trying to find a solution to that atm.

If you know the name of the header you want to filter on, the following should do the trick and only requires reading the file once:
$out = "out.csv"
$in = "in.csv"
Get-Content $in | Select-Object -Skip 1 |
ConvertFrom-Csv -Delimiter ';' | Select-Object column1 |
Export-Csv $out -NoTypeInformation
If however, you don't know the name of the header you need to filter on (column1 on example above) but you know it's the first column, it would require an extra step:
$csv = Get-Content $in | Select-Object -Skip 1 | ConvertFrom-Csv -Delimiter ';'
$csv | Select-Object $csv[0].PSObject.Properties.Name[0] | Export-Csv $out -NoTypeInformation
We can get the first object of object array ($csv[0]) and get it's properties by accessing it's PSObject.Properties then select the 1st property (.Name[0] - column1 in this case).

Related

Get Unique Column and Count from CSV file in Powershell

I have a CSV File called Products.csv
Product_ID,Category
1,A
2,A
3,A
4,B
I want a powershell script that will show me the Unique Categories along with the Count and export to CSV.
i.e.
A,3
B,1
I have used the following code to extract the Unique Categories, but cannot get the Count:
Import-Csv Products.csv -DeLimiter ","|
Select 'Category' -Unique |
Export-Csv Summary.csv -DeLimiter "," -NoTypeInformation
Can anyone help me out?
Thanks.
You can use Group-Object to get the count.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Foreach-Object {
"{0},{1}" -f $_.Name,$_.Count
}
If you want a CSV output of the count, you need headers for your data. Group-Object outputs property Name which is the grouped property value and Count which is the number of items in that group.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category | Select-Object Name,Count |
Export-Csv Summary.csv -Delimiter ',' -NoType
You can take the above code a step further and use Select-Object's calculated properties. Then you can create custom named columns and/or values with expressions.
Import-Csv Products.csv -DeLimiter "," |
Group-Object Category |
Select-Object #{n='Product_ID';e={$_.Name}},Count |
Export-Csv Summary.csv -Delimiter ',' -NoType

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

Extract differences of CSV files into a seperate file

I have a CSV file (with headers) filled with assortment data. The file will be updated once every day. I need to find the differences in those files (the old and the new one) and extract them into a separate file.
For instance: in the old file there could be a price of "18,50" and now it's an updated one of "17,90". The script should now extract this row into a new file.
So far, I was able to import both CSV files (via Import-Csv) but my current solution is to compare each row by findstr.
The problems are:
In 9 of 10 cases the strings are too long to compare.
What if a new row will be inserted - I guess the comparison wouldn't work any longer if the row isn't inserted at the end of the file.
My current code is:
foreach ($oldData in (Import-Csv $PSScriptRoot\old.csv -Delimiter ";" -Encoding "default")) {
foreach ($newData in (Import-Csv $PSScriptRoot\new.csv -Delimiter ";" -Encoding "default")) {
findstr.exe /v /c:$oldData $newData > $PSScriptRoot\diff.txt
}
}
Read both files into separate variables and use Compare-Object for the comparison:
$fields = 'idArtikel', 'Preis', ...
$csv1 = Import-Csv $PSScriptRoot\old.csv -Delimiter ';'
$csv2 = Import-Csv $PSScriptRoot\new.csv -Delimiter ';'
Compare-Object -ReferenceObject $csv1 -DifferenceObject $csv2 -Property $fields -PassThru | Where-Object {
$_.SideIndicator -eq '=>'
} | Select-Object $fields | Export-Csv 'C:\path\to\diff.csv' -Delimiter ';'
$csv1 | Join $csv2 idArtikel -Merge {$Right.$_} | Export-CSV 'C:\path\to\diff.csv' -Delimiter ';'
For details on Join (Join-Object), see: https://stackoverflow.com/a/45483110/1701026

Get first two items positionally from Import-CSV row

I have a series of files that have changed some header naming and column counts over time. However, the files always have the first column as the start date and second column as the end date.
I would like to get just these two columns, but the name has changed over time.
What I have tried is this:
$FileContents=Import-CSV -Path "$InputFilePath"
foreach ($line in $FileContents)
{
$StartDate=$line[0]
$EndDate=$line[1]
}
...but $FileContents is (I believe) an array of a type (objects?) that I'm not sure how to positionally access in PowerShell. Any help would be appreciated.
Edit: The files switched from comma delimiter to pipe delimiter a while back and there are 1000s of files to work with, so I use Import-CSV because it can implicitly read either format.
You could use the -Header parameter to give the first to columns of the csv the header names you want. Then you'll skip the first line that has the old header.
$FileContents = Import-CSV -Path "$InputFilePath" -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
foreach ($line in $FileContents) {
$StartDate = $line.StartDate
$EndDate = $line.EndDate
}
Here's an example:
Example.csv
a,b,c
1,2,3
4,5,6
Import-CSV -Path Example.csv -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
StartDate EndDate
--------- -------
1 2
4 5
If you use Import-Csv, PowerShell will indeed create an object for you. The "columns" are calles properties. You can select properties with Select-Object. You have to name the properties, you want to select. Since you don't know the property names in advance, you can get the names with Get-Member. The first two properties should match the first two columns in your CSV.
Use the following sample code and apply it to your script:
$csv = #'
1,2,3,4,5
a,b,c,d,e
g,h,i,j,k
'#
$csv = $csv | ConvertFrom-Csv
$properties = $csv | Get-Member -MemberType NoteProperty | Select-Object -First 2 -ExpandProperty Name
$csv | Select-Object -Property $properties
How about this:
$FileContents=get-content -Path "$InputFilePath"
for ($i=0;$i -lt $FileContents.count;$i++){
$textrow = ($FileContents[$i]).split(",")
$StartDate=$textrow[0]
$EndDate=$textrow[1]
#do what you want with the variables
write-host $startdate
write-host $EndDate
}
pending you are referencing a csv file....
Other solution with foreach (%=alias of foreach) and split :
Get-Content "example.csv" | select -skip 1 | %{$row=$_ -split ',', 3; [pscustomobject]#{NewCol1=$row[0];NewCol2=$row[1]}}
You can build predicate into the select too like this :
Get-Content "example.csv" | select #{N="Newcol1";E={($_ -split ',', 3)[0]}}, #{N="Newcol2";E={($_ -split ',', 3)[1]}} -skip 1
With convertfrom-string
Get-Content "example.csv" | ConvertFrom-Csv -Delimiter ',' -Header col1, col2 | select -skip 1

PowerShell write integer to file after x number of tabs

I'm sure this is ridiculously easy, but I'm a noob and trying to learn PowerShell.
I want to write an integer to each line of a tab delimited file, i.e. each line has 20 tabs; put a 1 after the nth tab.
No need to overwrite what's already there because in the current scenario there isn't anything.
Thanks!
If there is a header line then just import the file as a CSV, run it through a ForEach-Object loop and set that column to the integer that you want, then export the CSV again.
Import-CSV $File -Delimiter "`t" | ForEach{$_.ColumnName = $Integer} | Export-CSV $File -Delimiter "`t" -NoTypeInfo
If there is no header you could do the same thing and define your own headers. Except you would use ConvertTo-CSV instead of Export-CSV and then use Select to skip the header row, and use Set-Content to write the file. For my example I set the 7th column to $Integer.
$Headers = 1..20|ForEach{"Col$_"}
Import-CSV $File -Delimiter "`t" -Header $Headers | ForEach{$_.Col7 = $Integer} | ConvertTo-CSV -Delimiter "`t" -NoTypeInfo | Select -Skip 1 | Set-Content $File