Using Import-CSV in Powershell, ignoring commented lines - powershell

I think that I must be missing something obvious because I'm trying to use Import-CSV to import CSV files that have commented out lines (always beginning with a # as the first character) at the top of the file, so the file looks like this:
#[SpecialCSV],,,,,,,,,,,,,,,,,,,,
#Version,1.0.0,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#[Table],,,,,,,,,,,,,,,,,,,,
Header1,Header2,Header3,Header4,Header5,Header6,Header7,...
Data1,Data2,Data3,Data4,Data5,Data6,Data7,...
I'd like to ignore those first 5 lines, but still use Import-csv to get the rest of the information nicely in to Powershell.
Thanks

Simple - just use Select-String to exclude commented lines with a regex, and pipe to ConvertFrom-Csv:
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
The difference between Import-Csv and ConvertTo-Csv is that the former takes input from a file, and the latter takes pipeline input, otherwise they do the same thing - convert CSV data to an array of PSCustomObjects. So, by using ConvertFrom-Csv you can do this without modifying the CSV flie or using a temp file. You can assign the results to an array or pipe to a Foreach-Object block just as you'd do with Import-Csv:
$array = Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
or
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv | %{
<whatever you want do with the data>
}

CSV has no notion of "comments" - it's just flat data. You'll need to use Get-Content and inspect each line. If a line starts with #, ignore it, otherwise process it.
If you're OK with using a temp file:
Get-content special.csv |where-object{!$_.StartsWith("#")}|add-content -path $(join-path -path $env:temp -childpath "special-filtered.csv");
$mydata = import-csv -path $(join-path -path $env:temp -childpath "special-filtered.csv");
remove-item -path $(join-path -path $env:temp -childpath "special-filtered.csv")
$mydata |format-table -autosize; #Just for illustration
Edit: Forgot about convertfrom-csv. It gets much simpler this way.
$mydata = Get-Content special.csv |
Where-Object { !$_.StartsWith("#") } |
ConvertFrom-Csv

If you feed convertfrom-csv csv data as an array of lines it seems to automatically filter out comments. I frequently use convertfrom-csv this way but I haven't seen it documented.
cat data.csv | convertfrom-csv #skips commented lines automagically
("co1,col2,col3", "abc,def,ghi", "#this,is,a,comment", "abc1,def1,ghi1")|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
abc1 def1 ghi1
However, the following will not skip comments:
"co1,col2,col3
abc,def,ghi
#this,is,a,comment
abc1,def1,ghi1
"|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
#this is a
abc1 def1 ghi1

Where-object will work after import-csv as well. You just have to reference the first column from csv in the clause.
e.g.:
$EscapeCharacter = '#'
$FilteredData = Import-Csv -Path "$($Home)\Documents\sample.csv" -Delimiter "`t" -Encoding UTF8 | Where-Object {$_.coll1 -notlike "$EscapeCharacter*"}
The sample of tab delimited csv:
coll1 coll2
#Kotehulky SomeValue
Cakovice OtherValue

Related

PowerShell remove last column of pipe delimited text file

I have a folder of pipe delimited text files that I need to remove the last column on. I'm not seasoned in PS but I found enough through searches to help. I have two pieces of code. The first creates new text files in my destination path, keeps the pipe delimiter, but doesn't remove the last column. There are 11 columns. Here is that script:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
(Get-Content $File) | Foreach-Object { $_.split()[0..9] -join '|' } | Out-File $OutputFolder\$($File.Name)
}
Then this second code I tried creates the new text files on my destination path, it DOES get rid of the last column, but it loses the pipe delimiter. Ugh.
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
Import-Csv $File -Header col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 -Delimiter '|' |
Foreach-Object {"{0} {1} {2} {3} {4} {5} {6} {7} {8} {9}" -f $_.col1,$_.col2,$_.col3,$_.col4,$_.col5,$_.col6,$_.col7,$_.col8,$_.col9,$_.col10} | Out-File $destination\$($File.Name)
}
I have no clue on what I'm doing wrong. I have no preference in which way I get this done but I need to keep the delimiter and the have the last column removed. Any help would be greatly appreciated.
In your plain-text processing attempt with Get-Content, you simply need to split each line by | first (.Split('|')), before extracting the fields of interest with a range operation (..) and joining them back with |:
Get-Content $File |
Foreach-Object { $_.Split('|')[0..9] -join '|' } |
Out-File $OutputFolder\$($File.Name)
In your Import-Csv-based attempt, you can take advantage of the fact that it will only read as many columns as you supply column names for, via -Header:
# Pass only 10 column names to -Header
Import-Csv $File -Header (0..9).ForEach({ 'col' + $_ }) -Delimiter '|' |
ConvertTo-Csv -Delimiter '|' | # convert back to CSV with delimiter '|'
Select-Object -Skip 1 | # skip the header row
Out-File $destination\$($File.Name)
Note that ConvertTo-Csv, just like Export-Csv by default double-quotes each field in the resulting CSV data / file.
In Windows PowerShell, you cannot avoid this, but in PowerShell (Core) 7+ you can control this behavior with -UseQuotes Never, for instance.
You can give this a try, should be more efficient than using Import-Csv, however note, this should always exclude the last column of your files no matter how many columns they have and assuming they're pipe delimited:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
foreach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt")) {
[IO.File]::ReadAllLines($File.FullName) | & {
process{
-join ($_ -split '(?=\|)' | Select-Object -SkipLast 1)
}
} | Set-Content (Join-Path $OutputFolder -ChildPath $File.Name)
}

how to prepend filename to every record in a csv?

How do we prepend the filename to ALL the csv files in a specific directory?
I've got a bunch of csv files that each look like this:
ExampleFile.Csv
2323, alex, gordon
4382, liza, smith
The output I'd like is:
ExampleFile.Csv, 2323, alex, gordon
ExampleFile.Csv, 4382, liza, smith
How do we prepend the filename to ALL the csv files in a specific directory?
I've attempted the following solution:
Get-ChildItem *.csv | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName -Delimiter ","
$FileName = $_.Name
$CSV | Select-Object *,#{E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","
}
However, this did not work because it was altering the first row. (My data does not have a header row). Also, this script will append to each record at the end rather than prepend at the beginning.
You're missing the column header name I think. Take a look at the duplicate (or original, rather) and see Shay's answer. Your Select-Object should look like:
$CSV | Select-Object #{Name='FileName';Expression={"$filename"}},* | Export-Csv -Path $FileName -NoTypeInformation -Delimiter ','
That worked fine for me with multiple CSVs in a directory when using the rest of your sample code verbatim.
If your files do not have headers and the column count is unknown or unpredictable, you can read each line with Get-Content, make the changes, and then use Set-Content to make the update.
Get-ChildItem *.csv | ForEach-Object {
$Filename = $_.Name
$Fullname = $_.FullName
$contents = Get-Content -Path $Fullname | Foreach-Object {
"{0}, {1}" -f $Filename,$_
}
$contents | Set-Content -Path $Fullname
}

Powershell adds quotes to splitted CSV file

I'm trying to split a csv file by the first digits of the longitude column. Here is a sample:
X,Y,TYPE,SPEED,DirType,Direction
-44.058251,-19.945982,1,30,1,339
-54.629503,-20.497509,1,30,1,263
-54.646202,-20.496151,1,30,1,86
I have no powershell knowledge but I found some script online and it did what I wanted:
Import-Csv maparadar.csv
| Group-Object -Property {($_.x)[0..2] -join ""}
| Foreach-Object {$path=$_.name+".csv" ; $_.group
| Export-Csv -Path $path -NoTypeInformation}
With this I get output files like -44.csv, -54.csv
But it adds unwanted quotes to every field in the output file like:
"X","Y","TYPE","SPEED","DirType","Direction"
"-46.521991","-23.690235","1","30","1","169"
"-46.670774","-23.756021","1","30","1","281"
"-46.549897","-23.120720","1","30","1","99"
Is there any way I can export the csv without adding those quotes?
The following should provide the desired output:
Import-Csv maparadar.csv |
Group-Object -Property {($_.x)[0..2] -join ""} |
Foreach-Object { $path=$_.name+".csv" ; ($_.group |
ConvertTo-Csv -NoTypeInformation) -Replace '"' |
Set-Content -Path $path }
Explanation:
We replaced your Export-Csv with ConvertTo-Csv, which provides the CSV output to the console/pipeline rather than outputting to the file. Those CSV formatted outputs are sent through the -Replace operator to replace the literal " characters. Finally the formatted output is sent to the desired file using Set-Content -Path $path.

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

Merges csv files from directory into a single csv file PowerShell

How can I run one single PowerShell script that does the following in series?
Adds a the filename of all csv files in a directory as a column in the end of each file using this script:
Get-ChildItem *.csv | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName -Delimiter ","
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","}
Merges all csv files in the directory into a single csv file
Keeping only a header (first row) only from first csv and excluding all other first rows from files.
Similiar to what kemiller2002 has done here, except one script with csv inputs and a csv output.
Bill's answer allows you to combine CSVs, but doesn't tack file names onto the end of each row. I think the best way to do that would be to use the PipelineVariable common parameter to add that within the ForEach loop.
Get-ChildItem \inputCSVFiles\*.csv -PipelineVariable File |
ForEach-Object { Import-Csv $_ | Select *,#{l='FileName';e={$File.Name}}} |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
That should accomplish what you're looking for.
This is the general pattern:
Get-ChildItem \inputCSVFiles\*.csv |
ForEach-Object { Import-Csv $_ } |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
Make sure the output CSV file has a different filename pattern, or use a different directory name (like in this example).
If your csv files dont have always same header you can do it :
$Dir="c:\temp\"
#get header first csv file founded
$header=Get-ChildItem $Dir -file -Filter "*.csv" | select -First 1 | Get-Content -head 1
#header + all rows without header into new file
$header, (Get-ChildItem $Dir -file -Filter "*.csv" | %{Get-Content $_.fullname | select -skip 1}) | Out-File "c:\temp\result.csv"