I have a folder of pipe delimited text files that I need to remove the last column on. I'm not seasoned in PS but I found enough through searches to help. I have two pieces of code. The first creates new text files in my destination path, keeps the pipe delimiter, but doesn't remove the last column. There are 11 columns. Here is that script:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
(Get-Content $File) | Foreach-Object { $_.split()[0..9] -join '|' } | Out-File $OutputFolder\$($File.Name)
}
Then this second code I tried creates the new text files on my destination path, it DOES get rid of the last column, but it loses the pipe delimiter. Ugh.
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
Import-Csv $File -Header col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 -Delimiter '|' |
Foreach-Object {"{0} {1} {2} {3} {4} {5} {6} {7} {8} {9}" -f $_.col1,$_.col2,$_.col3,$_.col4,$_.col5,$_.col6,$_.col7,$_.col8,$_.col9,$_.col10} | Out-File $destination\$($File.Name)
}
I have no clue on what I'm doing wrong. I have no preference in which way I get this done but I need to keep the delimiter and the have the last column removed. Any help would be greatly appreciated.
In your plain-text processing attempt with Get-Content, you simply need to split each line by | first (.Split('|')), before extracting the fields of interest with a range operation (..) and joining them back with |:
Get-Content $File |
Foreach-Object { $_.Split('|')[0..9] -join '|' } |
Out-File $OutputFolder\$($File.Name)
In your Import-Csv-based attempt, you can take advantage of the fact that it will only read as many columns as you supply column names for, via -Header:
# Pass only 10 column names to -Header
Import-Csv $File -Header (0..9).ForEach({ 'col' + $_ }) -Delimiter '|' |
ConvertTo-Csv -Delimiter '|' | # convert back to CSV with delimiter '|'
Select-Object -Skip 1 | # skip the header row
Out-File $destination\$($File.Name)
Note that ConvertTo-Csv, just like Export-Csv by default double-quotes each field in the resulting CSV data / file.
In Windows PowerShell, you cannot avoid this, but in PowerShell (Core) 7+ you can control this behavior with -UseQuotes Never, for instance.
You can give this a try, should be more efficient than using Import-Csv, however note, this should always exclude the last column of your files no matter how many columns they have and assuming they're pipe delimited:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
foreach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt")) {
[IO.File]::ReadAllLines($File.FullName) | & {
process{
-join ($_ -split '(?=\|)' | Select-Object -SkipLast 1)
}
} | Set-Content (Join-Path $OutputFolder -ChildPath $File.Name)
}
I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}
I want to read a CSV file and output a CSV file with only one (1) field. I have tried to create a concise example.
PS C:\src\powershell> Get-Content .\t.csv
field1,field2,field3
1,2,3
4,55,6
7,888,9
PS C:\src\powershell> Import-Csv -Path .\t.csv | `
>> ForEach-Object {
>> $_.field2 `
>> } | `
>> Export-Csv -Path .\x.csv -NoTypeInformation
>>
The problem is that the Length of field2 is written to the exported CSV file. I want the field header to be "field2" and the values to be the value from the original CSV file. Also, I only want quotes where they are required; not everywhere.
I have read Export-CSV exports length but not name and Export to CSV only returning string length. But these do not seem to address producing an actual CSV file with a header and one field value.
PS C:\src\powershell> get-content .\x.csv
"Length"
"1"
"2"
"3"
CSV object uses note properties in each row to store its fields so we'll need to filter each row object and leave just the field(s) we want using Select-Object cmdlet (alias: select), which processes the entire CSV object at once:
Import-Csv 1.csv | select field2 | Export-Csv 2.csv -NoTypeInformation
Note, there's no need to escape the end of line if it ends with |, {, (, or ,.
It's possible to specify several fields: select field2, field3.
To strip unneeded doublequotes, general multi-field case:
Import-Csv 1.csv |
select field2 |
%{
$_.PSObject.Properties | %{ $_.value = $_.value -replace '"', [char]1 }
$_
} |
ConvertTo-Csv -NoTypeInformation |
%{ $_ -replace '"(\S*?)"', '$1' -replace '\x01', '""' } |
Out-File 2.csv -Encoding ascii
Simplified one-field case:
Import-Csv 1.csv |
select field2 |
%{
$_.field2 = $_.field2 -replace '"', [char]1
$_
} |
ConvertTo-Csv -NoTypeInformation |
%{ $_ -replace '"(\S*?)"', '$1' -replace '\x01', '""' } |
Out-File 2.csv -Encoding ascii
A tricky case of embedded quotes inside a field was solved by temporary replacing them with a control character code 01 (there are just a few that can be used in a typical non-broken text file: 09/tab, 0A/line feed, 0D/carriage return).
As per WOxxOm's response, Select-Object is best way to select only field from an input and pipe to output.
Regarding the quote marks, this is a known (and frustrating) issue with PowerShell. Specifying , as the delimiter did not help.
I have gotten round it by using ConvertTo-Csv and Foreach-Object replacements. THe replacements will need to be more complex if your data contains quote marks.
Import-Csv .\1.csv |
Select-Object field2 |
ConvertTo-Csv -NoTypeInformation |
ForEach-Object {$_ -replace '"',''} |
Out-File .\2.csv