PowerShell: How to remove columns from delimited text input? - powershell

I have a text file with 5 columns of text delimited by whitespace. For example:
10 45 5 23 78
89 3 56 12 56
999 4 67 93 5
Using PowerShell, how do I remove the rightmost two columns? The resulting file should be:
10 45 5
89 3 56
999 4 67
I can extract the individual items using the -split operator. But, the items appear on different lines and I do not see how I can get them back as 3 items per line.
And to make the question more generic (and helpful to others): How to use PowerShell to remove the data at multiple columns in the range [0,n-1] given an input that has lines with delimited data of n columns each?

Read the file content, convert it to a csv and select just the first 3 columns:
Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ' | Select-Object col1,col2,col3
If you want just the values (without a header):
Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ' | Select-Object col1,col2,col3 | Format-Table -HideTableHeaders -AutoSize
To save back the results to the file:
(Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ') | Foreach-Object { "{0} {1} {2}" -f $_.col1,$_.col2,$_.col3} | Out-File .\file.txt
UPDATE:
Just another option:
(Get-Content .\file.txt) | Foreach-Object { $_.split()[0..2] -join ' ' } | Out-File .\file.txt

One way is:
gc input.txt | %{[string]::join(" ",$_.split()[0..2]) } | out-file output.txt
(replace 2 by n-1)

Here is the generic solution:
param
(
# Input data file
[string]$Path = 'data.txt',
# Columns to be removed, any order, dupes are allowed
[int[]]$Remove = (4, 3, 4, 3)
)
# sort indexes descending and remove dupes
$Remove = $Remove | Sort-Object -Unique -Descending
# read input lines
Get-Content $Path | .{process{
# split and add to ArrayList which allows to remove items
$list = [Collections.ArrayList]($_ -split '\s')
# remove data at the indexes (from tail to head due to descending order)
foreach($i in $Remove) {
$list.RemoveAt($i)
}
# join and output
$list -join ' '
}}

Related

Powershell--split CSV into 2 files based on column value

I have several large CSV files that I need to split based on a match in one column.
The column is called "Grading Period Title" and there are up to 10 different values. I want to separate all values of "Overall" into overall.CSV file, and all other values to term.CSV and preserve all the other columns of data.
Grading Period Title
Score
Overall
5
22-23 MC T2
6
Overall
7
22-23 T2
1
I found this code to group and split by all the values, but I can't get it to split overall and all other values into 2 files
#Splitting a CSV file into multiple files based on column value
$groups = Import-Csv -Path "csvfile.csv" | Group-Object 'Grading Period Title' -eq '*Overall*'
$groups | ForEach-Object {$_.Group | Export-Csv "$($_.Name).csv" -NoTypeInformation}
Count Name Group
278 22-23 MC T2
71657 Overall
71275 22-23 T2
104 22-23 8th Blk Q2
So they are grouped, but I don't know how to select the Overall group as one file, and the rest as the 2nd file and name them.
thanks!
To just split it, you can filter with Where-Object, for example:
# overall group
Import-Csv -Path "csvfile.csv" |
Where-Object { $_.'Grading Period Title' -Like '*Overall*' } |
Export-CSV -Path "overall.csv" -NoTypeInformation
# looks like
Grading Period Title Score
-------------------- -----
Overall 5
Overall 7
# term group
Import-Csv -Path "csvfile.csv" |
Where-Object { $_.'Grading Period Title' -NotLike '*Overall*' } |
Export-CSV -Path "term.csv" -NoTypeInformation
# looks like
Grading Period Title Score
-------------------- -----
22-23 MC T2 6
22-23 T2 1
To complement the clear answer from #Cpt.Whale and do this is one iteration using the Steppable Pipeline:
Import-Csv .\csvfile.csv |
ForEach-Object -Begin {
$Overall = { Export-CSV -notype -Path .\Overall.csv }.GetSteppablePipeline()
$Term = { Export-CSV -notype -Path .\Term.csv }.GetSteppablePipeline()
$Overall.Begin($True)
$Term.Begin($True)
} -Process {
if ( $_.'Grading Period Title' -Like '*Overall*' ) {
$Overall.Process($_)
}
else {
$Term.Process($_)
}
} -End {
$Overall.End()
$Term.End()
}
For details, see: Mastering the (steppable) pipeline.

Get results of For-Each arrays and display in a table with column headers one line per results

I am trying to get a list of files and a count of the number of rows in each file displayed in a table consisting of two columns, Name and Lines.
I have tried using format table but I don't think the problem is with the format of the table and more to do with my results being separate results. See below
#Get a list of files in the filepath location
$files = Get-ChildItem $filepath
$files | ForEach-Object { $_ ; $_ | Get-Content | Measure-Object -Line} | Format-Table Name,Lines
Expected results
Name Lines
File A
9
File B
89
Actual Results
Name Lines
File A
9
File B
89
Another approach how to make a custom object like this: Using PowerShell's Calculated Properties:
$files | Select-Object -Property #{ N = 'Name' ; E = { $_.Name} },
#{ N = 'Lines'; E = { ($_ | Get-Content | Measure-Object -Line).Lines } }
Name Lines
---- -----
dotNetEnumClass.ps1 232
DotNetVersions.ps1 9
dotNETversionTable.ps1 64
Typically you would make a custom object like this, instead of outputting two different kinds of objects.
$files | ForEach-Object {
$lines = $_ | Get-Content | Measure-Object -Line
[pscustomobject]#{name = $_.name
lines = $lines.lines}
}
name lines
---- -----
rof.ps1 11
rof.ps1~ 7
wai.ps1 2
wai.ps1~ 1

Powershell - Trying to calculate the average of a csv file using a function

Below you can see my code. I' trying to calculate the average line by line of my csv file. The only way I have been able to do this is by using it as an array. My question is, is there a way to pass each line through the function so that I don't have to create multiple functions?
The file looks like this:
V1 V2 V3
5 9 3
5 6 2
Script:
Function Average($a) {
Foreach ($line in $a) {
$total = [int]$a[0].V1 + [int]$a[0].V2 + [int]$a[0].V3
}
return "The Average is $($total / 3)"
}
#Variables
$a = (Import-Csv "Document.csv")
#Logic
Average
You can access the properties of each line (without knowing their names) through the psobject reference:
function Get-CsvAverage
{
param(
[psobject[]]$Lines
)
# Loop through all lines of input
foreach($Line in $Lines){
# For each "line" object, loop through its properties
$Numbers = $Line.psobject.Properties |ForEach-Object {
# Attempt to cast the property's value as an integer
$_.Value -as [int]
}
# Output the average for the current line
Write-Output ($numbers | Measure-Object -Average).Average
}
}
Then use like:
$CsvLines = Import-Csv Document.csv
Get-CsvAverage $CsvLines
For your sample input this would produce:
5.66666666666667
4.33333333333333
if your data separator is a space in your file
Solution 1 :
import-csv C:\temp\test.csv -Delimiter ' ' | select #{Name="Average";Expression={[math]::Round(([Decimal]$_.V1 + [Decimal]$_.V2 + [Decimal]$_.V3)/3, 2)}}
Solution 2
import-csv C:\temp\test.csv -Delimiter ' ' | %{[math]::Round(([Decimal]$_.V1 + [Decimal]$_.V2 + [Decimal]$_.V3)/3, 2)}
Solution 3
Get-Content C:\temp\test.csv | select -Skip 1 | %{$avg=0; $_ -split " " | %{$avg+=[decimal]$_};[math]::Round($avg/3, 2) }

PowerShell: ConvertTo-Csv and then Set-Content by object filter

I have the following variables:
$input = "H:\input_file.txt"
$output = "H:\output_file.txt"
$data = Get-Content -Path $input | ConvertFrom-Csv -Delimiter '|' -Header 'Col1','Col2','Col3','Col4','Col5'
As seen above, the input file has 5 columns. This file has various record types in it. Not all record types have 5 columns defined. So, let's say I have 3 record types -- A, B, and C. A has 3 columns, B has 4 columns, and C has 5 columns. An example input file looks like:
A|x|1
B|y|2|stuff
C|z|3|stuff|other
B|y|3|other
A|z|2
My script then makes some modifications to the values in some of the columns (except for Col1) in $data. I want to output all rows in $data to a text file.
If I do something like
$data | Select Col1,Col2,Col3,Col4,Col5 | ConvertTo-Csv -Delimiter '|' -NoTypeInformation | % {$_ -replace '"', ""} | Select-Object -Skip 1 | Set-Content -Path $output
it will append unnecessary pipe characters to record types A and B (because they have less than 5 columns, and yet I am doing Select Col1,Col2,Col3,Col4,Col5).
Is there a clean way to output $data to a text file without the unnecessary pipe characters on record types A and B? My best guess at the moment is to have 3 separate pipelines for record types A, B, and C, such that I am doing the correct Select for the given record type, and then gluing them all together somehow.
The following code might be useful for you:
#"
A|x|1||
B|y|2|stuff|
C|z|3|stuff|other
B|y|3|other|
A|z|2||
"# -split '\r\n' | % { $_ -replace '\|+$', '' }
It seems little tricky and better to do some string manipulations I feel,
$data|ConvertTo-Csv -NoTypeInformation|ForEach-Object{$_.replace('","','|').replace('"','').replace(',','')}|Select-Object -Skip 1|Set-Content $outputtext
here by avoiding _delimiter , we canmake use of some string handling methods in PowerShell
Regards,
Prasoon Karunan V

How to change column position in powershell?

Is there any easy way how to change column position? I'm looking for a way how to move column 1 from the beginning to the and of each row and also I would like to add zero column as a second last column. Please see txt file example below.
Thank you for any suggestions.
File sample
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
Output:
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Another option:
#Prepare test file
(#'
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
'#).split("`n") |
foreach {$_.trim()} |
sc testfile.txt
#Script starts here
$file = 'testfile.txt'
(get-content $file -ReadCount 0) |
foreach {
'{1},{2},{3},{4},{5},{6},0,{0}' -f $_.split(',')
} | Set-Content $file
#End of script
#show results
get-content $file
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Sure, split on commas, spit the results back minus the first result joined by commas, add a 0, and then add the first result to the end and join the whole thing with commas. Something like:
$Input = #"
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
"# -split "`n"|ForEach{$_.trim()}
$Input|ForEach{
$split = $_.split(',')
($Split[1..($split.count-1)]-join ','),0,$split[0] -join ','
}
I created file test.txt to contain your sample data. I Assigned each field a name, "one","two","three" etc so that i could select them by name, then just selected and exported back to csv in the order you wanted.
First, add the zero to the end, it will end up as second last.
gc .\test.txt | %{ "$_,0" } | Out-File test1.txt
Then, rearrange order.
Import-Csv .\test.txt -Header "one","two","three","four","five","six","seven","eight" | Select-Object -Property two,three,four,five,six,seven,eight,one | Export-Csv test2.txt -NoTypeInformation
This will take the output file and get rid of quotes and header line if you would rather not have them.
gc .\test2.txt | %{ $_.replace('"','')} | Select-Object -Skip 1 | out-file test3.txt