I have several large CSV files that I need to split based on a match in one column.
The column is called "Grading Period Title" and there are up to 10 different values. I want to separate all values of "Overall" into overall.CSV file, and all other values to term.CSV and preserve all the other columns of data.
Grading Period Title
Score
Overall
5
22-23 MC T2
6
Overall
7
22-23 T2
1
I found this code to group and split by all the values, but I can't get it to split overall and all other values into 2 files
#Splitting a CSV file into multiple files based on column value
$groups = Import-Csv -Path "csvfile.csv" | Group-Object 'Grading Period Title' -eq '*Overall*'
$groups | ForEach-Object {$_.Group | Export-Csv "$($_.Name).csv" -NoTypeInformation}
Count Name Group
278 22-23 MC T2
71657 Overall
71275 22-23 T2
104 22-23 8th Blk Q2
So they are grouped, but I don't know how to select the Overall group as one file, and the rest as the 2nd file and name them.
thanks!
To just split it, you can filter with Where-Object, for example:
# overall group
Import-Csv -Path "csvfile.csv" |
Where-Object { $_.'Grading Period Title' -Like '*Overall*' } |
Export-CSV -Path "overall.csv" -NoTypeInformation
# looks like
Grading Period Title Score
-------------------- -----
Overall 5
Overall 7
# term group
Import-Csv -Path "csvfile.csv" |
Where-Object { $_.'Grading Period Title' -NotLike '*Overall*' } |
Export-CSV -Path "term.csv" -NoTypeInformation
# looks like
Grading Period Title Score
-------------------- -----
22-23 MC T2 6
22-23 T2 1
To complement the clear answer from #Cpt.Whale and do this is one iteration using the Steppable Pipeline:
Import-Csv .\csvfile.csv |
ForEach-Object -Begin {
$Overall = { Export-CSV -notype -Path .\Overall.csv }.GetSteppablePipeline()
$Term = { Export-CSV -notype -Path .\Term.csv }.GetSteppablePipeline()
$Overall.Begin($True)
$Term.Begin($True)
} -Process {
if ( $_.'Grading Period Title' -Like '*Overall*' ) {
$Overall.Process($_)
}
else {
$Term.Process($_)
}
} -End {
$Overall.End()
$Term.End()
}
For details, see: Mastering the (steppable) pipeline.
I am trying to get a list of files and a count of the number of rows in each file displayed in a table consisting of two columns, Name and Lines.
I have tried using format table but I don't think the problem is with the format of the table and more to do with my results being separate results. See below
#Get a list of files in the filepath location
$files = Get-ChildItem $filepath
$files | ForEach-Object { $_ ; $_ | Get-Content | Measure-Object -Line} | Format-Table Name,Lines
Expected results
Name Lines
File A
9
File B
89
Actual Results
Name Lines
File A
9
File B
89
Another approach how to make a custom object like this: Using PowerShell's Calculated Properties:
$files | Select-Object -Property #{ N = 'Name' ; E = { $_.Name} },
#{ N = 'Lines'; E = { ($_ | Get-Content | Measure-Object -Line).Lines } }
Name Lines
---- -----
dotNetEnumClass.ps1 232
DotNetVersions.ps1 9
dotNETversionTable.ps1 64
Typically you would make a custom object like this, instead of outputting two different kinds of objects.
$files | ForEach-Object {
$lines = $_ | Get-Content | Measure-Object -Line
[pscustomobject]#{name = $_.name
lines = $lines.lines}
}
name lines
---- -----
rof.ps1 11
rof.ps1~ 7
wai.ps1 2
wai.ps1~ 1
Below you can see my code. I' trying to calculate the average line by line of my csv file. The only way I have been able to do this is by using it as an array. My question is, is there a way to pass each line through the function so that I don't have to create multiple functions?
The file looks like this:
V1 V2 V3
5 9 3
5 6 2
Script:
Function Average($a) {
Foreach ($line in $a) {
$total = [int]$a[0].V1 + [int]$a[0].V2 + [int]$a[0].V3
}
return "The Average is $($total / 3)"
}
#Variables
$a = (Import-Csv "Document.csv")
#Logic
Average
You can access the properties of each line (without knowing their names) through the psobject reference:
function Get-CsvAverage
{
param(
[psobject[]]$Lines
)
# Loop through all lines of input
foreach($Line in $Lines){
# For each "line" object, loop through its properties
$Numbers = $Line.psobject.Properties |ForEach-Object {
# Attempt to cast the property's value as an integer
$_.Value -as [int]
}
# Output the average for the current line
Write-Output ($numbers | Measure-Object -Average).Average
}
}
Then use like:
$CsvLines = Import-Csv Document.csv
Get-CsvAverage $CsvLines
For your sample input this would produce:
5.66666666666667
4.33333333333333
if your data separator is a space in your file
Solution 1 :
import-csv C:\temp\test.csv -Delimiter ' ' | select #{Name="Average";Expression={[math]::Round(([Decimal]$_.V1 + [Decimal]$_.V2 + [Decimal]$_.V3)/3, 2)}}
Solution 2
import-csv C:\temp\test.csv -Delimiter ' ' | %{[math]::Round(([Decimal]$_.V1 + [Decimal]$_.V2 + [Decimal]$_.V3)/3, 2)}
Solution 3
Get-Content C:\temp\test.csv | select -Skip 1 | %{$avg=0; $_ -split " " | %{$avg+=[decimal]$_};[math]::Round($avg/3, 2) }
I have the following variables:
$input = "H:\input_file.txt"
$output = "H:\output_file.txt"
$data = Get-Content -Path $input | ConvertFrom-Csv -Delimiter '|' -Header 'Col1','Col2','Col3','Col4','Col5'
As seen above, the input file has 5 columns. This file has various record types in it. Not all record types have 5 columns defined. So, let's say I have 3 record types -- A, B, and C. A has 3 columns, B has 4 columns, and C has 5 columns. An example input file looks like:
A|x|1
B|y|2|stuff
C|z|3|stuff|other
B|y|3|other
A|z|2
My script then makes some modifications to the values in some of the columns (except for Col1) in $data. I want to output all rows in $data to a text file.
If I do something like
$data | Select Col1,Col2,Col3,Col4,Col5 | ConvertTo-Csv -Delimiter '|' -NoTypeInformation | % {$_ -replace '"', ""} | Select-Object -Skip 1 | Set-Content -Path $output
it will append unnecessary pipe characters to record types A and B (because they have less than 5 columns, and yet I am doing Select Col1,Col2,Col3,Col4,Col5).
Is there a clean way to output $data to a text file without the unnecessary pipe characters on record types A and B? My best guess at the moment is to have 3 separate pipelines for record types A, B, and C, such that I am doing the correct Select for the given record type, and then gluing them all together somehow.
The following code might be useful for you:
#"
A|x|1||
B|y|2|stuff|
C|z|3|stuff|other
B|y|3|other|
A|z|2||
"# -split '\r\n' | % { $_ -replace '\|+$', '' }
It seems little tricky and better to do some string manipulations I feel,
$data|ConvertTo-Csv -NoTypeInformation|ForEach-Object{$_.replace('","','|').replace('"','').replace(',','')}|Select-Object -Skip 1|Set-Content $outputtext
here by avoiding _delimiter , we canmake use of some string handling methods in PowerShell
Regards,
Prasoon Karunan V
Is there any easy way how to change column position? I'm looking for a way how to move column 1 from the beginning to the and of each row and also I would like to add zero column as a second last column. Please see txt file example below.
Thank you for any suggestions.
File sample
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
Output:
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Another option:
#Prepare test file
(#'
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
'#).split("`n") |
foreach {$_.trim()} |
sc testfile.txt
#Script starts here
$file = 'testfile.txt'
(get-content $file -ReadCount 0) |
foreach {
'{1},{2},{3},{4},{5},{6},0,{0}' -f $_.split(',')
} | Set-Content $file
#End of script
#show results
get-content $file
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Sure, split on commas, spit the results back minus the first result joined by commas, add a 0, and then add the first result to the end and join the whole thing with commas. Something like:
$Input = #"
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
"# -split "`n"|ForEach{$_.trim()}
$Input|ForEach{
$split = $_.split(',')
($Split[1..($split.count-1)]-join ','),0,$split[0] -join ','
}
I created file test.txt to contain your sample data. I Assigned each field a name, "one","two","three" etc so that i could select them by name, then just selected and exported back to csv in the order you wanted.
First, add the zero to the end, it will end up as second last.
gc .\test.txt | %{ "$_,0" } | Out-File test1.txt
Then, rearrange order.
Import-Csv .\test.txt -Header "one","two","three","four","five","six","seven","eight" | Select-Object -Property two,three,four,five,six,seven,eight,one | Export-Csv test2.txt -NoTypeInformation
This will take the output file and get rid of quotes and header line if you would rather not have them.
gc .\test2.txt | %{ $_.replace('"','')} | Select-Object -Skip 1 | out-file test3.txt