How to select the first 10 columns of a headerless csv file using PowerShell? - powershell

I have a CSV file called test.csv ($testCSV).
There are many columns in this file but I would simply like to select the first 10 columns and put these 10 columns in to another CSV file.
Please note that I DO NOT HAVE ANY COLUMN HEADERS so can not select columns based on a column name.
The below line of code will get the first 10 ROWS of the file:
$first10Rows = Get-Content $testCSV | select -First 10
However I need all the data for the first 10 COLUMNS and I am struggling to find a solution.
I have also had a look at splitting the file and attempting to return the first column as follows:
$split = ( Get-Content $testCSV) -split ','
$FirstColumn = $split[0]
I had hoped the $split[0] would return the entire first column but it only returns the very first field in the file.
Any help in solving this problem is very much appreciated.
Thanks in advance.
******UPDATE******
I am using the method as answered below by vonPryz to solve this problem, i.e.:
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b
However I am now also trying to import the CSV file only where column b is not null by adding this extra bit of code:
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b | where b -notmatch $null
I need to do this to speed up the script as there are tens of thousands of lines where column b is null and I do not need to import these lines.
However, the above code returns no data, either meaning the code must be wrong or it thinks the field b is not null. An example of 2 lines of the text file is:
1,2,3
x,,z
And I only want the line(s) where the second column is occupied.
I hope I've explained that well and again, any help is appreciated.
*******************ANSWER********************
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b | Where-Object { $_.b -ne '' }
Thanks!

Lack of column headers is no problem. The cmdlet Import-CSV can specify headers with -Header switch. Assuming test data is saved as C:\temp\headerless.csv and contains
val11,val12,val13,val14
val21,val22,val23,val24
val31,val32,val33,val34
Importing it as CSV is trivial:
Import-Csv -Delimiter "," -Header #("a","b","c","d") -Path C:\temp\headerless.csv
#Output
a b c d
- - - -
val11 val12 val13 val14
val21 val22 val23 val24
val31 val32 val33 val34
Selecting just columns a and b is not hard either:
Import-Csv -Delimiter "," -Header #("a","b","c","d") -Path C:\temp\headerless.csv | select a,b | ft -auto
#Output
a b
- -
val11 val12
val21 val22
val31 val32

To start I want to mention that vonPryz's answer is a superb way of dealing with this. I just wanted to chime in about what you were trying to do and why it was not working.
You had the right idea. You were splitting the data on commas. However you were not doing this on every line. Just the file as a whole which was the source of your woes.
Get-Content $testCSV | ForEach-Object{
$split = $_ -split ","
$FirstColumn = $split[0]
}
That would split each line individually and then you could have populated the $FirstColumn variable.

Related

Read CSV row 1 columns and save them to variables

I would like to read data from csv or another txt files. Data should been read only from row 1 and few columns on row 1 and save them to variables and after saving delete the row. Now I have done it like this:
Get-ChildItem -Path C:\path | ForEach-Object -Process {
$YourContent = Get-Content -Path $_.FullName
$YourVariable = $YourContent | Select-Object -First 1
$YourContent | Select-Object -Skip 1 | Set-Content -Path $_.FullName
My problem is that my variable prints out like this :
Elvis;867.5390;elvis#geocities.com
So I would like to save each variable to its own column. Example what csv could look:
Elvis | 867.5309 | Elvis#Geocities.com
Sammy | 555.1234 | SamSosa#Hotmail.com
Use Import-Csv instead of Get-Content:
Import-Csv file.csv -Delimiter ";" -Header A, B, C
here's one way to do what i think you want.
the 1st 8 lines make a file to work with. [grin]
line 10 reads in that file
lines 11-13 convert the 1st line into an object & remove the unwanted property
lines 14-15 grab all BUT the 1st line & send it to overwrite the source file
the remaining lines show what was done [grin]
Code:
$FileName = "$env:TEMP\Pimeydentimo.txt"
# create a file to work with
#'
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
'# | Set-Content -LiteralPath $FileName
$InStuff = Get-Content -LiteralPath $FileName
$TempObject = $InStuff[0] |
ConvertFrom-Csv -Delimiter ';' -Header 'Name', 'Number', 'DropThisOne', 'Email' |
Select-Object -Property * -ExcludeProperty DropThisOne
$InStuff[1..$InStuff.GetUpperBound(0)] |
Set-Content -LiteralPath $FileName
$InStuff
'=' * 30
$TempObject
'=' * 30
Get-Content -LiteralPath $FileName
output ...
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
==============================
Name Number Email
---- ------ -----
Alfa 123.456 Alfa#example.com
==============================
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
Thanks for the answers!
I try to clarify a bit more what i was trying to do. Answers might do it already, but I'm not yet that good in Powershell and learning still a alot.
If I have csv or any other txt file, i would want to read the first row of the file. The row contains more than one piece of information. I want also save each piece of information to Variables. After saving information to variables, I would like to delete the row.
Example:
Car Model Year
Ford Fiesta 2015
Audi A6 2018
In this example, i would like to save Ford, Fiesta and 2015 to variables (row 1)($Card, $Model, $Year) and after it delete the row. The 2nd row should not be deleted, because it is used later on

Selecting columns from flat file in power shell with no column name

I am new to power shell ,and I have the below format (pipe delimiter) with no column name:
01|1|06/28/2017 00:00:00|06/28/2017 00:00:00
I want to choose the third or any column from this format,I have tried the below code :
$columns=(Get-Content $filepath | Out-String | select -Skip 2 -First 1).Split("|")
but it is not working can any one help please.
Use Import-CSV with -Header and -Delimiter specified; that way, you get a structure (PSCustomObject[]) with attributes that you can reference directly and meaningfully. For example,
$EntryList = Import-CSV -Path $FilePath -Header ID,Type,StartTime,EndTime -Delimiter '|'
gets you an array of PSCustomObjects, where each object has the indicated fields. You can then (for example) refer to $EntryList[$n].ID, $EntryList[$n].StartTime, and so on.

Powershell removing columns and rows from CSV

I'm having trouble making some changes to a series of CSV files, all with the same data structure. I'm trying to combine all of the files into one CSV file or one tab delimited text file (don't really mind), however each file needs to have 2 empty rows removed and two of the columns removed, below is an example:
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
^ ^
remove remove
End Result:
col1,col2,col4,col6
col1,col2,col4,col6
This is my attempt at doing this (I'm very new to Powershell)
$ListofFiles = "example.csv" #this is an list of all the CSV files
ForEach ($file in $ListofFiles)
{
$content = Get-Content ($file)
$content = $content[2..($content.Count)]
$contentArray = #()
[string[]]$contentArray = $content -split ","
$content = $content[0..2 + 4 + 6]
Add-Content '...\output.txt' $content
}
Where am I going wrong here...
your example file should be read, before foreach to fetch the file list
$ListofFiles = get-content "example.csv"
Inside the foreach you are getting content of mainfile
$content = Get-Content ($ListofFiles)
instead of
$content = Get-Content $file
and for removing rows i will recommend this:
$obj = get-content C:\t.csv | select -Index 0,1,3
for removing columns (column numbers 0,1,3,5):
$obj | %{(($_.split(","))[0,1,3,5]) -join "," } | out-file test.csv -Append
According to the fact the initial files looks like
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
,,,,,
,,,,,
You can also try this one liner
Import-Csv D:\temp\*.csv -Header 'C1','C2','C3','C4','C5','C6' | where {$_.c1 -ne ''} | select -Property 'C1','C2','C5' | Export-Csv 'd:\temp\final.csv' -NoTypeInformation
According to the fact that you CSVs have all the same structure, you can directly open them providing the header, then remove objects with the missing datas then export all the object in a csv file.
It is sufficient to specify fictitious column names, with a column number that can exceed the number of columns in the file, change where you want and exclude columns that you do not want to take.
gci "c:\yourdirwithcsv" -file -filter *.csv |
%{ Import-Csv $_.FullName -Header C1,C2,C3,C4,C5,C6 |
where C1 -ne '' |
select -ExcludeProperty C3, C4 |
export-csv "c:\temp\merged.csv" -NoTypeInformation
}

Use Import-Csv to read changable column Titles by location

I'm trying to see if there is a way to read the column values in a csv file based on the column location. The reason for this is the file I'm being handed always has it's titles being changed...
For example, lets say csv file column A (via excel) looks like the following:
ColumnOne
ValueOne
ValueTwo
ValueThree
Now the user changes the title:
Column 1
ValueOne
ValueTwo
ValueThree
Now I want to create an array of the first column. Normally what I do is the following:
$arrayFirstColumn = Import-Csv 'C:\test\test1.csv' | where-object {$_.ColumnOne} | select-object -expand 'ColumnOne'
However, as we can see if ColumnOne is changed to Column 1, it breaks this code. How can I create this array to allow an interchangeable column title, but the column location will always be the same?
You can specify headers of your own on import:
Import-Csv 'C:\path\to\your.csv' -Header 'MyHeaderA','MyHeaderB',...
As long as you don't export the data back to a CSV (or don't require the original headers to be in the output CSV as well) you can use whatever names you like. You can also specify as many header names as you like. If their number is less than the number of the columns in the CSV the additional columns will be omitted, if it's greater then the columns for the additional headers will be empty.
If you need to preserve the original headers you could get the header name(s) you need to work with in variable(s) like this:
$csv = Import-Csv 'C:\test\test1.csv'
$firstCol = $csv | Select-Object -First 1 | ForEach-Object {
$_.PSObject.Properties | Select-Object -First 1 -Expand Name
}
$arrayFirstColumn = $csv | Where-Object {$_.$firstCol} |
Select-Object -Expand $firstCol
Or you could simply read the first line from the CSV and split it to get an array with the headers:
$headers = (Get-Content 'C:\test\test1.csv' -TotalCount 1) -split ','
$firstCol = $headers[0]
One option:
$ImportFile = 'C:\test\test1.csv'
$FirstColumn = ((Get-Content $ImportFile -TotalCount 2 | ConvertFrom-Csv).psobject.properties.name)[0]
$FirstColumn
$arrayFirstColumn = Import-Csv $ImportFile | where-object {$_.$FirstColumn} | select-object -expand $FirstColumn
If you are using PowerShell v2.0 then the expression for $FirstColumn in $mjolinor's answer would be:
$FirstColumn = ((Get-Content $ImportFile -TotalCount 2 | ConvertFrom-Csv).psobject.properties | ForEach-Object {$_.name})[0]
(Apologies for starting a new answer; I do not yet have enough reputation to add a comment to mjolinor's post)

PowerShell: ConvertTo-Csv and then Set-Content by object filter

I have the following variables:
$input = "H:\input_file.txt"
$output = "H:\output_file.txt"
$data = Get-Content -Path $input | ConvertFrom-Csv -Delimiter '|' -Header 'Col1','Col2','Col3','Col4','Col5'
As seen above, the input file has 5 columns. This file has various record types in it. Not all record types have 5 columns defined. So, let's say I have 3 record types -- A, B, and C. A has 3 columns, B has 4 columns, and C has 5 columns. An example input file looks like:
A|x|1
B|y|2|stuff
C|z|3|stuff|other
B|y|3|other
A|z|2
My script then makes some modifications to the values in some of the columns (except for Col1) in $data. I want to output all rows in $data to a text file.
If I do something like
$data | Select Col1,Col2,Col3,Col4,Col5 | ConvertTo-Csv -Delimiter '|' -NoTypeInformation | % {$_ -replace '"', ""} | Select-Object -Skip 1 | Set-Content -Path $output
it will append unnecessary pipe characters to record types A and B (because they have less than 5 columns, and yet I am doing Select Col1,Col2,Col3,Col4,Col5).
Is there a clean way to output $data to a text file without the unnecessary pipe characters on record types A and B? My best guess at the moment is to have 3 separate pipelines for record types A, B, and C, such that I am doing the correct Select for the given record type, and then gluing them all together somehow.
The following code might be useful for you:
#"
A|x|1||
B|y|2|stuff|
C|z|3|stuff|other
B|y|3|other|
A|z|2||
"# -split '\r\n' | % { $_ -replace '\|+$', '' }
It seems little tricky and better to do some string manipulations I feel,
$data|ConvertTo-Csv -NoTypeInformation|ForEach-Object{$_.replace('","','|').replace('"','').replace(',','')}|Select-Object -Skip 1|Set-Content $outputtext
here by avoiding _delimiter , we canmake use of some string handling methods in PowerShell
Regards,
Prasoon Karunan V