Filter CSV file with powershell - powershell

I am new to powershell ... I am trying to filter one of my scan result (.csv) on specific value, I am able to do this for simple CSV files. however the automated scan result seems like nested.
I need to filter Column "F" where the value is "Vuln" (Image is added) and write to a new CSV file.
can anyone please give me some leads.
I tried simple lines like : (but I didnt get any result)
Import-CSV -Path "C:\test.csv" | Where-Object {$_.Type -ey"Vuln"}
Sample CSV file format

You're mostly there, it looks like problem is coming from that .CSV file.
Frankly, it isn't a truly valid csv file, since it isn't just a simple Comma Separated Value sheet, instead it has multiple types of data.
However, it looks like something close to a real .CSV begins on line 6, so what we'll need to do is skip the first few rows of the file and then try to convert from .csv. I made my own .csv like yours
I can read the file and skip the first five lines like so:
Get-Content C:\temp\input.csv | Select-Object -Skip 5 | ConvertFrom-Csv
HostName OS Type
-------- -- ----
SomePC123 WinXp Vuln
SomePC234 Win7 Vuln
SomePc345 Win10 Patched
And then filter down to just items of Type Vuln with this command:
Get-Content C:\temp\input.csv | Select-Object -Skip 5 | ConvertFrom-Csv | Where-Object Type -eq 'Vuln'
HostName OS Type
-------- -- ----
SomePC123 WinXp Vuln
SomePC234 Win7 Vuln
To use this, just count down the number of lines until the spreadsheet begins within your .CSV and edit the -Skip parameter to match.
If you want to keep the header information, you can use an approach like this one:
$crap = Get-Content C:\temp\input.csv | Select-Object -First 5
$OnlyVulns = Get-Content C:\temp\input.csv | Select-Object -Skip 5 | ConvertFrom-Csv | Where-Object Type -eq 'Vuln'
$CrapAndOnlyVulns = $crap + $OnlyVulns
$CrapAndOnlyVulns > C:\pathTo\NewFile.csv

Here is the final script :)
$import = get-content .\Scan.csv
$import1= $import | Select-Object -First 7
$import | Select-Object -Skip 7 | ConvertFrom-Csv | Where-Object {$_.Type -eq "Vuln"} | Export-Csv Output1.csv -NoClobber -NoTypeInformation
$import1 + (Get-Content Output1.csv) | Set-Content Output2.csv

Related

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

Powershell: Comparing 2 CSVs and adding from 1 to the other, without duplicating

I had a question about copying from one CSV to another, without creating duplicates. I figured it out. See the accepted answer below.
Thanks.
Yet, the answer is pretty easy:
#(get-content .\masterlist.csv) + #(get-content .\update.csv) | Select -Unique | Out-File .\masterlist.csv
This is what I came up with (adds unique from update to master, then checks master for duplicates):
$updatefile = 'C:\path\to\file\update.csv'
$masterlist = 'C:\path\to\file\masterlist.csv'
get-content $updatefile | Select -Unique | add-content $masterlist
(Get-Content $masterlist | Group-Object | %{$_.group | select -First 1}) | Out-File $masterlist -encoding ASCII

Import-Csv Select -Skip

I have multiple CSV files that need to be merged to one. In every single CSV file there is a header and in the second row some text that I don't need.
I noticed the | Select -Skip 1 statement for the headers. Now I was wondering how I can skip the 3rd row?
I tried this, but this gives me an empty file
Get-ChildItem -Path $CSVFolder -Recurse -Filter "*.csv" | %{
Import-Csv $_.FullName -Header header1, header3, header4 |
Select -Skip 1 | Select -Skip 2
} | Export-Csv "C:\Export\result.csv" -NoTypeInformation
Select-Object doesn't allow you to skip arbitrary rows in between other rows. If you want to remove a particular row from a text input file, you can do so with a counter, e.g. like this:
$cnt = 0
Import-Csv 'C:\path\to\input.csv' |
Where-Object { ($cnt++) -ne 3 } |
Export-Csv 'C:\path\to\output.csv' -NoType
If the records in your input CSV don't have nested line breaks you could also use Get-Content/Set-Content, which is probably a little faster than Import-Csv/Export-Csv (due to less parsing overhead). Increase the line number you want to skip by one to account for the header line.
$cnt = 0
Get-Content 'C:\path\to\input.csv' |
Where-Object { ($cnt++) -ne 4 } |
Set-Content 'C:\path\to\output.csv'
try this
$i=0;
import-csv "C:\temp2\missing.csv" | %{$i++; if ($i -ne 3) {$_}} | export-csv "C:\temp2\result.csv" -NoTypeInformation
If all you are doing si skipping the first the rows in all user cases, just use -skip 3.
Get-Content -Path 'D:\Temp\UserRecord.csv'
# Results
<#
Name Codes
------- ---------
John AJFKC,EFUY
Ben EFOID, EIUF
Alex OIPORE, OUOIJE
#>
# Return all text after row the Header and row 3
(Get-Content -Path 'D:\Temp\UserRecord.csv') |
Select -Skip 3
# Results
<#
Ben EFOID, EIUF
Alex OIPORE, OUOIJE
#>
See also:
Parsing Text with PowerShell (1/3)

How to write the header from a .csv file to an array using powershell?

How do I read only the head from a CSV file and write the columnn names into an array?
I have found a solution using following cmdlets:
$obj = Import-Csv '.\users.csv' -Delimiter ';'
$headerarray = ($obj | Get-member -MemberType 'NoteProperty' | Select-Object -ExpandProperty 'Name')
But the problem is the name - values are auto sorted alphabetic
Anyone has a solution for this?
You can get the column names of a CSV file like this:
import-csv <csvfilename> |
select-object -first 1 | foreach-object { $_.PSObject.Properties } |
select-object -expandproperty Name

Using Import-CSV in Powershell, ignoring commented lines

I think that I must be missing something obvious because I'm trying to use Import-CSV to import CSV files that have commented out lines (always beginning with a # as the first character) at the top of the file, so the file looks like this:
#[SpecialCSV],,,,,,,,,,,,,,,,,,,,
#Version,1.0.0,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#[Table],,,,,,,,,,,,,,,,,,,,
Header1,Header2,Header3,Header4,Header5,Header6,Header7,...
Data1,Data2,Data3,Data4,Data5,Data6,Data7,...
I'd like to ignore those first 5 lines, but still use Import-csv to get the rest of the information nicely in to Powershell.
Thanks
Simple - just use Select-String to exclude commented lines with a regex, and pipe to ConvertFrom-Csv:
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
The difference between Import-Csv and ConvertTo-Csv is that the former takes input from a file, and the latter takes pipeline input, otherwise they do the same thing - convert CSV data to an array of PSCustomObjects. So, by using ConvertFrom-Csv you can do this without modifying the CSV flie or using a temp file. You can assign the results to an array or pipe to a Foreach-Object block just as you'd do with Import-Csv:
$array = Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
or
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv | %{
<whatever you want do with the data>
}
CSV has no notion of "comments" - it's just flat data. You'll need to use Get-Content and inspect each line. If a line starts with #, ignore it, otherwise process it.
If you're OK with using a temp file:
Get-content special.csv |where-object{!$_.StartsWith("#")}|add-content -path $(join-path -path $env:temp -childpath "special-filtered.csv");
$mydata = import-csv -path $(join-path -path $env:temp -childpath "special-filtered.csv");
remove-item -path $(join-path -path $env:temp -childpath "special-filtered.csv")
$mydata |format-table -autosize; #Just for illustration
Edit: Forgot about convertfrom-csv. It gets much simpler this way.
$mydata = Get-Content special.csv |
Where-Object { !$_.StartsWith("#") } |
ConvertFrom-Csv
If you feed convertfrom-csv csv data as an array of lines it seems to automatically filter out comments. I frequently use convertfrom-csv this way but I haven't seen it documented.
cat data.csv | convertfrom-csv #skips commented lines automagically
("co1,col2,col3", "abc,def,ghi", "#this,is,a,comment", "abc1,def1,ghi1")|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
abc1 def1 ghi1
However, the following will not skip comments:
"co1,col2,col3
abc,def,ghi
#this,is,a,comment
abc1,def1,ghi1
"|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
#this is a
abc1 def1 ghi1
Where-object will work after import-csv as well. You just have to reference the first column from csv in the clause.
e.g.:
$EscapeCharacter = '#'
$FilteredData = Import-Csv -Path "$($Home)\Documents\sample.csv" -Delimiter "`t" -Encoding UTF8 | Where-Object {$_.coll1 -notlike "$EscapeCharacter*"}
The sample of tab delimited csv:
coll1 coll2
#Kotehulky SomeValue
Cakovice OtherValue