Powershell Replace Regex Import CSV File - powershell

I have a CSV file named test.csv (C:\testing\test.csv) in this format:
File Name,Location,Added (GMT),Created (GMT),Last Modified (GMT),File Size (Bytes),File Size,Extension,Incident Type
10-MB-Test (1).docx,\\blah\Test 3,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:26,10723331,10.23 (MB),docx,low_data_discover
10-MB-Test (1).xlsx,\\blah2\Test 3\,10/8/2020 21:14,10/8/2020 19:33,10/8/2020 16:25,9566567,9.12 (MB),xlsx,high_data_discover
1-MB-Test.docx,\\blah3\Test 3\,10/8/2020 21:13,10/8/2020 19:33,10/8/2020 16:37,1045970,1021.46 (KB),docx,medium_data_discover
I'm trying to replace trailing "\" characters (if they exist) for values in the Location column with nothing using this Powershell code:
$file1 = import-csv -path "C:\testing\test.csv" | % {$_."Location" -replace "\\$",""} | Select-Object * | export-csv -NoTypeInformation "C:\testing\blah.csv"
However, when I run the code, the only output I get is a column named "Length" with a numerical value. Can you assist?

You're only sending the new string (updated location) down the pipeline. You can update each location and then export it at the end.
$file1 = import-csv -path "C:\testing\test.csv"
$file1 | ForEach-Object {$_.location = $_.location -replace '\\$'}
$file1 | export-csv -NoTypeInformation "C:\testing\blah.csv"

Related

Copy altered CSV Data to new CSV

The whole point of this issue is going to be: How to copy data from one CSV to another without knowing/listing the headers of the original CSV.
The cmdlet I'm building is meant to convert a report from CSV to a spreadsheet eventually. And if I write the column headers to the code, each time somebody changes the report, the code will break and it would have to be updated.
The steps I would take right now:
# Import the Source CSV. Gonna pull data from this later.
$SourceCSV = Import-Csv -Path $reportSourceCSV -Delimiter ";"
# Remove NULL characters, white spaces and change comma separator to semicolon.
(Get-Content -Path $reportSourceCSV | Where-Object {-not [string]::IsNullOrWhiteSpace($PSItem)}).Replace('","',";") | Out-File -FilePath $TMP1
# Import the modified new temp CSV.
$Input = Import-Csv -Path $TMP1 -Delimiter ";"
# Take existing CSV file headers and append some new ones. Rename a long column name.
((($GetHeaders = foreach ($Header in $SourceCSV[0].PSObject.Properties.Name) {
"`"$Header`""
}) + '"column4"','"column5"','"column6"') -join ";").Replace("VerylongOldColumnName","ShortName") | Out-File -FilePath $TMP2
foreach ($Item in $Input) {
"`"$($Item.column1)`";`"$($Item.'column2')`";`"$($Item.column3)`"" | Out-File -FilePath $TMP2 -Append
}
$exportToXLSX = Import-Csv -Path $TMP2 -Delimiter ";" | Export-Excel -Path $Target -WorkSheetname "reportname" -TableName "tablename" -TableStyle Medium2 -FreezeTopRow -AutoSize -PassThru
$exportToXLSX.Save()
$exportToXLSX.Dispose()
Remove-Item -Path $TMP1, $TMP2
This works! But I don't want to create infinite amount of different reports and just as many different logic blocks to process all these reports.
So this is as far as I was able to get trying a more dynamic way of processing the report CSVs:
(Get-Content -Path $reportSourceCSV | Where-Object {-not [string]::IsNullOrWhiteSpace($PSItem)}).Replace('","',";") | Out-File -FilePath $TMP1
$import = Import-Csv -Path $TMP1 -Delimiter ";"
$headers = ($import[0].PSObject.Properties.Name).Replace("VerylongOldColumnName","ShortName")
$headers | Out-File -FilePath "C:\TEMP\test.csv"
foreach ($item in $import) {
for ($h = 0; $h -le ($headers).Count; $h++) {
$($item.$($headers[$h]))
}
}
Now, this works... kind of. If I run the script like this, it shows me the output I want, but I was NOT able to export this to CSV.
I added Export-Csv to this line: $($item.$($headers[$h])) so this particular line would look like this:
$($item.$($headers[$h])) | Export-Csv -Path $Output -Delimiter ";" -Append -NoTypeInformation
And this is the error I get:
Export-Csv : Cannot append CSV content to the following file: C:\TEMP\test.csv.
The appended object does not have a property that corresponds to the following
column: column1. To continue with mismatched properties, add the -Force parameter,
and then retry the command.
At line:11 char:36
+ ... ers[$h])) | Export-Csv -Path $Output -Delimiter ";" -Append -NoTypeIn ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (column1:String) [Export-Csv], InvalidOperationException
+ FullyQualifiedErrorId : CannotAppendCsvWithMismatchedPropertyNames,Microsoft.PowerShell.Commands.ExportCsvCommand
If I add -Force parameter, the output will be the headers and a bunch of empty lines.
As little as I understand, is that the output is for some reason a string? To my knowledge everything should be an object in PS, unless converted to string (Write-Host cmdlet being an exception). And I don't really know how to force the output back to being objects.
Edit: Added sample source CSV
"Plugin","Plugin Name","Family","Severity","IP Address","Protocol","Port","Exploit?","Repository","DNS Name","NetBIOS Name","Plugin Text","Synopsis","Description","Solution","See Also","Vulnerability Priority Rating","CVSS V3 Base Score","CVSS V3 Temporal Score","CVSS V3 Vector","CPE","CVE","Cross References","First Discovered","Last Observed","Vuln Publication Date","Patch Publication Date","Exploit Ease","Exploit Frameworks"
"65057","Insecure Windows Service Permissions","Windows","High","127.0.0.1","TCP","445","No","Individual Scan","computer.domain.tld","NetBIOS Name","Plugin Output:
Path : c:\program files (x86)\application\folder\service.exe
Used by services : application
File write allowed for groups : Users, Authenticated Users
Full control of directory allowed for groups : Users, Authenticated Users","At least one improperly configured Windows service may have a privilege escalation vulnerability.","At least one Windows service executable with insecure permissions was detected on the remote host. Services configured to use an executable with weak permissions are vulnerable to privilege escalation attacks.
An unprivileged user could modify or overwrite the executable with arbitrary code, which would be executed the next time the service is started. Depending on the user that the service runs as, this could result in privilege escalation.
This plugin checks if any of the following groups have permissions to modify executable files that are started by Windows services :
- Everyone
- Users
- Domain Users
- Authenticated Users","Ensure the groups listed above do not have permissions to modify or write service executables. Additionally, ensure these groups do not have Full Control permission to any directories that contain service executables.","http://www.nessus.org/u?e4e766b2","","8.4","","AV:L/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H","cpe:/o:microsoft:windows","","","Jul 11, 2029 06:48:20 CEST","Jul 11, 2029 06:48:20 CEST","N/A","N/A","",""
Edit: I think I found another way how to accomplish this and looking at it, it looks I tried to overdo it quite a bit.
# Doing cleanup, changing delimiters, renaming that one known column. All in one line.
$importCSV = 'C:\TEMP\sourceReport.csv'
(Get-Content -Path $importCSV | Where-Object {-not [string]::IsNullOrWhiteSpace($PSItem)}).Replace('","','";"').Replace"VerylongOldColumnName","ShortName") | Out-File -FilePath C:\TEMP\tmp1.csv
# Adding additional columns and exporting it all to result CSV.
Import-Csv -Path C:\TEMP\tmp1.csv -Delimiter ";" | Select-Object *, "Column1", "Column2" | Export-Csv -Path C:\TEMP\result.csv -NoTypeInformation -Delimiter ";"
You should not simply replace , with ; because the fields actually contain commas as in ..Additionally, ensure these groups .. By replacing just like that, the field will get separated from the rest of its content and you'll end up with a mis-aligned csv.
The below approach will do what you want, leaving the structure of the csv file intact:
$importCSV = 'C:\TEMP\sourceReport.csv'
$exportCSV = 'C:\TEMP\result.csv'
$columnsToAdd = "Column1", "Column2"
# read the file as string array, not including empty lines
$content = Get-Content -Path $importCSV | Where-Object { $_ -match '\S' }
# replace the column header in the top line only
$content[0] = $content[0].Replace("VerylongOldColumnName", "ShortName")
# join the string array with newlines and convert that to an object with ConvertFrom-Csv
# add the columns to the object and export it using the semi-colon as delimiter
($content -join [Environment]::NewLine) | ConvertFrom-Csv |
Select-Object *, $columnsToAdd |
Export-Csv -Path $exportCSV -NoTypeInformation -Delimiter ";"

I use -NoTypeInformation so why do I get header back when using Out-File?

I filtered by date this file data1.csv
2017.11.1,09:55,1.1,1.2,1.3,1.4,1
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
I don't get a header with -NoTypeInformation:
$CutOff = (Get-Date).AddDays(-2)
$filePath = "data1.csv"
$Data = Import-Csv $filePath -Header Date,Time,A,B,C,D,E
$Data2 = $Data | Where-Object {$_.Date -as [datetime] -gt $Cutoff} | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But when rewriting with Out-File
$Data2 | Out-File "data2.csv" -Encoding utf8 -Force
I get header back as data2.csv contains:
Date,Time,A,B,C,D,E
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
Why do I have Date,Time,A,B,C,D,E ?
-NoTypeInformation is not about the header but the data type of the rows in the file. Remove it to see what shows up. From Microsoft
Omits the type information header from the output. By default, the string in the output contains #TYPE followed by the fully-qualified name of the object type.
Emphasis mine.
CSVs need headers. That is why it is making one. If you don't want to see the header in the output use Select-Object -Skip 1 to remove it.
$Data |
Where-Object {$_.Date -as [datetime] -gt $Cutoff} |
ConvertTo-CSV -NoTypeInformation -Delimiter "," |
Select-Object -Skip 1 |
% {$_ -replace '"'}
I would not pipe Out-File to itself. You could pipe to Set-Content here just as well.
I am guessing this whole process is to keep the source file in the same state just with some lines filtered out based on date. You could skip most of this just by parsing the date out in each line.
$threshold = (Get-Date).AddDays(-2)
$filePath = "c:\temp\bagel.txt"
(Get-Content $filePath) | Where-Object{
$date,$null=$_.Split(",",2)
[datetime]$date -gt $threshold
} | Set-Content $filePath
Now you don't have to worry about PowerShell CSV object structure or output since we act on the raw data of the file itself.
That will take each line of the input file and filter it out if the parsed date does not match the threshold. Change encoding on the input output cmdlets as you see necessary. What $date,$null=$_.Split(",",2) is doing is splitting the line
on the comma into 2 parts. First of which becomes $date and since this is just a filtering condition we dump the rest of the line into $null.
Properly-formed CSV files must have column headers. Your use of -NoTypeInformation in generating the CSV does not affect column headers; instead, it affects whether the PowerShell object type information is included. If you Export-CSV without -NoTypeInformation, the first line of your CSV file will have a line that looks like #TYPE System.PSCustomObject, which you don't want if you're going to open the CSV in a spreadsheet program.
If you subsequently Import-CSV, the headers (Date, Time, A, B, C) are used to create the fields of a PSObject, so that you can refer to them using the standard dot notation (e.g., $CSV[$line].Date).
The ability to specify -Header on Import-CSV is essentially a "hack" to allow the cmdlet to handle files that are comma-separated, but which did not include column headers.

How can I shift column values and add new ones in a CSV

I have to create a new column in my CSV data with PowerShell.
There is my code:
$csv = Import-Csv .\test1.csv -Delimiter ';'
$NewCSVObject = #()
foreach ($item in $csv)
{
$NewCSVObject += $item | Add-Member -name "ref" -value " " -MemberType NoteProperty
}
$NewCSVObject | export-csv -Path ".\test2.csv" -NoType
$csv | Export-CSV -Path ".\test2.csv" -NoTypeInformation -Delimiter ";" -Append
When I open the file, the column is here but a the right and I would like to have this at the left like column A. And I don't know if I can export the two object in one line like this (it doesn't work):
$csv,$NewCSVObject | Export-CSV -Path ".\test2.csv" -NoTypeInformation -Delimiter ";" -Append
The input file (It would have more lines than just the one):
A B C D E F G H
T-89 T-75 T-22 Y-23 Y-7 Y-71
The current output file:
A B C D E F G H
Y-23 Y-7 Y-71 ref: ref2:
The expected result in the Excel table, display "ref:" and "ref:2" before the product columns:
A B C D E F G H
ref: T-89 T-75 T-22 ref2: Y-23 Y-7 Y-71
This might be simpler if we just treat the file as a flat text file and save it in a csv format. You could use the csv objects and shift the values into other rows but that is not really necessary. Your approach of adding columns via Add-Member is not accomplishing this goal as it will be adding new columns and would not match your desired output. Export-CSV wants to write to file objects with the same properties as well which you were mixing which gave your unexpected results.
This is a verbose way of doing this. You could shorten this easily with something like regular expressions (see below). I opted for this method since it is a little easier to follow what is going on.
# Equivelent to Get-Content $filepath. This just shows what I am doing and is a portable solution.
$fileContents = "A;B;C;D;E;F;G;H",
"T-89;T-75;T-22;Y-23;Y-7;Y-71",
"T-89;T-75;T-22;Y-23;Y-7;Y-71"
$newFile = "C:\temp\csv.csv"
# Write the header to the output file.
$fileContents[0] | Set-Content $newFile
# Process the rest of the lines.
$fileContents | Select-Object -Skip 1 | ForEach-Object{
# Split the line into its elements
$splitLine = $_ -split ";"
# Rejoin the elements. adding the ref strings
(#("ref:") + $splitLine[0..2] + "ref2:" + $splitLine[3..5]) -join ";"
} | Add-Content $newFile
What the last line is going is concatenating an array. Starts with "ref:" add the first 3 elements of the split line followed by "ref2:" and the remaining elements. That new array is joined on semicolons and sent down the pipe to be outputted to the file.
If you are willing to give regex a shot this could be done with less code.
$fileContents = Get-Content "C:\source\file\path.csv"
$newFile = "C:\new\file\path.csv"
$fileContents[0] | Set-Content $newFile
($fileContents | Select-Object -Skip 1) -replace "((?:.*?;){3})(.*)",'ref:;$1ref2:;$2' | Add-Content $newFile
What that does is split each line beyond the first on the 3rd semicolon (Explanation). The replacement string is built from the ref strings and the matched content.
You can use Select-Object to specify order.
Assuming your headers are A-H (I know that instead of A it should be ref, from the code, but not sure if T-89 etc are your other headers)
$NewCSVObject | Select-Object A,B,C,D,E,F,G,H | Export-Csv -Path ".\test2.csv" -NoType

Powershell removing columns and rows from CSV

I'm having trouble making some changes to a series of CSV files, all with the same data structure. I'm trying to combine all of the files into one CSV file or one tab delimited text file (don't really mind), however each file needs to have 2 empty rows removed and two of the columns removed, below is an example:
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
^ ^
remove remove
End Result:
col1,col2,col4,col6
col1,col2,col4,col6
This is my attempt at doing this (I'm very new to Powershell)
$ListofFiles = "example.csv" #this is an list of all the CSV files
ForEach ($file in $ListofFiles)
{
$content = Get-Content ($file)
$content = $content[2..($content.Count)]
$contentArray = #()
[string[]]$contentArray = $content -split ","
$content = $content[0..2 + 4 + 6]
Add-Content '...\output.txt' $content
}
Where am I going wrong here...
your example file should be read, before foreach to fetch the file list
$ListofFiles = get-content "example.csv"
Inside the foreach you are getting content of mainfile
$content = Get-Content ($ListofFiles)
instead of
$content = Get-Content $file
and for removing rows i will recommend this:
$obj = get-content C:\t.csv | select -Index 0,1,3
for removing columns (column numbers 0,1,3,5):
$obj | %{(($_.split(","))[0,1,3,5]) -join "," } | out-file test.csv -Append
According to the fact the initial files looks like
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
,,,,,
,,,,,
You can also try this one liner
Import-Csv D:\temp\*.csv -Header 'C1','C2','C3','C4','C5','C6' | where {$_.c1 -ne ''} | select -Property 'C1','C2','C5' | Export-Csv 'd:\temp\final.csv' -NoTypeInformation
According to the fact that you CSVs have all the same structure, you can directly open them providing the header, then remove objects with the missing datas then export all the object in a csv file.
It is sufficient to specify fictitious column names, with a column number that can exceed the number of columns in the file, change where you want and exclude columns that you do not want to take.
gci "c:\yourdirwithcsv" -file -filter *.csv |
%{ Import-Csv $_.FullName -Header C1,C2,C3,C4,C5,C6 |
where C1 -ne '' |
select -ExcludeProperty C3, C4 |
export-csv "c:\temp\merged.csv" -NoTypeInformation
}

Powershell import-csv with empty headers

I'm using PowerShell To import a TAB separated file with headers. The generated file has a few empty strings "" on the end of first line of headers. PowerShell fails with an error:
"Cannot process argument because the
value of argument "name" is invalid.
Change the value of the "name"
argument and run the operation again"
because the header's require a name.
I'm wondering if anyone has any ideas on how to manipulate the file to either remove the double quotes or enumerate them with a "1" "2" "3" ... "10" etc.
Ideally I would like to not modify my original file. I was thinking something like this
$fileContents = Get-Content -Path = $tsvFileName
$firstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $firstLine
Import-Csv $fileContents -Delimiter "`t"
But Import-Csv is expecting $fileContents to be a path. Can I get it to use Content as a source?
You can either provide your own headers and ignore the first line of the csv, or you can use convertfrom-csv on the end like Keith says.
ps> import-csv -Header a,b,c,d,e foo.csv
Now the invalid headers in the file is just a row that you can skip.
-Oisin
If you want to work with strings instead use ConvertFrom-Csv e.g.:
PS> 'FName,LName','John,Doe','Jane,Doe' | ConvertFrom-Csv | Format-Table -Auto
FName LName
----- -----
John Doe
Jane Doe
I ended up needing to handle multiple instances of this issue. Rather than use the -Header and manually setting up each import instance I wrote a more generic method to apply to all of them. I cull out all of the `t"" instances of the first line and save the file to open as a $filename + _bak and import that one.
$fileContents = Get-Content -Path $tsvFileName
if( ([string]$fileContents[0]).ToString().Contains('""') )
{
[string]$fixedFirstLine = $fileContents[0].ToString().Replace('`t""',"")
$fileContents[0] = $fixedFirstLine
$tsvFileName = [string]::Format("{0}_bak",$tsvFileName
$fileContents | Out-File -FilePath $tsvFileName
}
Import-Csv $tsvFileName -Delimiter "`t"
My Solution if you have much columns :
$index=0
$ColumnsName=(Get-Content "C:\temp\yourCSCFile.csv" | select -First 1) -split ";" | %{
$index++
"Header_{0:d5}" -f $index
}
import-csv "C:\temp\yourCSCFile.csvv" -Header $ColumnsName