How to replace comma with pipe? - powershell

I am in the situation where I need to replace the comma with pipe using PowerShell. the fields are within double quotes so I need to remove them as well. but Some fields have comma in the data so When I replace, I should keep fields commas with while space.
Data look like this:
"Whaling, Mark","faflitto#cgf.com"
I want data to be like this:
Whaling Mark |faflitto#cgf.com
How to achieve this using PowerShell?. Help appreciated.
My script is this for now:
(Get-Content -ReadCount 0 Compliance2022.txt) -replace ',','|' -replace '"',' ' | Set-Content COMPLIANCE2022_.txt

I did NOT test this whole line, just the code between the Get-Content and Set-Content commands. The regex came from here.
(Get-Content -ReadCount 0 Compliance2022.txt) | ForEach-Object {($_ -Split '(?!\B"[^"]*),(?![^"]*"\B)') -Join(' | ')} | Set-Content COMPLIANCE2022_.txt
Odds are it misses exactly what you want, but maybe we can fine tune it to get to work.
EDIT:
This version removes the quotes.
(Get-Content -ReadCount 0 Compliance2022.txt) | ForEach-Object {(($_ -Split '(?!\B"[^"]*),(?![^"]*"\B)') -Join(' | ')) -replace '"', " "} | Set-Content COMPLIANCE2022_.txt

Related

How to add quotes and commas in PowerShell?

I have a CSV file where I only need 1 Column Called "SerialNumber" I need to combine the text lines, remove any blank space, add each line in quotes and separate by comma.
So far I have multiple miles of code that work, but it adds quotes at the end and doesn't add quotes in the beginning.
$SerialList = import-csv .\log.csv | select -ExpandProperty Serialnumber | Out-File -FilePath .\Process.txt
(gc process.txt) | ? {$_.trim() -ne "" } | set-content process.txt
gc .\process.txt | %{$_ -replace '$','","'} | out-file process1.csv
Get-Content .\process1.csv| foreach {
$out = $out + $_
}
$out| Out-File .\file2.txt
Output:
SerialNumber
1234
1234
4567
4567
Expected Output:
"1234","1234","4567","4567"
Try the following (PSv3+):
(Import-Csv .\log.csv).SerialNumber -replace '^.*$', '"$&"' -join "," > .\file2.txt
(Import-Csv .\log.csv).SerialNumber imports the CSV file and .SerialNumber uses member-access enumeration to extract the SerialNumber column values as an array.
-replace '^.*$', '"$&"' encloses each array element in "...".
Regex ^.*$ matches each array element in full.
Replacement expression "$&" replaces the element with what was matched ($&) enclosed in " chars. - for background, see this answer
-join "," joins the resulting array elements with , as the separator.

Powershell Remove spaces in the header only of a csv

First line of csv looks like this spaces are at after Path as well
author ,Revision ,Date ,SVNFolder ,Rev,Status,Path
I am trying to remove spaces only and rest of the content will be the same .
author,Revision,Date,SVNFolder,Rev,Status,Path
I tried below
Import-CSV .\script.csv | ForEach-Object {$_.Trimend()}
expanding on the comment with an example since it looks like you may be new:
$text = get-content .\script.csv
$text[0] = $text[0] -replace " ", ""
$csv = $text | ConvertFrom-CSV
Note: The solutions below avoid loading the entire CSV file into memory.
First, get the header row and fix it by removing all whitespace from it:
$header = (Get-Content -TotalCount 1 .\script.csv) -replace '\s+'
If you want to rewrite the CSV file to fix its header problem:
# Write the corrected header and the remaining lines to the output file.
# Note: I'm outputting to a *new* file, to be safe.
# If the file fits into memory as a whole, you can enclose
# Get-Content ... | Select-Object ... in (...) and write back to the
# input file, but note that there's a small risk of data loss, if
# writing back gets interrupted.
& { $header; Get-Content .\script.csv | Select-Object -Skip 1 } |
Set-content -Encoding utf8 .\fixed.csv
Note: I've chosen -Encoding utf8 as the example output character encoding; adjust as needed; note that the default is ASCII(!), which can result in data loss.
If you just want to import the CSV using the fixed headers:
& { $header; Get-Content .\script.csv | Select-Object -Skip 1 } | ConvertFrom-Csv
As for what you tried:
Import-Csv uses the column names in the header as property names of the custom objects it constructs from the input rows.
This property names are locked in at the time of reading the file, and cannot be changed later - unless you explicitly construct new custom objects from the old ones with the property names trimmed.
Import-Csv ... | ForEach-Object {$_.Trimend()}
Since Import-Csv outputs [pscustomobject] instances, reflected one by one in $_ in the ForEach-Object block, your code tries call .TrimEnd() directly on them, which will fail (because it is only [string] instances that have such a method).
Aside from that, as stated, your goal is to trim the property names of these objects, and that cannot be done without constructing new objects.
Read the whole file into an array:
$a = Get-Content test.txt
Replace the spaces in the first array element ([0]) with empty strings:
$a[0] = $a[0] -replace " ", ""
Write over the original file: (Don't forget backups!)
$a | Set-Content test.txt
$inFilePath = "C:\temp\headerwithspaces.csv"
$content = Get-Content $inFilePath
$csvColumnNames = ($content | Select-Object -First 1) -Replace '\s',''
$csvColumnNames = $csvColumnNames -Replace '\s',''
$remainingFile = ($content | Select-Object -Skip 1)

Parse text file with powershell, print out lines starting with 2 different strings

I have a text file containing repeating patterns of text (a STIG review document)
Sample:
Group ID (Vulid):  V-71989
Group Title:  SRG-OS-000445-GPOS-00199
.
Vulnerability Discussion:  ...
Check Content:...
<hash symbol> some command
.
I want to output the line beginning with "Group ID (Vulid)"
AND the line beginning with "#" in the order present in the file.
I have tried:
Get-Content C:\in-file.txt | (Where-Object {$_ -match 'Group ID'}) | (Where-Object {$_ -match '#'}) | Set-Content C:\out.txt
but it barfs on the "Or".
What Bill_Stewart is trying to say is it should be this:
Get-Content C:\in-file.txt |
Where-Object {$_ -match 'Group ID' -or $_ -match '#'} |
Set-Content C:\out.txt
Multi-line is just for readability; you can have it as one line in your code.

Removing blank space in columns in pipe delimited file in powershell

I have an issue where I need to remove blank spaces in columns 3 and 4 which may or may not exist in a pipe delimited text file using powershell.
The input file looks like this :
COLMABQ1|02112017|001000 08248|BQ|Name|
COLMABP1|02112017|00100009693|B P|Name|
COLAL3|02112017|001000 12032|C D|Name|
COLMAAO|02112017|00100014915|AO|Name|
COLAL1H|02112017|00100 017939|C D|Name|
I need the output file to look like this :
COLMABQ1|02112017|00100008248|BQ|Name|
COLMABP1|02112017|00100009693|BP|Name|
COLAL3|02112017|00100012032|CD|Name|
COLMAAO|02112017|00100014915|AO|Name|
COLAL1H|02112017|00100017939|CD|Name|
The nearest I have come to solving it so far is converting the file to a .csv file, replacing every | with a ",", running the code below against columns 3 & 4 then changing all the "," back to |
$headers = 1..5|%{"H{0}" -f $_}
$Csv = Import-Csv $infile -Header $Headers
$Csv|ft -auto
ForEach ($Row in $Csv) {
$Row.H3 = $Row.H3 -Replace ' '
}
$CSV | ConvertTo-CSV -NoType | Select -Skip 1 | Set-Content $outfile
Even this doesn't work exactly as I'd like and I'm convinced there must be a far easier way to do this...but 2 day's worth of Googling seems to suggest otherwise!
Any help anyone can give with this would be gratefully received as it's driving me insane.
One possibility:
Get-Content $infile |
ForEach-Object {
$parts = $_.split("|")
$parts[2] = $parts[2].replace(" ","")
$parts[3] = $parts[3].replace(" ","")
$parts -join "|"
} | Add-Content $Outfile
(Get-Content 'C:\Vincent imp\Test\Test.txt') -replace '(^.*\|.*\|.*) (.*\|.*\|.*\|)','$1$2' -replace '(^.*\|.*\|.*\|.*) (.*\|.*\|)','$1$2'
COLMABQ1|02112017|00100008248|BQ|Name|
COLMABP1|02112017|00100009693|BP|Name|
COLAL3|02112017|00100012032|CD|Name|
COLMAAO|02112017|00100014915|AO|Name|
COLAL1H|02112017|00100017939|CD|Name|
If only columns 3 & 4 have blank spaces :
(Get-Content $infile) -replace '\s+' | Set-Content $infile

How to avoid double quote when using export-csv in Powershell [duplicate]

I am using ConvertTo-Csv to get comma separated output
get-process | convertto-csv -NoTypeInformation -Delimiter ","
It outputs like:
"__NounName","Name","Handles","VM","WS",".....
However I would like to get output without quotes, like
__NounName,Name,Handles,VM,WS....
Here is a way to remove the quotes
get-process | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But it has a serious drawback if one of the item contains a " it will be removed !
Hmm, I have Powershell 7 preview 1 on my mac, and Export-Csv has a -UseQuotes option that you can set to AsNeeded. :)
I was working on a table today and thought about this very question as I was previewing the CSV file in notepad and decided to see what others had come up with. It seems many have over-complicated the solution.
Here's a real simple way to remove the quote marks from a CSV file generated by the Export-Csv cmdlet in PowerShell.
Create a TEST.csv file with the following data.
"ID","Name","State"
"5","Stephanie","Arizona"
"4","Melanie","Oregon"
"2","Katie","Texas"
"8","Steve","Idaho"
"9","Dolly","Tennessee"
Save As: TEST.csv
Store file contents in a $Test variable
$Test = Get-Content .\TEST.csv
Load $Test variable to see results of the get-content cmdlet
$Test
Load $Test variable again and replace all ( "," ) with a comma, then trim start and end by removing each quote mark
$Test.Replace('","',",").TrimStart('"').TrimEnd('"')
Save/Replace TEST.csv file
$Test.Replace('","',",").TrimStart('"').TrimEnd('"') | Out-File .\TEST.csv -Force -Confirm:$false
Test new file Output with Import-Csv and Get-Content:
Import-Csv .\TEST.csv
Get-Content .\TEST.csv
To Sum it all up, the work can be done with 2 lines of code
$Test = Get-Content .\TEST.csv
$Test.Replace('","',",").TrimStart('"').TrimEnd('"') | Out-File .\TEST.csv -Force -Confirm:$false
I ran into this issue, found this question, but was not satisfied with the answers because they all seem to suffer if the data you are using contains a delimiter, which should remain quoted. Getting rid of the unneeded double quotes is a good thing.
The solution below appears to solve this issue for a general case, and for all variants that would cause issues.
I found this answer elsewhere, Removing quotes from CSV created by PowerShell, and have used it to code up an example answer for the SO community.
Attribution: Credit for the regex, goes 100% to Russ Loski.
Code in a Function, Remove-DoubleQuotesFromCsv
function Remove-DoubleQuotesFromCsv
{
param (
[Parameter(Mandatory=$true)]
[string]
$InputFile,
[string]
$OutputFile
)
if (-not $OutputFile)
{
$OutputFile = $InputFile
}
$inputCsv = Import-Csv $InputFile
$quotedData = $inputCsv | ConvertTo-Csv -NoTypeInformation
$outputCsv = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
$outputCsv | Out-File $OutputFile -Encoding utf8 -Force
}
Test Code
$csvData = #"
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here","test data 1",345
3,",a comma, is in here","test data 2",346
4,"a comma, is in here,","test data 3",347
5,"a comma, is in here,","test data 4`r`nwith a newline",347
6,hello world2.,classic,123
"#
$data = $csvData | ConvertFrom-Csv
"`r`n---- data ---"
$data
$quotedData = $data | ConvertTo-Csv -NoTypeInformation
"`r`n---- quotedData ---"
$quotedData
# this regular expression comes from:
# http://www.sqlmovers.com/removing-quotes-from-csv-created-by-powershell/
$fixedData = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"\n]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
"`r`n---- fixedData ---"
$fixedData
$fixedData | Out-File e:\test.csv -Encoding ascii -Force
"`r`n---- e:\test.csv ---"
Get-Content e:\test.csv
Test Output
---- data ---
id string notes number
-- ------ ----- ------
1 hello world. classic 123
2 a comma, is in here test data 1 345
3 ,a comma, is in here test data 2 346
4 a comma, is in here, test data 3 347
5 a comma, is in here, test data 4... 347
6 hello world2. classic 123
---- quotedData ---
"id","string","notes","number"
"1","hello world.","classic","123"
"2","a comma, is in here","test data 1","345"
"3",",a comma, is in here","test data 2","346"
"4","a comma, is in here,","test data 3","347"
"5","a comma, is in here,","test data 4
with a newline","347"
"6","hello world2.","classic","123"
---- fixedData ---
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here",test data 1,345
3,",a comma, is in here",test data 2,346
4,"a comma, is in here,",test data 3,347
5,"a comma, is in here,","test data 4
with a newline","347"
6,hello world2.,classic,123
---- e:\test.csv ---
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here",test data 1,345
3,",a comma, is in here",test data 2,346
4,"a comma, is in here,",test data 3,347
5,"a comma, is in here,","test data 4
with a newline","347"
6,hello world2.,classic,123
This is pretty similar to the accepted answer but it helps to prevent unwanted removal of "real" quotes.
$delimiter = ','
Get-Process | ConvertTo-Csv -Delimiter $delimiter -NoTypeInformation | foreach { $_ -replace '^"','' -replace "`"$delimiter`"",$delimiter -replace '"$','' }
This will do the following:
Remove quotes that begin a line
Remove quotes that end a line
Replace quotes that wrap a delimiter with the delimiter alone.
Therefore, the only way this would go wrong is if one of the values actually contained not only quotes, but specifically a quote-delimiter-quote sequence, which hopefully should be pretty uncommon.
Once the file is generated, you can run
set-content FILENAME.csv ((get-content FILENAME.csv) -replace '"')
Depending on how pathological (or "full-featured") your CSV data is, one of the posted solutions will already work.
The solution posted by Kory Gill is almost perfect - the only issue remaining is that quotes are removed also for cells containing the line separator \r\n, which is causing issues in many tools.
The solution is adding a newline to the character class expression:
$fixedData = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"\n]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
I wrote this for my needs:
function ConvertTo-Delimited {
[CmdletBinding()]
param(
[Parameter(ValueFromPipeline=$true,Mandatory=$true)]
[psobject[]]$InputObject,
[string]$Delimiter='|',
[switch]$ExcludeHeader
)
Begin {
if ( $ExcludeHeader -eq $false ) {
#(
$InputObject[0].PsObject.Properties | `
Select-Object -ExpandProperty Name
) -Join $Delimiter
}
}
Process {
foreach ($item in $InputObject) {
#(
$item.PsObject.Properties | `
Select-Object Value | `
ForEach-Object {
if ( $null -ne $_.Value ) {$_.Value.ToString()}
else {''}
}
) -Join $Delimiter
}
}
End {}
}
Usage:
$Data = #(
[PSCustomObject]#{
A = $null
B = Get-Date
C = $null
}
[PSCustomObject]#{
A = 1
B = Get-Date
C = 'Lorem'
}
[PSCustomObject]#{
A = 2
B = Get-Date
C = 'Ipsum'
}
[PSCustomObject]#{
A = 3
B = $null
C = 'Lorem Ipsum'
}
)
# with headers
PS> ConvertTo-Delimited $Data
A|B|C
1|7/17/19 9:07:23 PM|Lorem
2|7/17/19 9:07:23 PM|Ipsum
||
# without headers
PS> ConvertTo-Delimited $Data -ExcludeHeader
1|7/17/19 9:08:19 PM|Lorem
2|7/17/19 9:08:19 PM|Ipsum
||
Here's another approach:
Get-Process | ConvertTo-Csv -NoTypeInformation -Delimiter "," |
foreach { $_ -replace '^"|"$|"(?=,)|(?<=,)"','' }
This replaces matches with the empty string, in each line. Breaking down the regex above:
| is like an OR, used to unite the following 4 sub-regexes
^" matches quotes in the beginning of the line
"$ matches quotes in the end of the line
"(?=,) matches quotes that are immediately followed by a comma
(?<=,)" matches quotes that are immediately preceded by a comma
I found that Kory's answer didn't work for the case where the original string included more than one blank field in a row. I.e. "ABC",,"0" was fine but "ABC",,,"0" wasn't handled properly. It stopped replacing quotes after the ",,,". I fixed it by adding "|(?<output>)" near the end of the first parameter, like this:
% {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$))|(?<output>))', `
'${start}${output}'}
I haven't spent much time looking for removing the quotes. But, here is a workaround.
get-process | Export-Csv -NoTypeInformation -Verbose -Path $env:temp\test.csv
$csv = Import-Csv -Path $env:temp\test.csv
This is a quick workaround and there may be a better way to do this.
A slightly modified variant of JPBlanc's answer:
I had an existing csv file which looked like this:
001,002,003
004,005,006
I wanted to export only the first and third column to a new csv file. And for sure I didn't want any quotes ;-)
It can be done like this:
Import-Csv -Path .\source.csv -Delimiter ',' -Header A,B,C | select A,C | ConvertTo-Csv -NoTypeInformation -Delimiter ',' | % {$_ -replace '"',''} | Out-File -Encoding utf8 .\target.csv
Couldn't find an answer to a similar question so I'm posting what I've found here...
For exporting as Pipe Delimited with No Quotes for string qualifiers, use the following:
$objtable | convertto-csv -Delimiter "|" -notypeinformation | select -Skip $headers | % { $_ -replace '"\|"', "|"} | % { $_ -replace '""', '"'} | % { $_ -replace "^`"",''} | % { $_ -replace "`"$",''} | out-file "$OutputPath$filename" -fo -en ascii
This was the only thing I could come up with that could handle quotes and commas within the text; especially things like a quote and comma next to each other at the beginning or ending of a text field.
This function takes a powershell csv object from the pipeline and outputs like convertto-csv but without adding quotes (unless needed).
function convertto-unquotedcsv {
param([Parameter(ValueFromPipeline=$true)]$csv, $delimiter=',', [switch]$noheader=$false)
begin {
$NeedQuotesRex = "($([regex]::escape($delimiter))|[\n\r\t])"
if ($noheader) { $names = #($true) } else { $names = #($false) }
}
process {
$psop = $_.psobject.properties
if (-not $names) {
$names = $psop.name | % {if ($_ -match $NeedQuotesRex) {'"' + $_ + '"'} else {$_}}
$names -join $delimiter # unquoted csv header
}
$values = $psop.value | % {if ($_ -match $NeedQuotesRex) {'"' + $_ + '"'} else {$_}}
$values -join $delimiter # unquoted csv line
}
end {
}
}
$names gets an array of noteproperty names and $values gets an array of notepropery values. It took that special step to output the header. The process block gets the csv object one piece at a time.
Here is a test run
$delimiter = ','; $csvData = #"
id,string,notes,"points per 1,000",number
4,"a delimiter$delimiter is in here,","test data 3",1,348
5,"a comma, is in here,","test data 4`r`nwith a newline",0.5,347
6,hello world2.,classic,"3,000",123
"#
$csvdata | convertfrom-csv | sort number | convertto-unquotedcsv -delimiter $delimiter
id,string,notes,"points per 1,000",number
6,hello world2.,classic,"3,000",123
5,"a comma, is in here,","test data 4
with a newline",0.5,347
4,"a delimiter, is in here,",test data 3,1,348