I am using ConvertTo-Csv to get comma separated output
get-process | convertto-csv -NoTypeInformation -Delimiter ","
It outputs like:
"__NounName","Name","Handles","VM","WS",".....
However I would like to get output without quotes, like
__NounName,Name,Handles,VM,WS....
Here is a way to remove the quotes
get-process | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But it has a serious drawback if one of the item contains a " it will be removed !
Hmm, I have Powershell 7 preview 1 on my mac, and Export-Csv has a -UseQuotes option that you can set to AsNeeded. :)
I was working on a table today and thought about this very question as I was previewing the CSV file in notepad and decided to see what others had come up with. It seems many have over-complicated the solution.
Here's a real simple way to remove the quote marks from a CSV file generated by the Export-Csv cmdlet in PowerShell.
Create a TEST.csv file with the following data.
"ID","Name","State"
"5","Stephanie","Arizona"
"4","Melanie","Oregon"
"2","Katie","Texas"
"8","Steve","Idaho"
"9","Dolly","Tennessee"
Save As: TEST.csv
Store file contents in a $Test variable
$Test = Get-Content .\TEST.csv
Load $Test variable to see results of the get-content cmdlet
$Test
Load $Test variable again and replace all ( "," ) with a comma, then trim start and end by removing each quote mark
$Test.Replace('","',",").TrimStart('"').TrimEnd('"')
Save/Replace TEST.csv file
$Test.Replace('","',",").TrimStart('"').TrimEnd('"') | Out-File .\TEST.csv -Force -Confirm:$false
Test new file Output with Import-Csv and Get-Content:
Import-Csv .\TEST.csv
Get-Content .\TEST.csv
To Sum it all up, the work can be done with 2 lines of code
$Test = Get-Content .\TEST.csv
$Test.Replace('","',",").TrimStart('"').TrimEnd('"') | Out-File .\TEST.csv -Force -Confirm:$false
I ran into this issue, found this question, but was not satisfied with the answers because they all seem to suffer if the data you are using contains a delimiter, which should remain quoted. Getting rid of the unneeded double quotes is a good thing.
The solution below appears to solve this issue for a general case, and for all variants that would cause issues.
I found this answer elsewhere, Removing quotes from CSV created by PowerShell, and have used it to code up an example answer for the SO community.
Attribution: Credit for the regex, goes 100% to Russ Loski.
Code in a Function, Remove-DoubleQuotesFromCsv
function Remove-DoubleQuotesFromCsv
{
param (
[Parameter(Mandatory=$true)]
[string]
$InputFile,
[string]
$OutputFile
)
if (-not $OutputFile)
{
$OutputFile = $InputFile
}
$inputCsv = Import-Csv $InputFile
$quotedData = $inputCsv | ConvertTo-Csv -NoTypeInformation
$outputCsv = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
$outputCsv | Out-File $OutputFile -Encoding utf8 -Force
}
Test Code
$csvData = #"
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here","test data 1",345
3,",a comma, is in here","test data 2",346
4,"a comma, is in here,","test data 3",347
5,"a comma, is in here,","test data 4`r`nwith a newline",347
6,hello world2.,classic,123
"#
$data = $csvData | ConvertFrom-Csv
"`r`n---- data ---"
$data
$quotedData = $data | ConvertTo-Csv -NoTypeInformation
"`r`n---- quotedData ---"
$quotedData
# this regular expression comes from:
# http://www.sqlmovers.com/removing-quotes-from-csv-created-by-powershell/
$fixedData = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"\n]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
"`r`n---- fixedData ---"
$fixedData
$fixedData | Out-File e:\test.csv -Encoding ascii -Force
"`r`n---- e:\test.csv ---"
Get-Content e:\test.csv
Test Output
---- data ---
id string notes number
-- ------ ----- ------
1 hello world. classic 123
2 a comma, is in here test data 1 345
3 ,a comma, is in here test data 2 346
4 a comma, is in here, test data 3 347
5 a comma, is in here, test data 4... 347
6 hello world2. classic 123
---- quotedData ---
"id","string","notes","number"
"1","hello world.","classic","123"
"2","a comma, is in here","test data 1","345"
"3",",a comma, is in here","test data 2","346"
"4","a comma, is in here,","test data 3","347"
"5","a comma, is in here,","test data 4
with a newline","347"
"6","hello world2.","classic","123"
---- fixedData ---
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here",test data 1,345
3,",a comma, is in here",test data 2,346
4,"a comma, is in here,",test data 3,347
5,"a comma, is in here,","test data 4
with a newline","347"
6,hello world2.,classic,123
---- e:\test.csv ---
id,string,notes,number
1,hello world.,classic,123
2,"a comma, is in here",test data 1,345
3,",a comma, is in here",test data 2,346
4,"a comma, is in here,",test data 3,347
5,"a comma, is in here,","test data 4
with a newline","347"
6,hello world2.,classic,123
This is pretty similar to the accepted answer but it helps to prevent unwanted removal of "real" quotes.
$delimiter = ','
Get-Process | ConvertTo-Csv -Delimiter $delimiter -NoTypeInformation | foreach { $_ -replace '^"','' -replace "`"$delimiter`"",$delimiter -replace '"$','' }
This will do the following:
Remove quotes that begin a line
Remove quotes that end a line
Replace quotes that wrap a delimiter with the delimiter alone.
Therefore, the only way this would go wrong is if one of the values actually contained not only quotes, but specifically a quote-delimiter-quote sequence, which hopefully should be pretty uncommon.
Once the file is generated, you can run
set-content FILENAME.csv ((get-content FILENAME.csv) -replace '"')
Depending on how pathological (or "full-featured") your CSV data is, one of the posted solutions will already work.
The solution posted by Kory Gill is almost perfect - the only issue remaining is that quotes are removed also for cells containing the line separator \r\n, which is causing issues in many tools.
The solution is adding a newline to the character class expression:
$fixedData = $quotedData | % {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"\n]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$)))' `
,'${start}${output}'}
I wrote this for my needs:
function ConvertTo-Delimited {
[CmdletBinding()]
param(
[Parameter(ValueFromPipeline=$true,Mandatory=$true)]
[psobject[]]$InputObject,
[string]$Delimiter='|',
[switch]$ExcludeHeader
)
Begin {
if ( $ExcludeHeader -eq $false ) {
#(
$InputObject[0].PsObject.Properties | `
Select-Object -ExpandProperty Name
) -Join $Delimiter
}
}
Process {
foreach ($item in $InputObject) {
#(
$item.PsObject.Properties | `
Select-Object Value | `
ForEach-Object {
if ( $null -ne $_.Value ) {$_.Value.ToString()}
else {''}
}
) -Join $Delimiter
}
}
End {}
}
Usage:
$Data = #(
[PSCustomObject]#{
A = $null
B = Get-Date
C = $null
}
[PSCustomObject]#{
A = 1
B = Get-Date
C = 'Lorem'
}
[PSCustomObject]#{
A = 2
B = Get-Date
C = 'Ipsum'
}
[PSCustomObject]#{
A = 3
B = $null
C = 'Lorem Ipsum'
}
)
# with headers
PS> ConvertTo-Delimited $Data
A|B|C
1|7/17/19 9:07:23 PM|Lorem
2|7/17/19 9:07:23 PM|Ipsum
||
# without headers
PS> ConvertTo-Delimited $Data -ExcludeHeader
1|7/17/19 9:08:19 PM|Lorem
2|7/17/19 9:08:19 PM|Ipsum
||
Here's another approach:
Get-Process | ConvertTo-Csv -NoTypeInformation -Delimiter "," |
foreach { $_ -replace '^"|"$|"(?=,)|(?<=,)"','' }
This replaces matches with the empty string, in each line. Breaking down the regex above:
| is like an OR, used to unite the following 4 sub-regexes
^" matches quotes in the beginning of the line
"$ matches quotes in the end of the line
"(?=,) matches quotes that are immediately followed by a comma
(?<=,)" matches quotes that are immediately preceded by a comma
I found that Kory's answer didn't work for the case where the original string included more than one blank field in a row. I.e. "ABC",,"0" was fine but "ABC",,,"0" wasn't handled properly. It stopped replacing quotes after the ",,,". I fixed it by adding "|(?<output>)" near the end of the first parameter, like this:
% {$_ -replace `
'\G(?<start>^|,)(("(?<output>[^,"]*?)"(?=,|$))|(?<output>".*?(?<!")("")*?"(?=,|$))|(?<output>))', `
'${start}${output}'}
I haven't spent much time looking for removing the quotes. But, here is a workaround.
get-process | Export-Csv -NoTypeInformation -Verbose -Path $env:temp\test.csv
$csv = Import-Csv -Path $env:temp\test.csv
This is a quick workaround and there may be a better way to do this.
A slightly modified variant of JPBlanc's answer:
I had an existing csv file which looked like this:
001,002,003
004,005,006
I wanted to export only the first and third column to a new csv file. And for sure I didn't want any quotes ;-)
It can be done like this:
Import-Csv -Path .\source.csv -Delimiter ',' -Header A,B,C | select A,C | ConvertTo-Csv -NoTypeInformation -Delimiter ',' | % {$_ -replace '"',''} | Out-File -Encoding utf8 .\target.csv
Couldn't find an answer to a similar question so I'm posting what I've found here...
For exporting as Pipe Delimited with No Quotes for string qualifiers, use the following:
$objtable | convertto-csv -Delimiter "|" -notypeinformation | select -Skip $headers | % { $_ -replace '"\|"', "|"} | % { $_ -replace '""', '"'} | % { $_ -replace "^`"",''} | % { $_ -replace "`"$",''} | out-file "$OutputPath$filename" -fo -en ascii
This was the only thing I could come up with that could handle quotes and commas within the text; especially things like a quote and comma next to each other at the beginning or ending of a text field.
This function takes a powershell csv object from the pipeline and outputs like convertto-csv but without adding quotes (unless needed).
function convertto-unquotedcsv {
param([Parameter(ValueFromPipeline=$true)]$csv, $delimiter=',', [switch]$noheader=$false)
begin {
$NeedQuotesRex = "($([regex]::escape($delimiter))|[\n\r\t])"
if ($noheader) { $names = #($true) } else { $names = #($false) }
}
process {
$psop = $_.psobject.properties
if (-not $names) {
$names = $psop.name | % {if ($_ -match $NeedQuotesRex) {'"' + $_ + '"'} else {$_}}
$names -join $delimiter # unquoted csv header
}
$values = $psop.value | % {if ($_ -match $NeedQuotesRex) {'"' + $_ + '"'} else {$_}}
$values -join $delimiter # unquoted csv line
}
end {
}
}
$names gets an array of noteproperty names and $values gets an array of notepropery values. It took that special step to output the header. The process block gets the csv object one piece at a time.
Here is a test run
$delimiter = ','; $csvData = #"
id,string,notes,"points per 1,000",number
4,"a delimiter$delimiter is in here,","test data 3",1,348
5,"a comma, is in here,","test data 4`r`nwith a newline",0.5,347
6,hello world2.,classic,"3,000",123
"#
$csvdata | convertfrom-csv | sort number | convertto-unquotedcsv -delimiter $delimiter
id,string,notes,"points per 1,000",number
6,hello world2.,classic,"3,000",123
5,"a comma, is in here,","test data 4
with a newline",0.5,347
4,"a delimiter, is in here,",test data 3,1,348
I'm working on a script that combines parts of two text files. These files are not too large (about 2000 lines each).
I'm seeing strange output from select-string that I don't think should be there.
Here's samples of my two files:
CC.csv - 2026 lines
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
LS126L47L6/1L3#536,07450,1,B
LS126L47L6/2R1#515,07451,1,B
LS126L47L6/10#525,07452,1,B
LS126L47L6/1L4#538,07453,1,B
GI.txt - 1995 lines
07445,B,SH,1
07446,B,SH,1
07448,B,SH,1
07449,B,SH,1
07450,B,SH,1
07451,B,SH,1
07452,B,SH,1
07453,B,SH,1
07454,B,SH,1
And here's a sample of the output file:
output in myfile.csv
LS126L47L6/3R1#516,07446,1,B
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
System.Object[],B
LS126L47L6/2R1#515,07451,1,B
This is the script I'm using:
sc ./myfile.csv "col1,col2,col3,col4"
$mn = gc cc.csv | select -skip 1 | % {$_.tostring().split(",")[1]}
$mn | % {
$a = (gc cc.csv | sls $_ ).tostring() -replace ",[a-z]$", ""
if (gc GI.txt | sls $_ | select -first 1)
{$b = (gc GI.txt | sls $_ | select -first 1).tostring().split(",")[1]}
else {$b = "NULL"
write-host "$_ is not present in GI file"}
$c = $a + ',' + $b
ac ./myfile.csv -value $c
}
The $a variable is where I am sometimes seeing the returned string as System.Object[]
Any ideas why? Also, this script takes quite some time to finish. Any tips for a newb on how to speed it up?
Edit: I should add that I've taken one line from the cc.csv file, saved in a new text file, and run through the script in console up through assigning $a. I can't get it to return "system.object[]".
Edit 2: After follow the advice below and trying a couple of things I've noticed that if I run
$mn | %{(gc cc.csv | sls $_).tostring()}
I get System.Object[].
But if I run
$mn | %{(gc cc.csv | sls $_)} | %{$_.tostring()}
It comes out fine. Go figure.
The problem is caused by a change in multiplicity of matches. If there are multiple matching elements an Object[] array (of MatchInfo elements) is returned; a single matching element results in a single MatchInfo object (not in an array); and when there are no matches, null is returned.
Consider these results, when executed against the "cc.csv" test-data supplied:
# matches many
(gc cc.csv | Select-String "LS" ).GetType().Name # => Object[]
# matches one
(gc cc.csv | Select-String "538").GetType().Name # => MatchInfo
# matches none
(gc cc.csv | Select-String "FAIL") # => null
The result of calling ToString on Object[] is "System.Object[]" while the result is a more useful concatenation of the matched values when invoked directly upon a MatchInfo object.
The immediate problem can be fixed with selected | Select -First 1, which will result in a MatchInfo being returned for the first two cases. Select-String will still search the entire input - extra results are simply discarded.
However, it seems like the look-back into "cc.csv" (with the Select-String) could be eliminated entirely as that is where $_ originally comes from. Here is a minor [untested] adaptation, of what it may look like:
gc cc.csv | Select -Skip 1 | %{
$num = $_.Split(",")[1]
$a = $_ -Replace ",[a-z]$", ""
# This is still O(m*n) and could be improved with a hash/set probe.
$gc_match = Select-String $num -Path gi.csv -SimpleMatch | Select -First 1
if ($gc_match) {
# Use of "Select -First 1" avoids the initial problem; but
# it /may/ be more appropriate for an error to indicate data problems.
# (Likewise, an error in the original may need further investigation.)
$b = $gc_match.ToString().Split(",")[1]
} else {
$b = "NULL"
Write-Host "$_ is not present in GI file"
}
$c = $a + ',' + $b
ac ./myfile.csv -Value $c
}
I have a series of documents that are going through the following function designed to count word occurrences in each document. This function works fine outputting to the console, but now I want to generate a text file containting the information, but with the file name appended to each word in the list.
My current console output is:
"processing document1 with x unique words occuring as follows"
"word1 12"
"word2 8"
"word3 3"
"word4 4"
"word5 1"
I want a delimited file in this format:
document1;word1;12
document1;word2;8
document1;word3;3
document1;word4;4
document1;word1;1
document2;word1;16
document2;word2;11
document2;word3;9
document2;word4;9
document2;word1;13
While the function below gets me the lists of words and occurences, I'm having a hard time figuring out where or how to insert the filename variable so that it prints at the head of each line. MSDN has been less-than helpful, and most of the places I try to insert the variable result in errors (see below)
function Count-Words ($docs) {
$document = get-content $docs
$document = [string]::join(" ", $document)
$words = $document.split(" `t",[stringsplitoptions]::RemoveEmptyEntries)
$uniq = $words | sort -uniq
$words | % {$wordhash=#{}} {$wordhash[$_] += 1}
Write-Host $docs "contains" $wordhash.psbase.keys.count "unique words distributed as follows."
$frequency = $wordhash.psbase.keys | sort {$wordhash[$_]}
-1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File c:\out-file-test.txt -append
$grouped = $words | group | sort count
Do I need to create a string to pass to the out-file cmdlet? is this just something I've been putting in the wrong place on the last few tries? I'd like to understand WHY it's going in a particular place as well. Right now I'm just guessing, because I know I have no idea where to put the out-file to achieve my selected results.
I've tried formatting my command per powershell help, using -$docs and -FilePath, but each time I add anything to the out-file above that runs successfully, I get the following error:
Out-File : Cannot validate argument on parameter 'Encoding'. The argument "c:\out-file-test.txt" does not bel
ong to the set "unicode,utf7,utf8,utf32,ascii,bigendianunicode,default,oem" specified by the ValidateSet attribute. Sup
ply an argument that is in the set and then try the command again.
At C:\c.ps1:39 char:71
+ -1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File <<<< -$docs -width 1024 c:\users\x46332\co
unt-test.txt -append
+ CategoryInfo : InvalidData: (:) [Out-File], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.OutFileCommand
I rewrote most of your code. You should utilize objects to make it easier formatting the way you want. This one splits on "space" and groups words together. Try this:
Function Count-Words ($paths) {
$output = #()
foreach ($path in $paths) {
$file = Get-ChildItem $path
((Get-Content $file) -join " ").Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries) | Group-Object | Select-Object -Property #{n="FileName";e={$file.BaseName}}, Name, Count | % {
$output += "$($_.FileName);$($_.Name);$($_.Count)"
}
}
$output | Out-File test-out2.txt -Append
}
$filepaths = ".\test.txt", ".\test2.txt"
Count-Words -paths $filepaths
It outputs like you asked(document;word;count). If you want documentname to include extension, change $file.BaseName to $file.Name . Testoutput:
test;11;1
test;9;2
test;13;1
test2;word11;5
test2;word1;4
test2;12;1
test2;word2;2
Slightly different approach:
function Get-WordCounts ($doc)
{
$text_ = [IO.File]::ReadAllText($doc.fullname)
$WordHash = #{}
$text_ -split '\b' -match '\w+'|
foreach {$WordHash[$_]++}
$WordHash.GetEnumerator() |
foreach {
New-Object PSObject -Property #{
Word = $_.Key
Count = $_.Value
}
}
}
$docs = gci c:\testfiles\*.txt |
sort name
&{
foreach ($doc in dir $docs)
{
Get-WordCounts $doc |
sort Count -Descending |
foreach {
(&{$doc.Name;$_.Word;$_.Count}) -join ';'
}
}
} | out-file c:\somedir\wordcounts.txt
Try this:
$docs = #("document1", "document2", ...)
$docs | % {
$doc = $_
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
} | Export-CSV output.csv -Delimiter ";" -NoTypeInfo
If you want to make this into a function you could do it like this:
function Count-Words($docs) {
foreach ($doc in $docs) {
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
}
}
$files = #("document1", "document2", ...)
Count-Words $files | Export-CSV output.csv -Delimiter ";" -NoTypeInfo