I am trying to create a Powershell script which wraps quotes around each columns of the file on export to CSV. However the Export-CSV applet only places these where they are needed, i.e. where the text has a space or similar within it.
I have tried to use the following to wrap the quotes on each line but it ends up wrapping three quotes on each column.
$r.SURNAME = '"'+$r.SURNAME+'"';
Is anyone able to share how to forces these on each column of the file - so far I can just find info on stripping these out.
Thanks
Perhaps a better approach would be to simply convert to CSV (not export) and then a simple regex expression could add the quotes then pipe it out to file.
Assuming you are exporting the whole object $r:
$r | ConvertTo-Csv -NoTypeInformation `
| % { $_ -replace ',(.*?),',',"$1",' } `
| Select -Skip 1 | Set-Content C:\temp\file.csv
The Select -Skip 1 removes the header. If you want the header just take it out.
To clarify what the regex expression is doing:
Match: ,(.*?),
Explanation: This will match section of each line that has a comma followed by any number of characters (.*) without being greedy (? : basically means it will only match the minimum number of characters that is needed to complete the match) and the finally is ended with a comma. The parenthesis will hold everything between the two commas in a match variable to be used later in the replace.
Replace: ,"$1",
Explanation: The $1 holds the match between the two parenthesis mention above in the match. I am surrounding it with quotes and re-adding the commas since I matched on those as well they must be replaced or they are simply consumed. Please note, that while the match portion of the -replace can have double quotes without an issue, the replace section must be surrounded in single quotes or the $1 gets interpreted by PowerShell as a PowerShell variable and not a match variable.
You can also use the following code:
$r.SURNAME = "`"$($r.SURNAME)`""
I have cheated to get what I want by re-parsing the file through the following - guess that it acts as a simple find and replace on the file.
get-content C:\Data\Downloads\file2.csv
| foreach-object { $_ -replace '"""' ,'"'}
| set-content C:\Data\Downloads\file3.csv
Thanks for the help on this.
Related
I am writing a script which at one point has to check in a text file and remove certain strings. So far I have this:
powershell -Command "(gc myFile.txt) -replace 'foo', 'bar' | Out-File -encoding ASCII myFile.txt"
The only problem is that that can find and replace but will not remove the line all together.
The second problem is that say I am removing the line that has Mark, it needs to not remove a line that has something like Markus.
I don't know if this is possible with the powershell interface?
Your current code will only replace foo with bar, this is what replace does.
Removing the whole line if it matches requires a different approach, almost backwards, as you can use notmatch to output any lines that do not match you filter - effectively removing them.
Also using regex word boundaries will then only match Mark but not Markus:
(Get-Content file.txt) | Where-Object {$_ -notmatch "\bMark\b"} | Set-Content file.txt
I have a script that I need to replace a couple of lines in. The first replace is going fine but the second is wiping out my file and duplicating the line multiple times.
My code
(get-content $($sr)) -replace 'remoteapplicationname:s:SHAREDAPP',"remoteapplicationcmdline:s:$($sa)" | Out-File $($sr)
(get-content $($sr)) -replace 'remoteapplicationprogram:s:||SHAREDAPP',"remoteapplicationprogram:s:||$($sa)" | Out-File $($sr)
The first replace works perfectly. The second one is causing this:
remoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagarederemoteapplicationprogram:s:||stagareddremoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagarederemoteapplicationprogram:s:||stagaredcremoteapplicationprogram:s:||stagaredtremoteapplicationprogram:s:||stagaredcremoteapplicationprogram:s:||stagaredlremoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagaredpremoteapplicationprogram:s:||stagaredbremoteapplicationprogram:s:||stagaredoremoteapplicationprogram:s:||stagaredaremoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagareddremoteapplicationprogram:s:||stagared:remoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagared:remoteapplicationprogram:s:||stagared1remoteapplicationprogram:s:||stagared
etc...
Is this because of the ||? If so, how do I get around it?
Thanks!
To begin with, you should be using slightly more meaningful names for your variables. Especially if you want someone else to be reviewing your code.
The gist of your issue is that -replace supports regexes (regular expressions), and you have regex control characters in your pattern string. Consider the following simple example, and notice everywhere the replacement string is found:
PS C:\Users\Matt> "ABCD" -replace "||", "bagel"
bagelAbagelBbagelCbagelDbagel
-replace is also an array operator, so it works on every line of the input file, which is nice. For simplicity's sake, if you are not using a regex, you should just consider using the string method .Replace(), but it is case-sensitive, so that might not be ideal. So let's escape those control characters in the easiest way possible:
$patternOne = [regex]::Escape('remoteapplicationname:s:SHAREDAPP')
$patternTwo = [regex]::Escape('remoteapplicationprogram:s:||SHAREDAPP')
(get-content $sr) -replace $patternOne, "remoteapplicationcmdline:s:$sa" | Out-File $($sr)
(get-content $sr) -replace $patternTwo, "remoteapplicationprogram:s:||$sa" | Out-File $($sr)
Now we get both patterns matched as you have them written. Run $patternTwo on the console to see what has changed to it! $patternOne, as written, has no regex control characters in it, but it does not hurt to use the escape method if you are just expecting simple matching.
Aside from the main issue pointed out, there is also some redundancy and misconception that can be addressed here. I presume you are updating a source file to replace all occurrences of those strings, yes? Well, you don't need to read the file in twice, given that you can chain -replace:
$patternOne = [regex]::Escape('remoteapplicationname:s:SHAREDAPP')
$patternTwo = [regex]::Escape('remoteapplicationprogram:s:||SHAREDAPP')
(get-content $sr) -replace $patternOne, "remoteapplicationcmdline:s:$sa" -replace $patternTwo, "remoteapplicationprogram:s:||$sa" |
Set-Content $sr
Perhaps that will do what you intended.
You might notice that I've removed the subexpressions operators ($(...)) around your variables. While they have their place, they don't need to be used here. They are only needed inside more complicated strings, like when you need to expand object properties or something.
Using PowerShell, I have code that will count the number of times a value appears anywhere in a .csv file. If I put "\bhello\b", it will count the times "hello" appears anywhere in the .csv. The problem is that it doesn't work for counting the times null appears in the CSV. It gives me a number bigger than the number of values in the entire CSV file.
(select-string -Path 'D:\AaronR\Desktop\Book.csv' -Pattern "\b$null\b" -AllMatches | Select-Object -ExpandProperty Matches).Count
There are 3 problems with your regular expression:
You defined your pattern as a double quoted string \b$null\b, so PowerShell automatically expands the variable $null to a null value and casts that to an empty string (to fit it into the string). Because of this you're effectively matching a pattern \b\b.
The character $ has a special meaning in regular expressions (the end of a string), so it must be escaped (\$) in order to match a literal $ character.
The character $ isn't a word-character, so even when escaping the $ the pattern \b\$word\b will only match if you have a word-character right before the $ (e.g. something$null).
If you want to match literal strings $null in your CSV you need to escape the $ and put the first word boundary marker between $ and n. Also, I'd recommend to use single quotes for regular expression strings, unless you want variables expanded in them.
Select-String -Pattern '\$\bnull\b' ...
"ID","Full Name","Age"
"1","Jone Micale","25"
Here a sample from a CSV file that I created, and now I want to remove double quotes from only the ID and Age column value.
I tried different ways but I don't want to create a new file out of it. I just want to update the file with changes using PowerShell v1.
Export-Csv will always put all fields in double quotes, so you have to remove the undesired quotes the hard way. Something like this might work:
$csv = 'C:\path\to\your.csv'
(Get-Content $csv) -replace '^"(.*?)",(.*?),"(.*?)"$', '$1,$2,$3' |
Set-Content $csv
Regular expression breakdown:
^ and $ match the beginning and end of a string respectively (Get-Content returns an array with the lines from the file).
"(.*?)" matches text between two double quotes and captures the match (without the double quotes) in a group.
,(.*?), matches text between two commas and captures the match (including double quotes) in a group.
$1,$2,$3 replaces a matching string with the comma-separated first, second and third group from the match.
For part 1, see this SO post
I have a CSV that has certain fields separated by the " symbol as a TextQualifier.
See below for example. Note that each integer (eg. 1,2,3 etc) is supposed to be a string. the qualified strings are surrounded by the " symbol.
1,2,3,"qualifiedString1",4,5,6,7,8,9,10,11,12,13,14,15,16,"qualifiedString2""
Notice how the last qualified string has a " symbol as part of the string.
User #mjolinor suggested this powershell script, which works to fix the above scenario, but it does not fix the "Part 2" scenario below.
(get-content file.txt -ReadCount 0) -replace '([^,]")"','$1' |
set-content newfile.txt
Here is part 2 of the question. I need a solution for this:
The extra " symbol can appear randomly in the string. Here's another example:
1,2,3,"qualifiedString1",4,5,6,7,8,9,10,11,12,13,14,15,16,"qualifiedS"tring2"
Can you suggest an elegant way to automate the cleaning of the CSV to eliminate redundant " qualifiers?
You just need a different regex:
(get-content file.txt -ReadCount 0) -replace '(?<!,)"(?!,|$)',''|
set-content newfile.txt
That one will replace any double quote that is not immediately preceeded by a comma, or followed by either a comma or the end of the line.
$text = '1,2,3,"qualifiedString1",4,5,6,7,8,9,10,11,12,13,14,15,16,"qualifiedS"tring2"'
$text -replace '(?<!,)"(?!,|$)',''
1,2,3,"qualifiedString1",4,5,6,7,8,9,10,11,12,13,14,15,16,"qualifiedString2"