How can I remove CRLF if anywhere between double quotes, using PowerShell? - powershell

My text file looks like this.
"MikeCRLF","","","Dell","DevelCRLFCRLFoper"CRLF
"SuCRLFsan","","","Apple","ManagCRLFer"CRLF
Desired result:
"Mike","","","Dell","Developer"LF
"Susan","","","Apple","Manager"LF
I tried this on PowerShell:
"C:\Users\abc\Desktop\1.txt"
(Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force
When I do this, I don't get the desired result. Also, I am left with one CRLF at the end. I don't want that either.
Please tell me how to do this using PowerShell v3.

This method avoids checking to see if \r\n is in quotes. Instead, it tries to find the "real" end of line situations and converts those first. Then it just purges the rest.
(Get-Content test.txt -Raw) -replace '([^,]")(\s*\r\n\s*)+("[^,])',"`$1`n`$3" -replace '\r\n',''
I think this should handle most of the stuff you throw at it, but let me know if you find a special case.
edited to fix the replacement string

If you are using the PowerShell Community Extensions, you can use the ConvertTo-UnixLineEnding command e.g.:
ConvertTo-UnixLineEnding C:\users\abc\desktop1.txt -dest desktop1-converted.txt -Enc ascii

Related

Batch File to Find and Replace in text file using whole word only?

I am writing a script which at one point has to check in a text file and remove certain strings. So far I have this:
powershell -Command "(gc myFile.txt) -replace 'foo', 'bar' | Out-File -encoding ASCII myFile.txt"
The only problem is that that can find and replace but will not remove the line all together.
The second problem is that say I am removing the line that has Mark, it needs to not remove a line that has something like Markus.
I don't know if this is possible with the powershell interface?
Your current code will only replace foo with bar, this is what replace does.
Removing the whole line if it matches requires a different approach, almost backwards, as you can use notmatch to output any lines that do not match you filter - effectively removing them.
Also using regex word boundaries will then only match Mark but not Markus:
(Get-Content file.txt) | Where-Object {$_ -notmatch "\bMark\b"} | Set-Content file.txt

Set-content keeping busy my files

I'm trying to replace multiple strings for new ones, always in the same file.
This would be an example. This give me no problems.
(get-content modTags.bas) | %{$_ -replace "rng_origin.Offset(ColumnOffset:=1)", "rng_origin.Offset(ColumnOffset:=0)"} | set-content modTags.bas
But if I repeat this line in the script (in fact, i must do it like 20 times) I get the error that the file is currently in use.
I have tried to put (set-content) like in (get-content), but it seems it doesn't works for only allow parameters in the first command in the pipeline.
I already know how to "bypass" this error.
By typing all my replacements inline it works (or continue the code in a new line) like this.
(get-content modTags.bas) | %{$_ -replace "X","Y" `
-replace "A","B"} | set-content modTags.bas
So this is a question about why set-content keeps the file occupy for new query in the same script and how can it be avoided? With get-content it easy with the () solution, and I was kinda expecting something similar for set-content.
And second. Could you suggest any better alternative for replacing multiple strings for different ones and save it in the same file (not creating a file.new.txt file.old.txt or something like that)
Thanks!

Pipes in replace causing line to be duplicated

I have a script that I need to replace a couple of lines in. The first replace is going fine but the second is wiping out my file and duplicating the line multiple times.
My code
(get-content $($sr)) -replace 'remoteapplicationname:s:SHAREDAPP',"remoteapplicationcmdline:s:$($sa)" | Out-File $($sr)
(get-content $($sr)) -replace 'remoteapplicationprogram:s:||SHAREDAPP',"remoteapplicationprogram:s:||$($sa)" | Out-File $($sr)
The first replace works perfectly. The second one is causing this:
remoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagarederemoteapplicationprogram:s:||stagareddremoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagarederemoteapplicationprogram:s:||stagaredcremoteapplicationprogram:s:||stagaredtremoteapplicationprogram:s:||stagaredcremoteapplicationprogram:s:||stagaredlremoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagaredpremoteapplicationprogram:s:||stagaredbremoteapplicationprogram:s:||stagaredoremoteapplicationprogram:s:||stagaredaremoteapplicationprogram:s:||stagaredrremoteapplicationprogram:s:||stagareddremoteapplicationprogram:s:||stagared:remoteapplicationprogram:s:||stagarediremoteapplicationprogram:s:||stagared:remoteapplicationprogram:s:||stagared1remoteapplicationprogram:s:||stagared
etc...
Is this because of the ||? If so, how do I get around it?
Thanks!
To begin with, you should be using slightly more meaningful names for your variables. Especially if you want someone else to be reviewing your code.
The gist of your issue is that -replace supports regexes (regular expressions), and you have regex control characters in your pattern string. Consider the following simple example, and notice everywhere the replacement string is found:
PS C:\Users\Matt> "ABCD" -replace "||", "bagel"
bagelAbagelBbagelCbagelDbagel
-replace is also an array operator, so it works on every line of the input file, which is nice. For simplicity's sake, if you are not using a regex, you should just consider using the string method .Replace(), but it is case-sensitive, so that might not be ideal. So let's escape those control characters in the easiest way possible:
$patternOne = [regex]::Escape('remoteapplicationname:s:SHAREDAPP')
$patternTwo = [regex]::Escape('remoteapplicationprogram:s:||SHAREDAPP')
(get-content $sr) -replace $patternOne, "remoteapplicationcmdline:s:$sa" | Out-File $($sr)
(get-content $sr) -replace $patternTwo, "remoteapplicationprogram:s:||$sa" | Out-File $($sr)
Now we get both patterns matched as you have them written. Run $patternTwo on the console to see what has changed to it! $patternOne, as written, has no regex control characters in it, but it does not hurt to use the escape method if you are just expecting simple matching.
Aside from the main issue pointed out, there is also some redundancy and misconception that can be addressed here. I presume you are updating a source file to replace all occurrences of those strings, yes? Well, you don't need to read the file in twice, given that you can chain -replace:
$patternOne = [regex]::Escape('remoteapplicationname:s:SHAREDAPP')
$patternTwo = [regex]::Escape('remoteapplicationprogram:s:||SHAREDAPP')
(get-content $sr) -replace $patternOne, "remoteapplicationcmdline:s:$sa" -replace $patternTwo, "remoteapplicationprogram:s:||$sa" |
Set-Content $sr
Perhaps that will do what you intended.
You might notice that I've removed the subexpressions operators ($(...)) around your variables. While they have their place, they don't need to be used here. They are only needed inside more complicated strings, like when you need to expand object properties or something.

Replacing contents of a text file using PowerShell

I've looked all around this site and can't quite seem to find anything that fits my situation. Basically, I am trying to write an addition to the NETLOGON file that will replace text in a text file on all of our users' desktops. The current text is static across the board.
The text I want it changed to will be unique to each user. I want to change the current text (user1) to the users AD username (i.e. johnd, janed, etc.). I am using Windows Server 2008 R2 and all the workstations are Windows 7 Professional SP1 64 bit.
Here's what I have tried so far (with a few variables, which none have worked for one reason or the other):
gc c:\Users\%USERNAME%\desktop\VPN.txt' -replace "user1",$env:username | out-file c:\Users\%USERNAME%\desktop\VPN.txt
I didn't get an error, but it also did not go back to the normal "PS C:>" prompt, just ">>>" and the file did not change as anticipated.
If that is how you have the code exactly then I suppose it is because you have an opening single quote without a closing one. You are still going to have two other problems and you have one answer in your code. The >>> is the line continuation characters because the parser knows that the code is not complete and giving you the option to continue with the code. If you were purposely coding a single line on multiple lines you would consider this a feature.
$path = "c:\Users\$($env:username)\desktop\VPN.txt"
(Get-Content $path) -replace "user1",$env:username | out-file $path
Closed the path in quotes and used a variable since you called the path twice.
%name% is used in command prompt. Environment variables in PowerShell use the $env: provider which you did you once in your snippet.
-replace is a regex replaced tool that can work against Get-Content but you need to capture the result in a sub expression first.
Secondly with -replace is for regex and your string is not regex based you could just use .Replace() as well.
Set-Content is generally preferred over Out-File for performance reasons.
All that being said...
you could also try something like this.
$path = "c:\Users\$($env:username)\desktop\VPN.txt"
(Get-Content $path).Replace("user1",$env:username) | Set-Content $path
Do you want to only replace the first occurrence?
You could use a little regex here with a tweak in how you get the use Get-Content
$path = "c:\Users\$($env:username)\desktop\VPN.txt"
(Get-Content $path | Out-String) -replace "(.*?)user1(.*)",('$1{0}$2' -f $env:username) | out-file $path
Regex will match the entire file. There are two groups which it captures.
(.*?) - Up until the first "user1"
(.*) - Everything after that
Then we use the format operator to sandwich the new username in between those capture groups.
Use:
(Get-Content $fileName) | % {
if ($_.ReadCount -eq 1) {
$_ -replace "$original", "$content"
}
else {
$_
}
} | Set-Content $fileName

Select-String pattern not matching

I have the text of a couple hundred Word documents saved into individual .txt files in a folder. I am having an issue where a MergeField in the Word document wasn't formatted correctly, and now I need to find all the instances in the folder where the incorrect formatting occurs. the incorrect formatting is the string \#,$##,##0.00\* So, I'm trying to use PowerShell as follows:
select-string -path MY_PATH\.*txt -pattern '\#,$##,##0.00\*'
select-string -path MY_PATH\.*txt -pattern "\#`,`$##`,##0.00\*"
But neither of those commands finds any results, even though I'm sure the string exists in at least one file. I feel like the error is occurring because there are special characters in the parameter (specifically $ and ,) that I'm not escaping correctly, but I'm not sure how else to format the pattern. Any suggestions?
If you are actually looking for \#,$##,##0.00\* then you need to be aware that Select-String uses regex and you have a lot of control characters in there. Your string should be
\\\#,\$\#\#,\#\#0\.00\\\*
Or you can use the static method Escape of regex to do the dirty work for you.
[regex]::Escape("\#,$##,##0.00\*")
To put this all together you would get the following:
select-string -path MY_PATH\.*txt -pattern ([regex]::Escape("\#,$##,##0.00\*"))
Or even simpler would be to use the parameter -SimpleMatch since it does not interpet the string .. just searches as is. More here
select-string -path MY_PATH\.*txt -SimpleMatch "\#,$##,##0.00\*"
My try, similar to Matts:
select-string -path .\*.txt -pattern '\\#,\$##,##0\.00\\\*'
result:
test.txt:1:\#,$##,##0.00\*