Remove all carriage return and line feed from file - powershell

Last week I have asked you guys to replace a string with newline character with .bat script. I have realized that my file has some carriage return and newline characters already, which I need to remove first and then do the replace.
to replace '#####' with linefeed I am using the line below.
(gc $Source) -replace "#####", "`r`n"|set-content $Destination
So I tried to implement the same logic to replace \r and \n as well, however it did not work.
(gc $Source) -replace "`n", ""|set-content $Destination
my file looks like :
abc|d ef|123#####xyz|tuv|567#####
and I need to make it look like
abc|def|123 xyz|tuv|567
like I said, replacing the row delimiter character with new line works, but I need to remove all cr and lf characters first before I do that.
For small files the script below works, but my file is >1.5GB and it throws OutofMemoryException error
param
(
[string]$Source,
[string]$Destination
)
echo $Source
echo $Destination
$Writer = New-Object IO.StreamWriter $Destination
$Writer.Write( [String]::Join("", $(Get-Content $Source)) )
$Writer.Close()

Use the below function to remove the special characters. Put all of them in $SpecChars what ever you want to remove and call the function with the Text-data as a parameter.
Function Convert-ToFriendlyName
{param ($Text)
# Unwanted characters (includes spaces and '-') converted to a regex:
#Whatever characters you want to remove, put it here with comma separation.
$SpecChars = '\', ' ','\\','-'
$remspecchars = [string]::join('|', ($SpecChars | % {[regex]::escape($_)}))
# Convert the text given to correct naming format (Uppercase)
$name = (Get-Culture).textinfo.totitlecase(“$Text”.tolower())
# Remove unwanted characters
$name = $name -replace $remspecchars, ""
$name
}
Hope it helps...!!!

This is vbscript. Windows isn't consistent. Mostly it breaks on CR and removes LF (all inbuilt programming languages). But Edit controls (ie Notepad) break on LF and ignore CR (unless preceding a LF).
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Do Until Inp.AtEndOfStream
Text = Inp.readall
Text = Replace(Text, vbcr, "")
Text = Replace(Text, vblf, "")
Text = Replace(Text, "#####", vblf)
outp.write Text
Loop
This uses redirection of StdIn and StdOut.
Filtering the output of a command
YourProgram | Cscript //nologo script.vbs > OutputFile.txt
Filtering a file
Cscript //nologo script.vbs < InputFile.txt > OutputFile.txt
See my CMD Cheat Sheet about the Windows' command line Command to run a .bat file
So this removes line ending in win.ini and prints to screen the now one line win.ini.
cscript //nologo "C:\Users\David Candy\Desktop\Replace.vbs" < C:\windows\win.ini

Related

How to replace multiple lines of file via powershell

I have a httpd.conf file, which contains some part like -
<ThisBlock *:4443>
This Part can contain any random lines
This Part can contain any random lines
This Part can contain any random lines
</ThisBlock>
What i want is to swap the above block with this new block using powershell or cmd
<ThisBlock *:4443>
This Part contain Static lines
This Part contain Static lines
</ThisBlock>
you could use regex with option SingleLine: all text between the tags are replaced
$newtext = "<ThisBlock *:4443>
This Part contain Static lines
This Part contain Static lines
</ThisBlock>"
$text = Get-Content -Path C:\httpd.conf -Encoding UTF8 -raw
$option = [System.Text.RegularExpressions.RegexOptions]::Singleline
#i have to escape the char \ becasue is special char
$pattern = "<ThisBlock \*:4443>.*?</ThisBlock>"
$rgx = [regex]::new($pattern, $option)
$result = $rgx.Replace($text, $newtext)
$result

HOWTO remove whitespace from an automation string to export to a SQLPlus SELECT command LockedOut.sql file

This is an automation of a command to SQLPlus 12c on Linux from Windows 18_3 version on PowerShell 5.1 with Microsoft modules loaded.
I need to clean out the whitespace of the string to input wildcard data on an automation Select script (the final script will find a missing TIFF image and reinsert it).
I am UNABLE to remove the white space before the tee.
The latest attempts are in the post but I have tried Trim, Split, Replace, Remove, Substring, >>, Write-Host -NoNewline,... I am SO close.
When I Write-Host -NoNewline I succeeded in removing the CRLF but not so as I can Tee, Write-Out, or Out-File the content that way.
#Add-Type -AssemblyName System.Data.OracleClient
$filefolder = "C:\EMSCadre\iGateway\clint\Input_Images\"
$Files = Get-ChildItem $FileFolder -Name -File
$longname = $Files.Get(2)
$shortname = $longname.Replace("_tiff","").Replace("cns","").Substring(9).Split('".tif"')
echo "select LD_CASE_NUMBER FROM LOG_data where ld_message_3 like %$shortname%" |
tee -Verbose c:\scripts\input\lockedout_test.sql
type c:\scripts\input\lockedout_test.sql
#Failed attempts
#echo "select LD_CASE_NUMBER FROM LOG_data where ld_message_3 like %($shortname1.TrimEnd('_',"")%" |
# tee -Verbose c:\scripts\input\lockedout_test.sql
Latest Results showing Whitespaces before last %:
select LD_CASE_NUMBER FROM LOG_data where ld_message_3 like %100838953_180130001 %
select LD_CASE_NUMBER FROM LOG_data where ld_message_3 like %100838953_180130001 %
Details to help troubleshoot:
PS C:\scripts> $Files
2823910000.tif
2823910002.tif
cns20180827_100838953_180130001_tiff.tif
exposureworks-dynamic-range-test-f16-graded-TIFF-RGB-parade.jpg
PS C:\scripts> $shortname
100838953_180130001
Looks to me like the last step (Split()) of the statement
$longname.Replace("_tiff","").Replace("cns","").Substring(9).Split('".tif"')
is supposed to remove the extension from the file name. That is not how Split() works. The method interprets the string ".tif" as a character array and splits the given string at any of those characters (", ., f, i, t). Splitting the string 100838953_180130001.tif that way gives you an array with 5 elements, the last 4 of which are empty strings:
[ '100838953_180130001', '', '', '', '' ]
Putting the variable with that array into a string mangles the array into a string by concatenating its elements using the output field separator ($OFS), which by default is a single space, thus producing the trailing spaces you observed.
To remove the prefix cns..._ and the substring _tiff as well as the extension .tif from the file name use the following:
$shortname = $longname -replace '^cns\d*_|_tiff|\.tif$'
That regular expression replacement will remove the substring "cns" followed by any number of digits and an underscore from the beginning of a string (^), the substring "_tiff" from anywhere in a string, and the substring ".tif" from the end of a string ($).

Adding a newline (line break) to a Powershell script

I have a script I am running in Powershell, and I want to be able to put a line in my resulting text file output between the ccript name and the script content itself.
Currently, from the below, the line $str_msg = $file,[System.IO.File]::ReadAllText($file.FullName) is what I need, but I need a line to separate $file and the result of the next expression. How can I do this?
foreach ($file in [System.IO.Directory]::GetFiles($sqldir,"*.sql",
[System.IO.SearchOption]::AllDirectories))
{
$file = [System.IO.FileInfo]::new($file);
$Log.SetLogDir("");
$str_msg = $file,[System.IO.File]::ReadAllText($file.FullName);
$Log.AddMsg($str_msg);
Write-Output $str_msg;
# ...
}
$str_msg = $file,[System.IO.File]::ReadAllText($file.FullName) doesn't create a string, it creates a 2-element array ([object[]]), composed of the $file [System.IO.FileInfo] instance, and the string with the contents of that file.
Presumably, the .AddMsg() method expects a single string, so PowerShell stringifies the array in order to convert it to a single string; PowerShell stringifies an array by concatenating the elements with a single space as the separator by default; e.g.:
[string] (1, 2) yields '1 2'.
Therefore, it's best to compose $str_msg as a string to begin with, with an explicit newline as the separator, e.g.:
$strMsg = "$file`r`n$([System.IO.File]::ReadAllText($file.FullName))"
Note the use of escape sequence "`r`n" to produce a CRLF, the Windows-specific newline sequence; on Unix-like platforms, you'd use just "`n" (LF).
.NET offers a cross-platform abstraction, [Environment]::NewLine, which returns the platform-appropriate newline sequence (which you could alternatively embed as $([Environment]::NewLine) inside "...").
An alternative to string interpolation is to use -f, the string-formatting operator, which is based on the .NET String.Format() method:
$strMsg = '{0}{1}{2}' -f $file,
[Environment]::NewLine,
[System.IO.File]::ReadAllText($file.FullName)
Backtick-r+backtick-n will do a carriage return with a new line in PS. You could do a Get-Content of your $file variable as a new array variable, and insert the carriage return at a particular index:
Example file: test123.txt
If the file contents were this:
line1
line2
line3
Store the contents in an array variable so you have indices
[Array]$fileContent = Get-Content C:\path\to\test123.txt
To add a carriage return between line2 and line3:
$fileContent2 = $fileContent[0..1] + "`r`n" + $fileContent[2]
Then output a new file:
$fileContent2 | Out-File -FilePath C:\path\to\newfile.txt
You need to use the carriage return powershell special character, which is "`r".
Use it like this to add a carriage return in your line :
$str_msg = $file,"`r",[System.IO.File]::ReadAllText($file.FullName);
Check this documentation to have more details on Poewershell special characters.

Using PowerShell in a .bat file, replace a string with multiple strings

I'm using a .baat to move several files into another folder, but before the actual move part, I want to replace the LAST line (it is a known line), for example I have a file output.txt like this:
HEADER
BODY
FOOTER
Using this snippet of code:
powershell -Command "(gc output.txt) -replace 'FOOTER', 'ONE_MORE_LINE `r`n FOOTER' | Out-File output.txt"
The return that I expected was
HEADER
BODY
ONE_MORE_LINE
FOOTER
But what I got was:
HEADER
BODY
ONE_MORE_LINE `r`n FOOTER
I've tried:
\n
<br>
"`r`n"
"`n"
echo ONE_MORE_LINE >> output.txt; echo. >> output.txt; echo FOOTER >> output.txt"
This last one got close, but the result was some broken characters.
Other suggestions besides the PowerShell are welcome. I'm only using it because it was an easy get way to do the adding lines and replace it.
EDIT :
Tried this command
powershell -Command "(gc output.txt) -replace 'FOOTER;', ""ONE_MORE_LINE `r`n FOOTER"" | Out-File output.txt "
And returned this error:
A cadeia de caracteres não tem o terminador: ".
+ CategoryInfo : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : TerminatorExpectedAtEndOfString
EDIT2 - Possible Solution:
I realized that using the PowerShell command altered the encoding of the file, breaking the echo ONE_MORE_LINE, and using the suggestion from #AnsgarWiechers, I made this code
findstr /v "FOOTER" output.sql > new_output.sql
TYPE new_output.sql > output.sql
del new_output.sql
ECHO. >> %%f
ECHO ONE_MORE_LINE >> %%f
ECHO FOOTER >> %%f
ECHO. >> %%f
What it does is using the commant findstr /v "FOOTER" I look for all lines that are not FOOTER in the file output.sql and Write it on new_output.sql
Then I TYPE it back to the original file, and DEL the new_output.sql
Then I Echo all the lines I need right under it.
It works BUT, for big files I think that re-writing it twice will take a lot of time, but I can't figure an other solution.
When working with big files it's best to use a file stream. More typical methods of reading a file line-by-line using a Batch for /f loop or using Get-Content in PowerShell to read the entire file into memory can slow the process to a crawl with large files. Using a file stream on the other hand, you can nearly instantly seek back from the end of the file to the beginning of the last line, insert your desired data, and then reassemble the bytes you overwrote.
The following example will use PowerShell's access to .NET methods to open a file as a byte stream for rapid reading and writing. See inline comments for details. File encoding will hopefully be preserved. Save this with a .bat extension and give it a shot.
<# : batch portion
#echo off & setlocal
set "file=test.txt"
set "line=Line to insert!"
powershell -noprofile "iex (${%~f0} | out-string)"
goto :EOF
: end batch / begin PowerShell hybrid #>
# construct a file stream for reading and writing $env:file
$IOstream = new-object IO.FileStream((gi $env:file).FullName,
[IO.FileMode]::OpenOrCreate, [IO.FileAccess]::ReadWrite)
# read BOM to determine file encoding
$reader = new-object IO.StreamReader($IOstream)
[void]$reader.Read((new-object byte[] 3), 0, 3)
$encoding = $reader.CurrentEncoding
$reader.DiscardBufferedData()
# convert line-to-insert to file's native encoding
$utf8line = [Text.Encoding]::UTF8.GetBytes("`r`n$env:line")
$line = [Text.Encoding]::Convert([Text.Encoding]::UTF8, $encoding, $utf8line)
$charSize = [math]::ceiling($line.length / $utf8line.length)
# move pointer to the end of the stream
$pos = $IOstream.Seek(0, [IO.SeekOrigin]::End)
# walk back pointer while stream returns no error
while ($char -gt -1) {
$IOstream.Position = --$pos
$char = $reader.Peek()
$reader.DiscardBufferedData()
# break out of loop when line feed preceding non-whitespace is found
if ($foundPrintable) { if ($char -eq 10) { break } }
else { if ([char]$char -match "\S") { $foundPrintable++ } }
}
# step pointer back to carriage return and read to end into $buffer with $line prepended
$pos -= $charSize
$IOstream.Position = $pos
$buffer = $encoding.GetBytes($encoding.GetString($line) + $reader.ReadToEnd())
$IOStream.Position = $pos
"Inserting data at byte $pos"
$IOstream.Write($buffer, 0, $buffer.Length)
# Garbage collection
$reader.Dispose()
$IOstream.Dispose()
This method should be much more efficient than reading the file from the beginning, or copying the entire file into memory or on disk with a new line inserted. In my testing, it inserts the line into a hundred meg file in about 1/3 of a second.

Adding variable to text string and adding new line in powershell

I have the following code
$scriptpath = "C:\Test"
$scriptname = "mount.bat"
$myimage = Read-Host 'Enter the file name of your image'
if (Test-Path $scriptpath\$scriptname) {
Remove-Item $scriptpath\$scriptname
}
Add-Content $scriptpath\$scriptname ':Loop 'n "C:\Program
Files\file.exe" -f \\host\"Shared Folders"\$myimage -m V: `n if not
%errorlevel% equal 0 goto :Loop'
I can't get powershell to output the variable correctly in the output batch file it just says "$myimage" and not the file name. I have tried using the break ` ' symbols but no luck. I also cannot get powershell to export onto a separate line. If anyone could help that would be great.
Since you're using (a) variable references ($myimage) and (b) escape sequences such as `n (to represent a newline) in your string, you must use double quotes to get the expected result.
(Single-quoted strings treat their contents literally - no interpolation takes place.[1])
Furthermore, since your string has embedded double quotes, they must be escaped as `"
Here's a fixed version; note that I've used actual line breaks for readability (rather than `n escape sequences in a single-line string):
$myImage = 'test.png'
":Loop
`"C:\ProgramFiles\file.exe`" -f \\host\`"Shared Folders`"\$myimage -m V:
if not %errorlevel% equal 0 goto :Loop"
[1] The only interpretation that takes place is to recognize escape sequence '' as an embedded '.