How to replace multiple lines of file via powershell - powershell

I have a httpd.conf file, which contains some part like -
<ThisBlock *:4443>
This Part can contain any random lines
This Part can contain any random lines
This Part can contain any random lines
</ThisBlock>
What i want is to swap the above block with this new block using powershell or cmd
<ThisBlock *:4443>
This Part contain Static lines
This Part contain Static lines
</ThisBlock>

you could use regex with option SingleLine: all text between the tags are replaced
$newtext = "<ThisBlock *:4443>
This Part contain Static lines
This Part contain Static lines
</ThisBlock>"
$text = Get-Content -Path C:\httpd.conf -Encoding UTF8 -raw
$option = [System.Text.RegularExpressions.RegexOptions]::Singleline
#i have to escape the char \ becasue is special char
$pattern = "<ThisBlock \*:4443>.*?</ThisBlock>"
$rgx = [regex]::new($pattern, $option)
$result = $rgx.Replace($text, $newtext)
$result

Related

PowerShell Extract text between two strings with -Tail and -Wait

I have a text file with a large number of log messages.
I want to extract the messages between two string patterns. I want the extracted message to appear as it is in the text file.
I tried the following methods. It works, but doesn't support Get-Content's -Wait and -Tail options. Also, the extracted results are displayed in one line, but not like the text file. Inputs are welcome :-)
Sample Code
function GetTextBetweenTwoStrings($startPattern, $endPattern, $filePath){
# Get content from the input file
$fileContent = Get-Content $filePath
# Regular expression (Regex) of the given start and end patterns
$pattern = "$startPattern(.*?)$endPattern"
# Perform the Regex opperation
$result = [regex]::Match($fileContent,$pattern).Value
# Finally return the result to the caller
return $result
}
# Clear the screen
Clear-Host
$input = "THE-LOG-FILE.log"
$startPattern = 'START-OF-PATTERN'
$endPattern = 'END-OF-PATTERN'
# Call the function
GetTextBetweenTwoStrings -startPattern $startPattern -endPattern $endPattern -filePath $input
Improved script based on Theo's answer.
The following points need to be improved:
The beginning and end of the output is somehow trimmed despite I adjusted the buffer size in the script.
How to wrap each matched result into START and END string?
Still I could not figure out how to use the -Wait and -Tail options
Updated Script
# Clear the screen
Clear-Host
# Adjust the buffer size of the window
$bw = 10000
$bh = 300000
if ($host.name -eq 'ConsoleHost') # or -notmatch 'ISE'
{
[console]::bufferwidth = $bw
[console]::bufferheight = $bh
}
else
{
$pshost = get-host
$pswindow = $pshost.ui.rawui
$newsize = $pswindow.buffersize
$newsize.height = $bh
$newsize.width = $bw
$pswindow.buffersize = $newsize
}
function Get-TextBetweenTwoStrings ([string]$startPattern, [string]$endPattern, [string]$filePath){
# Get content from the input file
$fileContent = Get-Content -Path $filePath -Raw
# Regular expression (Regex) of the given start and end patterns
$pattern = '(?is){0}(.*?){1}' -f [regex]::Escape($startPattern), [regex]::Escape($endPattern)
# Perform the Regex operation and output
[regex]::Match($fileContent,$pattern).Groups[1].Value
}
# Input file path
$inputFile = "THE-LOG-FILE.log"
# The patterns
$startPattern = 'START-OF-PATTERN'
$endPattern = 'END-OF-PATTERN'
Get-TextBetweenTwoStrings -startPattern $startPattern -endPattern $endPattern -filePath $inputFile
You need to perform streaming processing of your Get-Content call, in a pipeline, such as with ForEach-Object, if you want to process lines as they're being read.
This is a must if you're using Get-Content -Wait, because such a call doesn't terminate by itself (it keeps waiting for new lines to be added to the file, indefinitely), but inside a pipeline its output can be processed as it is being received, even before the command terminates.
You're trying to match across multiple lines, which with Get-Content output would only work if you used the -Raw switch - by default, Get-Content reads its input file(s) line by line.
However, -Raw is incompatible with -Wait.
Therefore, you must stick with line-by-line processing, which requires that you match the start and end patterns separately, and keep track of when you're processing lines between those two patterns.
Here's a proof of concept, but note the following:
-Tail 100 is hard-coded - adjust as needed or make it another parameter.
The use of -Wait means that the function will run indefinitely - waiting for new lines to be added to $filePath - so you'll need to use Ctrl-C to stop it.
While you can use a Get-TextBetweenTwoStrings call itself in a pipeline for object-by-object processing, assigning its result to a variable ($result = ...) won't work when terminating with Ctrl-C, because this method of termination also aborts the assignment operation.
To work around this limitation, the function below is defined as an advanced function, which automatically enables support for the common -OutVariable parameter, which is populated even in the event of termination with Ctrl-C; your sample call would then look as follows (as Theo notes, don't use the automatic $input variable as a custom variable):
# Look for blocks of interest in the input file, indefinitely,
# and output them as they're being found.
# After termination with Ctrl-C, $result will also contain the blocks
# found, if any.
Get-TextBetweenTwoStrings -OutVariable result -startPattern $startPattern -endPattern $endPattern -filePath $inputFile
Per your feedback you want the block of lines to encompass the full lines on which the start and end patterns match, so the regexes below are enclosed in .*
The word pattern in your $startPattern and $endPattern parameters is a bit ambiguous in that it suggests that they themselves are regexes that can therefore be used as-is or embedded as-is in a larger regex on the RHS of the -match operator.
However, in the solution below I am assuming that they are be treated as literal strings, which is why they are escaped with [regex]::Escape(); simply omit these calls if these parameters are indeed regexes themselves; i.e.:
$startRegex = '.*' + $startPattern + '.*'
$endRegex = '.*' + $endPattern + '.*'
The solution assumes there is no overlap between blocks and that, in a given block, the start and end patterns are on separate lines.
Each block found is output as a single, multi-line string, using LF ("`n") as the newline character; if you want a CRLF newline sequences instead, use "`r`n"; for the platform-native newline format (CRLF on Windows, LF on Unix-like platforms), use [Environment]::NewLine.
# Note the use of "-" after "Get", to adhere to PowerShell's
# "<Verb>-<Noun>" naming convention.
function Get-TextBetweenTwoStrings {
# Make the function an advanced one, so that it supports the
# -OutVariable common parameter.
[CmdletBinding()]
param(
$startPattern,
$endPattern,
$filePath
)
# Note: If $startPattern and $endPattern are themselves
# regexes, omit the [regex]::Escape() calls.
$startRegex = '.*' + [regex]::Escape($startPattern) + '.*'
$endRegex = '.*' + [regex]::Escape($endPattern) + '.*'
$inBlock = $false
$block = [System.Collections.Generic.List[string]]::new()
Get-Content -Tail 100 -Wait $filePath | ForEach-Object {
if ($inBlock) {
if ($_ -match $endRegex) {
$block.Add($Matches[0])
# Output the block of lines as a single, multi-line string
$block -join "`n"
$inBlock = $false; $block.Clear()
}
else {
$block.Add($_)
}
}
elseif ($_ -match $startRegex) {
$inBlock = $true
$block.Add($Matches[0])
}
}
}
First of all, you should not use $input as self-defined variable name, because this is an Automatic variable.
Then, you are reading the file as a string array, where you would rather read is as a single, multiline string. For that append switch -Raw to the Get-Content call.
The regex you are creating does not allow fgor regex special characters in the start- and end patterns you give, so it I would suggest using [regex]::Escape() on these patterns when creating the regex string.
While your regex does use a group capturing sequence inside the brackets, you are not using that when it comes to getting the value you seek.
Finally, I would recommend using PowerShell naming convention (Verb-Noun) for the function name
Try
function Get-TextBetweenTwoStrings ([string]$startPattern, [string]$endPattern, [string]$filePath){
# Get content from the input file
$fileContent = Get-Content -Path $filePath -Raw
# Regular expression (Regex) of the given start and end patterns
$pattern = '(?is){0}(.*?){1}' -f [regex]::Escape($startPattern), [regex]::Escape($endPattern)
# Perform the Regex operation and output
[regex]::Match($fileContent,$pattern).Groups[1].Value
}
$inputFile = "D:\Test\THE-LOG-FILE.log"
$startPattern = 'START-OF-PATTERN'
$endPattern = 'END-OF-PATTERN'
Get-TextBetweenTwoStrings -startPattern $startPattern -endPattern $endPattern -filePath $inputFile
Would result in something like:
blahblah
more lines here
The (?is) makes the regex case-insensitive and have the dot match linebreaks as well
Nice to see you're using my version of the Get-TextBetweenTwoStrings function, however I believe you are mistaking the output in the console to output as in a dedicated text editor. In the console, too long lines will be truncated, whereas in a text editor like notepad, you can choose to wrap long lines or have a horizontal scrollbar.
If you simply append
| Set-Content -Path 'X:\wherever\theoutput.txt'
to the Get-TextBetweenTwoStrings .. call, you will find the lines are NOT truncated when you open it in Word or notepad for instance.
In fact, you can have that line folowed by
notepad 'X:\wherever\theoutput.txt'
to have notepad open that file straight away.

Powershell- match split and replace based on index

I have a file
AB*00*Name1First*Name1Last*test
BC*JCB*P1*Church St*Texas
CD*02*83*XY*Fax*LM*KY
EF*12*Code1*TX*1234*RJ
I need to replace the 5th element in the CD segment alone from LM to ET in each of the file in the folder. Element delimiter is * as mentioned in the above sample file content. I am new to PowerShell and tried a code as below but unfortunately it is not giving desired results. Can any of you please provide some help?
foreach($xfile in $inputfolder)
{
If ($_ match "^CD\*")
{
[System.IO.File]::ReadAllText($xfile).replace(($_.split("*")[5],"ET") | Set-Content $xfile
}
[System.IO.File]::WriteAllText($xfile),((Get-Content $xfile -join("~")))
}
here's a slightly different way to get there ... [grin] what it does ...
fakes reading in a test file
when ready to do this for real, remove the entire #region/#endregion block and use Get-Content.
sets the constants
iterates thru the imported text file lines
checks for a line that starts with the target pattern
if found ...
== escapes the old value with [regex]::Escape() to deal with the asterisks
== replaces the escaped old value with the new value
== outputs the new version of that line
if NOT found, outputs the line as-is
stores all the lines into the $OutStuff var
displays that on screen
the code ...
#region >>> fake reading in a plain text file
# in real life, use Get-Content
$InStuff = #'
AB*00*Name1First*Name1Last*test
BC*JCB*P1*Church St*Texas
CD*02*83*XY*Fax*LM*KY
EF*12*Code1*TX*1234*RJ
'# -split [System.Environment]::NewLine
#endregion >>> fake reading in a plain text file
$TargetLineStart = 'CD*'
$OldValue = '*LM*'
$NewValue = '*ET*'
$OutStuff = foreach ($IS_Item in $InStuff)
{
if ($IS_Item.StartsWith($TargetLineStart))
{
$IS_Item -replace [regex]::Escape($OldValue), $NewValue
}
else
{
$IS_Item
}
}
$OutStuff
output ...
AB*00*Name1First*Name1Last*test
BC*JCB*P1*Church St*Texas
CD*02*83*XY*Fax*ET*KY
EF*12*Code1*TX*1234*RJ
i will leave saving that to a new file [or overwriting the old one] to the user. [grin]
You could capture all that comes before the match in group 1, and match LM.
In the replacement use $1ET
^(CD*(?:[^*\r\n]+\*){5})LM\b
Regex demo
If you don't want to match LM literally, you could also match any other char than * or a newline.
^(CD*(?:[^*\r\n]+\*){5})[^*\r\n]+\b
Replace example
$allText = Get-Content -Raw file.txt
$allText -replace '(?m)^(CD*(?:[^*\r\n]+\*){5})LM\b','$1ET'
Output
AB*00*Name1First*Name1Last*test
BC*JCB*P1*Church St*Texas
CD*02*83*XY*Fax*ET*KY
EF*12*Code1*TX*1234*RJ

Select String From Text File and Create variable

I have a text file containing a string I need to make a variable. I need the value for "file" to be retained as a variable. How can I capture this and make it a variable: "\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\". This data will change per file, but it will retain the same format, it will start with \ and end with \
Example Text File
order_id = 25490175-brtctybv
file = \\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\
copies = 1
volume = 20171031-brtctybv
label = \\domain.com\prodmaster\jobs\OPTI\CLIENT\Cdlab\somefile.file
merge = \\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\mrg\25490175-brtctybv.MRG
FIXATE = NOAPPEND
$file = ((Get-Content -path file.txt) | Select-String -pattern "^file\s*=\s*(\\\\.*\\)").matches.groups[1].value
$file
See Regex Demo to see the regex in action. The .matches.groups[1].value is grabbing the value of capture group 1. The capture group is created by the () within the pattern. See Select-String for more information about the cmdlet.
Regexes are powerful, but complex; sometimes there are conceptually simpler alternatives:
PS> ((Get-Content -Raw file.txt).Replace('\', '\\') | ConvertFrom-StringData).file
\\APPSRV\I\Run\OPTI\CLIENT\20171031\25490175\Data\brtctybv\
The ConvertFrom-StringData cmdlet is built for parsing key-value pairs separated by =
\ in the values is interpreted as an escape character, however, hence the doubling of \ in the input file with .Replace('\', '\\').
The result is a hash table (type [hashtable]); Get-Content -Raw - the input file read as a single string - is used to ensure that a single hash table is output); accessing its file key retrieves the associated value.

Adding a newline (line break) to a Powershell script

I have a script I am running in Powershell, and I want to be able to put a line in my resulting text file output between the ccript name and the script content itself.
Currently, from the below, the line $str_msg = $file,[System.IO.File]::ReadAllText($file.FullName) is what I need, but I need a line to separate $file and the result of the next expression. How can I do this?
foreach ($file in [System.IO.Directory]::GetFiles($sqldir,"*.sql",
[System.IO.SearchOption]::AllDirectories))
{
$file = [System.IO.FileInfo]::new($file);
$Log.SetLogDir("");
$str_msg = $file,[System.IO.File]::ReadAllText($file.FullName);
$Log.AddMsg($str_msg);
Write-Output $str_msg;
# ...
}
$str_msg = $file,[System.IO.File]::ReadAllText($file.FullName) doesn't create a string, it creates a 2-element array ([object[]]), composed of the $file [System.IO.FileInfo] instance, and the string with the contents of that file.
Presumably, the .AddMsg() method expects a single string, so PowerShell stringifies the array in order to convert it to a single string; PowerShell stringifies an array by concatenating the elements with a single space as the separator by default; e.g.:
[string] (1, 2) yields '1 2'.
Therefore, it's best to compose $str_msg as a string to begin with, with an explicit newline as the separator, e.g.:
$strMsg = "$file`r`n$([System.IO.File]::ReadAllText($file.FullName))"
Note the use of escape sequence "`r`n" to produce a CRLF, the Windows-specific newline sequence; on Unix-like platforms, you'd use just "`n" (LF).
.NET offers a cross-platform abstraction, [Environment]::NewLine, which returns the platform-appropriate newline sequence (which you could alternatively embed as $([Environment]::NewLine) inside "...").
An alternative to string interpolation is to use -f, the string-formatting operator, which is based on the .NET String.Format() method:
$strMsg = '{0}{1}{2}' -f $file,
[Environment]::NewLine,
[System.IO.File]::ReadAllText($file.FullName)
Backtick-r+backtick-n will do a carriage return with a new line in PS. You could do a Get-Content of your $file variable as a new array variable, and insert the carriage return at a particular index:
Example file: test123.txt
If the file contents were this:
line1
line2
line3
Store the contents in an array variable so you have indices
[Array]$fileContent = Get-Content C:\path\to\test123.txt
To add a carriage return between line2 and line3:
$fileContent2 = $fileContent[0..1] + "`r`n" + $fileContent[2]
Then output a new file:
$fileContent2 | Out-File -FilePath C:\path\to\newfile.txt
You need to use the carriage return powershell special character, which is "`r".
Use it like this to add a carriage return in your line :
$str_msg = $file,"`r",[System.IO.File]::ReadAllText($file.FullName);
Check this documentation to have more details on Poewershell special characters.

Remove all carriage return and line feed from file

Last week I have asked you guys to replace a string with newline character with .bat script. I have realized that my file has some carriage return and newline characters already, which I need to remove first and then do the replace.
to replace '#####' with linefeed I am using the line below.
(gc $Source) -replace "#####", "`r`n"|set-content $Destination
So I tried to implement the same logic to replace \r and \n as well, however it did not work.
(gc $Source) -replace "`n", ""|set-content $Destination
my file looks like :
abc|d ef|123#####xyz|tuv|567#####
and I need to make it look like
abc|def|123 xyz|tuv|567
like I said, replacing the row delimiter character with new line works, but I need to remove all cr and lf characters first before I do that.
For small files the script below works, but my file is >1.5GB and it throws OutofMemoryException error
param
(
[string]$Source,
[string]$Destination
)
echo $Source
echo $Destination
$Writer = New-Object IO.StreamWriter $Destination
$Writer.Write( [String]::Join("", $(Get-Content $Source)) )
$Writer.Close()
Use the below function to remove the special characters. Put all of them in $SpecChars what ever you want to remove and call the function with the Text-data as a parameter.
Function Convert-ToFriendlyName
{param ($Text)
# Unwanted characters (includes spaces and '-') converted to a regex:
#Whatever characters you want to remove, put it here with comma separation.
$SpecChars = '\', ' ','\\','-'
$remspecchars = [string]::join('|', ($SpecChars | % {[regex]::escape($_)}))
# Convert the text given to correct naming format (Uppercase)
$name = (Get-Culture).textinfo.totitlecase(“$Text”.tolower())
# Remove unwanted characters
$name = $name -replace $remspecchars, ""
$name
}
Hope it helps...!!!
This is vbscript. Windows isn't consistent. Mostly it breaks on CR and removes LF (all inbuilt programming languages). But Edit controls (ie Notepad) break on LF and ignore CR (unless preceding a LF).
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Do Until Inp.AtEndOfStream
Text = Inp.readall
Text = Replace(Text, vbcr, "")
Text = Replace(Text, vblf, "")
Text = Replace(Text, "#####", vblf)
outp.write Text
Loop
This uses redirection of StdIn and StdOut.
Filtering the output of a command
YourProgram | Cscript //nologo script.vbs > OutputFile.txt
Filtering a file
Cscript //nologo script.vbs < InputFile.txt > OutputFile.txt
See my CMD Cheat Sheet about the Windows' command line Command to run a .bat file
So this removes line ending in win.ini and prints to screen the now one line win.ini.
cscript //nologo "C:\Users\David Candy\Desktop\Replace.vbs" < C:\windows\win.ini