What I am trying to do is to create a word document from the text file. But the text file has pageBreaks in it. I want to remove or replace those pageBreaks in the text file. This is to allow me to add pageBreaks in the word document that I'll subsequently create in places where I actually need it.
Below is the PowerShell code that I tried myself to replace the pageBreak in the text file. This doesn't work. As using "`f" in place of pageBreak doesn't work.
$oldWord = "`fPage"
$newWord = "Page"
#-- replace the page breaks in the file
(Get-Content $inputFilePath) -replace '$oldWord', '$newWord' | Set-Content $inputFilePath
The symbol shown for pageBreak in the text editor UltraEdit is ♀
Replacing the character in UltraEdit is easy. I want to replace or remove this using Powershell.
Below is a related question. But still unanswered with regards to PowerShell code.
How to remove unknown line break (special character) in text file?
for page breaks , you can use :
[io.file]::ReadAllText( 'H:\oldFile.txt') | %{$_.replace("`f","")} >h:\newFile.txt
below snippet will work from powershell v3:
cat H:\oldFile.txt -raw | %{$_.replace("`f","")} >h:\newFile.txt
Thanks for the question! This one was interesting.
So the Form-Feed special character is a bugger in powershell. If you echo it out, you just get an odd character, or a square if you cannot display it. But if you copy and paste it back into the powershell terminal, if just moved your command entry point to the top of the screen. Odd.
What I did was try to find ways of replacing general special characters. You can use regexes in powershell using $oldWord -replace 'REGEX_GOES_HERE', 'THING_TO_REPLACE_WITH_HERE, so what I came up with is this:
$oldWord -replace '[\f]', '' #You can also use \r for carriage return, \n for new line, \t for tab, \s for ALL whitespace
This will simply remove all instances of the Form-Feed character.
Hope this helps! Cheers!
Related
I am trying to remove all lines of text after a single line of text "[info]" Here is an example:
Top=1266
[info]
name=tod
space=456
number=221,441,111,0
[version]
version=1
I only need the top, the other text will be replaced later on in the script. Here is all that I have tried
$Content -replace '\[Info\]*',''
Only removes the Info line and not anything past that. I have tried to loop, but I can't seem to find the line with a where object search.
What is a quick and easy way to remove all lines of code after a single line of set text?
To make the -replace operator treat it as one string, add (?s) to the pattern.
$Content -replace '(?s)\[Info\].*'
You also needed to match any character so .* works in this case. The second part is optional. Since you're replacing it with nothing you can simply omit it.
Read more about regular expression in powershell
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_regular_expressions?view=powershell-7.1
and operators
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_operators?view=powershell-7.1
I've been messing arround with Powershell and googling various things as I go along. This one is a little hard to put into words that google woule understand. I can get the indevidual lines of a text file in powershell by indexing:
$textFile = Get-Content "myText.txt"
$textFile[0]
This would output the first line of the text file. But when I put the text file in quotes it will output all lines, even with the index
"$textFile[0]"
How can I still get only get the line I want, while wrapping the variable in quotes? If I try "$textFile"[0] it will just give me the whole file as before. The reason I'm trying to do this is because I'm trying to make that one line of the text file part of a bigger string that I can execute
$remote = "Enter-PSSession -ComputerName`", textFile[0]"
Invoke-Expression $remote
This is my way of illustrating what I'm trying to do.
You can use any of the following methods:
# Sub-expression operator
"Some Text $($textFile[0])"
# String format operator
"My Text {0}" -f $textFile[0]
# Concatenation
("Text"+$textFile[0])
Surrounding double quotes tells PowerShell to expand the string inside. Any variables within will be interpolated. Variables begin with $ and their following names can only have certain characters without requiring a special escape. [ would require an escape and since it isn't escaped, PowerShell interprets the variable name ending with the character just before the [. Therefore $textFile is interpolated, the whole file contents are converted into a string, and [0] is appended to the end of the string.
You can see details of the operators at About_Operators.
See About_Variables for how to create a variable including cases with special characters even if that doesn't directly apply here.
I want to replace the lines between two strings [REPORT] and [TAGS]. File looks like this
Many lines
many lines
they remain the same
[REPORT]
some text
some more text412
[TAGS]
text that I Want
to stay the same!!!
I used sed within cygwin:
sed -e '/[REPORT]/,/[TAGS]/c\[REPORT]\nmy text goes here\nAnd a new line down here\n[TAGS]' minput.txt > moutput.txt
which gave me this:
Many lines
many lines
they remain the same
[REPORT]
my text goes here
And a new line down here
[TAGS]
text that I Want
to stay the same!!!
When I do this and open the output file in Notepad, it doesn't show the new lines. I assume that this is because of formatting issue a simple Dos2Unix should resolve the issue.
But because of this and also mainly due to the fact that not all of my colleagues have access to cygwin I was wondering if there's a way to do this in cmd (or Powershell if there is no way to do a batch).
Eventually, I want to run this on number of files and change this section of them (between those two aforementioned words) to the text that I am providing.
Use PowerShell, present from Windows 7 on.
## Q:\Test\2018\10\30\SO_53073481.ps1
## defining variable with a here string
$Text = #"
Many lines
many lines
they remain the same
[REPORT]
some text
some more text412
[TAGS]
text that I Want
to stay the same!!!
"#
$Text -Replace "(?sm)(?<=^\[REPORT\]`r?`n).*?(?=`r?`n\[TAGS\])",
"`nmy text goes here`nAnd a new line down here`n"
The -replace regular expression uses nonconsuming lookarounds
Sample output:
Many lines
many lines
they remain the same
[REPORT]
my text goes here
And a new line down here
[TAGS]
text that I Want
to stay the same!!!
To read text from file, replace and write back (even without storing in a var) you can use:
(Get-Content ".\file.txt" -Raw) -Replace "(?sm)(?<=^\[REPORT\]`r?`n).*?(?=`r?`n\[TAGS\])",
"`nmy text goes here`nAnd a new line down here`n"|
Set-Content ".\file.txt"
The parentheses are neccessary to reuse the same file name in one pipe.
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Set regEx = New RegExp
regEx.Pattern = "\n"
regEx.IgnoreCase = True
regEx.Global = True
Outp.Write regEx.Replace(Inp.ReadAll, vbcrlf)
To use
cscript //nologo "C:\Folder\Replace.vbs" < "C:\Windows\Win.ini" > "%userprofile%\Desktop\Test.txt"
So you can use your RegEx.
I need to output the content of a powershell variable to the clipboard, preserving all the newline characters except for the last -trailing- one.
At the moment I am just piping the output of a variable readout to clip.exe, but that gives a trailing newline.
$Text = "line1`nline2"
$Text | clip.exe
gives the following:
"line1,
line2
"
I would like it to output
"line1,
line2"
How might I achieve this?
Using the pipeline can result in a new line being added by powershell. You can use Set-Clipboard and it should avoid the newline issue.
You can also use the .NET option as well:
[System.Windows.Forms.Clipboard]::SetText("line1`r`nline2")
Get-Content $user| Foreach-Object{
$user = $_.Split('=')
New-Variable -Name $user[0] -Value $user[1]}
Im trying to work on a script and have it split a text file into an array, splitting the file based on each new line
What should I change the "=" sign to
It depends on the exact encoding of the textfile, but [Environment]::NewLine usually does the trick.
"This is `r`na string.".Split([Environment]::NewLine)
Output:
This is
a string.
The problem with the String.Split method is that it splits on each character in the given string. Hence, if the text file has CRLF line separators, you will get empty elements.
Better solution, using the -Split operator.
"This is `r`na string." -Split "`r`n" #[Environment]::NewLine, if you prefer
You can use the String.Split method to split on CRLF and not end up with the empty elements by using the Split(String[], StringSplitOptions) method overload.
There are a couple different ways you can use this method to do it.
Option 1
$input.Split([string[]]"`r`n", [StringSplitOptions]::None)
This will split on the combined CRLF (Carriage Return and Line Feed) string represented by `r`n. The [StringSplitOptions]::None option will allow the Split method to return empty elements in the array, but there should not be any if all the lines end with a CRLF.
Option 2
$input.Split([Environment]::NewLine, [StringSplitOptions]::RemoveEmptyEntries)
This will split on either a Carriage Return or a Line Feed. So the array will end up with empty elements interspersed with the actual strings. The [StringSplitOptions]::RemoveEmptyEntries option instructs the Split method to not include empty elements.
The answers given so far consider only Windows as the running environment. If your script needs to run in a variety of environments (Linux, Mac and Windows), consider using the following snippet:
$lines = $input.Split(
#("`r`n", "`r", "`n"),
[StringSplitOptions]::None)
There is a simple and unusual way to do this.
$lines = [string[]]$input
This will split $input like:
$input.Split(#("`r`n", "`n"))
This is undocumented at least in docs for Conversions.
Beware, this will not remove empty entries.
And it doesn't work for Carriage Return (\r) line ending at least on Windows.
Experimented in Powershell 7.2.
This article also explains a lot about how it works with carriage return and line ends. https://virot.eu/powershell-and-newlines/
having some issues with additional empty lines and such i found the solution to understanding the issue. Excerpt from virot.eu:
So what makes up a new line. Here comes the tricky part, it depends.
To understand this we need to go to the line feed the character.
Line feed is the ASCII character 10. It in most programming languages
escaped by writing \n, but in powershell it is `n. But Windows is not
content with just one character, Windows also uses carriage return
which is ASCII character 13. Escaped \r. So what is the difference?
Line feed advances the pointer down one row and carriage return
returns it to the left side again. If you store a file in Windows by
default are linebreaks are stored as first a carriage return and then
a line feed (\r\n). When we aren’t using any parameters for the
split() command it will split on all white-space characters, that is
both carriage return, linefeed, tabs and a few more. This is why we
are getting 5 results when there is both carriage return and line
feeds.