Split untill end of line powershell - powershell

I'm trying to do a script in PowerShell which adds a hyphen after every each 2 characters which is in a text file, and i have done it but I am facing an issue which is.
Code >
$file = get-content .\textfile.txt
($file -split "([a-z0-9]{2})" | ?{ $_.length -ne 0 }) -join "-" | Set-Content .\textfile.txt
If i have a value like below in a .txt file
000000000000
111111111111
Output is coming like.
00-00-00-00-00-00-11-11-11-11-11-11
I need an output like.
00-00-00-00-00-00
11-11-11-11-11-11
Kindly suggest what should i have to change.

Get-Content removes all the newlines, and outputs strings to the pipeline, one for each line.
$file is an array of two strings, #('000000000000', '111111111111'). When you -split it applies to both of the strings, and it turns into #('00', '00', '00', '00', '00', '00', '11', '11', '11', '11', '11', '11') and now you cannot tell where the lines start or end.
To fix it, you need to process each line separately:
(Get-Content .\textfile.txt) | ForEach-object {
($_ -split "([a-z0-9]{2})" |? { $_ }) -join "-"
} | Set-Content .\textfile.txt
Or change what you're doing to do a replace, that will work within the lines instead of merging them together:
(gc .\textfile.txt) -replace '([a-z0-9]{2})\B', '$1-' | sc textfile.txt
and the \B stops it from putting a - at the end of the line.

As answered on the TechNet Forums at https://social.technet.microsoft.com/Forums/en-US/e60e33d2-f065-4219-82cf-5797aaf10891/split-foreach-line?forum=winserverpowershell
This will add a dash (-) after every two characters in a line. Since it will also do this at the end of a line, we trim it.
$inputFile = Get-Content -Path C:\temp\file.txt
$newFile = foreach ($line in $inputFile) {
($line -replace '(..)', '$1-').trim('-')
}
$newFile | Set-Content -Path C:\temp\file.txt

Related

Combine multiple Get-Content cmdlet in powershell

I am a beginner to PowerShell and would like to use it to perform an automated file editing. Below is my current work to make a "OR" delimited string:
$inputFile = "C:\Users\David Kao\Desktop\Powershell\Input\input.txt"
$outputFile = "C:\Users\David Kao\Desktop\Powershell\Output\output.txt"
$final = "C:\Users\David Kao\Desktop\Powershell\Output\final.txt"
(Get-Content $inputFile ) -replace ' \[.*','' -replace ' \(.*','' -replace ';','' -replace ',','' -replace '- ',''|
Where-Object { $_ -notmatch '[^\p{IsBasicLatin}]' }|
Sort-Object -Unique |
Set-Content $outputFile
(Get-Content $outputFile) -join '/' -replace '/','" OR "' -replace '\t" OR ', '' -replace '$', '"'|
Set-Content $final
Here is the sample data:
multiple sclerosis [A0484253/AOD/DE/0000006106]
ms [A1145632/BI/AB/BI00548]
multiple sclerosis [A0484254/BI/PT/BI00548]
MS [A0432904/CCPSS/PT/0056346]
MULTIPLE SCLEROSIS [A0433042/CCPSS/PT/0037395]
Multiple sclerosis [A0436411/CCS/MD/6.2.2]
Multiple sclerosis [A0436412/CCS/SD/80]
Multiple sclerosis [A31482484/CCSR_10/SD/NVS005]
disseminated sclerosis [A18685620/CHV/SY/0000008328]
insular sclerosis [A18685621/CHV/SY/0000008328]
MS [A18592794/CHV/SY/0000008328]
MS multiple sclerosis [A18685622/CHV/SY/0000008328]
multiple sclerosis [A18611430/CHV/SY/0000008328]
multiple sclerosis (MS) [A18555705/CHV/PT/0000008328]
multiple sclerosis MS [A18574147/CHV/SY/0000008328]
And the output would be something like:
"multiple scelorosis" OR "MS" OR "insular sclerosis" OR ....
So far as now, this code works well, but what I would like to achieve is to get rid of the second section and put it into the first section to make it more concise, efficient and professional. Something like this:
$inputFile = "C:\Users\helloworld\Desktop\Powershell\Input\input.txt"
$outputFile = "C:\Users\helloworld\Desktop\Powershell\Output\output.txt"
(Get-Content $inputFile ) -replace ' \[.*','' -replace ' \(.*','' -replace ';','' -replace ',','' -replace '- ',''|
Where-Object { $_ -notmatch '[^\p{IsBasicLatin}]' }|
Sort-Object -Unique |
(Get-Content $_) -join '/' -replace '/','" OR "' -replace '\t" OR ', '' -replace '$', '"'|
Set-Content $outputFile
I have googled a lot about this issue but got stuck for a while.
Can anyone help me out?
Thanks!
I am now adding ForEach-Object:
$inputFile = "C:\Users\David Kao\Desktop\Powershell\Input\input.txt"
$outputFile = "C:\Users\David Kao\Desktop\Powershell\Output\output.txt"
(Get-Content $inputFile ) -replace ' \[.*','' -replace ' \(.*','' -replace ';','' -replace ',','' -replace '- ',''|
Where-Object { $_ -notmatch '[^\p{IsBasicLatin}]' }|
Sort-Object -Unique |
ForEach-Object { $_ -join '/' -replace '/','" OR "' -replace '\t" OR ', '' -replace '$', '"'}|
Set-Content $outputFile
But it seems that join does not work it ForEach-Object, if I can fix this, then I think everything would be fine.
I would do the following, which assumes all spaces are a single space and names will only contain alpha characters and spaces.
$inputfile = Get-Content inputfile.txt
($inputfile -replace '(^[a-z][a-z ]*) [^a-z].*$','"$1"' -ne '' |
sort -unique) -join ' OR ' | Set-Content output.txt
See Regex for matching explanation. Capture group 1 is everything matched within the first () grouping. It is substituted in the replace string as $1.
-ne '' is to remove solitary blank lines. If there are other spaces on those lines you may need an additional -notmatch '^\s*$'.
If you have a custom definition of uniqueness, then the code will need to be altered slightly rather than doing just sort -unique.
here's another way to do this ... [grin]
what it does ...
creates a set of sample data to work with
when ready to work with real data, replace the entire #region/#endregion block with a Get-Content call.
splits the lines on the open [
filters out the lines that have a closing ']'
that gets rid of the unwanted remainder from the split.
filters out any lines that contain 2-or-more consecutive spaces
this gets rid of those odd lines that have just 4 spaces on them.
trims away any leading or trailing spaces
sorts the items and removes any exact dupes
joins the items with double-quoted OR strings
assigns that to $Result
adds the required leading and trailing double quotes to the previous string
assigns that to $FinalResult
displays that last item on screen
the code ...
#region >>> fake reading in a plain text file
# in real life, use Get-Content
$InStuff = #'
multiple sclerosis [A0484253/AOD/DE/0000006106]
ms [A1145632/BI/AB/BI00548]
multiple sclerosis [A0484254/BI/PT/BI00548]
MS [A0432904/CCPSS/PT/0056346]
MULTIPLE SCLEROSIS [A0433042/CCPSS/PT/0037395]
Multiple sclerosis [A0436411/CCS/MD/6.2.2]
Multiple sclerosis [A0436412/CCS/SD/80]
Multiple sclerosis [A31482484/CCSR_10/SD/NVS005]
disseminated sclerosis [A18685620/CHV/SY/0000008328]
insular sclerosis [A18685621/CHV/SY/0000008328]
MS [A18592794/CHV/SY/0000008328]
MS multiple sclerosis [A18685622/CHV/SY/0000008328]
multiple sclerosis [A18611430/CHV/SY/0000008328]
multiple sclerosis (MS) [A18555705/CHV/PT/0000008328]
multiple sclerosis MS [A18574147/CHV/SY/0000008328]
'# -split [System.Environment]::NewLine
#endregion >>> fake reading in a plain text file
# split on the open "["
$Result = (($InStuff -split '\[').
Where({
# filter out the lines that have a closing "]"
$_ -notmatch '\]' -and
# filter out the lines that have two-or-more consecutive spaces
$_ -notmatch '\s{2,}'
}).
# trim away any leading/trailing spaces
Trim() |
# sort the items and toss out any dupes.
# does a preliminary join with double-quoted OR strings
Sort-Object -Unique) -join '" OR "'
# adds leading and trailing double quotes
$FinalResult = '"{0}"' -f $Result
$FinalResult
output ...
"disseminated sclerosis" OR "insular sclerosis" OR "ms" OR "MS multiple sclerosis" OR "multiple sclerosis" OR "multiple sclerosis (MS)" OR "multiple sclerosis MS"

powershell replace command if line starts with a specific character

I have a text file that I would like to read and do some replacements using powershell only if the line starts with a specific character.
SAy i want to change all the dash (-) to an 'x' if and only if the line starts with a y.
I tried using the command
(Get-Content trial.log2) | Foreach-Object {$_ -replace "-", 'x'} | Set-Content trial.log2
However, it actually replaces all occurrences of the dash, not only for the line the starts with a y.
Can this be also done if I want to have multiple find replace and string manipulation using one get content command?
I have another string manipulation but only if it starts with an F
If line starts with an F, then get first 4 characters of the line, then append 'NEW' then get the next characters from character 20 to 30.
if line starts with a y, then do a replace of - with an X.
$F=(get-content $file) -like 'F*'
(Get-Content $file) | Foreach-Object {
$_ -replace "^F.+", -join("$F".Substring(0,4), "$NEW3",
} | Set-Content trial.log2
Get-Content trial.log2 | ForEach-Object {
if ( $_ -match '^y' ) {
$_ -replace '-', 'X'
}
else {
$_
}
} | Set-Content trial.log3
However, if i do this, texts are being written twice. I think there is something wrong with how I look for the line that starts with the F
Any help is appreciated. Thanks!
You can use a look-behind ((?<=pattern)) to assert that the preceding characters include a y following the start of the string:
(Get-Content trial.log2) | Foreach-Object {$_ -replace '(?<=^y.*)-','x'} | Set-Content trial.log2
How about something like:
Get-Content trial.log2 | ForEach-Object {
if ( $_ -match '^y' ) {
$_ -replace '-', 'x'
}
else {
$_
}
} | Out-File trial.log2.temp

Remove [ character from a text file with PowerShell

How can I remove the [ character from a text file?
My text file contains lines like this:
[Adminlogin] 172.16.48.131 Wednesday Jan102018 07:05:36
And I would like to remove the [.
When I run this,
$file = "MyFile.txt"
Get-Content $file | Foreach {$_ -replace "[", ""} | Set-Content "Myfile-1.txt"
I get an error
The regular expression pattern [ is not valid
However when I run this to remove the ],
$file = "MyFile.txt"
Get-Content $file | Foreach {$_ -replace "]", ""} | Set-Content "Myfile-1.txt"
It runs with no problem.
[ is a regular expression meta character, so you need to escape it.
The simplest way is to use:
{$ -replace "\[", ""}
Or you can use the [Regex]::Escape($str) method. See blog post PowerShell Tip - Escape Regex MetaCharacters for a more detailed example.
Using:
Foreach {$_ -replace '[', ""}
Should also work.

Remove empty rows from csv in powershell [duplicate]

I know that I can use:
gc c:\FileWithEmptyLines.txt | where {$_ -ne ""} > c:\FileWithNoEmptyLines.txt
to remove empty lines. But How I can remove them with '-replace' ?
I found a nice one liner here >> http://www.pixelchef.net/remove-empty-lines-file-powershell. Just tested it out with several blanks lines including newlines only as well as lines with just spaces, just tabs, and combinations.
(gc file.txt) | ? {$_.trim() -ne "" } | set-content file.txt
See the original for some notes about the code. Nice :)
This piece of code from Randy Skretka is working fine for me, but I had the problem, that I still had a newline at the end of the file.
(gc file.txt) | ? {$_.trim() -ne "" } | set-content file.txt
So I added finally this:
$content = [System.IO.File]::ReadAllText("file.txt")
$content = $content.Trim()
[System.IO.File]::WriteAllText("file.txt", $content)
You can use -match instead -eq if you also want to exclude files that only contain whitespace characters:
#(gc c:\FileWithEmptyLines.txt) -match '\S' | out-file c:\FileWithNoEmptyLines
Not specifically using -replace, but you get the same effect parsing the content using -notmatch and regex.
(get-content 'c:\FileWithEmptyLines.txt') -notmatch '^\s*$' > c:\FileWithNoEmptyLines.txt
To resolve this with RegEx, you need to use the multiline flag (?m):
((Get-Content file.txt -Raw) -replace "(?m)^\s*`r`n",'').trim() | Set-Content file.txt
If you actually want to filter blank lines from a file then you may try this:
(gc $source_file).Trim() | ? {$_.Length -gt 0}
You can't do replacing, you have to replace SOMETHING with SOMETHING, and you neither have both.
This will remove empty lines or lines with only whitespace characters (tabs/spaces).
[IO.File]::ReadAllText("FileWithEmptyLines.txt") -replace '\s+\r\n+', "`r`n" | Out-File "c:\FileWithNoEmptyLines.txt"
(Get-Content c:\FileWithEmptyLines.txt) |
Foreach { $_ -Replace "Old content", " New content" } |
Set-Content c:\FileWithEmptyLines.txt;
file
PS /home/edward/Desktop> Get-Content ./copy.txt
[Desktop Entry]
Name=calibre
Exec=~/Apps/calibre/calibre
Icon=~/Apps/calibre/resources/content-server/calibre.png
Type=Application*
Start by get the content from file and trim the white spaces if any found in each line of the text document. That becomes the object passed to the where-object to go through the array looking at each member of the array with string length greater then 0. That object is passed to replace the content of the file you started with. It would probably be better to make a new file...
Last thing to do is reads back the newly made file's content and see your awesomeness.
(Get-Content ./copy.txt).Trim() | Where-Object{$_.length -gt 0} | Set-Content ./copy.txt
Get-Content ./copy.txt
This removes trailing whitespace and blank lines from file.txt
PS C:\Users\> (gc file.txt) | Foreach {$_.TrimEnd()} | where {$_ -ne ""} | Set-Content file.txt
Get-Content returns immutable array of rows. You can covert this to mutable array and delete neccessary lines by index.Particular indexex you can get with match. After that you can write result to new file with Set-Content. With this approach you can avoid empty lines that powershell replace tool leaves when you try to replace smthing with "". Note that I dont guarantee perfect perfomance. Im not a professional powershell developer))
$fileLines = Get-Content $filePath
$neccessaryLine = Select-String -Path $filePath -Pattern 'something'
if (-Not $neccessaryLine) { exit }
$neccessaryLineIndex = $neccessaryLine.LineNumber - 1
$updatedFileContent = [System.Collections.ArrayList]::new($fileLines)
$updatedFileContent.RemoveAt($neccessaryLineIndex)
$updatedHostsFileContent.RemoveAt($domainInfoLineIndex - 1)
$updatedHostsFileContent | Set-Content $hostsFilePath
Set-Content -Path "File.txt" -Value (get-content -Path "File.txt" | Select-String -Pattern '^\s*$' -NotMatch)
This works for me, originally got the line from here and added Joel's suggested '^\s*$': Using PowerShell to remove lines from a text file if it contains a string

How to remove some words from all text file in a folder by powershell?

I have a situation that I need to remove some words from all text file in a folder.
I know how to do that only in 1 file, but I need to do it automatically for all text files in that folder. I got no idea at all how to do it in powershell.
The name of the files are random.
Please help.
This is the code
$txt = get-content c:\work\test\01.i
$txt[0] = $txt[0] -replace '-'
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-'
$txt | set-content c:\work\test\01.i
Basicly it jsut removes a "-" from first line and last line, but i need to do this on all files in the folder.
Get-ChildItem c:\yourfolder -Filter *.txt | Foreach-Object{
... your code goes here ...
... you can access the current file name via $_.FullName ...
}
Here is a full working example:
Get-ChildItem c:\yourdirectory -Filter *.txt | Foreach-Object{
(Get-Content $_.FullName) |
Foreach-Object {$_ -replace "what you want to replace", "what to replace it with"} |
Set-Content $_.FullName
}
Now for a quick explanation:
Get-ChildItem with a Filter: gets all items ending in .txt
1st ForEach-Object: will perform the commands within the curly brackets
Get-Content $_.FullName: grabs the name of the .txt file
2nd ForEach-Object: will perform the replacement of text within the file
Set-Content $_.FullName: replaces the original file with the new file containing the changes
Important Note: -replace is working with a regular expression so if your string of text has any special characters
something like this ?
ls c:\temp\*.txt | %{ $newcontent=(gc $_) -replace "test","toto" |sc $_ }
$files = get-item c:\temp\*.txt
foreach ($file in $files){(Get-Content $file) | ForEach-Object {$_ -replace 'ur word','new word'} | Out-File $file}
I hope this helps.
Use Get-Childitem to filter for the files you want to modify. Per response to previous question "Powershell, like Windows, uses the extension of the file to determine the filetype."
Also:
You will replace ALL "-" with "" on the first and last lines, using what your example shows, IF you use this instead:
$txt[0] = $txt[0] -replace '-', ''
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-', ''