How can I search the first line and the last line in a text file? - powershell

I need to only search the 1st line and last line in a text file to find a "-" and remove it.
How can I do it?
I tried select-string, but I don't know to find the 1st and last line and only remove "-" from there.
Here is what the text file looks like:
% 01-A247M15 G70
N0001 G30 G17 X-100 Y-100 Z0
N0002 G31 G90 X100 Y100 Z45
N0003 ; --PART NO.: NC-HON.PHX01.COVER-SHOE.DET-1000.050
N0004 ; --TOOL: 8.55 X .3937
N0005 ;
N0006 % 01-A247M15 G70
Something like this?
$1 = Get-Content C:\work\test\01.I
$1 | select-object -index 0, ($1.count-1)

Ok, so after looking at this for a while, I decided there had to be a way to do this with a one liner. Here it is:
(gc "c:\myfile.txt") | % -Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} -Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) { $_ -replace "-" } else { $_ }} | Set-Content "c:\myfile.txt"
Here is a breakdown of what this is doing:
First, the aliases for those now familiar. I only put them in because the command is long enough as it is, so this helps keep things manageable:
gc means Get-Content
% means Foreach
$_ is for the current pipeline value (this isn't an alias, but I thought I would define it since you said you were new)
Ok, now here is what is happening in this:
(gc "c:\myfile.txt") | --> Gets the content of c:\myfile.txt and sends it down the line
% --> Does a foreach loop (goes through each item in the pipeline individually)
-Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} --> This is a begin block, it runs everything here before it goes onto the pipeline stuff. It is loading the first and last line of c:\myfile.txt into an array so we can check for first and last items
-Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) --> This runs a check on each item in the pipeline, checking if it's the first or the last item in the file
{ $_ -replace "-" } else { $_ } --> if it's the first or last, it does the replacement, if it's not, it just leaves it alone
| Set-Content "c:\myfile.txt" --> This puts the new values back into the file.
Please see the following sites for more information on each of these items:
Get-Content uses
Get-Content definition
Foreach
The Pipeline
Begin and Process part of the Foreach (this are usually for custom function, but they work in the foreach loop as well)
If ... else statements
Set-Content
So I was thinking about what if you wanted to do this to many files, or wanted to do this often. I decided to make a function that does what you are asking. Here is the function:
function Replace-FirstLast {
[CmdletBinding()]
param(
[Parameter( `
Position=0, `
Mandatory=$true)]
[String]$File,
[Parameter( `
Position=1, `
Mandatory=$true)]
[ValidateNotNull()]
[regex]$Regex,
[Parameter( `
position=2, `
Mandatory=$false)]
[string]$ReplaceWith=""
)
Begin {
$lines = Get-Content $File
} #end begin
Process {
foreach ($line in $lines) {
if ( $line -eq $lines[0] ) {
$lines[0] = $line -replace $Regex,$ReplaceWith
} #end if
if ( $line -eq $lines[-1] ) {
$lines[-1] = $line -replace $Regex,$ReplaceWith
}
} #end foreach
}#End process
end {
$lines | Set-Content $File
}#end end
} #end function
This will create a command called Replace-FirstLast. It would be called like this:
Replace-FirstLast -File "C:\myfiles.txt" -Regex "-" -ReplaceWith "NewText"
The -Replacewith is optional, if it is blank it will just remove (default value of ""). The -Regex is looking for a regular expression to match your command. For information on placing this into your profile check this article
Please note: If you file is very large (several GBs), this isn't the best solution. This would cause the whole file to live in memory, which could potentially cause other issues.

try:
$txt = get-content c:\myfile.txt
$txt[0] = $txt[0] -replace '-'
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-'
$txt | set-content c:\myfile.txt

You can use the select-object cmdlet to help you with this, since get-content basically spits out a text file as one huge array.
Thus, you can do something like this
get-content "path_to_my_awesome_file" | select -first 1 -last 1
To remove the dash after that, you can use the -Replace switch to find the dash and remove it. This is better than using System.String.Replace(...) method because it can match regex statements and replace whole arrays of strings too!
That would look like:
# gc = Get-Content. The parens tell Powershell to do whatever's inside of it
# then treat it like a variable.
(gc "path_to_my_awesome_file" | select -first 1 -last 1) -Replace '-',''

If your file is very large you might not want to read the whole file to get the last line. gc -Tail will get the last line very quickly for you.
function GetFirstAndLastLine($path){
return New-Object PSObject -Property #{
First = Get-Content $path -TotalCount 1
Last = Get-Content $path -Tail 1
}
}
GetFirstAndLastLine "u_ex150417.log"
I tried this on a 20 gb log file and it returned immediately. Reading the file takes hours.
You will still need to read the file if you want to keep all excising content and you want only to remove from the end. Using the -Tail is a quick way to check if it is there.
I hope it helps.

A cleaner answer to the above:
$Line_number_were_on = 0
$Awesome_file = Get-Content "path_to_ridiculously_excellent_file" | %{
$Line = $_
if ($Line_number_were_on -eq $Awesome_file.Length)
{ $Line -Replace '-','' }
else
{ $Line } ;
$Line_number_were_on++
}
I like one-liners, but I find that readability tends to suffer sometimes when I put terseness over function. If what you're doing is going to be part of a script that other people will be reading/maintaining, readability might be something to consider.

Following Nick's answer: I do need to do this on all text files in the directory tree and this is what I'm using now:
Get-ChildItem -Path "c:\work\test" -Filter *.i | where { !$_.PSIsContainer } | % {
$txt = Get-Content $_.FullName;
$txt[0] = $txt[0] -replace '-';
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-';
$txt | Set-Content $_.FullName
}
and it looks like it's working well now.

Simple process:
Replace $file.txt with your filename
Get-Content $file_txt | Select-Object -last 1

I was recently searching for comments in the last line of .bat files. It seems to mess up the error code of previous commands. I found this useful for searching for a pattern in the last line of files. Pspath is a hidden property that get-content outputs. If I used select-string, I would lose the filename. *.bat gets passed as -filter for speed.
get-childitem -recurse . *.bat | get-content -tail 1 | where { $_ -match 'rem' } |
select pspath
PSPath
------
Microsoft.PowerShell.Core\FileSystem::C:\users\js\foo\file.bat

Related

How to make changes to file content and save it to another file using powershell?

I want to do this
read the file
go through each line
if the line matches the pattern, do some changes with that line
save the content to another file
For now I use this script:
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
Out-File -append -filepath $output -inputobject $line
}
As you can see, here I write line by line. Is it possible to write the whole file at once ?
Good example is provided here :
(Get-Content c:\temp\test.txt) -replace '\[MYID\]', 'MyValue' | Set-Content c:\temp\test.txt
But my problem is that I have additional IF statement...
So, what could I do to improve my script ?
You could do it like that:
Get-Content -Path "C:\path\to\some\file1.txt" | foreach {
if($_ -match 'some_regex_expression') {
$_.replace("some","great")
}
else {
$_
}
} | Out-File -filepath "C:\path\to\some\file2.txt"
Get-Content reads a file line by line (array of strings) by default so you can just pipe it into a foreach loop, process each line within the loop and pipe the whole output into your file2.txt.
In this case Arrays or Array List(lists are better for large arrays) would be the most elegant solution. Simply add strings in array until ForEach loop ends. After that just flush array to a file.
This is Array List example
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
$outputData = New-Object System.Collections.ArrayList
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
$outputData.Add($line)
}
$outputData |Out-File $output
I think the if statement can be avoided in a lot of cases by using regular expression groups (e.g. (.*) and placeholders (e.g. $1, $2 etc.).
As in your example:
(Get-Content .\File1.txt) -Replace 'some(_regex_expression)', 'great$1' | Set-Content .\File2.txt
And for the good example" where [MYID\] might be somewhere inline:
(Get-Content c:\temp\test.txt) -Replace '^(.*)\[MYID\](.*)$', '$1MyValue$2' | Set-Content c:\temp\test.txt
(see also How to replace first and last part of each line with powershell)

Cycling through multiple variables in for loop

I just started working with Powershell and this is my first script.
I am checking for 3 strings in last 50 lines of a log file. I need to find all three strings and print error message if any one of those is missing. I have written following script but it does not give me the expected results.
(Get-Content C:\foo\bar.log )[-1..-50] | Out-File C:\boom\shiva\log.txt
$PO1 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P01_RCV> ok"}
$PO2 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P02_SND> ok"}
$PO3 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P03_RCV> ok"}
I am satisfied with above piece of code. The problem is with the below. I dont want to use if-else thrice. I am struggling to draft a for loop which can save space and still give me the same result.
if (!$PO1)
{
"PO1 is critical"
}
else
{
"PO1 is OK"
}
if (!$PO2)
{
"PO2 is critical"
}
else
{
"PO2 is OK"
}
if (!$PO3)
{
"PO3 is critical"
}
else
{
"PO3 is OK"
}
Can someone gave me small example of how i can fit these 3 if-else in one for loop.
If you only want to find out that all 3 strings are present this script will also show which one is missing.
(binary encoded in the variable $Cnt)
## Q:\Test\2018\07\13\SO_51323760.ps1
##
$Last50 = Get-Content 'C:\foo\bar.log' | Select-Object -Last 50
$Cnt = 0
if ($Last50 -match "<Ping:AD_P01_RCV> ok"){$Cnt++}
if ($Last50 -match "<Ping:AD_P02_SND> ok"){$Cnt+=2}
if ($Last50 -match "<Ping:AD_P03_RCV> ok"){$Cnt+=4}
if ($cnt -eq 7){
"did find all 3 strings "
} else {
"didn't find all 3 strings ({0})" -f $cnt
}
Variant immediately complaining missing P0(1..3)
$Last50 = Get-Content 'C:\foo\bar.log' | Select-Object -Last 50
if (!($Last50 -match "<Ping:AD_P01_RCV> ok")) {"PO1 is critical"}
if (!($Last50 -match "<Ping:AD_P02_SND> ok")) {"PO2 is critical"}
if (!($Last50 -match "<Ping:AD_P03_RCV> ok")) {"PO3 is critical"}
Sorry I'm a bit slow this monday.
To check in a loop different variables by building the variable name:
1..3| ForEach-Object {
If (!(Get-Variable -name "P0$_").Value){"`$P0$_ is critical"}
}
What you're trying to do is better addressed with a hashtable than with individually named variables.
$data = Get-Content 'C:\boom\shiva\log.txt'
$ht = #{}
1..3 | ForEach-Object {
$key = 'P{0:d2}' -f $_
$str = if ($_ -eq 2) {"${key}_SND"} else {"${key}_RCV"}
$ht[$key] = $data -match "<ing:AD_${str}> ok"
}
$ht.Keys | ForEach-Object {
if ($ht[$_]) {
"${key} found in log."
} else {
"${key} not found in log."
}
}
You can check if all lines were present at least once with something like this:
if (($ht.Values | Where-Object { $_ }).Count -lt 3) {
'Line missing from log.'
}
PSv3 introduced the -Tail (-Last) parameter to Get-Content, which is the most efficient way to extract a fixe number of lines from the end of a file.
You can pipe its output to Select-String, which accepts an array of regex patterns, any of which produces a match (implicit OR logic).
$matchingLines = Get-Content -Tail 50 C:\foo\bar.log |
Select-String '<Ping:AD_P01_RCV> ok', '<Ping:AD_P02_SND> ok', '<Ping:AD_P03_RCV> ok'
if ($matchingLines) { # at least 1 of the regexes matched
$matchingLines.Line # output the matching lines
} else { # nothing matched
Write-Warning "Nothing matched."
}
I finally got below draft that resolved my query to cycle variables through a for loop. I finally had to convert those individual variables to a array. But htis gives me expected result. Basically i need this script to provide input to my Nagios plugin which needs minor modification but its done.
(Get-Content C:\foo\bar.log )[-1..-50] | Out-File C:\boom\shiva\log.txt
$j = 1
$PO = new-object object[] 3
$PO[0] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P01_RCV> ok"}
$PO[1] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P02_SND> ok"}
$PO[2] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P03_RCV> ok"}
foreach( $i in $PO){
if (!$i){
"PO "+$j+" is CRITICAL"}
else{
"PO "+$j+" is OK"}
$j+=1
}
Thank you LotPings, Ansgar and mklement0 for your support and responses. I picked up a few things from your answers.

Powershell - reading ahead and While

I have a text file in the following format:
.....
ENTRY,PartNumber1,,,
FIELD,IntCode,123456
...
FIELD,MFRPartNumber,ABC123,,,
...
FIELD,XPARTNUMBER,ABC123
...
FIELD,InternalPartNumber,3214567
...
ENTRY,PartNumber2,,,
...
...
the ... indicates there is other data between these fields. The ONLY thing I can be certain of is that the field starting with ENTRY is a new set of records. The rows starting with FIELD can be in any order, and not all of them may be present in each group of data.
I need to read in a chunk of data
Search for any field matching the
string ABC123
If ABC123 found, search for the existence of the
InternalPartNumber field & return that row of data.
I have not seen a way to use Get-Content that can read in a variable number of rows as a set & be able to search it.
Here is the code I currently have, which will read a file, searching for a string & replacing it with another. I hope this can be modified to be used in this case.
$ftype = "*.txt"
$fnames = gci -Path $filefolder1 -Filter $ftype -Recurse|% {$_.FullName}
$mfgPartlist = Import-Csv -Path "C:\test\mfrPartList.csv"
foreach ($file in $fnames) {
$contents = Get-Content -Path $file
foreach ($partnbr in $mfgPartlist) {
$oldString = $mfgPartlist.OldValue
$newString = $mfgPartlist.NewValue
if (Select-String -Path $file -SimpleMatch $oldString -Debug -Quiet) {
$stringData = $contents -imatch $oldString
$stringData = $stringData -replace "[\n\r]","|"
foreach ($dataline in $stringData) {
$file +"|"+$stringData+"|"+$oldString+"|"+$newString|Out-File "C:\test\Datachanges.txt" -Width 2000 -Append
}
$contents = $contents -replace $oldString $newString
Set-Content -Path $file -Value $contents
}
}
}
Is there a way to read & search a text file in "chunks" using Powershell? Or to do a Read-ahead & determine what to search?
Assuming your fine isn't too big to read into memory all at once:
$Text = Get-Content testfile.txt -Raw
($Text -split '(?ms)^(?=ENTRY)') |
foreach {
if ($_ -match '(?ms)^FIELD\S+ABC123')
{$_ -replace '(?ms).+(^Field\S+InternalPartNumber.+?$).+','$1'}
}
FIELD,InternalPartNumber,3214567
That reads the entire file in as a single multiline string, and then splits it at the beginning of any line that starts with 'ENTRY'. Then it tests each segment for a FIELD line that contains 'ABC123', and if it does, removes everything except the FIELD line for the InternalPartNumber.
This is not my best work as I have just got back from vacation. You could use a while loop reading the text and set an entry flag to gobble up the text in chunks. However if your files are not too big then you could just read up the text file at once and use regex to split up the chunks and then process accordingly.
$pattern = "ABC123"
$matchedRowToReturn = "InternalPartNumber"
$fileData = Get-Content "d:\temp\test.txt" | Where-Object{$_ -match '^(entry|field)'} | Out-String
$parts = $fileData | Select-String '(?smi)(^Entry).*?(?=^Entry|\Z)' -AllMatches | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
$parts | Where-Object{$_ -match $pattern} | Select-String "$matchedRowToReturn.*$" | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
What this will do is read in the text file, drop any lines that are not entry or field related, as one long string and split it up into chunks that start with lines that begin with the work "Entry".
Then we drop those "parts" that do not contain the $pattern. Of the remaining that match extract the InternalPartNumber line and present.

Powershell V2 find and replace

I am trying to change dates programmatically in a file. The line I need to fix looks like this:
set ##dateto = '03/15/12'
I need to write a powershell V2 script that replaces what's inside the single quotes, and I have no idea how to do this.
The closest I've come looks like this:
gc $file | ? {$_ -match "set ##dateto ="} | % {$temp=$_.split("'");$temp[17]
=$CorrectedDate;$temp -join ","} | -outfile newfile.txt
Problems with this: It gives an error about the index 17 being out of range. Also, the outfile only contains one line (The unmodified line). I'd appreciate any help with this. Thanks!
You can do something like this ( though you may want to handle the corner cases) :
$CorrectedDate = '10/09/09'
gc $file | %{
if($_ -match "^set ##dateto = '(\d\d/\d\d/\d\d)'") {
$_ -replace $matches[1], $CorrectedDate;
}
else {
$_
}
} | out-file test2.txt
mv test2.txt $file -force

Remove Top Line of Text File with PowerShell

I am trying to just remove the first line of about 5000 text files before importing them.
I am still very new to PowerShell so not sure what to search for or how to approach this. My current concept using pseudo-code:
set-content file (get-content unless line contains amount)
However, I can't seem to figure out how to do something like contains.
While I really admire the answer from #hoge both for a very concise technique and a wrapper function to generalize it and I encourage upvotes for it, I am compelled to comment on the other two answers that use temp files (it gnaws at me like fingernails on a chalkboard!).
Assuming the file is not huge, you can force the pipeline to operate in discrete sections--thereby obviating the need for a temp file--with judicious use of parentheses:
(Get-Content $file | Select-Object -Skip 1) | Set-Content $file
... or in short form:
(gc $file | select -Skip 1) | sc $file
It is not the most efficient in the world, but this should work:
get-content $file |
select -Skip 1 |
set-content "$file-temp"
move "$file-temp" $file -Force
Using variable notation, you can do it without a temporary file:
${C:\file.txt} = ${C:\file.txt} | select -skip 1
function Remove-Topline ( [string[]]$path, [int]$skip=1 ) {
if ( -not (Test-Path $path -PathType Leaf) ) {
throw "invalid filename"
}
ls $path |
% { iex "`${$($_.fullname)} = `${$($_.fullname)} | select -skip $skip" }
}
I just had to do the same task, and gc | select ... | sc took over 4 GB of RAM on my machine while reading a 1.6 GB file. It didn't finish for at least 20 minutes after reading the whole file in (as reported by Read Bytes in Process Explorer), at which point I had to kill it.
My solution was to use a more .NET approach: StreamReader + StreamWriter.
See this answer for a great answer discussing the perf: In Powershell, what's the most efficient way to split a large text file by record type?
Below is my solution. Yes, it uses a temporary file, but in my case, it didn't matter (it was a freaking huge SQL table creation and insert statements file):
PS> (measure-command{
$i = 0
$ins = New-Object System.IO.StreamReader "in/file/pa.th"
$outs = New-Object System.IO.StreamWriter "out/file/pa.th"
while( !$ins.EndOfStream ) {
$line = $ins.ReadLine();
if( $i -ne 0 ) {
$outs.WriteLine($line);
}
$i = $i+1;
}
$outs.Close();
$ins.Close();
}).TotalSeconds
It returned:
188.1224443
Inspired by AASoft's answer, I went out to improve it a bit more:
Avoid the loop variable $i and the comparison with 0 in every loop
Wrap the execution into a try..finally block to always close the files in use
Make the solution work for an arbitrary number of lines to remove from the beginning of the file
Use a variable $p to reference the current directory
These changes lead to the following code:
$p = (Get-Location).Path
(Measure-Command {
# Number of lines to skip
$skip = 1
$ins = New-Object System.IO.StreamReader ($p + "\test.log")
$outs = New-Object System.IO.StreamWriter ($p + "\test-1.log")
try {
# Skip the first N lines, but allow for fewer than N, as well
for( $s = 1; $s -le $skip -and !$ins.EndOfStream; $s++ ) {
$ins.ReadLine()
}
while( !$ins.EndOfStream ) {
$outs.WriteLine( $ins.ReadLine() )
}
}
finally {
$outs.Close()
$ins.Close()
}
}).TotalSeconds
The first change brought the processing time for my 60 MB file down from 5.3s to 4s. The rest of the changes is more cosmetic.
$x = get-content $file
$x[1..$x.count] | set-content $file
Just that much. Long boring explanation follows. Get-content returns an array. We can "index into" array variables, as demonstrated in this and other Scripting Guys posts.
For example, if we define an array variable like this,
$array = #("first item","second item","third item")
so $array returns
first item
second item
third item
then we can "index into" that array to retrieve only its 1st element
$array[0]
or only its 2nd
$array[1]
or a range of index values from the 2nd through the last.
$array[1..$array.count]
I just learned from a website:
Get-ChildItem *.txt | ForEach-Object { (get-Content $_) | Where-Object {(1) -notcontains $_.ReadCount } | Set-Content -path $_ }
Or you can use the aliases to make it short, like:
gci *.txt | % { (gc $_) | ? { (1) -notcontains $_.ReadCount } | sc -path $_ }
Another approach to remove the first line from file, using multiple assignment technique. Refer Link
$firstLine, $restOfDocument = Get-Content -Path $filename
$modifiedContent = $restOfDocument
$modifiedContent | Out-String | Set-Content $filename
skip` didn't work, so my workaround is
$LinesCount = $(get-content $file).Count
get-content $file |
select -Last $($LinesCount-1) |
set-content "$file-temp"
move "$file-temp" $file -Force
Following on from Michael Soren's answer.
If you want to edit all .txt files in the current directory and remove the first line from each.
Get-ChildItem (Get-Location).Path -Filter *.txt |
Foreach-Object {
(Get-Content $_.FullName | Select-Object -Skip 1) | Set-Content $_.FullName
}
For smaller files you could use this:
& C:\windows\system32\more +1 oldfile.csv > newfile.csv | out-null
... but it's not very effective at processing my example file of 16MB. It doesn't seem to terminate and release the lock on newfile.csv.