I have a script that searches for a series of strings (stored in a txt file) in the contents of files in a directory. I would like to modify it to also list the text around the string found (these are regular strings, not regex expressions). I played around a lot and it seems like I need to use -Context, but I am not sure how to get the text from that.
Also, the files I am searching may not have linefeeds, so if it could just get the xx characters before and after the search term, that would be better.
Here's what I have so far (I omitted the looping though files parts):
$result = Get-Content $file.FullName | Select-String $control -quiet
If ($result -eq $True)
{
$match = $file.FullName
"Match on string : $control in file : $match" | Out-File $output -Append
Write-host "Match on string : $control in file : $match"
}
If it could write the context, that would be perfect. Seems like I need to use $_Matches, but not sure how.
If $control is just a regular string, can you turn it into a regular expression?
$n = 3
$re = "(.{0,$n})(" + [Regex]::Escape($control) + ")(.{0,$n})"
$result = (Get-Content $file.FullName) -match $re
With this, the $matches hashtable should give you access to the $n characters before and after the match:
if ($result.Length -gt 0) {
echo "Before: $($matches[1])"
echo "After: $($matches[3])"
}
Here is what I have now and it seems to work:
$regex = "[\s\S]{0,$ContextChars}$SearchTerm[\s\S]{0,$ContextChars}"
$results = Get-Content $file.FullName | Select-String -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value }
if ($results)
{
foreach($result in $results)
{
$display = $result
"File: $file Match ---$display---"
}
}
The only thing I wish I had but don't know how to get it is the line number the match is found on.
Related
I have a text file that contains a string I want to modify.
Example text file contents:
abc=1
def=2
ghi=3
If I run this code:
$file = "c:\test.txt"
$MinX = 100
$MinY = 100
$a = (Get-Content $file) | %{
if($_ -match "def=(\d*)"){
if($Matches[1] -gt $MinX){$_ -replace "$($Matches[1])","$($MinX)" }
}
}
$a
The result is:
def=100
If I omit the greater-than check like so:
$a = (Get-Content $file) | %{
if($_ -match "def=(\d*)"){
$_ -replace "$($Matches[1])","$($MinX)"
}
}
$a
The result is correct:
abc=1
def=100
ghi=3
I don't understand how a simple integer comparison before doing the replace could screw things up so badly, can anyone advise what I'm missing?
The comparison operator -gt will never get you a value of $true because you need to
cast the $matches[1] string value to int first so it compares two integer numbers
2 is never greater than 100.. Change the operator to -lt instead.
Your code outputs only one line, because you forgot to also output unchanged lines that do not match the regex
$file = 'c:\test.txt'
$MinX = 100
$MinY = 100
$a = (Get-Content $file) | ForEach-Object {
if ($_ -match '^def=(\d+)'){
if([int]$matches[1] -lt $MinX){ $_ -replace $matches[1],$MinX }
}
else {
$_
}
}
$a
Or use switch (is also faster than using Get-Content):
$file = 'c:\test.txt'
$MinX = 100
$MinY = 100
$a = switch -Regex -File $file {
'^def=(\d+)' {
if([int]$matches[1] -lt $MinX){ $_ -replace $matches[1],$MinX }
}
default { $_ }
}
$a
Output:
abc=1
def=100
ghi=3
That's because the expression ($Matches[1] -gt $MinX) is a string comparison. In Powershell, the left-hand side of a comparison dictates the comparison type and since that is of type [string], Powershell has to cast/convert the right-hand side of the expression to [string] also. You expression, therefore, is evaluated as ([string]$Matches[1] -gt [string]$MinX).
I want to do this
read the file
go through each line
if the line matches the pattern, do some changes with that line
save the content to another file
For now I use this script:
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
Out-File -append -filepath $output -inputobject $line
}
As you can see, here I write line by line. Is it possible to write the whole file at once ?
Good example is provided here :
(Get-Content c:\temp\test.txt) -replace '\[MYID\]', 'MyValue' | Set-Content c:\temp\test.txt
But my problem is that I have additional IF statement...
So, what could I do to improve my script ?
You could do it like that:
Get-Content -Path "C:\path\to\some\file1.txt" | foreach {
if($_ -match 'some_regex_expression') {
$_.replace("some","great")
}
else {
$_
}
} | Out-File -filepath "C:\path\to\some\file2.txt"
Get-Content reads a file line by line (array of strings) by default so you can just pipe it into a foreach loop, process each line within the loop and pipe the whole output into your file2.txt.
In this case Arrays or Array List(lists are better for large arrays) would be the most elegant solution. Simply add strings in array until ForEach loop ends. After that just flush array to a file.
This is Array List example
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
$outputData = New-Object System.Collections.ArrayList
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
$outputData.Add($line)
}
$outputData |Out-File $output
I think the if statement can be avoided in a lot of cases by using regular expression groups (e.g. (.*) and placeholders (e.g. $1, $2 etc.).
As in your example:
(Get-Content .\File1.txt) -Replace 'some(_regex_expression)', 'great$1' | Set-Content .\File2.txt
And for the good example" where [MYID\] might be somewhere inline:
(Get-Content c:\temp\test.txt) -Replace '^(.*)\[MYID\](.*)$', '$1MyValue$2' | Set-Content c:\temp\test.txt
(see also How to replace first and last part of each line with powershell)
I just started working with Powershell and this is my first script.
I am checking for 3 strings in last 50 lines of a log file. I need to find all three strings and print error message if any one of those is missing. I have written following script but it does not give me the expected results.
(Get-Content C:\foo\bar.log )[-1..-50] | Out-File C:\boom\shiva\log.txt
$PO1 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P01_RCV> ok"}
$PO2 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P02_SND> ok"}
$PO3 = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P03_RCV> ok"}
I am satisfied with above piece of code. The problem is with the below. I dont want to use if-else thrice. I am struggling to draft a for loop which can save space and still give me the same result.
if (!$PO1)
{
"PO1 is critical"
}
else
{
"PO1 is OK"
}
if (!$PO2)
{
"PO2 is critical"
}
else
{
"PO2 is OK"
}
if (!$PO3)
{
"PO3 is critical"
}
else
{
"PO3 is OK"
}
Can someone gave me small example of how i can fit these 3 if-else in one for loop.
If you only want to find out that all 3 strings are present this script will also show which one is missing.
(binary encoded in the variable $Cnt)
## Q:\Test\2018\07\13\SO_51323760.ps1
##
$Last50 = Get-Content 'C:\foo\bar.log' | Select-Object -Last 50
$Cnt = 0
if ($Last50 -match "<Ping:AD_P01_RCV> ok"){$Cnt++}
if ($Last50 -match "<Ping:AD_P02_SND> ok"){$Cnt+=2}
if ($Last50 -match "<Ping:AD_P03_RCV> ok"){$Cnt+=4}
if ($cnt -eq 7){
"did find all 3 strings "
} else {
"didn't find all 3 strings ({0})" -f $cnt
}
Variant immediately complaining missing P0(1..3)
$Last50 = Get-Content 'C:\foo\bar.log' | Select-Object -Last 50
if (!($Last50 -match "<Ping:AD_P01_RCV> ok")) {"PO1 is critical"}
if (!($Last50 -match "<Ping:AD_P02_SND> ok")) {"PO2 is critical"}
if (!($Last50 -match "<Ping:AD_P03_RCV> ok")) {"PO3 is critical"}
Sorry I'm a bit slow this monday.
To check in a loop different variables by building the variable name:
1..3| ForEach-Object {
If (!(Get-Variable -name "P0$_").Value){"`$P0$_ is critical"}
}
What you're trying to do is better addressed with a hashtable than with individually named variables.
$data = Get-Content 'C:\boom\shiva\log.txt'
$ht = #{}
1..3 | ForEach-Object {
$key = 'P{0:d2}' -f $_
$str = if ($_ -eq 2) {"${key}_SND"} else {"${key}_RCV"}
$ht[$key] = $data -match "<ing:AD_${str}> ok"
}
$ht.Keys | ForEach-Object {
if ($ht[$_]) {
"${key} found in log."
} else {
"${key} not found in log."
}
}
You can check if all lines were present at least once with something like this:
if (($ht.Values | Where-Object { $_ }).Count -lt 3) {
'Line missing from log.'
}
PSv3 introduced the -Tail (-Last) parameter to Get-Content, which is the most efficient way to extract a fixe number of lines from the end of a file.
You can pipe its output to Select-String, which accepts an array of regex patterns, any of which produces a match (implicit OR logic).
$matchingLines = Get-Content -Tail 50 C:\foo\bar.log |
Select-String '<Ping:AD_P01_RCV> ok', '<Ping:AD_P02_SND> ok', '<Ping:AD_P03_RCV> ok'
if ($matchingLines) { # at least 1 of the regexes matched
$matchingLines.Line # output the matching lines
} else { # nothing matched
Write-Warning "Nothing matched."
}
I finally got below draft that resolved my query to cycle variables through a for loop. I finally had to convert those individual variables to a array. But htis gives me expected result. Basically i need this script to provide input to my Nagios plugin which needs minor modification but its done.
(Get-Content C:\foo\bar.log )[-1..-50] | Out-File C:\boom\shiva\log.txt
$j = 1
$PO = new-object object[] 3
$PO[0] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P01_RCV> ok"}
$PO[1] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P02_SND> ok"}
$PO[2] = Get-Content C:\boom\shiva\log.txt | where {$_ -match "<Ping:AD_P03_RCV> ok"}
foreach( $i in $PO){
if (!$i){
"PO "+$j+" is CRITICAL"}
else{
"PO "+$j+" is OK"}
$j+=1
}
Thank you LotPings, Ansgar and mklement0 for your support and responses. I picked up a few things from your answers.
I have a text file in the following format:
.....
ENTRY,PartNumber1,,,
FIELD,IntCode,123456
...
FIELD,MFRPartNumber,ABC123,,,
...
FIELD,XPARTNUMBER,ABC123
...
FIELD,InternalPartNumber,3214567
...
ENTRY,PartNumber2,,,
...
...
the ... indicates there is other data between these fields. The ONLY thing I can be certain of is that the field starting with ENTRY is a new set of records. The rows starting with FIELD can be in any order, and not all of them may be present in each group of data.
I need to read in a chunk of data
Search for any field matching the
string ABC123
If ABC123 found, search for the existence of the
InternalPartNumber field & return that row of data.
I have not seen a way to use Get-Content that can read in a variable number of rows as a set & be able to search it.
Here is the code I currently have, which will read a file, searching for a string & replacing it with another. I hope this can be modified to be used in this case.
$ftype = "*.txt"
$fnames = gci -Path $filefolder1 -Filter $ftype -Recurse|% {$_.FullName}
$mfgPartlist = Import-Csv -Path "C:\test\mfrPartList.csv"
foreach ($file in $fnames) {
$contents = Get-Content -Path $file
foreach ($partnbr in $mfgPartlist) {
$oldString = $mfgPartlist.OldValue
$newString = $mfgPartlist.NewValue
if (Select-String -Path $file -SimpleMatch $oldString -Debug -Quiet) {
$stringData = $contents -imatch $oldString
$stringData = $stringData -replace "[\n\r]","|"
foreach ($dataline in $stringData) {
$file +"|"+$stringData+"|"+$oldString+"|"+$newString|Out-File "C:\test\Datachanges.txt" -Width 2000 -Append
}
$contents = $contents -replace $oldString $newString
Set-Content -Path $file -Value $contents
}
}
}
Is there a way to read & search a text file in "chunks" using Powershell? Or to do a Read-ahead & determine what to search?
Assuming your fine isn't too big to read into memory all at once:
$Text = Get-Content testfile.txt -Raw
($Text -split '(?ms)^(?=ENTRY)') |
foreach {
if ($_ -match '(?ms)^FIELD\S+ABC123')
{$_ -replace '(?ms).+(^Field\S+InternalPartNumber.+?$).+','$1'}
}
FIELD,InternalPartNumber,3214567
That reads the entire file in as a single multiline string, and then splits it at the beginning of any line that starts with 'ENTRY'. Then it tests each segment for a FIELD line that contains 'ABC123', and if it does, removes everything except the FIELD line for the InternalPartNumber.
This is not my best work as I have just got back from vacation. You could use a while loop reading the text and set an entry flag to gobble up the text in chunks. However if your files are not too big then you could just read up the text file at once and use regex to split up the chunks and then process accordingly.
$pattern = "ABC123"
$matchedRowToReturn = "InternalPartNumber"
$fileData = Get-Content "d:\temp\test.txt" | Where-Object{$_ -match '^(entry|field)'} | Out-String
$parts = $fileData | Select-String '(?smi)(^Entry).*?(?=^Entry|\Z)' -AllMatches | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
$parts | Where-Object{$_ -match $pattern} | Select-String "$matchedRowToReturn.*$" | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
What this will do is read in the text file, drop any lines that are not entry or field related, as one long string and split it up into chunks that start with lines that begin with the work "Entry".
Then we drop those "parts" that do not contain the $pattern. Of the remaining that match extract the InternalPartNumber line and present.
I need to only search the 1st line and last line in a text file to find a "-" and remove it.
How can I do it?
I tried select-string, but I don't know to find the 1st and last line and only remove "-" from there.
Here is what the text file looks like:
% 01-A247M15 G70
N0001 G30 G17 X-100 Y-100 Z0
N0002 G31 G90 X100 Y100 Z45
N0003 ; --PART NO.: NC-HON.PHX01.COVER-SHOE.DET-1000.050
N0004 ; --TOOL: 8.55 X .3937
N0005 ;
N0006 % 01-A247M15 G70
Something like this?
$1 = Get-Content C:\work\test\01.I
$1 | select-object -index 0, ($1.count-1)
Ok, so after looking at this for a while, I decided there had to be a way to do this with a one liner. Here it is:
(gc "c:\myfile.txt") | % -Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} -Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) { $_ -replace "-" } else { $_ }} | Set-Content "c:\myfile.txt"
Here is a breakdown of what this is doing:
First, the aliases for those now familiar. I only put them in because the command is long enough as it is, so this helps keep things manageable:
gc means Get-Content
% means Foreach
$_ is for the current pipeline value (this isn't an alias, but I thought I would define it since you said you were new)
Ok, now here is what is happening in this:
(gc "c:\myfile.txt") | --> Gets the content of c:\myfile.txt and sends it down the line
% --> Does a foreach loop (goes through each item in the pipeline individually)
-Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} --> This is a begin block, it runs everything here before it goes onto the pipeline stuff. It is loading the first and last line of c:\myfile.txt into an array so we can check for first and last items
-Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) --> This runs a check on each item in the pipeline, checking if it's the first or the last item in the file
{ $_ -replace "-" } else { $_ } --> if it's the first or last, it does the replacement, if it's not, it just leaves it alone
| Set-Content "c:\myfile.txt" --> This puts the new values back into the file.
Please see the following sites for more information on each of these items:
Get-Content uses
Get-Content definition
Foreach
The Pipeline
Begin and Process part of the Foreach (this are usually for custom function, but they work in the foreach loop as well)
If ... else statements
Set-Content
So I was thinking about what if you wanted to do this to many files, or wanted to do this often. I decided to make a function that does what you are asking. Here is the function:
function Replace-FirstLast {
[CmdletBinding()]
param(
[Parameter( `
Position=0, `
Mandatory=$true)]
[String]$File,
[Parameter( `
Position=1, `
Mandatory=$true)]
[ValidateNotNull()]
[regex]$Regex,
[Parameter( `
position=2, `
Mandatory=$false)]
[string]$ReplaceWith=""
)
Begin {
$lines = Get-Content $File
} #end begin
Process {
foreach ($line in $lines) {
if ( $line -eq $lines[0] ) {
$lines[0] = $line -replace $Regex,$ReplaceWith
} #end if
if ( $line -eq $lines[-1] ) {
$lines[-1] = $line -replace $Regex,$ReplaceWith
}
} #end foreach
}#End process
end {
$lines | Set-Content $File
}#end end
} #end function
This will create a command called Replace-FirstLast. It would be called like this:
Replace-FirstLast -File "C:\myfiles.txt" -Regex "-" -ReplaceWith "NewText"
The -Replacewith is optional, if it is blank it will just remove (default value of ""). The -Regex is looking for a regular expression to match your command. For information on placing this into your profile check this article
Please note: If you file is very large (several GBs), this isn't the best solution. This would cause the whole file to live in memory, which could potentially cause other issues.
try:
$txt = get-content c:\myfile.txt
$txt[0] = $txt[0] -replace '-'
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-'
$txt | set-content c:\myfile.txt
You can use the select-object cmdlet to help you with this, since get-content basically spits out a text file as one huge array.
Thus, you can do something like this
get-content "path_to_my_awesome_file" | select -first 1 -last 1
To remove the dash after that, you can use the -Replace switch to find the dash and remove it. This is better than using System.String.Replace(...) method because it can match regex statements and replace whole arrays of strings too!
That would look like:
# gc = Get-Content. The parens tell Powershell to do whatever's inside of it
# then treat it like a variable.
(gc "path_to_my_awesome_file" | select -first 1 -last 1) -Replace '-',''
If your file is very large you might not want to read the whole file to get the last line. gc -Tail will get the last line very quickly for you.
function GetFirstAndLastLine($path){
return New-Object PSObject -Property #{
First = Get-Content $path -TotalCount 1
Last = Get-Content $path -Tail 1
}
}
GetFirstAndLastLine "u_ex150417.log"
I tried this on a 20 gb log file and it returned immediately. Reading the file takes hours.
You will still need to read the file if you want to keep all excising content and you want only to remove from the end. Using the -Tail is a quick way to check if it is there.
I hope it helps.
A cleaner answer to the above:
$Line_number_were_on = 0
$Awesome_file = Get-Content "path_to_ridiculously_excellent_file" | %{
$Line = $_
if ($Line_number_were_on -eq $Awesome_file.Length)
{ $Line -Replace '-','' }
else
{ $Line } ;
$Line_number_were_on++
}
I like one-liners, but I find that readability tends to suffer sometimes when I put terseness over function. If what you're doing is going to be part of a script that other people will be reading/maintaining, readability might be something to consider.
Following Nick's answer: I do need to do this on all text files in the directory tree and this is what I'm using now:
Get-ChildItem -Path "c:\work\test" -Filter *.i | where { !$_.PSIsContainer } | % {
$txt = Get-Content $_.FullName;
$txt[0] = $txt[0] -replace '-';
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-';
$txt | Set-Content $_.FullName
}
and it looks like it's working well now.
Simple process:
Replace $file.txt with your filename
Get-Content $file_txt | Select-Object -last 1
I was recently searching for comments in the last line of .bat files. It seems to mess up the error code of previous commands. I found this useful for searching for a pattern in the last line of files. Pspath is a hidden property that get-content outputs. If I used select-string, I would lose the filename. *.bat gets passed as -filter for speed.
get-childitem -recurse . *.bat | get-content -tail 1 | where { $_ -match 'rem' } |
select pspath
PSPath
------
Microsoft.PowerShell.Core\FileSystem::C:\users\js\foo\file.bat