How to take a substring with the endpoint being a carriage return and/or line feed? - powershell

How do I take a substring where I don't know the length of the thing I want, but I know that the end of it is a CR/LF?
I'm communicating with a server trying to extract some information. The start point of the substring is well defined, but the end point can be variable. In other scripting languages, I'd expect there to be a find() command, but I haven't found one in PowerShell yet. Most articles and SE questions refer to Get-Content, substring, and Select-String, with the intent to replace a CRLF rather than just find it.
The device I am communicating with has a telnet-like command structure. It starts out with it's model as a prompt. You can give it commands and it responds. I'm trying to grab the hostname from it. This is what a prompt, command, and response look like in a terminal:
TSS-752>hostname
Host Name: ThisIsMyHostname
TSS-752>
I want to extract the hostname. I came across IndexOf(), which seems to work like the find command I am looking for. ":" is a good start point, and then I want to truncate it to the next CRLF.
NOTE: I have made my code work to my satisfaction, but in the interest of not receiving anymore downvotes (3 at the time of this writing) or getting banned again, I will not post the solution, nor delete the question. Those are taboo here. Taking into account the requests for more info from the comments has only earned me downvotes, so I think I'm just stuck in the SO-Catch-22.

You could probably have found the first 20 examples in c# outlining this exact same approach, but here goes with PowerShell examples
If you want to find the index at which CR/LF occurs, use String.IndexOf():
PS C:\> " `r`n".IndexOf("`r`n")
2
Use it to calculate the length parameter argument for String.Substring():
$String = " This phrase starts at index 4 ends at some point`r`nand then there's more"
# Define the start index
$Offset = 4
# Find the index of the end marker
$CRLFIndex = $string.IndexOf("`r`n")
# Check that the end marker was actually found
if($CRLFIndex -eq -1){
throw "CRLF not found in string"
}
# Calculate length based on end marker index - start index
$Length = $CRLFIndex - $Offset
# Generate substring
$Substring = $String.Substring($Offset,$Length)

Related

Regsubbing simple matches

I'm looking for a regsub example that does the following:
123tcl456TCL789 => 123!tcl!456!TCL!789
This is an Tcl example => This is an !Tcl! example
Yes, I could use string first to find a position and mash things but I saw in past a regsub command that does what I want but I can't recall. What would be the regsub command that allows that? I would guess regsub -all -nocase is a start.
I am bad at regsub and regexps. I wonder if there is a site or tool/script that we can supply a string, the final result and then we get the regsub form.
You're looking at the right tool, but there are various options, depending on exactly what the conditions are when faced with other text. Here's one that wraps each occurrence of "Tcl" (any capitalisation) with exclamation marks:
set inputString "123tcl456TCL789"
set replaced [regsub -all -nocase {tcl} $inputString {!&!}]
puts $replaced
That's using a very simple regular expression with the -nocase option, and the replacement means "put ! on either side of the substring matched".
Another (more generally applicable... perhaps) might be to put ! after any letter or number sequence that is followed by a number or letter.
set replaced [regsub -all {[A-Za-z]+(?=[0-9])|[0-9]+(?=[A-Za-z])} $inputString {&!}]
Note that doing things correctly typically requires understanding the real input data fairly well. For example, whether the numbers include floating point numbers in scientific notation, or whether the substrings to delimit are of fixed length.

How to parse logs and mask specific characters using Powershell

I have a problem that I really hope to get some help with.
It's rather complex but I will try and keep my explanation as simple and objective as possible. In a nutshell, I have log files that contain thousands of lines. Each line consists of information like date/time, source, type and message.
In this case the message contains a variable size ...999 password that I need to mask. Basically the message looks something like this (its an ISO message):
year-day-month 00:00:00,computername,source, info,rx 0210 22222222222222333333333333333333444444444444444444444444455555008PASSWORD6666666666666666677777777777777777777777ccccdddddddddddffffffffffffff
For each line I need to zero in on password length identifier (008) do a count on it and then proceed to mask the number of following characters, which would be PASSWORD in this case. I would change it to something like XXXXXXXX instead so once done the line would look like this:
year-day-month 00:00:00,computername,source, info,rx 0210 22222222222222333333333333333333444444444444444444444444455555008XXXXXXXX6666666666666666677777777777777777777777ccccdddddddddddffffffffffffff
I honestly have no idea how to start doing this with PowerShell. I need to loop though each line in the log file, and identify the number of characters to mask.
I've kept this high level as a starting point, there are some other complexities that I hope to figure out at a later time, like the fact that there are different types of messages and depending on the type the password length starts at another character position. I might be able to build on my aforementioned question first but if anyone understands what I mean then I would appreciate some help or tips about that too.
Any help is appreciated.
Thanks!
Additional information to original post:
Firstly, thank you to everyone for your answers thus far, its been greatly appreciated. Now that I have a baseline for how your answers are being formulated based on my information I feel I need to provide some more details.
1) There was a question about whether or not the password starting position is fixed and the logic behind it.
The password position is not fixed. In an ISO message (which these are) the password, and all information in the message, is dependent on the data elements present in the message which are in turn are indicated by the bitmap. The bitmap is also part of the message. So in my case, I need to script additional logic above and beyond the answers provided to come full circle.
2) This is what I know and these are the steps I hope to accomplish with the script.
What I know:
- There are 3 different msg types that contain passwords. I've figured out where the starting position of the password is for each msg type based on the bitmap and the data elements present.
For example 0210 contains one in this case:
year-day-month 00:00:00,computername,source, info,rx 0210 22222222222222333333333333333333444444444444444444444444455555008PASSWORD6666666666666666677777777777777777777777ccccdddddddddddffffffffffffff
What I need to do:
Pass the log file to the script
For each line in the log identify if the line has a msg type that contains a password
If the message type contains a password then determine length of password by reading the preceding 3 digits to the password ("ans ...999" which means alphanumeric - special with length max of 999 and 3 digit length info). Lets say the character position of the password would be 107 in this case for arguments sake, so we know to read the 3 numbers before it.
Starting at the character position of the password, mask the number of characters required with XXX. Loop through log until complete.
It does seem as though you're indicating the position of the password and the length of the password will vary. As long as you have the '008' and something like '666' to indicate a starting and stopping point something like this should work.
$filePath = '.\YourFile.log'
(Get-Content $filePath) | ForEach-Object {
$startIndex = $_.IndexOf('008') + 3
$endIndex = $_.IndexOf('666', $startIndex)
$passwordLength = $endIndex - $startIndex
$passwordToReplace = $_.Substring($startIndex,$passwordLength)
$obfuscation = New-Object 'string' -ArgumentList 'X', $passwordLength
$_.Replace($passwordToReplace, $obfuscation)
} | Set-Content $filePath
If the file is too large to load into memory then you will have to StreamReader and StreamWriter to write the content to a new file and delete the old.
Assuming a fixed position where the password-length field starts, based on your sample line (if that position is variable, as you've hinted at, you need to tell us more):
$line = '22222222222222333333333333333333444444444444444444444444455555008PASSWORD6666666666666666677777777777777777777777ccccdddddddddddffffffffffffff'
$posStart = 62 # fixed 0-based pos. where length-of-password field stats
$pwLenFieldLen = 3 # length of length-of-password field
$pwLen = [int] $line.SubString($posStart, $pwLenFieldLen) # extract password length
$pwSubstitute = 'X' * $pwLen # determine the password replacement string
# replace the password with all Xs
$line -replace "(?<=^.{$($posStart + $pwLenFieldLen)}).{$pwLen}(?=.*)", $pwSubstitute
Note: This is not the most efficient way to do it, but it is concise.

PCRE Regex - How to return matches with multiline string looking for multiple strings in any order

I need to use Perl-compatible regex to match several strings which appear over multiple lines in a file.
The matches need to appear in any order (server servernameA.company.com followed by servernameZ.company.com followed by servernameD.company.com or any order combination of the three). Note: All matches will appear at the beginning of each line.
In my testing with grep -P, I haven't even been able to produce a match on simple string terms that appear in any order over new lines (even when using the /s and /m modifiers). I am pretty sure from reading I need a look-ahead assertion but the samples I used didn't produce a match for me even after analyzing each bit of the regex to make sure it was relevant to my scenario.
Since I need to support this in Production, I would like an answer that is simple and relatively straight-forward to interpret.
Sample Input
irrelevant_directive = 0
# Comment
server servernameA.company.com iburst
additional_directive = yes
server servernameZ.company.com iburst
server servernameD.company.com iburst
# Additional Comment
final_directive = true
Expectation
The regex should match and return the 3 lines beginning with server (that appear in any order) if and only if there is a perfect match for strings'serverA.company.com', 'serverZ.company.com', and 'serverD.company.com' followed by iburst. All 3 strings must be included.
Finally, if the answer (or a very similar form of the answer) can address checking for strings in any order on a single line, that would be very helpful. For example, if I have a single-line string of: preauth param audit=true silent deny=5 severe=false unlock_time=1000 time=20ms and I want to ensure the terms deny=5 and time=20ms appear in any order and if so match.
Thank you in advance for your assistance.
Regarding the main issue [for the secondary question see Casimir et Hippolyte answer] (using x modifier): https://regex101.com/r/mkxcap/5
(?:
(?<a>.*serverA\.company\.com\s+iburst.*)
|(?<z>.*serverZ\.company\.com\s+iburst.*)
|(?<d>.*serverD\.company\.com\s+iburst.*)
|[^\n]*(?:\n|$)
)++
(?(a)(?(z)(?(d)(*ACCEPT))))(*SKIP)(*F)
The matches are now all in the a, z and d capturing groups.
It's not the most efficient (it goes three times over each line with backtracking...), but the main takeaway is to register the matches with capturing groups and then checking for them being defined.
You don't need to use the PCRE features, you can simply write in ERE:
grep -E '.*(\bdeny=5\b.*\btime=20ms\b|\btime=20ms\b.*\bdeny=5\b).*' file
The PCRE approach will be different: (however you can also use the previous pattern)
grep -P '^(?=.*\bdeny=5\b).*\btime=20ms\b.*' file

Powershell: search backwards from end of file

My script reads a log file once a minute and selects (and acts upon) the lines where the timestamp begins with the previous minute.
This is easy (the regex is simply "^$timestamp"), but when the log gets big it can take a while.
My thinking is the lines I want will always be near the bottom of the file, so I'd be searching far fewer lines if I started at the bottom and searched upwards, stopping when I get to the minute prior to the one I'm interested in.
My question is, how can I search from the bottom of the file instead of the top? Can I even say "read line $length", or even "read line n" (if so I could do a sort of binary search thing to find the length of the file and work backwards from there)?
Last question: would this even be faster (I'd still like to know how to do it even if it wouldn't be faster)?
Ideally, I'd like to do this all in my own code without installing anything extra.
Thanks
get-content bigfile.txt -tail 10
This words on huge files nearly instantly without any big memory usage.
I did it with a 22 GB text file in my testing.
Doing something like "get-context bigfile.txt | select -Last 10" works but it seems to have to load all of the lines (or objects in powershell) then does the select.
May I suggest just changing the regex to equal Get-Date + whatever time period you want?
For example (and this is without your log so i apologize)
$a = Get-Date
$hr = $a.Hour
$min = $a.Minute
Then work off those values to build out the regex to select the times you want. And if you don't already use it this website is awesome for building regex's quickly and easily http://gskinner.com/RegExr/ .
Got another fix, I think you will like this..
$a = get-content .\biglog.text
Use the length to slice the array from back to front change write host to select-string and your regex or whatever you want to do in reverse..
foreach($x in $a.length..0){ write-host $a[$x] }
Another option after the get-content cmdlet again, this option just reverse orders the array then you are reading $a from bottom to top
[array]::Reverse($a)
dc
If you only want the last bit of the file, depending on the format, you can just do this:
Get-Content C:\Windows\WindowsUpdate.log | Select -last 10
This will return the last 10 lines found in the file.

Perl: pattern match a string and then print next line/lines

I am using Net::Whois::Raw to query a list of domains from a text file and then parse through this to output relevant information for each domain.
It was all going well until I hit Nominet results as the information I require is never on the same line as that which I am pattern matching.
For instance:
Name servers:
ns.mistral.co.uk 195.184.229.229
So what I need to do is pattern match for "Name servers:" and then display the next line or lines but I just can't manage it.
I have read through all of the answers on here but they either don't seem to work in my case or confuse me even further as I am a simple bear.
The code I am using is as follows:
while ($record = <DOMAINS>) {
$domaininfo = whois($record);
if ($domaininfo=~ m/Name servers:(.*?)\n/){
print "Nameserver: $1\n";
}
}
I have tried an example of Stackoverflow where
<DOMAINS>;
will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.
EDIT: Forgot to say thanks!
how rude.
So, the $domaininfo string contains your domain?
What you probably need is the m parameter at the end of your regular expression. This treats your string as a multilined string (which is what it is). Then, you can match on the \n character. This works for me:
my $domaininfo =<<DATA;
Name servers:
ns.mistral.co.uk 195.184.229.229
DATA
$domaininfo =~ m/Name servers:\n(\S+)\s+(\S+)/m;
print "Server name = $1\n";
print "IP Address = $2\n";
Now, I can match the \n at the end of the Name servers: line and capture the name and IP address which is on the next line.
This might have to be munged a bit to get it to work in your situation.
This is half a question and perhaps half an answer (the question's in here as I am not yet allowed to write comments...). Okay, here we go:
Name servers:
ns.mistral.co.uk 195.184.229.229
Is this what an entry in the file you're parsing looks like? What will follow immediately afterwards - more domain names and IP addresses? And will there be blank lines in between?
Anyway, I think your problem may (in part?) be related to your reading the file line by line. Once you get to the IP address line, the info about 'Name servers:' having been present will be gone. Multiline matching will not help if you're looking at your file line by line. Thus I'd recommend switching to paragraph mode:
{
local $/ = ''; # one paragraph instead of one line constitutes a record
while ($record = <DOMAINS>) {
# $record will now contain all consecutive lines that were NOT separated
# by blank lines; once there are >= 1 blank lines $record will have a
# new value
# do stuff, e.g. pattern matching
}
}
But then you said
I have tried an example of Stackoverflow where
<DOMAINS>;
will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.
so maybe you've already tried what I have just suggested? An alternative would be to just add another variable ($indicator or whatever) which you'll set to 1 once 'Name servers:' has been read, and as long as it's equal to 1 all following lines will be treated as containing the data you need. Whether this is feasible, however, depends on you always knowing what else your data file contains.
I hope something in here has been helpful to you. If there are any questions, please ask :)