Alternating string - powershell

I am trying to use regular expressions to match against a string that starts with 7 numbers, then has a "K" inbetween it, and then 3 numbers again. For example:
1234567K890.
I currently have $_a -match '^\d{7}K\d{3}'. However, this does not work for my purposes. Does anyone have a solution?

PS C:\> "1234567K890" -match "\d{7}(k)\d{3}"
This \d{7} matches 7 digits then (k) matches letter k and \d{3} matches last three characters.

Tested this, works for your example and some others:
$string = "1234567K890"
$string -match '^[0-9]{7}(k)[0-9]{3}$'"
It matches against exactly 7 numbers, then against K (casing does not matter), then against exactly 3 numbers. The characters at the beginning and the end of the string restrict against whitespace at the beginning and end of the string -- if you want whitespace to be allowed, you can just remove them.
Here's a powershell regex reference, which may help in the future.

Related

powershell -ilike operations too similar

is there a way to say, that between a * there is only a numeric value of maybe two values.
i want to select items, but the way i can differentiate them is very limited.
i want to store values like "31.04.2003" with following line of code:
$contentDateReal = $content_ -ilike '*"*.*.*",'
this works for me, in the most times but, sometimes i got values like: "Installation Acrobat Reader 10.0.1 "
those one also fit the -ilike filter but i dont want them. is there a way to say, that i only want values that contains numbers, and that before the first dot, there is only 2 ("xx") index sizes, after the first dot also ("xx"), and after the second one there is space for four index values like "xxxx" or "2020".
While, you can use character ranges such as [0-9] to match a character (digit) in that range, PowerShell's wildcard expressions do not support matching a varying number of these characters.
That is, '10' -like '[0-9][0-9]' is $true, but '2' -like '[0-9][0-9]' is not.
Note: -ilike is just an alias for -like, which is case-insensitive by default, as all PowerShell operators are; conversely, use -clike for case-sensitive matching. This naming convention applies to all operators that (also) process text.
While you do want to match fixed numbers of digits, matching with a fixed number of [0-9] ranges may still yield false positives if additional digits are present at the start or at the end, so to rule these out you need to use the more sophisticated matching that regular expressions (regexes) provide:
PowerShell supports regexes via the -match operator (among others), so you could use the following:
('Some Software 31.04.2003', 'Installation Acrobat Reader 10.0.1').ForEach({
if ($_ -match '\b(\d{2}\.\d{2}\.\d{4})\b') {
"'$_' matched; extracted version number: $($Matches[1])"
}
})
The above yields the following, because only the first string matched:
'Some Software 31.04.2003' matched; extracted version number: 31.04.2003
Explanation of the regex:
\b matches at word boundaries, which means that something other than word character (a letter, a digit, or _) must occur at that position (which can include the start and end of the string).
\d matches a digit (roughly equivalent to [0-9], the latter limiting matching to the decimal digits in the ASCII sub-range of Unicode); {2}, for instance, stipulates that exactly 2 instances of digits must be present.
\. represents a verbatim . (it must be \-escaped, because . is a regex metacharacter representing any character).
Enclosing a subexpression in (...) creates a so-called capture group, which additionally captures what the subexpression matched, and makes that available starting with index 1 (for the first of potentially multiple (unnamed) capture groups) in the automatic $Matches variable variable.
Note that -match - unlike -like - matches substrings by default, so there's no need to also match what comes before or after the version number.

Design Powershell script for find the Numbers which contain file

Everyone help to design the script to find the Numbers which contain file..
For example:
20200514_EE#998501_12.
I need numbers 12 then write to the txt file
the contain will generated different sequence numbers..
For example: #20200514_EE#998501_123.#
so, I need numbers 123 then write to the txt file
How to write the script in Powershell or bat file ?
Very appreciate!
Thanks
Tony
You can do the following as a start. You have not provided enough information/examples to work through any issues you are experiencing.
'#20200514_EE#998501_123.#' -replace '^.*?(\d+)\D*$','$1'
'#20200514_EE#998501_123' -replace '^.*?(\d+)\D*$','$1'
-replace uses regex matching and then replaces with a string and/or matched substitute. ^ is the start of the string. .*? lazily matches all characters. \d+ matches one or more digits in a capture group due to the encapsulating (). \D* matches zero or more non-digits. $ matches the end of the string. For the replacement, $1 is capture group 1, which is what was captured by (\d+).
You can use the .Split() method also in combination with -replace.
'#20200514_EE#998501_123.#'.Split('_')[-1] -replace '\D+$'

How to remove characters after the Nth matching character from a string in powershell

How do i remove all the characters after the nth number of a matching character in powershell?
Example:
\1CF\0101\FIXED\PIPING\0101-000\0101-000-000\Crkg_O_S_I\1997_O_S_I
I want to remove all the characters after the 7th "\" so the output would be
\1CF\0101\FIXED\PIPING\0101-000\0101-000-000 or
\1CF\0101\FIXED\PIPING\0101-000\0101-000-000\
Doesn't matter which of the output is
[moving this from my comment] I think splitting the string into parts based on backslashes, then taking the first 7 parts and ignoring the rest, and joining those 7 back up with new backslashes is quite a short, sensible approach:
$string.split('\')[0..6] -join '\'
Other approaches would be to repeatedly do $index = $string.IndexOf('\', $index + 1) until it had found the location of the 7th and then use $string.SubString(). For a small saving of memory (no array created to hold the split pieces), this is likely not worth it.

Text file search for match strings regex

I am trying to understand how regex works and what are the possibilities of working with it.
So I have a txt file and I am trying to search for 8 char long strings containing numbers. for now I use a quite simple option:
clear
Get-ChildItem random.txt | Select-String -Pattern [0-9][a-z] | foreach {$_.line}
It sort of works but I am trying to find a better option. ATM it takes too long to read through the left out text since it writes entire lines and it does not filter them by length.
You can use a lookahead to assert that a string contains at least 1 digit, then specify the length of the match and finally anchor it with ^ (start of string) and $ (end of string) if the string is on a line of its own, or \b (word boundary) if it's part of an HTML document as your comments seem to suggest:
Get-ChildItem C:\files\ |Select-String -Pattern '^(?=.*\d)\w{8}$'
Get-ChildItem C:\files\ |Select-String -Pattern '\b(?=.*\d)\w{8}\b'
The pattern [0-9][a-z] matches a digit followed by a letter. If you want to match a sequence of 8 characters use .{8}. The dot in regular expressions matches any character except newlines. A number in curly brackets matches the preceding expression the given number of times.
If you want to match non-whitespace characters use \S instead of .. If you want to match only digits and letters use [0-9a-z] (a character class) instead of ..
For a more thorough introduction please go find a tutorial. The subject is way too complex to be covered by a single answer on SO.
What you're currently searching for is a single number ranging from 0-9 followed by a single lowercase letter ranging from a-z.
this, for example, will match any 8 char long strings containing only alphanumeric characters.
\w{8}
i often forget what some regex classes are, and it may be useful to you as a learning tool, but i use this as a point of reference: http://regexr.com/
It can also validate what you're typing inline via a text field so you can see if what you're doing works or not.
If you need more of a tutorial than a reference, i found this extremely useful when i learned: regexone.com

Need Regular expression - perl

I am looking for a regx for below expression:
my $text = "1170 KB/s (244475 bytes in 2.204s)"; # I want to retrieve last ‘2.204’ from this String.
$text =~ m/\d+[^\d*](\d+)/; #Regexp
my $num = $1;
print " $num ";
Output:
204
But I need 2.204 as output, please correct me.
Can any one help me out?
The regex is doing exactly what you asked it to: It is matching digits \d+, followed by one non-digit or star [^\d*], followed by digits \d+. The only thing that matches that in your string is 204.
If you want a quick fix, you can just move the parentheses:
m/(\d+[^\d*]\d+)/
This would (with the above input) match what you want. A more exact way to put it would be:
m/(\d+\.\d+)/
Of course this will match any float precision number, so if you can have more of those, that's not a good idea. You can shore it up by using an anchor, like so:
m/(\d+\.\d+)s\)/
Where s\) forces the match to occur at only that place. Further strictures:
m/\(\d+\D+(\d+\.\d+)s\)/
You might also want to account for the possibility of your target number not being a float:
m/\(\d+\D+(\d+\.?\d*)s\)/
By using ? and * we allow for those parts not to match at all. This is not recommended to do unless you are using anchors. You can also replace everything in the capture group with [\d.]+.
If you are not fond of matching the parentheses, you can match the text:
m/bytes in ([\d.]+)s/
I'd go with the second marker as indicator where you are in the string:
my ($num) = ($text =~ /(\d+\.\d+)s/);
with explanations:
/( # start of matching group
\d+ # first digits
\. # a literal '.', take \D if you want non-numbers
\d+ # second digits
)/x # close the matching group and the regex
You had the matching groups wrong. Also the [^\d] is a bit excessive, generally you can negate some of the backspaced special classes (\d,\h, \s and \w) with their respective uppercase letter.
Try this regex:
$text =~ m/\d+[^\d]*(\d+\.?\d*)s/;
That should match 1+ digits, a decimal point if there is one, 0 or more decimal places, and make sure it's followed by a "s".