Trouble determining the pattern for NSRegularExpression...? - iphone

i am relatively new to NSRegularExpression and just can't come up with a pattern to find a string within a string....
here is the string...
##$294#001#[12345-678[123-456-7#15665#2
I want to extract the string..
#001#[12345-678[123-456-7#
for more info I know that there will be 3 digits(like 001) between two # 's and 20 characters between the last two # 's..
I have tried n number of combinations but nothing seem to work. any help is appreciated.

How about something like this:
#[0-9]{3}#.{20}#
If you know that the 20 characters will always consist of digits, [ and -, your pattern would become:
#[0-9]{3}#[0-9\[\-]{20}#
Be careful with the backslashes: When you use create the pattern with a string literal (#"..."), you need to add an extra backslash before each backslash.

You can test NSRegularExpression patterns without recompiling each time by using RegexTester https://github.com/liyanage/regextester

Related

Regex to remove data between 2 semicolns perl

A string have data with semicolons now i want to remove all the data within the 2 semicolons and leave the rest as it is. I am using perl regex to remove the unwanted data from the string:
String :
$val="Data;test is here ;&data=1dffvdviofv;&dt&;&data=343";
Now we want to remove all the data between each semicolons ,throughout the string :
$val=~s/(.*)(\;.*\;)(.*)$/$1$3/g;
But this is not working for me. Final out should be like below :
Data &data=1dffvdviofv&data=343
One of the problems is that .* is greedy, that is, it will consume as much as it can. You can make it non-greedy by writing .*?, but that alone won't fix your regex since you've anchored it to the end of the string with $. Personally I don't think there is a need for the capture groups, you can just write
$val =~ s/;.*?;//g;
I'm assuming that the extra space in your expected output (Data &data...) is a typo.
You might also want to consider using a proper parser for whatever data format this is.

Partial String Replacement using PowerShell

Problem
I am working on a script that has a user provide a specific IP address and I want to mask this IP in some fashion so that it isn't stored in the logs. My problem is, that I can easily do this when I know what the first three values of the IP typically are; however, I want to avoid storing/hard coding those values into the code to if at all possible. I also want to be able to replace the values even if the first three are unknown to me.
Examples:
10.11.12.50 would display as XX.XX.XX.50
10.12.11.23 would also display as XX.XX.XX.23
I have looked up partial string replacements, but none of the questions or problems that I found came close to doing this. I have tried doing things like:
# This ended up replacing all of the numbers
$tempString = $str -replace '[0-9]', 'X'
I know that I am partway there, but I aiming to only replace only the first 3 sets of digits so, basically every digit that is before a '.', but I haven't been able to achieve this.
Question
Is what I'm trying to do possible to achieve with PowerShell? Is there a best practice way of achieving this?
Here's an example of how you can accomplish this:
Get-Content 'File.txt' |
ForEach-Object { $_ = $_ -replace '\d{1,3}\.\d{1,3}\.\d{1,3}','xx.xx.xx' }
This example matches a digit 1-3 times, a literal period, and continues that pattern so it'll capture anything from 0-999.0-999.0-999 and replace with xx.xx.xx
TheIncorrigible1's helpful answer is an exact way of solving the problem (replacement only happens if 3 consecutive .-separated groups of 1-3 digits are matched.)
A looser, but shorter solution that replaces everything but the last .-prefixed digit group:
PS> '10.11.12.50' -replace '.+(?=\.\d+$)', 'XX.XX.XX'
XX.XX.XX.50
(?=\.\d+$) is a (positive) lookahead assertion ((?=...)) that matches the enclosed subexpression (a literal . followed by 1 or more digits (\d) at the end of the string ($)), but doesn't capture it as part of the overall match.
The net effect is that only what .+ captured - everything before the lookahead assertion's match - is replaced with 'XX.XX.XX'.
Applied to the above example input string, 10.11.12.50:
(?=\.\d+$) matches the .-prefixed digit group at the end, .50.
.+ matches everything before .50, which is 10.11.12.
Since the (?=...) part isn't captured, it is therefore not included in what is replaced, so it is only substring 10.11.12 that is replaced, namely with XX.XX.XX, yielding XX.XX.XX.50 as a result.

Removing Unique Characters in Different Parts of String

I'm trying to cut the ? character and pixels work out of a text file export in a unique column.
Sample String: ?300 dpi
#{N='Dpi' ; E={$_.'Horizontal resolution'.Split(" ")[0]}}
I am using split to successfully remove dpi although I also want to remove the ? at the start of the string.
"Name","Path","BaseName","Dpi","Width(Pixels)","Height(Pixels)","DpiTest"
"test.png","\\directory\TCG\Labels\test.png","test","?300","?2623","?1229","?2623 pixels"
You can use the TrimStart() method to remove one or more characters at the start of a string:
$_.'Horizontal resolution'.Split(" ")[0].TrimStart('?')
But I would suggest using the -replace operator for both operations:
$_.'Horizontal resolution' -replace '\?(\d+).*','$1'
The regex matches on a literal ?, 1 or more numerical digits and anything, and then replaces it with with the digits
just do it:
$_.'Horizontal resolution'.Split(" ")[0].Replace('?', '')

How to use numbers as delimiters in MATLAB strsplit function

As the title suggests I'm looking to detect where the numbers are in a string and then to just take the substring from the larger string. EG
If I have say zero89 or eight78, I would just like zero or nine returned. When using the strsplit function I have:
strsplit('zero89', but what do I put here?)
Interested in regexp that will provide you more options to explore with?
Extract numeric digits -
regexp('zero89','\d','match')
Extract anything other than digits -
regexp('zero89','\d+','Split')
strsplit('zero89', '\d', 'DelimiterType', 'RegularExpression')
Or just using regexp:
regexp('zero89','\D+','match')
I got the \D+ from here
Assuming you mean this strsplit?
strsplit('zero89', '8')

matlab regexprep

How to use matlab regexprep , for multiple expression and replacements?
file='http:xxx/sys/tags/Rel/total';
I want to replace 'sys' with sys1 and 'total' with 'total1'. For a single expression a replacement it works like this:
strrep(file,'sys', 'sys1')
and want to have like
strrep(file,'sys','sys1','total','total1') .
I know this doesn't work for strrep
Why not just issue the command twice?
file = 'http:xxx/sys/tags/Rel/total';
file = strrep(file,'sys','sys1')
strrep(file,'total','total1')
To solve it you need substitute functionality with regex, try to find in matlab's regexes something similar to this in php:
$string = 'http:xxx/sys/tags/Rel/total';
preg_replace('/http:(.*?)\//', 'http:${1}1/', $string);
${1} means 1st match group, that is what in parenthesis, (.*?).
http:(.*?)\/ - match pattern
http:${1}1/ - replace pattern with second 1 as you wish to add (first 1 is a group number)
http:xxx/sys/tags/Rel/total - input string
The secret is that whatever is matched by (.*?) (whether xxx or yyyy or 1234) will be inserted instead of ${1} in replace pattern, and then replace instead of old stuff into the input string. Welcome to see more examples on substitute functionality in php.
As documented in the help page for regexprep, you can specify pairs of patterns and replacements like this:
file='http:xxx/sys/tags/Rel/total';
regexprep(file, {'sys' 'total'}, {'sys1' 'total1'})
ans =
http:xxx/sys1/tags/Rel/total1
It is even possible to use tokens, should you be able to define a match pattern for everything you want to replace:
regexprep(file, '/([st][yo][^/$]*)', '/$11')
ans =
http:xxx/sys1/tags/Rel/total1
However, care must be taken with the first approach under certain circumstances, because MATLAB replaces the pairs one after another. That is to say if, say, the first pattern matches a string and replaces it with something that is subsequently matched by a later pattern, then that will also be replaced by the later replacement, even though it might not have matched the later pattern in the original string.
Example:
regexprep('This\is{not}LaTeX.', {'\\' '([{}])'}, {'\\textbackslash{}' '\\$1'})
ans =
This\textbackslash\{\}is\{not\}LaTeX.
=> This\{}is{not}LaTeX.
and
regexprep('This\is{not}LaTeX.', {'([{}])' '\\'}, {'\\$1' '\\textbackslash{}'})
ans =
This\textbackslash{}is\textbackslash{}{not\textbackslash{}}LaTeX.
=> This\is\not\LaTeX.
Both results are unintended, and there seems to be no way around this with consecutive replacements instead of simultaneous ones.