Using AMPScript to extract text from a String after nth delimiter - substring

I need to be able to extract from a string after nth delimiter.
In this case, the delimiter is an underscore.
The challenge is that the last delimiter could be in the 2nd, 3rd,4th or 5th position
Example:
LB_AB_BookingReminder_123-1-2-1S (3rd position)
LB_AB_123-1-2-1S (2nd position)
LB_AB_Booking_Reminder_123-1-2-1S (4th position)
Output Needed: 123-1-2-1S
Thank You

Using the RegExMatch() function may be your best bet:
%%[
set #pattern = "^.+_(.+\-\d+\-\d+\-.+)"
set #s1 = "LB_AB_BookingReminder_123-1-2-1S"
set #s2 = "LB_AB_BookingReminder_123-1-2-1S"
set #s3 = "LB_AB_Booking_Reminder_123-1-2-1S"
set #match1 = RegExMatch(#s1, #pattern, 1)
set #match2 = RegExMatch(#s2, #pattern, 1)
set #match3 = RegExMatch(#s3, #pattern, 1)
]%%
<br>%%=v(#match1)=%%
<br>%%=v(#match2)=%%
<br>%%=v(#match3)=%%
Output:
123-1-2-1S
123-1-2-1S
123-1-2-1S
Regex101 snippet: https://regex101.com/r/DJeKjd/1
AMPscript tester: https://mcsnippets.herokuapp.com/s/F4WISQvc

Related

Extract or import two Specific values form Text file - MATLAB

I am working on a problem in MATLAB, in which I need to import two specific values (highlighted) from a text file as shown in the figure 1.
Corresponding .txt file is attached in the following link (Link)
When you want to extract specific values from a text file you should always consider using regular expressions.
For getting the two values you highlighted you can use:
raw = fileread('M7_soil_FN_1.txt');
val = regexp(raw,'(\d+)\s+(\d+\.\d+)\s+(?=NPTS)','tokens')
The regular expression says:
(\d+) Match digits and capture.
\s+ Match whitespace.
(\d+\.\d+) Match and capture digits, a full stop, more digits.
\s+ Match whitespace.
(?=NPTS) Positive lookahead to ensure that what follows is NPTS.
Then convert val to double:
val = str2double(val{:})
>>val(1)
5991
>>val(2)
0.0050
If you are interested, you can see the regular expression in action live here.
Hopefully this will work.
delimiter = ' ';
startRow = 7;
fileID = fopen(filename,'r');
formatSpec = '%f%f%[^\n\r]';
dataArray = textscan(fileID, formatSpec,startRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false,'MultipleDelimsAsOne',1, 'EndOfLine', '\r\n');
val1 = dataArray{1,1};
val2 = dataArray{1,2};

strsplit excluding one, but including another

A very short question. I have a string
str = 'var(:,1),var(:,2),var(:,3)';
I need to split it with strsplit by ',' but not by ':,' so that I will end up with a cell array
cel = {'var(:,1)','var(:,2)','var(:,3)'};
I am not good with regular expression at all and I tried ,^(:,) but this fails. I thought ^ is not () is group.
How can it be done?
Use a regular expression with negative lookbehind:
cel = regexp(str, '(?<!:),', 'split');

Extract character between two specified characters of a string using Matlab

I have a string and want to extract the numbers between the character 'w' and 's'. The positions of the the characters vary between different strings.
For example:
s = '1w12s01'
desired result: '12'
and
s = '102w22s21'
desired result: '22'
It can also be done using a regular expression with lookahead and lookbehind:
regexp(s,'(?<=w).*(?=s)','match')
The function strfind will do this easily enough. This will work as long as the number is always directly between a 'w' and and 's', both are only in the target string once, and the number you're after is the only thing between those two characters.
s = '102w22s21';
r = s((strfind(s, 'w')+1):(strfind(s, 's')-1));
Use this:
e = extractBetween(s,'w','s');

Regexp to find a matching condition in a string

Hi need help in using regexp for condition matching.
ex.my file has the following content
{hello.program='function'`;
bye.program='script'; }
I am trying to use regexp to match the string that has .program='function' in them:
pattern = '[.program]+\=(function)'
also tried pattern='[^\n]*(.hello=function)[^\n]*';
pattern_match = regexp(myfilename,pattern , 'match')
but this returns me pattern_match={} while i expect the result to be hello.program='function'`;
If 'function' comes with string-markers, you need to include these in the match. Also, you need to escape the dot (otherwise, it's considered "any character"). [.program]+ looks for one or several letters contained in the square brackets - but you can just look for program instead. Also, you don't need to escape the =-sign (which is probably what messed up the match).
cst = {'hello.program=''function''';'bye.program=''script'''; };
pat = 'hello\.program=''function''';
out = regexp(cst,pat,'match');
out{1}{1} %# first string from list, first match
hello.program='function'
EDIT
In response to the comment
my file contains
m2 = S.Parameter;
m2.Value = matlabstandard;
m2.Volatility = 'Tunable';
m2.Configurability = 'None';
m2.ReferenceInterfaceFile ='';
m2.DataType = 'auto';
my objective is to find all the lines that match, .DataType='auto'
Here's how you find the matching lines with regexp
%# read the file with textscan into a variable txt
fid = fopen('myFile.m');
txt = textscan(fid,'%s');
fclose(fid);
txt = txt{1};
%# find the match. Allow spaces and equal signs between DataType and 'auto'
match = regexp(txt,'\.DataType[ =]+''auto''','match')
%# match is not empty only if a match was found. Identify the non-empty match
matchedLine = find(~cellfun(#isempty,match));
Try this as it matches .program='function' exactly:
(\.)program='function'
I think this did not work:
'[.program]+\=(function)'
because of how the []'s work. Here is a link explaining why I say that: http://www.regular-expressions.info/charclass.html

Working string in MATLAB

I have the following string in MATLAB, for example
##%%F1_USA(40)_u
and I want
F1_USA_40__u
Does it has any function for this?
Your best bet is probably regexprep which allows you to replace parts of a string using regular expressions:
s_new = regexprep(regexprep(s, '[()]', '_'), '[^A-Za-z0-9_]', '')
Update: based on your updated comment, this is probably what you want:
s_new = regexprep(regexprep(s, '^[^A-Za-z0-9_]*', ''), '[^A-Za-z0-9_]', '')
or:
s_new = regexprep(regexprep(s, '[^A-Za-z0-9_]', '_'), '^_*', '')
One way to do this is to use the function ISSTRPROP to find the indices of alphanumeric characters and replace or remove the others accordingly:
>> str = '##%%F1_USA(40)_u'; %# Sample string
>> index = isstrprop(str,'alphanum'); %# Find indices of alphanumeric characters
>> str(~index) = '_'; %# Set non-alphanumeric characters to '_'
>> str = str(find(index,1):end) %# Remove any leading '_'
str =
F1_USA_40__u %# Result
If you want to use regular expressions (which can get a little more complicated) then the last suggestion from Tamas will work. However, it can be greatly simplified to the following:
str = regexprep(str,{'\W','^_*'},{'_',''});