How to split Matlab string into two with known suffix? - matlab

I need to split a string into two components. As an example I have the string:
s = 'Hello1_1000_10_1_data'
and I want to split it into the two strings
str1 = 'Hello1_1000_10_1'
and
str2 = '_data'
the important point is that I can't be too sure of the format of the first string, the only thing that is sure is that the 'suffix' which is to be read into the second string always reads '_data'. What is the best way to do this? I looked up the documentation on strtok and regexp but they do not seem to offer me what I want.

If you always know the length of the suffix, you could just use that:
s = 'Hello1_1000_10_1_data'
str1 = s(1:end-5)
Or otherwise:
s = 'Hello1_1000_10_1_data'
suffix = length('_data')
str1 = s(1:end-suffix)

You can use:
s = 'Hello1_1000_10_1_data';
str = regexp(s, '(.*)(_data)', 'tokens'){1};
str{1} %// Hello1_1000_10_1
str{2} %// _data
If _data occurs several times in the file name, this will still work.

You can also use strsplit() as follow:
s = 'Hello1_1000_10_1_data';
suffix = '_data';
str = strsplit(s,suffix);
str1 = str{1}
In addition, you can use strsplit() with multiple delimiters.

You can use strfind():
s = 'Hello1_1000_10_1_data';
suffix = '_data';
i = strfind(s,suffix);
if any(i)
i = i(end);
prefix = s(1:i-1);
end

Related

Remove end folder part of string in MATLAB

Say if we have this string
a = 'C:/my_folder/folder/mac/data/';
How can I use regexprep to reduce the string to:
'C:/my_folder/folder/mac/';
Actually, I found a way to do it.
[pathstr] = fileparts(a);
regexprep(pathstr, '(?<=/)[^/]*$', '')
You can try this method to cut 5 char at end of string
a = a(1:end-5)

matlab: reading numbers that also contain characters

I have to read a textfile which contains a list of companycodes. The format of the textfile is:
[1233A12; 1233B88; 2342Q85; 2266738]
Even if I have read the file? Is it possible to compare these numbers with regular numbers? Because I have the codes from two different data-bases and one of them has regular firmnumbers (no characters) and the other has characters inside the firmnumbers.
Btw the file is big (50+mb).
Edit: I have added an additional number in the example because not all the numbers have a character inside
If you want to compare part of a string with a number, you could do it as follows:
combiString = '1234AB56'
myNumber= 1234
str2num(combiString(1:4))==myNumber
str2num(combiString(7:8))==myNumber
You can achieve this result by using regular expressions. For example, if str = '1233A12' you can write
nums = regexp(str, '(\d+)[A-Z]*(\d+)', 'tokens');
str1 = nums{1}(1);
num1 = str2num(str1{1});
str2 = nums{1}(2);
num2 = str2num(str2{1});

Regexp to find a matching condition in a string

Hi need help in using regexp for condition matching.
ex.my file has the following content
{hello.program='function'`;
bye.program='script'; }
I am trying to use regexp to match the string that has .program='function' in them:
pattern = '[.program]+\=(function)'
also tried pattern='[^\n]*(.hello=function)[^\n]*';
pattern_match = regexp(myfilename,pattern , 'match')
but this returns me pattern_match={} while i expect the result to be hello.program='function'`;
If 'function' comes with string-markers, you need to include these in the match. Also, you need to escape the dot (otherwise, it's considered "any character"). [.program]+ looks for one or several letters contained in the square brackets - but you can just look for program instead. Also, you don't need to escape the =-sign (which is probably what messed up the match).
cst = {'hello.program=''function''';'bye.program=''script'''; };
pat = 'hello\.program=''function''';
out = regexp(cst,pat,'match');
out{1}{1} %# first string from list, first match
hello.program='function'
EDIT
In response to the comment
my file contains
m2 = S.Parameter;
m2.Value = matlabstandard;
m2.Volatility = 'Tunable';
m2.Configurability = 'None';
m2.ReferenceInterfaceFile ='';
m2.DataType = 'auto';
my objective is to find all the lines that match, .DataType='auto'
Here's how you find the matching lines with regexp
%# read the file with textscan into a variable txt
fid = fopen('myFile.m');
txt = textscan(fid,'%s');
fclose(fid);
txt = txt{1};
%# find the match. Allow spaces and equal signs between DataType and 'auto'
match = regexp(txt,'\.DataType[ =]+''auto''','match')
%# match is not empty only if a match was found. Identify the non-empty match
matchedLine = find(~cellfun(#isempty,match));
Try this as it matches .program='function' exactly:
(\.)program='function'
I think this did not work:
'[.program]+\=(function)'
because of how the []'s work. Here is a link explaining why I say that: http://www.regular-expressions.info/charclass.html

How can I rearrange the chars in a char*?

I have char* myChar = "HELLO". I would like to switch the places of the E and the O. I tried doing myChar[1] = myChar[4], but that doesn't work. Please help!
First off, that string literal is probably being stored in read-only memory. You can fix that by declaring the string as an array of characters:
char myChar[] = "HELLO";
To swap the characters, you'll have to use a temporary variable:
char c1 = myChar[1];
myChar[1] = myChar[4];
myChar[4] = c1;
You assigned whatever is in myChar[4] into myChar[1]. (that's all you did there)
You need to create a temporary variable char temp; and do the following:
Edit: As mentioned by Tim Cooper, char myChar[] = "HELLO"; - // This will remove it's constness.
temp = myChar[1];
myChar[1] = myChar[4];
myChar[4] = temp;
This is a very common 'algorithm' to swap two things.

Working string in MATLAB

I have the following string in MATLAB, for example
##%%F1_USA(40)_u
and I want
F1_USA_40__u
Does it has any function for this?
Your best bet is probably regexprep which allows you to replace parts of a string using regular expressions:
s_new = regexprep(regexprep(s, '[()]', '_'), '[^A-Za-z0-9_]', '')
Update: based on your updated comment, this is probably what you want:
s_new = regexprep(regexprep(s, '^[^A-Za-z0-9_]*', ''), '[^A-Za-z0-9_]', '')
or:
s_new = regexprep(regexprep(s, '[^A-Za-z0-9_]', '_'), '^_*', '')
One way to do this is to use the function ISSTRPROP to find the indices of alphanumeric characters and replace or remove the others accordingly:
>> str = '##%%F1_USA(40)_u'; %# Sample string
>> index = isstrprop(str,'alphanum'); %# Find indices of alphanumeric characters
>> str(~index) = '_'; %# Set non-alphanumeric characters to '_'
>> str = str(find(index,1):end) %# Remove any leading '_'
str =
F1_USA_40__u %# Result
If you want to use regular expressions (which can get a little more complicated) then the last suggestion from Tamas will work. However, it can be greatly simplified to the following:
str = regexprep(str,{'\W','^_*'},{'_',''});