I want to split the string into part eg: en, wikipedia, org, wiki, hostname... then check whether those split string appear in the A11. However, I am facing problem when I used importdata.
str = 'http://en.wikipedia.org/wiki/hostname';
split_URL = regexp(str,'[:/.]*','split') %// note the different search string
A11 = 'hostname From wikipedia, the free encyclopedia Jump to: navigation, search In computer networking, a hostname (archaically nodename .....';
feature11 = (~cellfun('isempty', regexpi(A11, split_URL, 'once')))
I replaced str = 'http://en.wikipedia.org/wiki/hostname'; with data=importdata(urlq), whereas urlq contain http://en.wikipedia.org/wiki/hostname which I saved in the note pad.
How do I use the importdata in this case?
I can split the str = 'http://en.wikipedia.org/wiki/hostname'; in to array form with 1x6cells but I cannot do so when using importdata. Any ideas?
Related
I have to read a textfile which contains a list of companycodes. The format of the textfile is:
[1233A12; 1233B88; 2342Q85; 2266738]
Even if I have read the file? Is it possible to compare these numbers with regular numbers? Because I have the codes from two different data-bases and one of them has regular firmnumbers (no characters) and the other has characters inside the firmnumbers.
Btw the file is big (50+mb).
Edit: I have added an additional number in the example because not all the numbers have a character inside
If you want to compare part of a string with a number, you could do it as follows:
combiString = '1234AB56'
myNumber= 1234
str2num(combiString(1:4))==myNumber
str2num(combiString(7:8))==myNumber
You can achieve this result by using regular expressions. For example, if str = '1233A12' you can write
nums = regexp(str, '(\d+)[A-Z]*(\d+)', 'tokens');
str1 = nums{1}(1);
num1 = str2num(str1{1});
str2 = nums{1}(2);
num2 = str2num(str2{1});
After the output of keywords in URL, how do I check whether the keywords exist in the content of the page like the content below, if yes then return 1, else return 0. There is strfind at there, but I do not have idea why it cannot work
str = 'http://en.wikipedia.org/wiki/hostname'
Paragraph = 'hostname From wikipedia, the free encyclopedia Jump to: navigation, search In computer networking, a hostname (archaically nodename .....'
SplitStrings = regexp(str,'[/.]','split')
for it = SplitStrings
c( it{1} ) = strfind(Paragraph, it{1} )
end
SplitStrings = {};
feature11=(cellfun(#(n) isempty(n), strfind(Paragraph, SplitStrings{1})))
I can do with the below code 4 checking whether 'https' exist or not. But, how to modify the 'SplitString' into 'B6'?
str = 'https://en.wikipedia.org/wiki/hostname'
A6 = regexp(str,'\w*://','match','once')
B6 = {'https'};
feature6=(cellfun(#(n) isempty(n), strfind(A6, B6{1})))
It is absolutely not clear to me what you want to do here...
I suspect it is this:
str = 'http://en.wikipedia.org/wiki/hostname';
haystack = 'hostname From wikipedia, the free encyclopedia Jump to: navigation, search In computer networking, a hostname (archaically nodename .....';
needles = regexp(str,'[:/.]*','split') %// note the different search string
%// What I think you want to do
~cellfun('isempty', regexpi(haystack, needles, 'once'))
Results:
needles =
'http' 'en' 'wikipedia' 'org' 'wiki' 'hostname'
ans =
0 1 1 0 1 1
but if this is not the case, please edit your question and include your desired outputs for some example inputs.
EDIT
OK, so if I understand you corretly now, you want whole words and not partial matches. You must tell this to regexp, in the following way:
%// NOTE: these metacharacters indicate that match is to occur
%// at beginning AND end of word (so whole words only)
needles = strcat('\<', regexpi(str,'[:/.]*','split'), '\>')
%// Search for these words in the paragraph
~cellfun('isempty', regexpi(haystack, needles, 'once'))
You can try this
f=#(str) isempty(strfind(Paragraph,str))
cellfun(f,SplitStrings)
This should get whole words. The key is parsing the variable Paragraph to get them
SplitParagraph=regexp(Paragraph,'[ ,:.()]','split');
I=ismember(SplitStrings,SplitParagraph);
SplitStrings(I)
I have a bunch of classes that I am iterating through and collecting which classes the student is failing in. If the student fails , I collect the name of the class in a vector called retake.
retake =[Math History Science]
I have line breaks so when the classes print in the command window it shows as:
retake=
Math
History
Science.
However, I am trying display retake in a static text box in Gui Guide so it looks like the above. Instead, the static text box is showing as:
MathHistoryScience
set(handles.text13,'String', retake) % this is what I tried
can you please show me so it prints:
Math
History
Science
It looks to me like you need to add carriage returns.
Assuming you have a cell array with strings (rather than concatenated strings using [], which will give you a single long line), you can do it as follows:
retake = {'Math', 'History', 'Science'};
rString = '';
for ii = 1:numel(retake)-1
rString = [rString sprintf('%s\n', retake{ii}];
end
rString = [rString retake{end}];
Notice the use of '' to denote strings, {} to denote a cell array, '\n' as the end-of-line character, and [a b] to do simple string concatenation.
I am trying to add '\' before all special characters in a string in MATLAB, could anyone please help me out. Here is the example:
tStr = 'Hi, I'm a Big (Not So Big) MATLAB addict; Since my school days!';
I want this string to be changed to:
'Hi\, I\'m a Big \(Not so Big \) MATLAB addict\; Since my school days\!'
The escape character in Matlab is the single quote ('), not the backslash (\), like in C language. Thus, your string must be like this:
tStr = 'Hi\, I\''m a Big (Not so Big ) MATLAB addict\; Since my school days!'
I took the list of special charecters defined on the Mathworks webpage to do this:
special = '[]{}()=''.().....,;:%%{%}!#';
tStr = 'Hi, I''m a Big (Not So Big) MATLAB addict; Since my school days!';
outStr = '';
for l = tStr
if (length(find(special == l)) > 0)
outStr = [outStr, '\', l];
else
outStr = [outStr, l];
end
end
which will automatically add those \s. You do need to use two single quotes ('') in place of the apostrophe in your input string. If tStr is obtained with the function input(), or something similar, this will procedure will still work.
Edited:
Or using regular expressions:
regexprep(tStr,'([[\]{}()=''.(),;:%%{%}!#])','\\$1')
Hi need help in using regexp for condition matching.
ex.my file has the following content
{hello.program='function'`;
bye.program='script'; }
I am trying to use regexp to match the string that has .program='function' in them:
pattern = '[.program]+\=(function)'
also tried pattern='[^\n]*(.hello=function)[^\n]*';
pattern_match = regexp(myfilename,pattern , 'match')
but this returns me pattern_match={} while i expect the result to be hello.program='function'`;
If 'function' comes with string-markers, you need to include these in the match. Also, you need to escape the dot (otherwise, it's considered "any character"). [.program]+ looks for one or several letters contained in the square brackets - but you can just look for program instead. Also, you don't need to escape the =-sign (which is probably what messed up the match).
cst = {'hello.program=''function''';'bye.program=''script'''; };
pat = 'hello\.program=''function''';
out = regexp(cst,pat,'match');
out{1}{1} %# first string from list, first match
hello.program='function'
EDIT
In response to the comment
my file contains
m2 = S.Parameter;
m2.Value = matlabstandard;
m2.Volatility = 'Tunable';
m2.Configurability = 'None';
m2.ReferenceInterfaceFile ='';
m2.DataType = 'auto';
my objective is to find all the lines that match, .DataType='auto'
Here's how you find the matching lines with regexp
%# read the file with textscan into a variable txt
fid = fopen('myFile.m');
txt = textscan(fid,'%s');
fclose(fid);
txt = txt{1};
%# find the match. Allow spaces and equal signs between DataType and 'auto'
match = regexp(txt,'\.DataType[ =]+''auto''','match')
%# match is not empty only if a match was found. Identify the non-empty match
matchedLine = find(~cellfun(#isempty,match));
Try this as it matches .program='function' exactly:
(\.)program='function'
I think this did not work:
'[.program]+\=(function)'
because of how the []'s work. Here is a link explaining why I say that: http://www.regular-expressions.info/charclass.html