I'm trying to make an algorithm in Matlab that scans the character array from left to right and if it encounters a space, it should do nothing, but if it encounters 2 consecutive spaces, it should start printing the remaining quantities of array from next line. for example,
inpuut='a bc d';
after applying this algorithm, the final output should have to be:
a bc
d
but this algorithm is giving me the output as:
a bc
d d
Also, if someone has got a more simpler algorithm to do this task, do help me please :)
m=1; t=1;
inpuut='a bc d';
while(m<=(length(inpuut)))
if((inpuut(m)==' ')&&(inpuut(m+1)==' '))
n=m;
fprintf(inpuut(t:(n-1)));
fprintf('\n');
t=m+2;
end
fprintf(inpuut(t));
if(t<length(inpuut))
t=t+1;
elseif(t==length(inpuut))
t=t-1;
else
end
m=m+1;
end
fprintf('\n');
OK I gave up telling why your code doesn't work. This is a working one.
inpuut='a bc d ';
% remove trailing space
while (inpuut(end)==' ')
inpuut(end)=[];
end
str = regexp(inpuut, ' ', 'split');
for ii = 1:length(str)
fprintf('%s\n', str{ii});
end
regexp with 'split' option splits the string into a cell array, with delimiter defined in the matching expression.
fprintf is capable of handling complicated strings, much more than printing a single string.
You can remove the trailing space before printing, or do it inside the loop (check if the last cell is empty, but it's more costly).
You can use regexprep to replace two consecutive spaces by a line feed:
result_string = regexprep(inpuut, ' ', '\n');
If you need to remove trailing spaces: use this first:
result_string = regexprep(inpuut, ' $', '');
I have a solution without using regex, but I assumed you wanted to print on 2 lines maximum.
Example: with 'a b c hello':
a b
c hello
and not:
a b
c
hello
In any case, here is the code:
inpuut = 'a b c';
while(length(inpuut) > 2)
% Read the next 2 character
first2char = inpuut(1:2);
switch(first2char)
case ' ' % 2 white spaces
% we add a new line and print the rest of the input
fprintf('\n%s', inpuut(3:end));
inpuut = [];
otherwise % not 2 white spaces
% Just print one character
fprintf('%s', inpuut(1))
inpuut(1) = [];
end
end
fprintf('%s\n', inpuut);
Related
How can I go about doing this? So far I've opened the file like this
fileID = fopen('hamlet.txt'.'r');
[A,count] = fscanf(fileID, '%s');
fclose(fileID);
Getting spaces from the file
First, if you want to capture spaces, you'll need to change your format specifier. %s reads only non-whitespace characters.
>> fileID = fopen('space.txt','r');
>> A = fscanf(fileID, '%s');
>> fclose(fileID);
>> A
A = Thistexthasspacesinit.
Instead, we can use %c:
>> fileID = fopen('space.txt','r');
>> A = fscanf(fileID, '%c');
>> fclose(fileID);
>> A
A = This text has spaces in it.
Mapping between characters and values (array indices)
We could create a character array that contains all of the target characters to look for:
search_chars = ['A':'Z', 'a':'z', ',', '.', ' '];
That would work, but to map the character to a position in the array you'd have to do something like:
>> char_pos = find(search_chars == 'q')
char_pos = 43
You could also use containters.Map, but that seems like overkill.
Instead, let's use the ASCII value of each character. For convenience, we'll use only values 1:126 (0 is NUL, and 127 is DEL. We should never encounter either of those.) Converting from characters to their ASCII code is easy:
>> c = 'q'
c = s
>> a = uint8(c) % MATLAB actually does this using double(). Seems wasteful to me.
a = 115
>> c2 = char(a)
c2 = s
Note that by doing this, you're counting characters that are not in your desired list like ! and *. If that's a problem, then use search_chars and figure out how you want to map from characters to indices.
Looping solution
The most intuitive way to count each character is a loop. For each character in A, find its ASCII code and increment the counter array at that index.
char_count = zeros(1, 126);
for current_char = A
c = uint8(current_char);
char_count(c) = char_count(c) + 1;
end
Now you've got an array of counts for each character with ASCII codes from 1 to 126. To find out how many instances of 's' there are, we can just use its ASCII code as an index:
>> char_count(115)
ans = 4
We can even use the character itself as an index:
>> char_count('s')
ans = 4
Vectorized solution
As you can see with that last example, MATLAB's weak typing makes characters and their ASCII codes pretty much equivalent. In fact:
>> 's' == 115
ans = 1
That means that we can use implicit broadcasting and == to create a logical 2D array where L(c,a) == 1 if character c in our string A has an ASCII code of a. Then we can get the count for each ASCII code by summing along the columns.
L = (A.' == [1:126]);
char_count = sum(L, 1);
A one-liner
Just for fun, I'll show one more way to do this: histcounts. This is meant to put values into bins, but as we said before, characters can be treated like values.
char_count = histcounts(uint8(A), 1:126);
There are dozens of other possibilities, for instance you could use the search_chars array and ismember(), but this should be a good starting point.
With [A,count] = fscanf(fileID, '%s'); you'll only count all string letters, doesn't matter which one. You can use regexp here which search for each letter you specify and will put it in a cell array. It consists of fields which contains the indices of your occuring letters. In the end you only sum the number of indices and you have the count for each letter:
fileID = fopen('hamlet.txt'.'r');
A = fscanf(fileID, '%s');
indexCellArray = regexp(A,{'A','B','C','D',... %I'm too lazy to add the other letters now^^
'a','b','c','d',...
','.' '};
letterCount = cellfun(#(x) numel(x),indexCellArray);
fclose(fileID);
Maybe you put the cell array in a struct where you can give fieldnames for the letters, otherwise you might loose track which count belongs to which number.
Maybe there's much easier solution, cause this one is kind of exhausting to put all the letters in the regexp but it works.
In matlab i'm coding a Ceaser Cipher, but the space shows up as a 'y' character.
How can I replace that with a space
case 4
disp('Breaking Ceaser Cipher')
cs = menu('Please Enter your Choice','Encryption','Decryption');
if cs==1
c = input('Enter the message: ','s');
sh = str2double(input('Enter shift: ','s'));
c=upper(c);
lc=length(c);
for i=1:lc
p(i)=int16(c(i))-65+sh;
end
p=mod(p,26)+97;
p=char(p);
disp( p)
end
end
output example:
Breaking Ceaser Cipher
Enter the message:
my name is jeff
Enter shift:
5
rdysfrjynxyojkk
Here we see that the encryption is correct, but the space is being replaced by 'y'. It does not replace the character 'y' when used as an input, the space bar somehow comes out as a 'y'.
I'v also tried using p2 = regexprep(c, 'y', ' ') in order to replace the 'y' string with space.Also looked into isspace function. No luck
You are halfway there:
spaces=isspace(c)
% make array of spaces
out=blanks(size(c));
% get array without spaces
c=c(~spaces);
% do stuff to c, without spaces.
p=mod(p,26)+97;
p=char(p);
% Fill p in corresponding locations
out(~spaces)=p;
I am trying to capitalize the first and last letter of only the three letter words in a string. So far, I have tried
spaces = strfind(str, ' ');
spaces = [0 spaces];
lw = diff(spaces);
lw3 = find(lw ==4);
a3 = lw-1;
b3 = spaces(a3+1);
b4 = b3 + 2 ;
str(b3) = upper(str(b3));
str(b4) = upper(str(b4);
we had to find where the 3 letter words were first so that is what the first 4 lines of code are and then the others are trying to get it so that it will find where the first and last letters are and then capitalize them?
I would use regular expressions to identity the 3-letter words and then use regexprep combined with an anonymous function to perform the case-conversion.
str = 'abcd efg hijk lmn';
% Custom function to capitalize the first and last letter of a word
f = #(x)[upper(x(1)), x(2:end-1), upper(x(end))];
% This will match 3-letter words and apply function f to them
out = regexprep(str, '\<\w{3}\>', '${f($0)}')
% abcd EfG hijk LmN
Regular expressions are definitely the way to go. I am going to suggest a slightly different route, and that is to return the indices using the tokenExtents flag for regexpi:
str = 'abcd efg hijk lmn';
% Tokenize the words and return the first and last index of each
idx = regexpi(str, '(\<w{3}\>)', 'tokenExtents');
% Convert those indices to upper case
str([idx{:}]) = upper(str([idx{:}]));
Using the matlab ipusum function from the File Exchange, I generated a 1000 paragraph random text string with mean word length 4 +/- 2.
str = lower(matlab_ipsum('WordLength', 4, 'Paragraphs', 1000));
The result was a 177,575 character string with 5,531 3-letter words. I used timeit to check the execution time of using regexprep and regexpi with tokenExtents. Using regexpi is an order of magnitude faster:
regexpi = 0.013979s
regexprep = 0.14401s
I have a text file that looks like this:
(a (bee (cold down)))
if I load it using
c=textscan(fid,'%s');
I get this:
'(a'
'(bee'
'(cold'
'down)))'
What I would like to get is:
'('
'a'
'('
'bee'
'('
'cold'
'down'
')'
')'
')'
I know I can delimit with '(' and ')' by specifying 'Delimiter' in textscan, but then I will loose this character, which I want to keep.
Thank you in Advance.
The %s specifier indicates that you want Strings, what you want is individual chars. Use %c instead .
c=textscan(fid,'%c');
Update if you want too keep your words intact then you'll want to load your text using the %s specifier. After the text is loaded you can either solve this problem with Regular Expressions (not my forte) or write your own parser then parses each word individually and saves the paranthesis and words to a new cell array.
AFAIK, there is no canned routine capable of preserving arbitrary delimiters.
You'd have to do it yourself:
string = '(a (bee (cold down)))';
bo = string == '(';
bc = string == ')';
sp = string == ' ';
output = cell(nnz(bo|bc|sp)+1,1);
j = 1;
for ii = 1:numel(string)
if bo(ii)
output{j} = '(';
j = j + 1;
elseif bc(ii)
output{j} = ')';
j = j + 1;
elseif sp(ii)
j = j + 1;
else
output{j} = [output{j} string(ii)];
end
end
Which can probably be improved -- the growing character array will prevent the loop from being JIT'ed. The array bc | bo | sp holds all the information to vectorize this thing, I just don't see how at this hour...
Nevertheless, it should give you a place to start.
Matlab has a strtok function similar to C. Its format is:
token = strtok(str)
token = strtok(str, delimiter)
[token, remain] = strtok('str', ...)
there is also a string replace function strrep:
modifiedStr = strrep(origStr, oldSubstr, newSubstr)
What I would do is modify the original string with strrep to add in delimiters, then use strtok. Since you already scanned the string into c:
c = (c,'(','( '); %Add a space after each open paren
c = (c,')',' ) '); % Add a space before and after each close paren
token = zeros(10); preallocate for speed
i = 2;
[token(1), remain] = strtok(c, ' ');
while(remain)
[token(i), remain] = strtok(c, ' ');
i =i + 1;
end
gives you the linear token array of each of the string you requested.
strtok reference: http://www.mathworks.com/help/techdoc/ref/strtok.html
strrep reference: http://www.mathworks.com/help/techdoc/ref/strrep.html
The following two statements read the first line from an input file (fid) and parse said line into strings delimited by whitespace.
a = textscan(fid,'%s',1,'Delimiter','\n');
b = textscan(a{1}{1},'%s');
I would like to know if this action can be accomplished in a single statement, having a form similar to the following (which is syntactically invalid).
b = textscan(textscan(fid,'%s',1,'Delimiter','\n'),'%s');
Thanks.
Instead of
a = textscan(fid, '%s', 1, 'Delimiter', '\n');
you can use
a = fgetl(fid);
That will return the next line in fid as a string (the newline character at the end is stripped). You can then split that line into white-space separated chunks as follows:
b = regexp(a, '\s*', 'split');
Combined:
b = regexp(fgetl(fid), '\s*', 'split');
Note that this is not 100% equivalent to your code, since using textscan adds another cell-layer (representing different lines in the file). That's not a problem, though, simply use
b = {regexp(fgetl(fid), '\s*', 'split')};
if you need that extra cell-layer.