How do I compute the number of times a character appears in a character array in MATLAB? [duplicate] - matlab

This question already has answers here:
how to count unique elements of a cell in matlab?
(2 answers)
Closed 7 years ago.
I want to determine the number of times a character appears in a character array, excluding the time it appears at the last position.
How would I do this?

In Matlab computing environment, all variables are arrays, and strings are of type char (character arrays). So your Character Array is actually a string (Or in reality the other way around). Which means you can apply string methods on it to achieve your results. To find total count of occurrence of a character except on last place in a String/Character Array named yourStringVar you can do this:
YourSubString = yourStringVar(1:end-1)
//Now you have substring of main string in variable named YourSubString without the last character because you wanted to ignore it
numberOfOccurrence = length(find(YourSubString=='Character you want to count'))
It has been pointed out by Ray that length(find()) is not a good approach due to various reasons. Alternatively you could do:
numberOfOccurrence = nnz(YourSubString == 'Character you want to count')
numberOfOccurrence will give you your desired result.

What you can do is map each character into a unique integer ID, then determine the count of each character through histcounts. Use unique to complete the first step. The first output of unique will give you a list of all possible unique characters in your string. If you want to exclude the last time each character occurs in the string, just subtract 1 from the total count. Assuming S is your character array:
%// Get all unique characters and assign them to a unique ID
[unq,~,id] = unique(S);
%// Count up how many times we see each character and subtract by 1
counts = histcounts(id) - 1;
%// Show table of occurrences with characters
T = table(cellstr(unq(:)), counts.', 'VariableNames', {'Character', 'Counts'});
The last piece of code displays everything in a nice table. We ensure that the unique characters are placed as individual cells in a cell array.
Example:
>> S = 'ABCDABABCDEFFGACEG';
Running the above code, we get:
>> T
T =
Character Counts
_________ ______
'A' 3
'B' 2
'C' 2
'D' 1
'E' 1
'F' 1
'G' 1

Related

Converting a .txt file with 1 million digits of "e" into a vector in matlab

I have a text file with 1 million decimal digits of "e" number with 80 digits on each line excluding the first and the last line which have 76 and 4 digits and the file has 12501 lines. I want to convert it into a vector in matlab with each digit on each row. I tried num2str function, but the problem is that it gets converted like for example '7.1828e79' (13 characters). What can I do?
P.S.1: The first two lines of the text file (76 and 80 digits) are:
7182818284590452353602874713526624977572470936999595749669676277240766303535 47594571382178525166427427466391932003059921817413596629043572900334295260595630
P.S.2: I used "dlmread" and got a 12501x1 vector, with the first and second row of 7.18281828459045e+75 and 4.75945713821785e+79 and the problem is that when I use num2str for example for the first row value, I get: '7.182818284590453e+75' as a string and not the whole 76 digits. My aim was to do something like this:
e1=dlmread('e.txt');
es1=num2str(e1);
for i=1:12501
for j=1:length(es1(1,:))
a1((i-1)*length(es1(1,:))+j)=es1(i,j);
end
end
e_digits=a1.';
but I get a string like this:
a1='7.182818284590453e+754.759457138217852e+797.381323286279435e+799.244761460668082e+796.133138458300076e+791.416928368190255e+79 5...'
with 262521 characters instead of 1 million digits.
P.S.3: I think the problem might be solved if I can manipulate the text file in a way that I have one digit on each line and simply use dlmread.
Well, this is not hard, there are many ways to do it.
So first you want to load in your file as a Char Array using something simple like (you want a Char Array so that you can easily manipulate it to forget about the lines breaks) :
C = fileread('yourfile.txt'); %loads file as Char Array
D = C(~isspace(C)); %Removes SPACES which are line-breaks
Next, you want to actually append a SPACE between each char (this is because you want to use the num2str transform - and matlab needs to see the space), you can do this using a RESHAPE, a STRTRIM or simply a REGEX:
E = strtrim(regexprep(D, '.{1}', '$0 ')); %Add a SPACE after each Numeric Digit
Now you can transform it using str2num into a Vector:
str2num(E)'; %Converts the Char Array back to Vector
which will give you a single digit each row.
Using your example, I get a vector of 156 x 1, with 1 digit each row as you required.
You can get a digit per row like this
fid = fopen('e.txt','r');
c = textscan(fid,'%s');
c=cat(1,c{:});
c = cellfun(#(x) str2num(reshape(x,[],1)),c,'un',0);
c=cat(1,c{:});
And it is not the only possible way.
Could you please tell what is the final task, how do you plan using the array of e digits?

How to separate the elements of a matrix with comma in Matlab

I would like to separate each element in the matrix below with a comma.
1 2 3
4 5 6
7 8 9
Here's my attempt:
s= sprintf('%.17g,',matrix)
Ouput=1,2,3,4,5,6,7,8,9,
Desired output:
1, 2, 3
4, 5, 6
7, 8, 9
Thanks in advance for your suggestions.
You just need to specify the formatting of the entire first line:
s = sprintf('%.17g, %.17g, %.17g\n',matrix.')
MATLAB keeps re-using the formatting string as long as there are elements left in matrix.
To generalize this process, use the following expression:
s = sprintf([strjoin(repmat({'%.17g'},1,size(matrix,2)), ', ') '\n'], matrix.')
So there's a lot going on in this one line - let's unpack it from inside out:
repmat({'%.17g'},1,size(matrix,2))
This sub-expression takes a single cell array of size 1x1, containing the string %.17g, and duplicates it into a cell array with dimensions specified by the next two arguments. We want to construct a cell array with a single row (hence the argument 1) representing all the format specifiers (%...) we need. Since we want one instance of %.17g for each column, we use size(matrix,2) as the last argument to repmat, since that returns the number of columns of the matrix.
As an example, if you have 5 columns, you get this:
>> repmat({'%.17g'},1,5)
ans =
'%.17g' '%.17g' '%.17g' '%.17g' '%.17g'
Next, since you want columns delimited by commas and spaces, you can use strjoin():
>> strjoin(repmat({'%.17g'},1,5), ', ')
ans =
%.17g, %.17g, %.17g, %.17g, %.17g
Note the use of a comma and several spaces as the second argument (the delimiting string) to strjoin(). Adjust the number of spaces according to your display needs. We need one more thing to be able to print a multi-line matrix - a carriage return. To do this, we use the fact that two strings in square brackets [] are concatenated by MATLAB:
[strjoin(repmat({'%.17g'},1,size(matrix,2)), ', ') '\n']
This produces the final formatting string that we need. All that is left, is to add the sprintf and pass in the matrix argument. As Rijul Sudhir pointed out, you do have to transpose your matrix because MATLAB will walk down a column to pair the matrix elements with the format specifiers.
EDIT: Stewie Griffin was correct about the transpose operation (.') - code has been corrected.

convert number from cell to double, cell2mat not working

I think I'm losing my mind.
I have a 3 x 2 cell which looks exactly like below.
Region Code
US 1
EU 2
I then have the following code to determine the row number for the EU region.
eq_code_index = find(ismember(fund.type_des(:, 1), 'EU'));
eq_code = cell2mat(fund.type_des(eq_code_index, 2));
eq_code_index returns 3 which is correct (row headers are included in the output). So I want the value in row 3, column 2 which is 2. I then use cell2mat to convert it from a cell value to an integer however it doesn't work the value is of type char? Haven't a clue why cell2mat isn't working?
Update
Even if I do the following two lines of code below I can't get the codes into a vector, they turn into char's
codes = fund.type_des(2:end, end);
codes = cell2mat(codes)
To access a single element in a cell array, use curly braces:
fund.type_des{eq_code_index, 2};
This is generally simpler than using cell2mat(). If the contents of the cell are chars and you want an integer, you have to perform the conversion. str2num() is one of many options for this:
eq_code = str2num(fund.type_des{eq_code_index, 2});

Read specific character from cell-array of string

I have an cell-array of dimensions 1x6 like this:
A = {'25_2.mat','25_3.mat','25_4.mat','25_5.mat','25_6.mat','25_7.mat'};
I want to read for example from the A{1} , the number after the '_' i.e 2 for my example
Using cellfun, strfind and str2double
out = cellfun(#(x) str2double(x(strfind(x,'_')+1:strfind(x,'.')-1)),A)
How does it work?
This code simply finds the index of character one number after the occurrence of '_'. Lets call it as start_index. Then finds the character one number lesser than the index of occurrence of '.' character. Lets call it as end_index. Then retrieves all the characters between start_index and end_index. Finally converts those characters to numbers using str2double.
Sample Input:
A = {'2545_23.mat','2_3.mat','250_4.mat','25_51.mat','25_6.mat','25_7.mat'};
Output:
>> out
out =
23 3 4 51 6 7
You can access the contents of the cell by using the curly braces{...}. Once you have access to the contents, you can use indexes to access the elements of the string as you would do with a normal array. For example:
test = {'25_2.mat', '25_3.mat', '25_4.mat', '25_5.mat', '25_6.mat', '25_7.mat'}
character = test{1}(4);
If your string length is variable, you can use strfind to find the index of the character you want.
Assuming the numbers are non-negative integers after the _ sign: use a regular expression with lookbehind, and then convert from string to number:
numbers = cellfun(#(x) str2num(x{1}), regexp(A, '(?<=\_)\d+', 'match'));

Function to split string in matlab and return second number

I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str