Creating an array from an str concatenation Matlab - matlab

Hello I have these two vectors
Q = [1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4]
and
Year = [2000,2000,2000,2000,2001,2001,2001,2001,2002.....]
and I would like to concatenate them into one single array Time
Time = [20001,20002,20003,20004,20010....]
Or
Time= {'2000Q1', '2000Q2', '2000Q3', '2000Q4', '2001Q1'....}
So far I tried with this code
m = zeros(136,1)
for i=1:136
m(i,1)= strcat(Q(i),Year(i));
end
And Matlab outputed me this:
Subscripted assignment dimension mismatch.
Help pls ?

If your vectors Year and Q have the same number of elements, you do not need a loop, just transpose them (or just make sure they are in column), then concatenate with the [] operator:
Time = [ num2str(Year.') num2str(Q.') ] ;
will give you:
20001
20002
20003
20004
20011
...
And if you want the 'Q' character, insert it in the expression:
Time = [ num2str(Year.') repmat('Q',length(Q),1) num2str(Q.') ]
Will give you:
2000Q1
2000Q2
2000Q3
2000Q4
2001Q1
...
This will be a char array, if you want a cell array, use cellstr on the same expression:
time = cellstr( [num2str(Year.') repmat('Q',length(Q),1) num2str(Q.')] ) ;

To obtain strings:
strtrim(mat2cell(num2str([Year(:) Q(:) ],'%i%i'), ones(1,numel(Q))));
Explanation:
Concat both numeric vectors as two columns (using [...])
Convert to char array, where each row is the concatenation of two numbers (using num2str with sprintf-like format specifiers). It is assumed that all numbers are integers (if not, change the format specifiers). This may introduce unwanted spaces if not all the concatenated numbers have the same number of digits.
Convert to a cell array, putting each row in a different cell (using mat2cell).
Remove whitespaces in each cell (using strtrim)
To obtain numbers: apply str2double to the above:
str2double(strtrim(mat2cell(num2str([Year(:) Q(:) ],'%i%i'), ones(1,numel(Q)))));
Or compute directly
10.^ceil(max(log10(Q)))*Year + Q;

You can use arrayfun
If you want your output in string format (with a 'Q' in the middle) then use sprintf to format the string
Time = arrayfun( #(y,q) sprintf('%dQ%d', y, q ), Year, Q, 'uni', 0 );
Resulting with a cellarray
Time =
'2000Q1' '2000Q2' '2000Q3' '2000Q4' '2001Q1' '2001Q2' '2001Q3'...
Alternatively, if you skip the 'Q' you can save each number in an array
Time = arrayfun( #(y,q) y*10+q, Year, Q )
Resulting with a regular array
Time =
20001 20002 20003 20004 20011 20012 20013 ...

Thats because you are initializing m to zeros(136,1) and then trying to save a full string into the first value. and obviously a double cannot hold a string.
I give you 2 options, but I favor the first one.
1.- you can just use cell arrays, so your code converts into:
m = cell(136,1)
for ii=1:136
m{ii}= strcat(Q(ii),Year(ii));
end
and then m will be: m{1}='2000Q1';
2.- Or if you know that your strings will ALWAYS be the same size (in your case it lokos like they are always 6) you can:
m = zeros(136,strsize)
for ii=1:136
m(ii,:)= strcat(Q(ii),Year(ii));
end
and then m will be: m(1,:)= [ 50 48 48 48 81 49 ] wich translated do ASCII will be 2000Q1

Related

Converting a .txt file with 1 million digits of "e" into a vector in matlab

I have a text file with 1 million decimal digits of "e" number with 80 digits on each line excluding the first and the last line which have 76 and 4 digits and the file has 12501 lines. I want to convert it into a vector in matlab with each digit on each row. I tried num2str function, but the problem is that it gets converted like for example '7.1828e79' (13 characters). What can I do?
P.S.1: The first two lines of the text file (76 and 80 digits) are:
7182818284590452353602874713526624977572470936999595749669676277240766303535 47594571382178525166427427466391932003059921817413596629043572900334295260595630
P.S.2: I used "dlmread" and got a 12501x1 vector, with the first and second row of 7.18281828459045e+75 and 4.75945713821785e+79 and the problem is that when I use num2str for example for the first row value, I get: '7.182818284590453e+75' as a string and not the whole 76 digits. My aim was to do something like this:
e1=dlmread('e.txt');
es1=num2str(e1);
for i=1:12501
for j=1:length(es1(1,:))
a1((i-1)*length(es1(1,:))+j)=es1(i,j);
end
end
e_digits=a1.';
but I get a string like this:
a1='7.182818284590453e+754.759457138217852e+797.381323286279435e+799.244761460668082e+796.133138458300076e+791.416928368190255e+79 5...'
with 262521 characters instead of 1 million digits.
P.S.3: I think the problem might be solved if I can manipulate the text file in a way that I have one digit on each line and simply use dlmread.
Well, this is not hard, there are many ways to do it.
So first you want to load in your file as a Char Array using something simple like (you want a Char Array so that you can easily manipulate it to forget about the lines breaks) :
C = fileread('yourfile.txt'); %loads file as Char Array
D = C(~isspace(C)); %Removes SPACES which are line-breaks
Next, you want to actually append a SPACE between each char (this is because you want to use the num2str transform - and matlab needs to see the space), you can do this using a RESHAPE, a STRTRIM or simply a REGEX:
E = strtrim(regexprep(D, '.{1}', '$0 ')); %Add a SPACE after each Numeric Digit
Now you can transform it using str2num into a Vector:
str2num(E)'; %Converts the Char Array back to Vector
which will give you a single digit each row.
Using your example, I get a vector of 156 x 1, with 1 digit each row as you required.
You can get a digit per row like this
fid = fopen('e.txt','r');
c = textscan(fid,'%s');
c=cat(1,c{:});
c = cellfun(#(x) str2num(reshape(x,[],1)),c,'un',0);
c=cat(1,c{:});
And it is not the only possible way.
Could you please tell what is the final task, how do you plan using the array of e digits?

How to separate the elements of a matrix with comma in Matlab

I would like to separate each element in the matrix below with a comma.
1 2 3
4 5 6
7 8 9
Here's my attempt:
s= sprintf('%.17g,',matrix)
Ouput=1,2,3,4,5,6,7,8,9,
Desired output:
1, 2, 3
4, 5, 6
7, 8, 9
Thanks in advance for your suggestions.
You just need to specify the formatting of the entire first line:
s = sprintf('%.17g, %.17g, %.17g\n',matrix.')
MATLAB keeps re-using the formatting string as long as there are elements left in matrix.
To generalize this process, use the following expression:
s = sprintf([strjoin(repmat({'%.17g'},1,size(matrix,2)), ', ') '\n'], matrix.')
So there's a lot going on in this one line - let's unpack it from inside out:
repmat({'%.17g'},1,size(matrix,2))
This sub-expression takes a single cell array of size 1x1, containing the string %.17g, and duplicates it into a cell array with dimensions specified by the next two arguments. We want to construct a cell array with a single row (hence the argument 1) representing all the format specifiers (%...) we need. Since we want one instance of %.17g for each column, we use size(matrix,2) as the last argument to repmat, since that returns the number of columns of the matrix.
As an example, if you have 5 columns, you get this:
>> repmat({'%.17g'},1,5)
ans =
'%.17g' '%.17g' '%.17g' '%.17g' '%.17g'
Next, since you want columns delimited by commas and spaces, you can use strjoin():
>> strjoin(repmat({'%.17g'},1,5), ', ')
ans =
%.17g, %.17g, %.17g, %.17g, %.17g
Note the use of a comma and several spaces as the second argument (the delimiting string) to strjoin(). Adjust the number of spaces according to your display needs. We need one more thing to be able to print a multi-line matrix - a carriage return. To do this, we use the fact that two strings in square brackets [] are concatenated by MATLAB:
[strjoin(repmat({'%.17g'},1,size(matrix,2)), ', ') '\n']
This produces the final formatting string that we need. All that is left, is to add the sprintf and pass in the matrix argument. As Rijul Sudhir pointed out, you do have to transpose your matrix because MATLAB will walk down a column to pair the matrix elements with the format specifiers.
EDIT: Stewie Griffin was correct about the transpose operation (.') - code has been corrected.

Read specific character from cell-array of string

I have an cell-array of dimensions 1x6 like this:
A = {'25_2.mat','25_3.mat','25_4.mat','25_5.mat','25_6.mat','25_7.mat'};
I want to read for example from the A{1} , the number after the '_' i.e 2 for my example
Using cellfun, strfind and str2double
out = cellfun(#(x) str2double(x(strfind(x,'_')+1:strfind(x,'.')-1)),A)
How does it work?
This code simply finds the index of character one number after the occurrence of '_'. Lets call it as start_index. Then finds the character one number lesser than the index of occurrence of '.' character. Lets call it as end_index. Then retrieves all the characters between start_index and end_index. Finally converts those characters to numbers using str2double.
Sample Input:
A = {'2545_23.mat','2_3.mat','250_4.mat','25_51.mat','25_6.mat','25_7.mat'};
Output:
>> out
out =
23 3 4 51 6 7
You can access the contents of the cell by using the curly braces{...}. Once you have access to the contents, you can use indexes to access the elements of the string as you would do with a normal array. For example:
test = {'25_2.mat', '25_3.mat', '25_4.mat', '25_5.mat', '25_6.mat', '25_7.mat'}
character = test{1}(4);
If your string length is variable, you can use strfind to find the index of the character you want.
Assuming the numbers are non-negative integers after the _ sign: use a regular expression with lookbehind, and then convert from string to number:
numbers = cellfun(#(x) str2num(x{1}), regexp(A, '(?<=\_)\d+', 'match'));

Function to split string in matlab and return second number

I have a string and I need two characters to be returned.
I tried with strsplit but the delimiter must be a string and I don't have any delimiters in my string. Instead, I always want to get the second number in my string. The number is always 2 digits.
Example: 001a02.jpg I use the fileparts function to delete the extension of the image (jpg), so I get this string: 001a02
The expected return value is 02
Another example: 001A43a . Return values: 43
Another one: 002A12. Return values: 12
All the filenames are in a matrix 1002x1. Maybe I can use textscan but in the second example, it gives "43a" as a result.
(Just so this question doesn't remain unanswered, here's a possible approach: )
One way to go about this uses splitting with regular expressions (MATLAB's strsplit which you mentioned):
str = '001a02.jpg';
C = strsplit(str,'[a-zA-Z.]','DelimiterType','RegularExpression');
Results in:
C =
'001' '02' ''
In older versions of MATLAB, before strsplit was introduced, similar functionality was achieved using regexp(...,'split').
If you want to learn more about regular expressions (abbreviated as "regex" or "regexp"), there are many online resources (JGI..)
In your case, if you only need to take the 5th and 6th characters from the string you could use:
D = str(5:6);
... and if you want to convert those into numbers you could use:
E = str2double(str(5:6));
If your number is always at a certain position in the string, you can simply index this position.
In the examples you gave, the number is always the 5th and 6th characters in the string.
filename = '002A12';
num = str2num(filename(5:6));
Otherwise, if the formating is more complex, you may want to use a regular expression. There is a similar question matlab - extracting numbers from (odd) string. Modifying the code found there you can do the following
all_num = regexp(filename, '\d+', 'match'); %Find all numbers in the filename
num = str2num(all_num{2}) %Convert second number from str

Convert nonuniform cell array to numeric array

I am using xlsread in MATLAB to read in sheets from an excel file. My goal is to have each column of the excel sheet read as a numeric array. One of the columns has a mix of numbers and numbers+char. For example, the values could be 200, 300A, 450, 500A, 200A, 100. here is what I have so far:
[num, txt, raw] = xlsread(fileIn, sheets{ii}); % Reading in each sheet from a for loop
myCol = raw(:, 4) % I want all rows of column 4
for kk=1:numel(myCol)
if iscellstr(myCol(kk))
myCol(kk) = (cellfun(#(x)strrep(x, 'A', ''), myCol(kk), 'UniformOutput', false));
end
end
myCol = cell2mat(myCol);
This is able to strip off the char from the number but then I am left with:
myCol =
[200]
'300'
[450]
'500'
'200'
[100]
which errors out on cell2mat with:
cell2mat(myCol)
??? Error using ==> cell2mat at 46
All contents of the input cell array must be of the same data type.
I feel like I am probably mixing up () and {} somewhere. Can someone help me out with this?
Let me start from reading the file
[num, txt, raw] = xlsread('test.xlsx');
myCol = raw(:, 4);
idx = cellfun(#ischar,myCol ); %# find strings
data = zeros(size(myCol)); %# preallocate matrix for numeric data
data(~idx) = cell2mat(myCol(~idx)); %# convert numeric data
data(idx) = str2double(regexprep(myCol(idx),'\D','')); %# remove non-digits and convert to numeric
The variable myCol is initially a cell array containing both numbers and strings, something like this in your example:
myCol = {200; '300A'; 450; '500A'; '200A'; 100};
The steps you have to follow to convert the string entries into numeric values is:
Identify the cell entries in myCol that are strings. You can use a loop to do this, as in your example, or you can use the function CELLFUN to get a logical index like so:
index = cellfun(#ischar,myCol);
Remove the letters. If you know the letters to remove will always be 'A', as in your example, you can use a simple function like STRREP on all of your indexed cells like so:
strrep(myCol(index),'A','')
If you can have all sorts of other characters and letters in the string, then a function like REGEXPREP may work better for you. For your example, you could do this:
regexprep(myCol(index),'\D','')
Convert the strings of numbers to numeric values. You can do this for all of your indexed cells using the function STR2DOUBLE:
str2double(regexprep(myCol(index),'\D',''))
The final result of the above can then be combined with the original numeric values in myCol. Putting it all together, you get the following:
>> index = cellfun(#ischar,myCol);
>> result(index,1) = str2double(regexprep(myCol(index),'\D',''));
>> result(~index) = [myCol{~index}]
result =
200
300
450
500
200
100