how to fix and remove zero among a number to equalize the digits in an array - matlab

consider an array in MATLAB :
a = [102 20 1 30 8 255];
In this array, I need to make all the numbers to three digits by prefixing zero to all values lie this :
a = 102 020 001 030 008 255
After that, I need to reverse it again to the same. how can i do this?
I tried to separate the digits and do this. But it failed.

You want to use the notation of fprintf, which can be saved as a string with sprintf:
>> a = [102 20 1 30 8 255]
a =
102 20 1 30 8 255
>> b = sprintf('%.3d ',a) % b is a single string
b =
102 020 001 030 008 255
>> a = str2num(b)
a =
102 20 1 30 8 255

You probably need to convert to string. Have a look at the int2str or num2strfunctions for example. Then you can easily concatenate zeros at the beginning. For example:
s = int2str(10);
['0' s]
This gives you 010 as an output.
You can then revert with the str2num function.

Related

How to convert alphabets to numerical values with spaces and return it back to alphabets?

Want to convert the alphabet to numerical values and transform it back to alphabets using some mathematical techniques like fast Fourier transform in MATLAB.
Example:
The following is the text saved in "text2figure.txt" file
Hi how r u am fine take care of your health
thank u very much
am 2.0
Reading it in MATLAB:
data=fopen('text2figure.txt','r')
d=fscanf(data,'%s')
temp = fileread( 'text2figure.txt' )
temp = regexprep( temp, ' {6}', ' NaN' )
c=cellstr(temp(:))'
Now I wish to convert cell array with spaces to numerical values/integers:
coding = 'abcdefghijklmnñopqrstuvwxyz .,;'
str = temp %// example text
[~, result] = ismember(str, coding)
y=result
result =
Columns 1 through 18
0 9 28 8 16 24 28 19 28 22 28 1 13 28 6 9 14 5
Columns 19 through 36
28 21 1 11 5 28 3 1 19 5 28 16 6 28 26 16 22 19
Columns 37 through 54
28 8 5 1 12 21 8 28 0 0 21 8 1 14 11 28 22 28
Columns 55 through 71
23 5 19 26 28 13 22 3 8 0 0 1 13 28 0 29 0
Now I wish to convert the numerical values back to alphabets:
Hi how r u am fine take care of your health
thank u very much
am 2.0
How to write a MATLAB code to return the numerical values in the variable result to alphabets?
Most of the code in the question doesn't have any useful effects. These three lines are the ones that lead to result:
str = fileread('test2figure.txt');
coding = 'abcdefghijklmnñopqrstuvwxyz .,;';
[~, result] = ismember(str, coding);
ismember returns, in the second output argument, the indices into coding for each element of str. Thus, result are indices that we can use to index into coding:
out = coding(result);
However, this does not work because some elements of str do not occur in coding, and for those elements ismember returns 0, which is not a valid index. We can replace the zeros with a new character:
coding = ['*',coding];
out = coding(result+1);
Basically, we're shifting each code by one, adding a new code for 1.
One of the characters we're missing here is the newline character. Thus the three lines have become one line. You can add a code for the newline character by adding it to the coding table:
str = fileread('test2figure.txt');
coding = ['abcdefghijklmnñopqrstuvwxyz .,;',char(10)]; % char(10) is the newline character
[~, result] = ismember(str, coding);
coding = ['*',coding];
out = coding(result+1);
All of this is easier to achieve just using the ASCII code table:
str = fileread('test2figure.txt');
result = double(str);
out = char(result);

How to load a text file in Matlab when the number of values in every line are different

I have a none rectangular text file like A which has 10 values in first line, 14 values in 2nd line, 16 values in 3rd line and so on. Here is an example of 4 lines of my text file:
line1:
1.68595314026 -1.48498177528 2.39820933342 27 20 15 2 4 62 -487.471069336 -517.781921387 5 96 -524.886108398 -485.697143555
Line2:
1.24980998039 -0.988095104694 1.89048337936 212 209 191 2 1 989 -641.149658203 -249.001220703 3 1036 -608.681762695 -300.815673828
Line3:
8.10434532166 -4.81520080566 4.90576314926 118 115 96 3 0 1703 749.967773438 -754.015136719 1 1359 1276.73632813 -941.855895996 2 1497 1338.98852539 -837.659179688
Line 4:
0.795098006725 -0.98456710577 1.89322447777 213 200 68 5 0 1438 -1386.39111328 -747.421386719 1 1565 -1153.50915527 -342.951965332 2 1481 -1341.57043457 -519.307800293 3 1920 -1058.8828125 -371.696960449 4 1303 -1466.5802002 -308.764587402
Now, I want to load this text file in to a matrix M in Matlab. I tired to use importdata function for loading it
M = importdata('A.txt');
but it loads the file in a rectangular matrix (all rows have same number of columns!!!) which is not right. The expected created matrix size should be like this:
size(M(1,:))= 1 10
size(M(2,:))= 1 14
size(M(3,:))= 1 16
How can I load this text file in a correct way into Matlab?
As #Jens suggested, you should use a cell array. Assuming your file contains only numeric values separated by whitespaces, for instance:
1 3 6
7 8 9 12 15
1 2
0 3 7
You can parse it into cell array like this:
% Read full file
str = fileread('A.txt');
% Convert text in a cell array of strings
c = textscan(str, '%s', 'Delimiter', '\n');
c = c{1};
% Convert 'string' elements to 'double'
n = cellfun(#str2num, c, 'UniformOutput', false)
You can then access individual lines like this:
>> n{1}
ans =
1 3 6
>> n{2}
ans =
7 8 9 12 15

Matlab beginner median , mode and binning

I am a beginner with MATLAB and I am struggling with this assignment. Can anyone guide me through it?
Consider the data given below:
x = [ 1 , 48 , 81 , 2 , 10 , 25 , ,14 , 18 , 53 , 41, 56, 89,0, 1000, , ...
34, 47, 455, 21, , 22, 100 ];
Once the data is loaded, see if you can find any:
Outliers or
Missing data in the data file
Correct the missing values using median, mode and noisy data using median binning, mean binning and bin boundaries.
This isn't so bad. First off, take a look at the distribution of your data. You can see that the majority of your data has double digits. The outliers are those with single digits, or those that are way larger than double digits. Mind you, this is totally subjective so someone else may tell you that the single digits are part of your data too. Also, the missing data are those numbers that are spaces in between the commas. Let's write some MATLAB code and change these to NaN (or not-a-number), because if you try copying and pasting this code directly into MATLAB, it will give you a syntax error because if you are explicitly defining numbers this way, you have to be sure all of them are there.
To do this, use regexprep so that any parts of this string that have a comma, space, then another comma, put a NaN in between. To do this, we need to put this statement as a string first. We then use eval to convert this string to an actual MATLAB statement:
x = '[ 1 , 48 , 81 , 2 , 10 , 25 , ,14 , 18 , 53 , 41, 56, 89,0, 1000, , 34, 47, 455, 21, , 22, 100 ];'
y = eval(regexprep(x, ', ,', ', NaN, '));
If we display this data, we get:
y =
Columns 1 through 6
1 48 81 2 10 25
Columns 7 through 12
NaN 14 18 53 41 56
Columns 13 through 18
89 0 1000 NaN 34 47
Columns 19 through 23
455 21 NaN 22 100
As such, to answer our first question, any values that are missing are denoted as NaN and those numbers that are bigger than double digits are outliers.
For the next question, we simply extract those values that are not missing, calculate the mean and median of what is not missing, and fill in those NaN values with the mean and median. For the bin boundaries, this is the same thing as using the values to the left (or right... depends on your definition, but let's use left) of the missing value and fill those in. As such:
yMissing = isnan(y); %// Which values are missing?
y_noNaN = y(~yMissing); %// Extract the non-missing values
meanY = mean(y_noNaN); %// Get the mean
medianY = median(y_noNaN); %// Get the median
%// Output - Fill in missing values with median
yMedian = y;
yMedian(yMissing) = medianY;
%// Same for mean
yMean = y;
yMean(yMissing) = meanY;
%// Bin boundaries
yBinBound = y;
yBinBound(yMissing) = y(find(yMissing)-1);
The mean and median for the data of the non-missing values is:
meanY =
105.8500
medianY =
37.5000
The outputs for each of these, in addition to the original data with the missing values looks like:
format bank; %// Do this to show just the first two decimal places for compact output
format compact;
y =
Columns 1 through 5
1 48 81 2 10
Columns 6 through 10
25 NaN 14 18 53
Columns 11 through 15
41 56 89 0 1000
Columns 16 through 20
NaN 34 47 455 21
Columns 21 through 23
NaN 22 100
yMean =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 105.85 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
105.85 34.00 47.00 455.00 21.00
Columns 21 through 23
105.85 22.00 100.00
yMedian =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 37.50 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
37.50 34.00 47.00 455.00 21.00
Columns 21 through 23
37.50 22.00 100.00
yBinBound =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 25.00 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
1000.00 34.00 47.00 455.00 21.00
Columns 21 through 23
21.00 22.00 100.00
If you take a look at each of the output values, this fills in our data with the mean, median and also the bin boundaries as per the question.

returning a logical vector the same size as vector your checking

I have a matrix of 6500 x 3000, called deal_matrix. I have a logical vector called active_deals 6500 x 1. Lastly I have another vector called country_codes which is 6500 x 1.
active_deals = isnan(deal_matrix(:, 1));
cty = 42 % 42 is the code for the US
cty_active = zeros(6500, 1);
cty_active = active_deals(country_codes == cty, 1);
currently cty_active will be a 3200 x 1 vector, which is correct in that there are 3200 active US deals. However I would like a 6500 x 1 vector returned for calculations later on.
Please see a brief example below,
country_codes
42
51
42
42
30
33
65
based on cty = 42 the logical vector I would like return is the same size as country_codes and should look like below,
result vec
1
0
1
1
0
0
0

How do I select n elements of a sequence in windows of m ? (matlab)

Quick MATLAB question.
What would be the best/most efficient way to select a certain number of elements, 'n' in windows of 'm'. In other words, I want to select the first 50 elements of a sequence, then elements 10-60, then elements 20-70 ect.
Right now, my sequence is in vector format(but this can easily be changed).
EDIT:
The sequences that I am dealing with are too long to be stored in my RAM. I need to be able to create the windows, and then call upon the window that I want to analyze/preform another command on.
Do you have enough RAM to store a 50-by-nWindow array in memory? In that case, you can generate your windows in one go, and then apply your processing on each column
%# idxMatrix has 1:50 in first col, 11:60 in second col etc
idxMatrix = bsxfun(#plus,(1:50)',0:10:length(yourVector)-50); %'#
%# reshapedData is a 50-by-numberOfWindows array
reshapedData = yourVector(idxMatrix);
%# now you can do processing on each column, e.g.
maximumOfEachWindow = max(reshapedData,[],1);
To complement Kerrek's answer: if you want to do it in a loop, you can use something like
n = 50
m = 10;
for i=1:m:length(v)
w = v(i:i+n);
% Do something with w
end
There's a slight issue with the description of your problem. You say that you want "to select the first 50 elements of a sequence, then elements 10-60..."; however, this would translate to selecting elements:
1-50
10-60
20-70
etc.
That first sequence should be 0-10 to fit the pattern which of course in MATLAB would not make sense since arrays use one-indexing. To address this, the algorithm below uses a variable called startIndex to indicate which element to start the sequence sampling from.
You could accomplish this in a vectorized way by constructing an index array. Create a vector consisting of the starting indices of each sequence. For reuse sake, I put the length of the sequence, the step size between sequence starts, and the start of the last sequence as variables. In the example you describe, the length of the sequence should be 50, the step size should be 10 and the start of the last sequence depends on the size of the input data and your needs.
>> startIndex = 10;
>> sequenceSize = 5;
>> finalSequenceStart = 20;
Create some sample data:
>> sampleData = randi(100, 1, 28)
sampleData =
Columns 1 through 18
8 53 10 82 82 73 15 66 52 98 65 81 46 44 83 9 14 18
Columns 19 through 28
40 84 81 7 40 53 42 66 63 30
Create a vector of the start indices of the sequences:
>> sequenceStart = startIndex:sequenceSize:finalSequenceStart
sequenceStart =
10 15 20
Create an array of indices to index into the data array:
>> index = cumsum(ones(sequenceSize, length(sequenceStart)))
index =
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
>> index = index + repmat(sequenceStart, sequenceSize, 1) - 1
index =
10 15 20
11 16 21
12 17 22
13 18 23
14 19 24
Finally, use this index array to reference the data array:
>> sampleData(index)
ans =
98 83 84
65 9 81
81 14 7
46 18 40
44 40 53
Use (start : step : end) indexing: v(1:1:50), v(10:1:60), etc. If the step is 1, you can omit it: v(1:50).
Consider the following vectorized code:
x = 1:100; %# an example sequence of numbers
nwind = 50; %# window size
noverlap = 40; %# number of overlapping elements
nx = length(x); %# length of sequence
ncol = fix((nx-noverlap)/(nwind-noverlap)); %# number of sliding windows
colindex = 1 + (0:(ncol-1))*(nwind-noverlap); %# starting index of each
%# indices to put sequence into columns with the proper offset
idx = bsxfun(#plus, (1:nwind)', colindex)-1; %'
%# apply the indices on the sequence
slidingWindows = x(idx)
The result (truncated for brevity):
slidingWindows =
1 11 21 31 41 51
2 12 22 32 42 52
3 13 23 33 43 53
...
48 58 68 78 88 98
49 59 69 79 89 99
50 60 70 80 90 100
In fact, the code was adapted from the now deprecated SPECGRAM function from the Signal Processing Toolbox (just do edit specgram.m to see the code).
I omitted parts that zero-pad the sequence in case the sliding windows do not evenly divide the entire sequence (for example x=1:105), but you can easily add them again if you need that functionality...