scatterplot visualize the same points in matlab - matlab

I have the following problem: I need to build the scatterplot of the data. Everything nice, but there is some duplicate data there:
x = [11, 10, 3, 8, 2, 6, 2, 3, 3, 2, 3, 2, 3, 2, 2, 2, 3, 3, 2, 2];
y = [29, 14, 28, 19, 25, 21, 27, 15, 24, 23, 23, 18, 0, 26, 11, 27, 23, 30, 30, 25];
One can see that there are two elements with (2, 25); (2,27); (3,24);
So if to build this data with a regular scatter(x,y) I am loosing this information:
The way out of this I have found is to use undocumented 'jitter' parameter
scatter(x,y, 'jitter','on', 'jitterAmount', 0.06);
But I do not like the outlook:
What I was trying to achieve is this:
Where the number of duplicates is next to the point (if the number is more than 1), or may be inside the point.
Any idea how to achieve this?

You can do that pretty easily, let's cut it down in two parts:
First you're gonna need to identify the unique 2d points and count them. That's what we have the unique and accumarray function for. Read through the documentation if you don't immediately understand what they're doing and what outputs they have:
x = [11 10 3 8 2 6 2 3 3 2 3 2 3 2 2 2 3 3 2 2];
y = [29 14 28 19 25 21 27 15 24 23 23 18 0 26 11 27 23 30 30 25];
A=[x' y'];
[Auniq,~,IC] = unique(A,'rows');
cnt = accumarray(IC,1);
Now each row of Auniq contains the unique 2d points, while cnt contains the number of occurences of each of those points:
>> [cnt Auniq]
ans =
1 2 11
1 2 18
1 2 23
2 2 25
1 2 26
...etc
For displaying the number of occurences, there are a great many possibilities. As you mentioned, you could put the numbers inside/next to the scatter markers, other options are color encoding, size of the markers,... let's do all of these, you can also of course combine!
Number next to marker
scatter(Auniq(:,1), Auniq(:,2));
for ii=1:numel(cnt)
if cnt(ii)>1
text(Auniq(ii,1)+0.2,Auniq(ii,2),num2str(cnt(ii)), ...
'HorizontalAlignment','left', ...
'VerticalAlignment','middle', ...
'FontSize', 6);
end
end
xlim([1 11]);ylim([0 30]);
Number inside marker
scatter(Auniq(:,1), Auniq(:,2), (6+2*(cnt>1)).^2); % make the ones where we'll put a number inside a bit bigger
for ii=1:numel(cnt)
if cnt(ii)>1
text(Auniq(ii,1),Auniq(ii,2),num2str(cnt(ii)), ...
'HorizontalAlignment','center', ...
'VerticalAlignment','middle', ...
'FontSize', 6);
end
end
as you can see, I enlarged the size of the markers very simply with the scatter function itself.
Color encoding
scatter(Auniq(:,1), Auniq(:,2), [], cnt);
colormap(jet(max(cnt))); % just for the looks of it
after which you can add a colorbar or legend to indicate the number of occurences per color.

Related

replacing values of a matrix with an if operation using matlab

mn = 1
for kn = 1:199
for sn = 1:19773
if abs((x1c{kn+1,1}(sn)) - (x1c{kn,1}(sn))) >= 20
extract{mn} = x1c{kn+1,1}(sn);
mn = mn+1;
end
end
end
extend = cell2mat(extract) + 40;
How can I change the values of "x1c" with the values of "extend"?
You are performing the operation on a cell. Considering you're comparing numbers, this would be done far more efficiently when done with matrices.
I therefor suggest you convert the cell (or a subset of it) to a matrix and then use vectorized operations, like this:
>> a={[13, 2, 3], [14, 25, 8], [100, 9, 10], [101, 8, 32], [140, 20, 3]};
>>
>> x = transpose(reshape(cell2mat(a), 3, []));
>> z = abs(x(2:end, :) - x(1:end-1,:)) > 20;
>> z2 = [zeros(1,3); z]
z2 =
0 0 0
0 1 0
1 0 0
0 0 1
1 0 1
>> x(logical(z2)) = x(logical(z2)) - 200
x =
13 2 3
14 -175 8
-100 9 10
101 8 -168
-60 20 -197
There are two alternatives if you really must use cells (I don't recommend it for speed reasons).
store the indices (k, sn) of the cell items where your condition holds true. And then you'd have to loop over the elements again (very inefficient).
You'd store the previous and next cell "row" in temporary variables and compare using those. When the condition holds, edit in-place and take the temporary variable with you in the next iteration of the loop. The code below shows how this is done:
a={[13, 2, 3], [14, 25, 8], [100, 9, 10], [101, 8, 32], [140, 20, 3]};
curr_row = a{1};
for rowind=1:4
next_row = a{rowind+1};
for colind=1:3
if abs(next_row(1, colind) - curr_row(1, colind)) > 20
a{rowind+1}(1, colind) = a{rowind+1}(1, colind) + 40;
end
end
curr_row = next_row;
end

How to fprintf column headers in MATLAB? [duplicate]

This question already has answers here:
Display matrix with row and column labels
(6 answers)
Closed 7 years ago.
I have two matrices A and B as given below:
A = [1
2
3
4
5]
B = [10 11 12 13
15 16 17 18
17 12 15 13
20 21 22 17
40 41 32 33]
and I would like to output it to a text file in the form given below with column headers as shown:
Desired text output
A B B B B
1 10 11 12 13
2 15 16 17 18
3 17 12 15 13
4 20 21 22 17
5 40 41 32 33
Reproducible code
A = [1; 2; 3; 4; 5];
B = [10, 11, 12, 13;
15, 16, 17, 18;
17, 12, 15, 13;
20, 21, 22, 17;
40, 41, 32, 33;];
ALL = [A B];
ALL_cell = mat2cell(ALL, ones(size(ALL,1),1), size(ALL,2));
fID = fopen('output.dat','w');
f = #(x) fprintf(fID,'%s\n',sprintf('%f\t',x));
cellfun(f,ALL_cell);
fclose(fID);
How to insert the column headers as shown above with MATLAB? Sometimes, the columns in B could be over 100, as an example I have given only 4.
In my private collection of utility scripts, I have a function to 'pretty-print' matrices:
function pprint(fid, M, cols)
fprintf(fid, '%s\t', cols{:});
fprintf(fid, '\n');
for irow=1:size(M, 1)
fprintf(fid, '% .3f\t', M(irow,:));
fprintf(fid, '\n');
end
You could use it like this:
>> headers = [repmat({'A'}, 1, size(A, 2)), repmat({'B'}, 1, size(B, 2))]
headers =
'A' 'B' 'B' 'B' 'B'
>> fid = 1; % print to stdout
>> pprint(fid, [A, B], headers)
A B B B B
1.000 10.000 11.000 12.000 13.000
2.000 15.000 16.000 17.000 18.000
3.000 17.000 12.000 15.000 13.000
4.000 20.000 21.000 22.000 17.000
5.000 40.000 41.000 32.000 33.000
Note that the headers and the columns only line up nicely if the column labels are not too large, you might have to play with adding extra tabs, or use spaces instead of tabs (i.e. use '%10s' instead of '%s\t', and '%10.3f' instead of '% .3f\t')

Select numbers from array which are much greater than the rest

Say there is an array of n elements, and out of n elements there be some numbers which are much bigger than the rest.
So, I might have:
16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3
In this case, I'd be interested in 32, 16 and 54.
Or, I might have:
32, 105, 26, 5, 1, 82, 906, 58, 22, 88, 967, 1024, 1055
In this case, I'd be interested in 1024, 906, 967 and 1055.
I'm trying to write a function to extract the numbers of interest. The problem is that I can't define a threshold to determine what's "much greater", and I can't just tell it to get the x biggest numbers because both of these will vary depending on what the function is called against.
I'm a little stuck. Does anyone have any ideas how to attack this?
Just taking all the numbers larger than the mean doesn't cut it all the time. For example if you only have one number which is much larger, but much more numbers wich are close to each other. The one large number won't shift the mean very much, which results in taking too many numbers:
data = [ones(1,10) 2*ones(1,10) 10];
data(data>mean(data))
ans =
2 2 2 2 2 2 2 2 2 2 10
If you look at the differences between numbers, this problem is solved:
>> data = [16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3];
sorted_data = sort(data);
dd = diff(sorted_data);
mean_dd = mean(dd);
ii = find(dd> 2*mean_dd,1,'first');
large_numbers = sorted_data(ii:end);
large_numbers =
6 16 32 54
the threshold value (2 in this case) lets you play with the meaning of "how much greater" a number has to be.
If it were me I'd use a little more statistical insight, that would give the most flexibility for the code in the future.
x = [1 2 3 2 2 1 4 6 15 83 2 4 22 81 0 8 7 7 7 3 1 2 3]
EpicNumbers = x( x>(mean(x) + std(x)) )
Then you can increase or decrease the number of standard deviations to broaden or tighten your threshold.
LessEpicNumbers = x( x>(mean(x) + 2*std(x)) )
MoreEpicNumbers = x( x>(mean(x) + 0.5*std(x)) )
A simple solution would be to use find and a treshold based on the mean value (or multiples thereof):
a = [16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3]
find(a>mean(a))

Getting the indices of the max values of matrix columns in MATLAB

I need to get the indices of the maximum values of the columns in a matrix, for example:
a =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
and I want to get
[1, 4, 4, 1]
which are the indices of 16,14,15,13 i.e. the maximum value in each column. I discovered that
max(a,[],1)
returns
[16, 14, 15, 13]
How can I get their indices?
You need to find indices, not the numbers themselves, so you need the second output argument.
[~,I] = max(a)

Changing the values in a matrix in MATLAB

I have an UNBALANCED dataset containing five fields like:
a_code b_code sector year value
1 2 15 1970 1000
2 3 16 1971 2900
3 2 15 1970 3900
I want to create a 4-dimensional matrix in MATLAB for the "value" field. So I want to have a value field in a matrix such as M(a_code,b_code,sector,year) = value. I have 75 a_code, 75 b_code, 19 sectors and 45 years. So a NaN matrix is (75,75,19,45).
Since my dataset is not balanced (for example I don't have a value for a_code = 3, b_code = 1, sector = 15, year = 1970), I don't have a value for each (a_code, b_code, sector, year) combination. For the unavailable values, I want to have NaN. I know how to create a 4-dimensional matrix with NaN values, but how do I replace these NaN values with the ones in my dataset?
Probably I should write a loop, but I don't know how.
Here is some simple code to fulfill your requirements:
D= [1 2 15 1970 1000; 2 3 16 1971 2900; 3 2 15 1970 3900];
m= min(D(:, 1: end- 1))- 1;
shape= max(D(:, 1: end- 1))- m+ 1;
X= NaN(shape);
for k= 1: size(D, 1)
n= D(k, 1: end- 1)- m;
X(sub2ind(shape, n(1), n(2), n(3), n(4)))= D(k, end);
end
X(1, 1, 1, 1) %=> 1000
X(2, 2, 2, 2) %=> 2900
X(3, 1, 1, 1) %=> 3900
You may like to elaborate more on your specific situation, there may exists more suitable approaches. For example from your question, its not quite so clear why you need to have your data represented as a 4D array.