Say there is an array of n elements, and out of n elements there be some numbers which are much bigger than the rest.
So, I might have:
16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3
In this case, I'd be interested in 32, 16 and 54.
Or, I might have:
32, 105, 26, 5, 1, 82, 906, 58, 22, 88, 967, 1024, 1055
In this case, I'd be interested in 1024, 906, 967 and 1055.
I'm trying to write a function to extract the numbers of interest. The problem is that I can't define a threshold to determine what's "much greater", and I can't just tell it to get the x biggest numbers because both of these will vary depending on what the function is called against.
I'm a little stuck. Does anyone have any ideas how to attack this?
Just taking all the numbers larger than the mean doesn't cut it all the time. For example if you only have one number which is much larger, but much more numbers wich are close to each other. The one large number won't shift the mean very much, which results in taking too many numbers:
data = [ones(1,10) 2*ones(1,10) 10];
data(data>mean(data))
ans =
2 2 2 2 2 2 2 2 2 2 10
If you look at the differences between numbers, this problem is solved:
>> data = [16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3];
sorted_data = sort(data);
dd = diff(sorted_data);
mean_dd = mean(dd);
ii = find(dd> 2*mean_dd,1,'first');
large_numbers = sorted_data(ii:end);
large_numbers =
6 16 32 54
the threshold value (2 in this case) lets you play with the meaning of "how much greater" a number has to be.
If it were me I'd use a little more statistical insight, that would give the most flexibility for the code in the future.
x = [1 2 3 2 2 1 4 6 15 83 2 4 22 81 0 8 7 7 7 3 1 2 3]
EpicNumbers = x( x>(mean(x) + std(x)) )
Then you can increase or decrease the number of standard deviations to broaden or tighten your threshold.
LessEpicNumbers = x( x>(mean(x) + 2*std(x)) )
MoreEpicNumbers = x( x>(mean(x) + 0.5*std(x)) )
A simple solution would be to use find and a treshold based on the mean value (or multiples thereof):
a = [16, 1, 1, 0, 5, 0, 32, 6, 54, 1, 2, 5, 3]
find(a>mean(a))
Related
I'm trying to make a sample lottery checker.
I'm using python.
x = [1,2,3,4,5,
y = [[1,2,3,4,5,6] # 6 numbers hit
,[1,2,3,4,6,7] # 5 numbers hit
,[2,3,4,6,7,8] # 4 numbers hit
,[4,5,6,7,8,9] # 3 numbers hit
,[1,2,7,8,9,10] # 2 numbers hit
,[4,7,8,9,10,11] # 1 number hit
,[7,8,9,10,11,12]]
output: (including the number of hits)
[1,2,3,4,5,6] 6 number hit
[1,2,3,4,6,7] 5 numbers hit
[2,3,4,6,7,8] 4 numbers hit
[4,5,6,7,8,9] 3 numbers hit
[1,2,7,8,9,10] 2 numbers hit
[4,7,8,9,10,11] 1 number hit
I tried using the any() function but only returned true or false.
please help.
Data:
x = [1,2,3,4,5,6]
y = [[1,2,3,4,5,6] # 6 numbers hit
,[1,2,3,4,6,7] # 5 numbers hit
,[2,3,4,6,7,8] # 4 numbers hit
,[4,5,6,7,8,9] # 3 numbers hit
,[1,2,7,8,9,10] # 2 numbers hit
,[4,7,8,9,10,11] # 1 number hit
,[7,8,9,10,11,12]]
Code:
for ticket in y:
print(ticket)
count = 0
for item in x:
if item in ticket:
count += 1
print(count, " numbers hit!")
Output:
[1, 2, 3, 4, 5, 6]
6 numbers hit!
[1, 2, 3, 4, 6, 7]
5 numbers hit!
[2, 3, 4, 6, 7, 8]
4 numbers hit!
[4, 5, 6, 7, 8, 9]
3 numbers hit!
[1, 2, 7, 8, 9, 10]
2 numbers hit!
[4, 7, 8, 9, 10, 11]
1 numbers hit!
[7, 8, 9, 10, 11, 12]
0 numbers hit!
I have 2 arrays (vectors? in m vernacular?) and I want to sort them in unison. How can I achieve this in Matlab?
For example; I have found the peaks from a histogram and they are stored in 2 arrays; peakXVals, peakYVals. They will always be arranged in ascending x axis index. So they will always look like:
peakXVals = [0, 3, 20, 77, 240];
peakYVals = [10, 999, 30, 40, 20];
I wish to sort both arrays based of the values in peakYVals in descending order. Ie from largest peak to smallest peak. So the desired result is:
peakXVals = [3, 77, 20, 240, 0];
peakYVals = [999, 40, 30, 20, 10];
What function's can I use to achieve this in Matlab?
Use sort:
peakXVals = [0, 3, 20, 77, 240];
peakYVals = [10, 999, 30, 40, 20];
>> [B,I] = sort(peakYVals, 'descend')
B =
999 40 30 20 10
I =
2 4 3 5 1
Then:
>> peakXVals_sorted = peakXVals(I)
peakXVals_sorted =
3 77 20 240 0
>> peakYVals_sorted = B
peakYVals_sorted =
999 40 30 20 10
You can arrange the two vectors as columns of a matrix and sort the rows of that matrix as atoms, in lexicographical order. Then the results are the columns of the sorted matrix:
tmp = sortrows([peakYVals(:) peakXVals(:)], 'descend');
peakYVals = tmp(:,1).';
peakXVals = tmp(:,2).';
Is there some handy implementation of Matlab function randperm in numpy that random select K items from totally M(M>K) items, and return the selected indice?
In Matlab,
randperm(100,10)
ans =
82 90 13 89 61 10 27 51 97 88
Yes, with the numpy.random.choice function.
>>> numpy.random.choice(100, 10, replace=False)
array([89, 99, 27, 39, 80, 31, 6, 0, 40, 93])
Note that the resulting range is 0 to M-1. If you need 1 to M like MATLAB, add 1 to the result:
>>> numpy.random.choice(100, 10, replace=False) + 1
array([ 28, 23, 15, 90, 18, 65, 86, 100, 99, 1])
mn = 1
for kn = 1:199
for sn = 1:19773
if abs((x1c{kn+1,1}(sn)) - (x1c{kn,1}(sn))) >= 20
extract{mn} = x1c{kn+1,1}(sn);
mn = mn+1;
end
end
end
extend = cell2mat(extract) + 40;
How can I change the values of "x1c" with the values of "extend"?
You are performing the operation on a cell. Considering you're comparing numbers, this would be done far more efficiently when done with matrices.
I therefor suggest you convert the cell (or a subset of it) to a matrix and then use vectorized operations, like this:
>> a={[13, 2, 3], [14, 25, 8], [100, 9, 10], [101, 8, 32], [140, 20, 3]};
>>
>> x = transpose(reshape(cell2mat(a), 3, []));
>> z = abs(x(2:end, :) - x(1:end-1,:)) > 20;
>> z2 = [zeros(1,3); z]
z2 =
0 0 0
0 1 0
1 0 0
0 0 1
1 0 1
>> x(logical(z2)) = x(logical(z2)) - 200
x =
13 2 3
14 -175 8
-100 9 10
101 8 -168
-60 20 -197
There are two alternatives if you really must use cells (I don't recommend it for speed reasons).
store the indices (k, sn) of the cell items where your condition holds true. And then you'd have to loop over the elements again (very inefficient).
You'd store the previous and next cell "row" in temporary variables and compare using those. When the condition holds, edit in-place and take the temporary variable with you in the next iteration of the loop. The code below shows how this is done:
a={[13, 2, 3], [14, 25, 8], [100, 9, 10], [101, 8, 32], [140, 20, 3]};
curr_row = a{1};
for rowind=1:4
next_row = a{rowind+1};
for colind=1:3
if abs(next_row(1, colind) - curr_row(1, colind)) > 20
a{rowind+1}(1, colind) = a{rowind+1}(1, colind) + 40;
end
end
curr_row = next_row;
end
I have the following problem: I need to build the scatterplot of the data. Everything nice, but there is some duplicate data there:
x = [11, 10, 3, 8, 2, 6, 2, 3, 3, 2, 3, 2, 3, 2, 2, 2, 3, 3, 2, 2];
y = [29, 14, 28, 19, 25, 21, 27, 15, 24, 23, 23, 18, 0, 26, 11, 27, 23, 30, 30, 25];
One can see that there are two elements with (2, 25); (2,27); (3,24);
So if to build this data with a regular scatter(x,y) I am loosing this information:
The way out of this I have found is to use undocumented 'jitter' parameter
scatter(x,y, 'jitter','on', 'jitterAmount', 0.06);
But I do not like the outlook:
What I was trying to achieve is this:
Where the number of duplicates is next to the point (if the number is more than 1), or may be inside the point.
Any idea how to achieve this?
You can do that pretty easily, let's cut it down in two parts:
First you're gonna need to identify the unique 2d points and count them. That's what we have the unique and accumarray function for. Read through the documentation if you don't immediately understand what they're doing and what outputs they have:
x = [11 10 3 8 2 6 2 3 3 2 3 2 3 2 2 2 3 3 2 2];
y = [29 14 28 19 25 21 27 15 24 23 23 18 0 26 11 27 23 30 30 25];
A=[x' y'];
[Auniq,~,IC] = unique(A,'rows');
cnt = accumarray(IC,1);
Now each row of Auniq contains the unique 2d points, while cnt contains the number of occurences of each of those points:
>> [cnt Auniq]
ans =
1 2 11
1 2 18
1 2 23
2 2 25
1 2 26
...etc
For displaying the number of occurences, there are a great many possibilities. As you mentioned, you could put the numbers inside/next to the scatter markers, other options are color encoding, size of the markers,... let's do all of these, you can also of course combine!
Number next to marker
scatter(Auniq(:,1), Auniq(:,2));
for ii=1:numel(cnt)
if cnt(ii)>1
text(Auniq(ii,1)+0.2,Auniq(ii,2),num2str(cnt(ii)), ...
'HorizontalAlignment','left', ...
'VerticalAlignment','middle', ...
'FontSize', 6);
end
end
xlim([1 11]);ylim([0 30]);
Number inside marker
scatter(Auniq(:,1), Auniq(:,2), (6+2*(cnt>1)).^2); % make the ones where we'll put a number inside a bit bigger
for ii=1:numel(cnt)
if cnt(ii)>1
text(Auniq(ii,1),Auniq(ii,2),num2str(cnt(ii)), ...
'HorizontalAlignment','center', ...
'VerticalAlignment','middle', ...
'FontSize', 6);
end
end
as you can see, I enlarged the size of the markers very simply with the scatter function itself.
Color encoding
scatter(Auniq(:,1), Auniq(:,2), [], cnt);
colormap(jet(max(cnt))); % just for the looks of it
after which you can add a colorbar or legend to indicate the number of occurences per color.