If there is a vector of numbers (size = n), we want to find the number that is the 'best'.
The criteria for the best number is its frequency should be > 50% of the total size of the vector of numbers.
Given that there will always be only one best number.
How will you solve this with O(1) space complexity and O(n) time complexity?
eg. input : [1, 1, 1, 3, 3, 3, 3]
ans : 3 because (its frequency i.e. 4 is greater than 50% of 7 = 3)
Related
lets say that we have the next series of arrays:
A = [1, 2, -2, -24];
B = [1, 4, -7, -2];
C = [3, 1, -7, -14];
D = [11, 4, -7, -1];
E = [1, 2, -3, -4];
F = [5, 14, -17, -12];
I would like to create two arrays,
the first will be the maximum of each column for all arrays,
i.e.
Maxi = [11,14,-2 -1];
the second will be the minimum of each column for all arrays
i.e.
Mini= [1,1,-17 -24];
I am trying all day, using loops, with max, and abs but I cant make it work
in my problem have a matrix (100,200), so with the above example i am trying to easily approach the problem. The ultimate goal is to get a kinda fitting of the 100 y_lines of 200 x_points. The idea is to calculate two lines (i.e. max,min), that will be the "visual" boarders of all lines (maximum and minimum values for each x). The next step will be to calculate an array of the average of these two arrays, so in the end will be a line between all lines.
any help is more than welcome!
How about this?
Suppose you stack all the row vectors , namely A,B...,F as
arr=[A;B;C;D;E;F];% stack the vectors
And then use the max(), min() and mean() functions provided by Matlab. That is,
Maxi = max(arr); % Maxi is a row vector carrying the max of each column of arr
Mini = min(arr);
Meani = mean(arr);
You just have to stack them as shown above. But if you have 100s of row vectors, use a loop to stack them into array arr as shown above.
Let X = [1, 2, 3, 4, 5] and Y = [1, 2, 1, 0, 1] be vectors where X maps into Y.
Now I want to identify the maximum and minimum of Y, which is easy: [value_min, id_min] = min(Y) = [0, 4] and [value_max, id_max] = max(Y) = [2, 2].
Then I want to remove the element from X corresponding to the minimum in Y and expand evenly around the element in X corresponding to the maximum in Y, while keeping the number of points equal. For this example we remove X(4)=[]. Then we expand like X(2)=(X(2) - X(1))/2 and X(3)=(X(3) - X(2))/2 such that X looks like X = [1, 1.5, 2.5, 3, 5]. How can I achieve this? I think there is a general pattwern.
Solution
Now the following snipped should work for any vector of length N. Note that the first and final element are fixed.
[value_max, id_max] = max(Y(2:N-1));
X(id_max) = (X(id_max) - X(id_max-1))/2;
X(id_max+1) = (X(id_max+1) - X(id_max))/2;
[value_min, id_min] = min(Y(2:N-1));
X(id_min)=[];
Here is a solution to your problem but there are a few things you should take care of
% Any Vector should work
X=[1 2 3 4 5];
Y=[1 2 1 0 1];
%We dont need the actual min max
[~,MIN]=min(Y(2:end-1));
[~,MAX]=max(Y(2:end-1));
%you dont look at the first element so the index has to be increased by 1
MIN=MIN+1;
MAX=MAX+1;
X(MIN)=[];%taking out the smallest element
Xnew= [X(1:MAX) X(MAX:end)]; %Extend the vector by taking the MAX value twice
%the mean for 2 elements is A+B/2
Xnew(MAX)=mean(Xnew(MAX-1:MAX)); %the left one and the element next to it
Xnew(MAX+1)=mean(Xnew(MAX+1:MAX+2)); %the right one and the element next ot it
%rewrite X and clear Xnew
X=Xnew;
clear Xnew;
First of all this isnt very efficient, but if its just used to
modify some vectors and not get called a million times a day it will
do the trick.
In your text you say remove the minima then stretch
around the maxima, in your solution metacode it is the other way
around. this will influence the outcome when min and max are next to
each other, so please check which way you prefer.
Y isnt changed in this at all so it cant be performed multiple times on the same vector.
Is N (the length) of any importance later on? if not you can always just refer to "end"
Good day,
I have a question what I want to achieve without the loop if possible. As title says I need to do windowed subtraction of vectors that are not same size and then finding the mean of results.
As example, let say that we have vector a = [2 3 4 5 6] and vector b = [1 2].
Program will have to move window with smaller numbers of elements (in this example vector b) over bigger one (vector a) and make operations on that way so it starts in first two elements in vector a and make subtraction with vector b and then sum results and find mean.
In this example it will just make calculation of subtraction 2-1 = 1, 3-2 = 1, summing results 1+1=2 and divide them with 2 (because vector b is that size). Final result is 1.
Then we move window on second elements of vector a (value 3 and 4 there, or index 2 and 3) and repeat process to the last elements of vector a.
For final result we need to get vector c who consist of elements [1 2 3 4] for this example.
Is this possible to do without looping because I have data sets over 10k of size. Thanks in advance
I can solve it with only one loop, iterating through "b" (two loops in your example).
Declare vectors (as columns! This is needed for matlabs computations to work)
a = [2 3 4 5 6]';
b = [1 2]';
Declare matrix for computed results. Each column represents subtractions of elements in "a" with one of the elements in "b".
c = zeros(length(a)-length(b)+1,length(b));
for k = 1:length(b)
c(:,k) = a(k:length(a)-length(b)+k)-b(k);
end
Now just sum the elements in "c" row wise and divide by length of "b" to get the mean
result = sum(c,2)/length(b);
You can simplify this for your exact example, but this is a generic solution for any vetors "a" and "b", where "b" is the smaller vector.
This question already has answers here:
Find specific value's count in a vector
(4 answers)
Closed 8 years ago.
I have a NxM matrix for example named A. After some processes I want to count the zero elements.
How can I do this in one line code? I tried A==0 which returns a 2D matrix.
There is a function to find the number of nonzero matrix elements nnz. You can use this function on a logical matrix, which will return the number of true.
In this case, we apply nnz on the matrix A==0, hence the elements of the logical matrix are true, if the original element was 0, false for any other element than 0.
A = [1, 3, 1;
0, 0, 2;
0, 2, 1];
nnz(A==0) %// returns 3, i.e. the number of zeros of A (the amount of true in A==0)
The credits for the benchmarking belong to Divarkar.
Benchmarking
Using the following paramters and inputs, one can benchmark the solutions presented here with timeit.
Input sizes
Small sized datasize - 1:10:100
Medium sized datasize - 50:50:1000
Large sized datasize - 500:500:4000
Varying % of zeros
~10% of zeros case - A = round(rand(N)*5);
~50% of zeros case - A = rand(N);A(A<=0.5)=0;
~90% of zeros case - A = rand(N);A(A<=0.9)=0;
The results are shown next -
1) Small Datasizes
2. Medium Datasizes
3. Large Datasizes
Observations
If you look closely into the NNZ and SUM performance plots for medium and large datasizes, you would notice that their performances get closer to each other for 10% and 90% zeros cases. For 50% zeros case, the performance gap between SUM and NNZ methods is comparatively wider.
As a general observation across all datasizes and all three fraction cases of zeros,
SUM method seems to be the undisputed winner. Again, an interesting thing was observed here that the general case solution sum(A(:)==0) seems to be better in performance than sum(~A(:)).
some basic matlab to know: the (:) operator will flatten any matrix into a column vector , ~ is the NOT operator flipping zeros to ones and non zero values to zero, then we just use sum:
sum(~A(:))
This should be also about 10 times faster than the length(find... scheme, in case efficiency is important.
Edit: in the case of NaN values you can resort to the solution:
sum(A(:)==0)
I'll add something to the mix as well. You can use histc and compute the histogram of the entire matrix. You specify the second parameter to be which bins the numbers should be collected at. If we just want to count the number of zeroes, we can simply specify 0 as the second parameter. However, if you specify a matrix into histc, it will operate along the columns but we want to operate on the entire matrix. As such, simply transform the matrix into a column vector A(:) and use histc. In other words, do this:
histc(A(:), 0)
This should be equivalent to counting the number of zeroes in the entire matrix A.
Well I don't know if I'm answering well the question but you could code it as follows :
% Random Matrix
M = [1 0 4 8 0 6;
0 0 7 4 8 0;
8 7 4 0 6 0];
n = size(M,1); % Number of lines of M
p = size(M,2); % Number of columns of M
nbrOfZeros = 0; % counter
for i=1:n
for j=1:p
if M(i,j) == 0
nbrOfZeros = nbrOfZeros + 1;
end
end
end
nbrOfZeros
I'm trying to program a function (or even better it it already exists) in scilab that calculates a regular timed samples of values.
IE: I have a vector 'values' which contains the value of a signal at different times. This times are in the vector 'times'. So at time times(N), the signal has value values(N).
At the moment the times are not regular, so the variable 'times' and 'values' can look like:
times = [0, 2, 6, 8, 14]
values= [5, 9, 10, 1, 6]
This represents that the signal had value 5 from second 0 to second 2. Value 9 from second 2 to second 6, etc.
Therefore, if I want to calculate the signal average value I can not just calculate the average of vector 'values'. This is because for example the signal can be for a long time with the same value, but there will be only one value in the vector.
One option is to take the deltaT to calculate the media, but I will also need to perform other calculations:average, etc.
Other option is to create a function that given a deltaT, samples the time and values vectors to produce an equally spaced time vector and corresponding values. For example, with deltaT=2 and the previous vectors,
[sampledTime, sampledValues] = regularSample(times, values, 2)
sampledTime = [0, 2, 4, 6, 8, 10, 12, 14]
sampledValues = [5, 9, 9, 10, 1, 1, 1, 6]
This is easy if deltaT is small enough to fit exactly with all the times. If the deltaT is bigger, then the average of values or some approximation must be done...
Is there anything already done in Scilab?
How can this function be programmed?
Thanks a lot!
PS: I don't know if this is the correct forum to post scilab questions, so any pointer would also be useful.
If you like to implement it yourself, you can use a weighted sum.
times = [0, 2, 6, 8, 14]
values = [5, 9, 10, 1, 6]
weightedSum = 0
highestIndex = length(times)
for i=1:(highestIndex-1)
// Get the amount of time a certain value contributed
deltaTime = times(i+1) - times(i);
// Add the weighted amount to the total weighted sum
weightedSum = weightedSum + deltaTime * values(i);
end
totalTimeDelta = times($) - times(1);
average = weightedSum / totalTimeDelta
printf( "Result is %f", average )
Or If you want to use functionally the same, but less readable code
timeDeltas = diff(times)
sum(timeDeltas.*values(1:$-1))/sum(timeDeltas)