I have a vector containing a time series with different values and some missing values inbetween that are set to zero:
X=[0,0,2,0,5,0,0,0,4,0];
I want to create a new vector where the missing values (zeros) are populated by the previous value if one exist so that I get a new vector looking like:
Z=[0,0,2,2,5,5,5,5,4,4];
I have been browsing through the Matlab help and forums like this to find a neat and suitable function that would solve this for me with a one line solution or similar, but I have failed to do so. I can solve the problem through a few different steps according to below but I am guessing that there must be a better and easier solution available?
Current solution:
X=[0,0,2,0,5,0,0,0,4,0];
ix=logical(X);
Y = X(ix);
ixc=cumsum(ix);
Z=[zeros(1,sum(~logical(ixc))) Y(ixc(logical(ixc)))];
This does the trick, but it seems like an overly complicated solution to a simple problem, so can anyone help me with a better one? Thanks.
Here's a somewhat simpler version using cumsum:
X=[0,0,2,0,5,0,0,0,4,0];
%# find the entries where X is different from zero
id = find(X);
%# If we want to run cumsum on X directly, we'd
%# have the problem that the non-zero entry to the left
%# be added to subsequent non-zero entries. Thus,
%# subtract the non-zero entries from their neighbor
%# to the right
X(id(2:end)) = X(id(2:end)) - X(id(1:end-1));
%# run cumsum to fill in values from the left
Y = cumsum(X)
Y =
0 0 2 2 5 5 5 5 4 4
Here's a little something I wrote up. Does this do the trick?
% INPUT: the array you would like to populate
% OUTPUT: the populated array
function popArray = populate(array)
popArray = array;
% Loops through all the array elements and if it equals zero, replaces it
% with the previous element
%
% Since there is no element before the first to potentially populate it, this
% starts with the second element.
for ii = 2:length(popArray)
if array(ii) == 0;
popArray(ii)= popArray(ii-1);
end
end
disp(popArray);
Let me suggest another vectorized solution (though I like the one by #Jonas better):
X = [0 0 2 0 5 0 0 0 4 0]
id = find(X);
X(id(1):end) = cell2mat( arrayfun(#(a,b)a(ones(1,b)), ...
X(id), [diff(id) numel(X)-id(end)+1], 'UniformOutput',false) )
Related
I need to find all possible combinations of numbers 1:8 such that sum of all elements is equal to 8
The combinations need to be arranged in an ascending order.
Eg
1 7
2 2 4
1 3 5
1 2 2 3
1 1 1 1 1 1 1 1
A number can repeat itself. But a combination must not..
i.e 1 2 2 3 and 2 1 2 3
I need the the solution in ascending order So there will be only one possibility of every combination
I tried a few codes online suggested on Find vector elements that sum up to specific number in MATLAB
VEC = [1:8];
NUM = 8;
n = length(VEC);
finans = zeros(2^n-1,NUM);
for i = 1:(2^n - 1)
ndx = dec2bin(i,n) == '1';
if sum(VEC(ndx)) == NUM
l = length(VEC(ndx));
VEC(ndx)
end
end
but they dont include the possibilities where the numbers repeat.
I found a better approach through recursion and it's more elegant (I like elegant) and faster than my previous attempt (0.00399705213 seconds on my computer).
EDIT: You will need my custom function stretchmat.m that stretches a vector to fit the size of another matrix. Kinda like repmat but stretching the first parameter (see help for details). Very useful!
script.m
% Define funciton to prepend a cell x with a variable i
cellprepend = #(x,i) {[i x]};
% Execute and time function
tic;
a = allcomb(cellprepend,1,8); % Solution in a
toc;
allcomb.m
function a = allcomb( cellprepend, m, n )
% Add entire block as a combination
a{1} = n;
% Exit recursion if block size 1
if n == 1
return;
end
% Recurse cutting blocks at different segments
for i = m:n/2
b = allcomb(cellprepend,i,n-i);
a = [a cellfun( cellprepend, b, num2cell( stretchmat( i, b ) ) )];
end
end
So the idea is simple, for solutions that add to 8 is exhaustive. If you look for only valid answers, you can do a depth first search by breaking up the problem into 2 blocks. This can be written recursively as I did above and is kinda similar to Merge Sort. The allcomb call takes the block size (n) and finds all the ways of breaking it up into smaller pieces.
We want non-zero pieces so we loop it from 1:n-1. It then prepends the first block to all the combinations of the second block. By only doing all comb on one of the blocks, we can ensure that all solutions are unique.
As for the sorting, I'm not quite sure what you mean by ascending. From what I see, you appear to be sorting from the last number in ascending order. Can you confirm? Any sort can be appended to the end of script.m.
EDIT 2/3 Notes
For the permutatively unique case, the code can be found here
Thanks to #Simon for helping me QA the code multiple times
EDIT: Look at my second more efficient answer!
The Naive approach! Where the cartprod.m function can be found here.
% Create all permutations
p(1:8) = {0:8};
M = fliplr( cartprod( p{:} ) );
% Check sums
r = sum( M, 2 ) == 8;
M = M(sum( M, 2 ) == 8,:); % Solution here
There are definitely more efficient solutions than this but if you just need a quick and dirty solution for small permutations, this will work. Please note that this made Matlab take 3.5 GB of RAM to temporarily store the permutations.
First save all combinations with repetitions in a cell array. In order to do that, just use nmultichoosek.
v = 1 : 8;
combs = cell(length(v),0);
for i = v
combs{i} = nmultichoosek(v,i);
end
In this way, each element of combs contains a matrix where each row is a combination. For instance, the i-th row of combs{4} is a combination of four numbers.
Now you need to check the sum. In order to do that to all the combinations, use cellfun
sums = cellfun(#(x)sum(x,2),combs,'UniformOutput',false);
sums contains the vectors with the sum of all combinations. For
instance, sums{4} has the sum of the number in combination combs{4}.
The next step is check for the fixed sum.
fixed_sum = 10;
indices = cellfun(#(x)x==fixed_sum,sums,'UniformOutput',false);
indices contains arrays of logical values, telling if the combination satisfies the fixed sum. For instance, indices{4}(1) tells you if the first combination with 4 numbers sums to fixed_sum.
Finally, retrieve all valid combinations in a new cell array, sorting them at the same time.
valid_combs = cell(length(v),0);
for i = v
idx = indices{i};
c = combs{i};
valid_combs{i} = sortrows(c(idx,:));
end
valid_combs is a cell similar to combs, but with only combinations that sum up to your desired value, and sorted by the number of numbers used: valid_combs{1} has all valid combinations with 1 number, valid_combs{2} with 2 numbers, and so on. Also, thanks to sortrows, combinations with the same amount of numbers are also sorted. For instance, if fixed_sum = 10 then valid_combs{8} is
1 1 1 1 1 1 1 3
1 1 1 1 1 1 2 2
This code is quite efficient, on my very old laptop I am able to run it in 0.016947 seconds.
The title is a little vague, but I'm not sure how to put it differently. What I have is a pretty long array, say of length 10000, that contains the values 1,2 and 3. They are often located in long strings of the same number, such as
[1111111111122222222211111222222222233333332222]
The data denote 3 states of something, which are 1, 2 and 3. The only transitions possible are 1 <-> 2, 2 <-> 3, not 1 <-> 3.
In general the strings are very long, and it is thus unlikely to observe something like [111121111], where it changes to 2 for a single element and then back. However, due to errors in the measurements these things do come in, and I'm trying to find a way to filter them out in MATLAB. So what I want to do is remove all elements, for which the number of consecutive identical elements is smaller than some number X. If it is very difficult to do for general X, X = 1 is a very good start!
Personally, I have no idea how to tackle this. I imagine using diff can tell you where the elements change, and when they change again, en then somehow by denoting their indices you can find the length of the sequences. Then, using some if conditions, you can remove them. This should probably be done backwards, as the size of the array will change. I'm still trying to get something working with these things, but no success so far. Maybe someone could give me a hint?
Approach 1 (uses bsxfun. Inefficient. I recommend second approach.1 )
The following code detects the beginning of short runs. What to do then is not clear from your question (Remove those entries? Fill them with the preceding value?).
x = '1111111111122222222211111222222222233333332222'; %// data (string)
len = 5; %// runs of this length or shorter will be detected
ind = find(diff(x-'0')~=0) + 1; %// index of changes
mat = bsxfun(#minus, ind.', ind); %'// distance between changes
mat = tril(mat); %// only distance to *previous* changes, not to *later* changes
mat(mat==0) = NaN;
result = ind(any(mat<=len)); %// index of beginning of short runs
In this example the result is
result =
21
Note that the last run is not considered. So in the example, even though the last run is shorter than len, it is not detected as too short. If you need to also detect that run, change the ind line to
ind = find([diff(x-'0') inf]~=0) + 1;
In this case,
result =
21 43
Approach 2 (uses diff. Much more efficient than approach 1.)
It sufficies to compare each index with the preceding index, instead of with all other indices as above. Also, as per comments, short runs need to be replaced with the preceding value; and last run should also be detected if it's short:
%// Data
x = '1111111111122222222211111222222222233333332222'; %// data (string)
len = 5; %// runs of this length or shorter will be detected
%// Detect beginning of short runs
ind = find([diff(x-'0') inf]~=0) + 1;
starts = ind(diff(ind)<=len); %// index of beginning of short runs
%// Replace short runs with preceding value
ind = [ind numel(x)+1]; %// extend ind in case last run was detected as short
for k = find(diff(ind)<=len)
x(ind(k):ind(k+1)-1) = x(ind(k)-1); %// replace
end
1 Why do I keep approach 1, then? Well, it got me four upvotes before approach 2 ocurred to me, so there must be something to it (I suspect that has something to do with bsxfun...)
This could be one approach -
%%// Input string
a1 = '111111111112222222221111122222222221111133333332222'
th = 10 %%// Less than or equal to 10 consecutive oocurances shall be removed
str1 = num2str(a1=='1','%1d')
t1 = strfind(['0' str1 '0'],'01')' %%//'
t2 = strfind(['0' str1 '0'],'10')' %%//'
t3 = [t1 t2-1]
t4 = t3([t2-t1]<=th,:)
ind1 = true(size(a1))
for k=1:size(t4,1)
ind1(t4(k,1):t4(k,2))=false;
end
out = a1(ind1) %%// Output string
Output -
out =
11111111111222222222222222222233333332222
I need to calculate the frequency of each value in another vector in MATLAB.
I can use something like
for i=1:length(pdata)
gt(i)=length(find(pf_test(:,1)==pdata(i,1)));
end
But I prefer not to use loop because my dataset is quite large. Is there anything like histc (which is used to find the frequency of values in one vector) to find the frequency of one vector value in another vector?
If your values are only integers, you could do the following:
range = min(pf_test):max(pf_test);
count = histc(pf_test,range);
gt = count(ismember(range,a));
gt(~ismember(unique(a),b)) = 0;
If you can't guarantee that the values are integers, it's a bit more complicated. One possible method of it would be the following:
%restrict yourself to values that appear in the second vector
filter = ismember(pf_test,pdata);
% sort your first vector (ignore this if it is already sorted)
spf_test = sort(pf_test);
% Find the first and last occurrence of each element
[~,last] = unique(spf_test(filter));
[~,first] = unique(spf_test(filter),'first');
% Initialise gt
gt = zeros(length(pf_test));
% Fill gt
gt(filter) = (last-first)+1;
EDIT: Note that I may have got the vectors the wrong way around - if this doesn't work as expected, switch pf_test and pdata. It wasn't immediately clear to me which was which.
You mention histc. Why are you not using it (in its version with two input parameters)?
>> pdata = [1 1 3 2 3 1 4 4 5];
>> pf_test = 1:6;
>> histc(pdata,pf_test)
ans =
3 1 2 2 1 0
everyone. Let's say I have following (3x3)matrix A:
0 1 3
0 0 3
0 0 0
My question is how to find out the unique value in that matrix by using matlab?
In this case, the result should be 1.
I have tried used the
value=unique(A)
but it returned a vector {0;1;3} is not what I want.
I much appreciate if you guys can help me solve this problem. Thank you!
Here is a short one
value = A(sum(bsxfun(#eq, A(:), A(:).'))==1);
It compares all pairs of elements in the matrix and counts how many times they are equal and returns the ones that have been counted only once.
Here is a one line alternative:
find(histc(A(:), 0:3)==1) - 1
or more generally:
find(histc(A(:), min(A(:)):max(A(:)))==1) + min(A(:)) - 1
OR to generalize it even further (to handle floats)
p = 0.1; %//Set a precision.
(find(histc(A(:), min(A(:)):p:max(A(:)))==1) + min(A(:)) - 1)*p
The method of counting I generally prefer uses sort and diff as follows,
[x,sortinds] = sort(A(:));
dx = diff(x);
thecount = diff(find([1; dx; 1]));
uniqueinds = [find(dx); numel(x)];
countwhat = x(uniqueinds);
Then you grab the value(s) with only one occurrence:
lonelyValues = countwhat(thecount==1)
If you want the location of these value(s) in the matrix:
valueInds = sortinds(uniqueinds(thecount==1))
[valRows,valCols] = ind2sub(size(A),valueInds)
If you expect to any NaN and/or Inf values in your matrix, you have to do additional bookkeeping, but the idea is the same.
Here is another alternative using unique() and hist():
count elements:
[elements,indices,~] = unique(A); % get each value with index once
counts = hist(A(:), elements); % count occurrences of elements within a
get elements:
uniqueElements = elements(counts==1); % find unique elements
get indices:
uniqueIndices = indices(counts==1); % find unique indices
[uRow, uCol] = ind2sub(size(A),uniqueIndices); % get row/column representation
I have a n x 2 matrix in Octave, and I would like to find every row where the matrix(row, 1) and matrix(row, 2) elements are non-zero. I could use a for loop like this:
[nrows, ncols] = size(data);
for i = 1:nrows
if(data(i, 1) ~= 0 && data(i, 2) ~= 0)
% Do something
end
end
The issue with that is that n is about 3 million, and iteration in Octave takes for ever. I feel like there is a way to do it with find, but I haven't been able to figure it out yet.
Anyone have any advice?
Thanks!
You can create use logical indexing:
idx = all(data(:,1:2)~=0, 2);
The resulting vector idx contains 1s in every row where both cells are non-zero and 0 otherwise.
I think in this case (since it is related with zero values) the following also should work
idx=(data(:,1).*data(:,2)~=0)
But #H.Muster's solution is the one that works in all cases, hence a better one.
If performance is the issue: maybe try converting both columns to logical:
useful = and(logical(data(:,1)),logical(data(:,2)))
Then you can again use logical indexing:
filtered = data(useful,:)