Finding means from fields in a structure without for loop - matlab

I have data from an experiment in a structure like this:
data.subject.trial
I need to find the means for scores on trials across all participants (e.g. what is the mean score of all participants on trial x?).
I can get there using a for loop as below but it feels like there should be an easier one-liner to achieve the same thing (values in "trial" are numeric in this instance). Any tips? Many thanks!
for i = 1:length(data.subject)
for j = 1:length(data.subject(i).trial)
a(i,j) = data.subject(i).trial(j);
end
end
trialMeans = mean(a);

I think I've stumbled across an answer to my own question...
A = cell2mat({data.subject.trial}); % Put all scores from all trials into 1 vector
B = reshape(A,[],length(data.subject))'; % Reshape into rows of however many subjects there are
trialMeans = mean(B);
Thanks!

Related

MATLAB randsample in function form

The goal is generating six lotto numbers, but obviously they have to be unique. This has to be written in function form though, following is the equivalent using the library:
(randsample(42,6))'
My idea was to create the vector with all possibilities, pick one out at a time through index and making it impossible to pick this one again by grabbing it out before the next one is picked.
function numbers = lottonumbers()
pool = 1:42;
numbers = zeros(1,6);
for i=1:6
for j=42-i
randIndex = round(1+j*rand);
randNumber = pool(randIndex);
numbers(i) = randNumber;
if randIndex==1
pool = pool(2:end);
else if randIndex==length(pool)
pool = pool(1:(end-1));
else
pool = [pool(1:randIndex-1), pool(randIndex+1:end)];
end
end
end
end
Since I'm pretty noob at MATLAB (just noob at programming really) and since I solved it myself while asking the question, I'm just going to leave it here and ask you guys for suggestions (better style, other algorithm...)
Lotto is based on permutations where the order does not play a role.
% p = randperm(n,k) returns a row vector containing k unique integers selected randomly from 1 to n inclusive.
randperm( 42, 6 )
should do the trick.
From the code: "This is sometimes referred to as a K-permutation of 1:N or as sampling without replacement."
Another approach is to use rejection sampling: generate the numbers independently, and if they are not all different start again. This is efficient as long as the chance of numbers not being all different is small.
N = 6;
M = 42;
done = false;
while ~done
result = randi(M,1,N); %// generate N numbers from [1,...,M]
done = all(diff(sort(result))); %// if all are different, we're done
end

How can I vectorize a large number of subtractions in Matlab

I have one array (the "true" cartesian coordinates) which is of size (natoms*3,1) where natoms is the number of atoms. I also have a large number (500,000) of observations stored in an array of size (nobs, natoms*3). Now, I want to create an array of the differences between all observations against the true coordinates. I would like to simply vectorize this by doing something like
for iat = 1:natoms
xyz_dif = xyz_obs(:, 3*iat-2:3*iat) - xyz_true(3*iat-2:3*iat)
end
but this does not work. Instead I am forced to go through each of the observtions like so:
for iat = 1:natoms
for iobs = 1:nobs
xyz_diff(iobs, 3*iat-2:3*iat) = xyzs(iobs, 3*iat-2:3*iat) - xyz_true(3*iat-2:3*iat)
end
end
but this seems quite inefficient. Is there a faster, more efficient way to do this?
Thanks.
use bsxfun
xyz_diff = bsxfun(#minus, xyz_true', xyz_obs)
an alternative solution, which in my view is more readable is to use matrix multiplication:
xyz_diff = xyz_obs-ones(nobs,1)*xyz_true;

Matlab: Avoid for-loop by using clever matrix indexing & find? How?

I've been getting into Matlab more and more lately and another question came up during my latest project.
I generate several rectangles (or meshs) within an overall boundary.
These meshs can have varying spacings/intervals.
I do so, because I want to decrease the mesh/pixel resolution of certain areas of a digital elevation model. So far, everything works fine.
But because the rectangles can be chosen in a GUI, it might happen that the rectangles overlap. This overlap is what I want to find, and remove. Would they have the same spacing, e.g. rectangle 1&2 would look something like this:
[t1x, t1y] = meshgrid(1:1:9,1:1:9);
[t2x, t2y] = meshgrid(7:1:15,7:1:15);
[t3x, t3y] = meshgrid(5:1:17,7:1:24);
In this case, I could just use unique, to find the overlapping areas.
However, they look more like this:
[t1x, t1y] = meshgrid(1:2:9,1:2:9);
[t2x, t2y] = meshgrid(7:3:15,7:3:15);
[t3x, t3y] = meshgrid(5:4:17,7:4:24);
Therefore, unique cannot be applied, because mesh 1 might very well overlap with mesh 2 without having the same nodes. For convenience and further processing, all rectangles / meshes are brought into column notation and put in one result matrix within my code:
result = [[t1x(:), t1y(:)]; [t2x(:), t2y(:)]; [t3x(:), t3y(:)]];
Now I was thinking about using 2 nested for-loops to solve this problem, sth like this (which does not quite work yet):
res = zeros(length(result),1);
for i=1:length(result)
currX = result(i,1);
currY = result(i,2);
for j=1:length(result)
if result(j,1)< currX < result(j+1,1) && result(j,2)< currY < result(j+1,2)
res(j) = 1;
end
end
end
BUT: First of all, this does not quite work yet, because I get an out of bounds error due to length(result)=j+1 and moreover, res(j) = 1 seems to get overwritten by the loop.
But this was just for testing and demonstratin anyway.
Because the meshes shown here are just examples, and the ones I use are fairly big, the result Matrix contains up to 2000x2000 = 4 mio nodes --> lenght(result) ~4mio.
Putting this into a nested for-loop running over the entire length will most likely kill my memory.
Therefore I was hoping to find a sophisticade solution which does not require a nested loop, but takes advantage of Matlabs find and clever matrix indexing.
I am not able to think of something, but was hoping to get help here.
Discussions and help is very much appreciated!
Cheers,
Theo
Here follows a quick stab (not extensively tested):
% Example meshes
[t1x, t1y] = meshgrid(1:2:9,1:2:9);
[t2x, t2y] = meshgrid(7:3:15,7:3:15);
% Group points for convenience
A = [t1x(:), t1y(:)];
B = [t2x(:), t2y(:)];
% Compare which points of A within edges of B (and viceversa)
idxA = A(:,1) >= B(1,1) & A(:,1) <= B(end,1) & A(:,2) >= B(1,2) & A(:,2) <= B(end,2);
idxB = B(:,1) >= A(1,1) & B(:,1) <= A(end,1) & B(:,2) >= A(1,2) & B(:,2) <= A(end,2);
% Plot result of identified points
plot(A(:,1),A(:,2), '*r')
hold on
plot(B(:,1),B(:,2), '*b')
plot([A(idxA,1); B(idxB,1)], [A(idxA,2); B(idxB,2)], 'sk')
I squared the points that were identified as overlapping:
Also, related to your question is this Puzzler: overlapping rectangles by Doug Hull of TMW.

Adjacency matrix from edge list (preferrably in Matlab)

I have a list of triads (vertex1, vertex2, weight) representing the edges of a weighted directed graph. Since prototype implementation is going on in Matlab, these are imported as a Nx3 matrix, where N is the number of edges. So the naive implementation of this is
id1 = L(:,1);
id2 = L(:,2);
weight = L(:,3);
m = max(max(id1, id2)) % to find the necessary size
V = zeros(m,m)
for i=1:m
V(id1(i),id2(i)) = weight(i)
end
The trouble with tribbles is that "id1" and "id2" are nonconsecutive; they're codes. This gives me three problems. (1) Huge matrices with way too many "phantom", spurious vertices, which distorts the results of algorithms to be used with that matrix and (2) I need to recover the codes in the results of said algorithms (suffice to say this would be trivial if id codes where consecutive 1:m).
Answers in Matlab are preferrable, but I think I can hack back from answers in other languages (as long as they're not pre-packaged solutions of the kind "R has a library that does this").
I'm new to StackOverflow, and I hope to be contributing meaningfully to the community soon. For the time being, thanks in advance!
Edit: This would be a solution, if we didn't have vertices at the origin of multiple vertices. (This implies a 1:1 match between the list of edge origins and the list of identities)
for i=1:n
for j=1:n
if id1(i) >0 & i2(j) > 0
V(i,j) = weight(i);
end
end
end
You can use the function sparse:
sparse(id1,id2,weight,m,m)
If your problem is that the node ID numbers are nonconsecutive, why not re-map them onto consecutive integers? All you need to do is create a dictionary of all unique node ID's and their correspondence to new IDs.
This is really no different to the case where you're asked to work with named nodes (Australia, Britain, Canada, Denmark...) - you would map these onto consecutive integers first.
You can use GRP2IDX function to convert your id codes to consecutive numbers, and ids can be either numerical or not, does not matter. Just keep the mapping information.
[idx1, gname1, gmap1] = grp2idx(id1);
[idx2, gname2, gmap2] = grp2idx(id2);
You can recover the original ids with gmap1(idx1).
If your id1 and id2 are from the same set you can apply grp2idx to their union:
[idx, gname,gmap] = grp2idx([id1; id2]);
idx1 = idx(1:numel(id1));
idx2 = idx(numel(id1)+1:end);
For the reordering see a recent question - how to assign a set of coordinates in Matlab?
You can use ACCUMARRAY or SUB2IND to solve this problem.
V = accumarray([idx1 idx2], weight);
or
V = zeros(max(idx1),max(idx2)); %# or V = zeros(max(idx));
V(sub2ind(size(V),idx1,idx2)) = weight;
Confirm if you have non-unique combinations of id1 and id2. You will have to take care of that.
Here is another solution:
First put together all your vertex ids since there might a sink vertex in your graph:
v_id_from = edge_list(:,1);
v_id_to = edge_list(:,2);
v_id_all = [v_id_from; v_id_to];
Then find the unique vertex ids:
v_id_unique = unique(v_id_all);
Now you can use the ismember function to get the mapping between your vertex ids and their consecutive index mappings:
[~,from] = ismember(v_id_from, v_id_unique);
[~,to] = ismember(v_id_to, v_id_unique);
Now you can use sub2ind to populate your adjacency matrix:
adjacency_matrix = zeros(length(from), length(to));
linear_ind = sub2ind(size(adjacency_matrix), from, to);
adjacency_matrix(linear_ind) = edge_list(:,3);
You can always go back from the mapped consecutive id to the original vertex id:
original_vertex_id = v_id_unique(mapped_consecutive_id);
Hope this helps.
Your first solution is close to what you want. However it is probably best to iterate over your edge list instead of the adjacency matrix.
edge_indexes = edge_list(:, 1:2);
n_edges = max(edge_indexes(:));
adj_matrix = zeros(n_edges);
for local_edge = edge_list' %transpose in order to iterate by edge
adj_matrix(local_edge(1), local_edge(2)) = local_edge(3);
end

Storing Results of a Operation in a Matrix

Let's say I want to take the sin of 1 through 100 (in degrees).
I come from a C background so my instinct is to loop 1 through 100 in a for loop (something I can do in Matlab). In a matrix/vector/array I would store sin(x) where x is the counter of the for loop.
I cannot figure out how to do this in Matlab. Do I create a array like
x = [1 .. 100];
And then do
x[offset] = numberHere;
I know the "correct" way. For operations like addition you use .+ instead of + and with a function like sin I'm pretty sure you just do
resultArray = sin(x);
I just want to know that I could do it the C way in case that ever came up, thus my question here on SO. :)
% vectorized
x = sin((1:100)*pi/180);
or
% nonvectorized
x=[];
for i = 1:100
x(i) = sin(i*pi/180);
end
I beleive this can actually be done as a one liner in MatLab:
x = sind(1:100);
Note that you use sind() instead of sin(). Sin() takes radians as arguments.
As others have already pointed out there are for-loops in MATLAB as well.
help for
should give you everything you need about how it works. The difference from C is that the loop can go over objects and not only an integer:
objects = struct('Name', {'obj1', 'obj2'}, 'Field1', {'Value1','Value2'});
for x = objects
disp(sprintf('Object %s Field1 = %d', x.Name, x.Field1))
end
That example will output:
Object obj1 Field1 = Value1
Object obj2 field1 = Value2
This could have been done as
for i=1:length(objects)
x = objects(i);
disp(sprintf('Object %s Field1 = %d', x.Name, x.Field1))
end
And now to what I really wanted to say: If you ever write a for loop in MATLAB, stop and think!. For most tasks you can vectorize the code so that it uses matrix operations and builtin functions instead of looping over the data. This usually gives a huge speed gain. It is not uncommon that vectorized code executes 100x faster than looping code. Recent versions of MATLAB has JIT compilation which makes it less dramatic than before, but still: Always vectorize if you can.
#Daniel Fath
I think you'll need the final line to read
resultArray(i) = sin(x(i)) (rather than x(1))
I think you can also do:
for i = x
...
though that will behave differently if x is not a simple 1-100 vector
Hmm, if understand correctly you want a loop like structure
resultArray = zeros(1,length(x)) %% initialization aint necessary I just forgot how you dynamically add members :x
for i = 1:length(x) %% starts with 1 instead of zero
resultArray(i) = sin(x(i))
end
Warning I didn't test this but it should be about right.