I'm currently using Matlab and I am plotting the contents of the rows of a matrix, where each column is an independent data set. As the matrix is large I don't want to have to go through the tedius task of writing up the plot labels for each data set individually, so I was wondering if there is a specific way to include a handle/name for each column in such a way that it will automatically apply the plot label, and will adjust accordingly if columns are added or removed from the matrix?
Thanks!
Specifics, if they help:
Amplified spontaneous emission (ASE) in an optical fibre amplifier. Rows act as storage for a discretised ASE spectrum, columns are a given position along the fibre amplifier (it is this position -- the distance along the fibre corresponding to the column -- which I want to use as the label) and each element contains power information. The plot gives spectral power of ASE in the fibre for different positions along its length.
If by labels you mean the plot legend, you can do that by using cells. Consider matrix A
A = repmat([1:3], 3, 1)
A =
1 2 3
1 2 3
1 2 3
You can call plot to plot the columns of the matrix
plot(A);
Here, you will get 3 horizontal lines at y=1, 2 and 3. You can create your legend as follows
l{1} = 'dataset1';
l{2} = 'dataset2';
l{3} = 'dataset3';
Then you type
legend(l)
to show the legend. However, no one will create the legend for you, so you must create the cell array yourself. You can do it automatically, of course, e.g. the above legend can be created by a simple loop
for i=1:size(A, 2)
l{i} = ['dataset' num2str(i)];
end
Related
I have got a matrix of AirFuelRatio values at certain engine speeds and throttlepositions. (eg. the AFR is 14 at 2500rpm and 60% throttle)
The matrix is now 25x10, and the engine speed ranges from 1200-6000rpm with interval 200rpm, the throttle range from 0.1-1 with interval 0.1.
Say i have measured new values, eg. an AFR of 13.5 at 2138rpm and 74,3% throttle, how do i merge that in the matrix? The matrix closest values are 2000 or 2200rpm and 70 or 80% throttle. Also i don't want new data to replace the older data. How can i make the matrix take this value in and adjust its values to take the new value in account?
Simplified i have the following x-axis values(top row) and 1x4 matrix(below):
2 4 6 8
14 16 18 20
I just measured an AFR value of 15.5 at 3 rpm. If you interpolate the AFR matrix you would've gotten a 15, so this value is out of the ordinary.
I want the matrix to take this data and adjust the other variables to it, ie. average everything so that the more data i put in the more reliable and accurate the matrix becomes. So in the simplified case the matrix would become something like:
2 4 6 8
14.3 16.3 18.2 20.1
So it averages between old and new data. I've read the documentation about concatenation but i believe my problem can't be solved with that function.
EDIT: To clarify my question, the following visual clarification.
The 'matrix' keeps the same size of 5 points whil a new data point is added. It takes the new data in account and adjusts the matrix accordingly. This is what i'm trying to achieve. The more scatterd data i get, the more accurate the matrix becomes. (and yes the green dot in this case would be an outlier, but it explains my case)
Cheers
This is not a matter of simple merge/average. I don't think there's a quick method to do this unless you have simplifying assumptions. What you want is a statistical inference of the underlying trend. I suggest using Gaussian process regression to solve this problem. There's a great MATLAB toolbox by Rasmussen and Williams called GPML. http://www.gaussianprocess.org/gpml/
This sounds more like a data fitting task to me. What you are suggesting is that you have a set of measurements for which you wish to get the best linear fit. Instead of producing a table of data, what you need is a table of values, and then find the best fit to those values. So, for example, I could create a matrix, A, which has all of the recorded values. Let's start with:
A=[2,14;3,15.5;4,16;6,18;8,20];
I now need a matrix of points for the inputs to my fitting curve (which, in this instance, lets assume it is linear, so is the set of values 1 and x)
B=[ones(size(A,1),1), A(:,1)];
We can find the linear fit parameters (where it cuts the y-axis and the gradient) using:
B\A(:,2)
Or, if you want the points that the line goes through for the values of x:
B*(B\A(:,2))
This results in the points:
2,14.1897 3,15.1552 4,16.1207 6,18.0517 8,19.9828
which represents the best fit line through these points.
You can manually extend this to polynomial fitting if you want, or you can use the Matlab function polyfit. To manually extend the process you should use a revised B matrix. You can also produce only a specified set of points in the last line. The complete code would then be:
% Original measurements - could be read in from a file,
% but for this example we will set it to a matrix
% Note that not all tabulated values need to be present
A=[2,14; 3,15.5; 4,16; 5,17; 8,20];
% Now create the polynomial values of x corresponding to
% the data points. Choosing a second order polynomial...
B=[ones(size(A,1),1), A(:,1), A(:,1).^2];
% Find the polynomial coefficients for the best fit curve
coeffs=B\A(:,2);
% Now generate a table of values at specific points
% First define the x-values
tabinds = 2:2:8;
% Then generate the polynomial values of x
tabpolys=[ones(length(tabinds),1), tabinds', (tabinds').^2];
% Finally, multiply by the coefficients found
curve_table = [tabinds', tabpolys*coeffs];
% and display the results
disp(curve_table);
I want to carry out hierarchical clustering in Matlab and plot the clusters on a scatterplot. I have used the evalclusters function to first investigate what a 'good' number of clusters would be using different criteria values eg Silhouette, CalinskiHarabasz. Here is the code I used for the evaluation (x is my data with 200 observations and 10 variables):
E = evalclusters(x,'linkage','CalinskiHarabasz','KList',[1:10])
%store kmean optimal clusters
optk=E.OptimalK;
%save the outouts to a structure
clust_struc(1).Optimalk=optk;
clust_struc(1).method={'CalinskiHarabasz'}
I then used code similar to what I have found online:
gscatter(x(:,1),x(:,2),E.OptimalY,'rbgckmr','xod*s.p')
%OptimalY is a vector 200 long with the cluster numbers
and this is what I get:
My question may be silly, but I don't understand why I am only using the first two columns of data to produce the scatter plot? I realise that the clusters themselves are being incorporated through the use of the Optimal Y, but should I not be using all of the data in x?
Each row in x is an observation with properties in size(x,2) dimensions. All this dimensions are used for clustering x rows.
However, when plotting the clusters, we cannot plot more than 2-3 dimensions so we try to represent each element with its key properties. I'm not sure that x(:,1),x(:,2) are the best option, but you have to choose 2 for a 2-D plot.
Usually you would have some property of interest that you want to plot. Have a look at the example in MATLAB doc: the fisheriris data has 4 different variables - the length and width measurements from the sepals and petals of three species of iris flowers. It is up to you to decide which you want to plot against each other (in the example they choosed Petal Length and Petal Width).
Here is a comparison between taking Petals measurements and Sepals measurements as the axis for plotting the grouping:
I have a 2 column matrix, where in each row are observations for healthy (column 1) and not healthy (2 column) patients. Also, I have 5 partition values which should be used to plot ROC curve.
Could you please help me to understand how to get the inputs from this data for the perfcurve function?
Thank you for any reply!
I've made a small script that shows the basics of a perfcurve given a two column matrix input. If you execute this in MATLAB and take a careful look at it then you should have no trouble using perfcurve
%Simulate your data as Gaussian data with 1000 measurements in each group.
%Lets give them a mean difference of 1 and a standard deviation of 1.
Data = zeros(1000,2);
Data(:,1) = normrnd(0,1,1000,1);
Data(:,2) = normrnd(1,1,1000,1);
%Now the data is reshaped to a vector (required for perfcurve) and I create the labels.
Data = reshape(Data,2000,1);
Labels = zeros(size(Data,1),1);
Labels(end/2+1:end) = 1;
%Your bottom half of the data (initially second column) is now group 1, the
%top half is group 0.
%Lets set the positive class to group 1.
PosClass = 1;
%Now we have all required variables to call perfcurve. We will give
%perfcurve the 'Xvals' input to define the values at which the ROC curve is
%calculated. This parameter can be left out to let matlab calculate the
%curve at all values.
[X Y] = perfcurve(Labels,Data,PosClass, 'Xvals', 0:0.25:1);
%Lets plot this
plot(X,Y)
%One limitation in scripting it like this is that you must have equal group
%sizes for healthy and sick. If you reshape your Data matrix to a vector
%and keep a seperate labels vector then you can also handle groups of
%different sizes.
I am trying to plot 3 vectors onto matlab GUI in a serial object's callback.
I want to plot this on axes handle but the problem is it only plot last vector;
plot(handles.axes1,sensor1,'r');
plot(handles.axes1,sensor2,'b');
plot(handles.axes1,sensor3,'g');
I searched on internet and find that this issue can be solved with hold on and hold of feature so I tried this
plot(handles.axes1,sensor1,'r');
hold on ;
plot(handles.axes1,sensor2,'b');
plot(handles.axes1,sensor3,'g');
hold off;
but in this case a new figure is opened(dont know why) and again only the last plot is drawn.
I am stucked. If any one have idea of what would be the issue?
Thanks
I'm not sure why your first try using "hold" didn't work. Seems like it should have.
But in any case, you can get the desired behavior in a single command:
plot(handles.axes1,length(sensor1),sensor1,'r',...
length(sensor2),sensor2,'b',...
length(sensor3),sensor3,'g');
This specifies both an X = length(sensor_) and a Y = sensor_ to the plot command. When you only give plot a Y input, it assumes an X of length(Y). But you can't combine multiple traces in a single plot command by giving only the Y input for each, because it will try to treat the inputs as X,Y pairs.
As the vectors are the same length we can simply combine them as the columns of a matrix and then plot the matrix
plot(handles.axes1,[sensor1',sensor2',sensor3'])
However these will have the default colour order. Without specifying x values setting colors within the plot command is tricky. However (luckily) the default order starts:
blue,green,red...
so swapping the column order will plot the lines with the colours requested
plot(handles.axes1,[sensor2',sensor3',sensor1'])
(this assumes the vectors are rows, if they are columns don't transpose them)
I am creating a program where the user can select multiple files to plot and compare data. The program can properly graph the data, the problem I have encountered is within the legend.
I tried posting an image, however I do not have a high enough reputation. So I will try to explain the graph in detail. Two sets of points are plotted (two matrices of different sizes). The curves are labeled by the user, and in this example they are: "PS, Cs" and "PS, Po."
The program successfully plots the "PS, Cs" curve with the red squares then plots the "PS, Po" with the blue circles however the legend continues to show the red squares for both sets of points. Below is the loop within the code that does the plotting.
fig = small_group_struct;
mystyles = {'bo','rs','go'};
mat_len = size(small_group_struct,2);
for q = 1:mat_len
plotstyle = mystyles{mod(q,mat_len)+1};
semilogy(1:size(small_group_struct(1).values),small_group_struct(q).values,plotstyle);
hold all;
[~,~,~,current_entries] = legend;
legend([current_entries {small_group_struct(q).name}]);
end
hold off;
%legend(small_group_struct.values,{small_group_struct.name});
Other threads that I have seen suggested putting the plot command into a handle but since each set of points is a nxm matrix of different sizes, the program does not like this.
Also, as mentioned at the beginning the user will select the number of files and while this is typically two, it will not always be the case and why I am trying to plot it within a for loop.
Any suggestions and comments would be greatly appreciated.
EDIT: I now have a high enough reputation to post images, so here is a screenshot of the graph
You can use handles to specify what labels go with what data in the legend.
You say that "each set of points is a nxm matrix of different sizes." Plotting an mxn matrix creates n line objects and returns n handles. You can keep track of all of these handles and assign labels to them when you create the legend.
Here's an example:
% Cell array of data. Each element is a different size.
data = {rand(100, 1), rand(150, 2)};
styles = {'ro', 'gs'};
% Vector to store the handles to the line objects in.
h = [];
figure
hold on
for t = 1:length(data)
% plots the data and stores the handle or handles to the line object.
h = [h; semilogx(data{t}, styles{t})];
end
% There are three strings in the legend because a total of three columns of data are
% plotted. One column of data is from the first element of data, two columns of data
% are from the second element of data.
strings = {'data1' ,'data2', 'data3'};
legend(h,strings)
You might want to do something different with the legend but hopefully this will get you started.