matlab plot(x,y) of different data type - matlab

i got data (2945 * 3) of different types imported as cell array. 1st column data type has been imported as text (time e.g 1/1/1990), whereas the 2nd and 3rd columns are numbers.
so far i used cell2mat to convert to double both the 2nd and 3rd columns. Thus plot(y) works {y being either the 2nd or 3rd column data} , however i am wondering how i can handle the text data type from my 1st column in an attempt to use plot(x,y).
Any idea would be appreciated. cheers
--------sample.csv-------------
Date LAST Rt
1/27/2018 20 0.234556
1/26/2019 20.05 0.184556
1/23/2040 20.1 0.134556
1/22/1990 20.15 0.084556
1/21/1991 20.2 0.034556
1/20/1993 20.25 -0.015444
1/19/1998 20.3 -0.065444
1/16/2050 20.35 -0.115444
1/15/2030 20.4 -0.165444
--------cell array appearance------------
1 | 2 | 3
1| '1/27/2018' 20 0.234556
2| '1/26/2019' 20.05 0.184556
3| '1/23/2040' 20.1 0.134556
4| '1/22/1990' 20.15 0.084556
5| '1/21/1991' 20.2 0.034556
6| '1/20/1993' 20.25 -0.015444
7| '1/19/1998' 20.3 -0.065444
8| '1/16/2050' 20.35 -0.115444
9| '1/15/2030' 20.4 -0.165444

You could also use datenum to convert the text to a serial date number (copied from Octave command line):
>> test
test =
{
[1,1] = 1/1/2000
[1,2] = 1/2/2001
[1,3] = 10/2/2001
[1,4] = 10/3/2001
[1,5] = 10/3/2005
}
>> x_dates = cellfun('datenum',test(:,1))
x_dates =
730486 730853 731126 731127 732588
>> y = rand(size(x_dates));
>> plot(x_dates,y)
>> datetick('x','dd/mm/yyyy')
Update:
It looks like cellfun requires a function handle in MATLAB, so you probably need to do something like:
x_dates = cellfun(#datenum,test(:,1))

You could use XTick and XTickLabel. The former will set up where and how many ticks you want in your X axis (I guess that you'd want one for each X data, but you also may want to go jumping 10 by 10). The second will set the labels in those tick positions. If the Labels are less than the ticks, they will repeat, so careful with that.
Let me illustrate with an example:
x = [0 1 2 3];
y = [2 0 1 1];
plot (x, y);
yourstrings={'Banana', 'T', 'Potato', '45'};
set(gca,'XTick',x(1):x(end))
set(gca,'XTickLabel',yourstrings)
A second option would be to use text. you could put text wherever you like in the plot. Let me illustrate again. Of course I don't meant to put it "nice", but if you/d like, you could play with offsetting the positions of the texts and so on in order to get a more "beautiful" plot.
x = [0 1 2 3];
y = [2 0 1 1];
plot (x, y);
yourstrings={'Banana', 'T', 'Potato', '45'};
for ii=1:length(yourstrings)
text(x(ii),y(ii),yourstrings{ii})
end

You can create a new matrix with usable data (or rewrite your current one) by looping through your matrix(:,1) to convert the strings using the cellfun function.
Or just to plot:
plot(cellfun(#(x)str2double(x), Measures(:,i)))
where i = 1: length(matrix)

Related

How can I delete table row values in pairs? For example, if either column is less than 0.01, how do I delete the row?

I have two sets of data from different instruments that have common X-variables (XThompsons) but various Y-variables (YCounts) due to various experimental conditions. The data resemble the example below:
[Table1]
XThompsons | YCounts (1) | YCounts (2) | YCounts (3) | .... | ....
------------------------------------------------------------------
[Table2]
XThompsons | YCounts (1) | YCounts (2) | YCounts (3) | .... | ....
------------------------------------------------------------------
When I have two sets of data that are like this, I have written a script to take a single Y-column information from Table1 and do some math to all Y-columns in Table2. However, when comparing two table columns if either column has a value of a specific threshold (0.10) I want to delete that value. In the example below I want to delete row 4 and row 6 because either column has a value containing 0.10 or less
XThompsons | Table1.YCounts(1) | Table2.YCounts(2)
--------------------------------------------------
1 1.00 0.50
2 0.22 0.12
3 0.29 0.14
4 0.29 0.09 (delete row)
5 0.11 0.49
6 0.02 0.83 (delete row)
How can I carry this out in Matlab? My current code is below; I convert each table row to an array first. How can I make it so that if Y < 0.10 delete the row?
datax = readtable('table1.xls'); % Instrument 1
datay = readtable('table2.xls'); % Instrument 2
SIDATA = [];
for idx=2:width(datay);
% Read the indexed column of datax (instrument 1) then normalize to 1
x = table2array(datax(:,idx));
x = x ./ max(x);
% Read indexed column of datay (instrument 2) and carry out loop
for idy=2:width(datay);
% Normalize y data to 1
y = table2array(datay(:,idy));
y = y ./ max(y);
% Calculate similarity index (SI) at using the datax index for all collision energies for datay
xynum = sum(sqrt(x) .* sqrt(y));
xyden = sqrt(sum(x) .* sum(y));
SIDATA(idy,idx) = (xynum/xyden);
end
end
Help would be appreciated.
Thanks!
Generally when looping through and pruning values you want to increment from the end of the matrix back to one; this way, if you delete any rows, you don't skip. (If you delete row 2, then advance to row 3, you skip the data formerly in row 3).
To me, the easiest way to do this is that if all your data is in one matrix A, with columns Y1 Y2,
APruned = A((A(:,1) > 0.1) & (A(:,2) > 0.1),:)
This takes the A matrix, finds the rows where Y1 > 0.1, finds the rows where Y2 > 0.1, finds the overlap, and then outputs only the rows in A where both of these are true.
You should read about logical indecies for more on this topic
EDIT: It looks like you could also clean up your earlier code using element-wise operations;
A = [datax./max(datax) datay./max(datay)];

Polyfit for two variables

I have a kind of data and want to find the equation(poly coeff) of given data. For example equation for given sample data is simple a^2*b+10
a\b 5 10 15
________________________
3| 55 100 145
4| 90 170 250
5| 135 260 385
6| 190 370 550
I checked forpolfitbut It only works for one variable.
As Dusty Campbell pointed out you can use the fit function. To do this you have to build a mesh with your data
a = [3 4 5 6];
b = [5 10 15];
[A, B] = meshgrid(a, b);
C = (A.^2).*B + 10;
and then call fit with a custom equation
ft = fittype('p1*a^2*b + p2', 'independent',{'a','b'}, 'dependent','c');
opts = fitoptions('Method','NonlinearLeastSquares', 'StartPoint',[0.5,1]);
[fitresult, gof] = fit([A(:), B(:)], C(:), ft, opts);
As you'll see the solver converges to the correct solution p1 = 1, p2 = 10.
polyfitn should help...
Another approach: In the general case of non-linear data fitting you can easily use lsqnonlin.
Looks like you need the fit function from the Curve Fitting Toolbox. Or perhaps polyfitn created and shared by another Matlab user.

vectorizing histogram on matrix rows in matlab

Hello I need to calculate a histogram for every row in a big matrix.
For the first row for example I get this:
AA = hist(symbolic_data(1,:), 1:8);
With symbolic_data(1,:)=[7 6 7 8 7], I get AA=[0 0 0 0 0 1 3 1].
Of course this is easy using a simple for loop, but my symbolic_data matrix is really big.
Is there a way to vectorize this.
I've been fiddling with bsxfun, but I can't make it work.
Any help would be much appreciated.
Thanks for your time.
From Matlab help:
N = hist(Y) bins the elements of Y into 10 equally spaced containers
and returns the number of elements in each container. If Y is a
matrix, hist works down the columns.
so:
AA = hist(symbolic_data', 1:8);
will do what you want
The answer by #Mercury is the way to go. But if you want to do it with bsxfun:
If you only have integer values, use
bin_centers = 1:8;
AA = squeeze(sum(bsxfun(#eq, permute(symbolic_data,[2 3 1]), bin_centers(:).')));
If the values are not necessarily integer:
bin_centers = 1:8;
AA = squeeze(sum( bsxfun(#le, permute(symbolic_data,[2 3 1]), bin_centers(:).'+.5) &...
bsxfun(#gt, permute(symbolic_data,[2 3 1]), bin_centers(:).'-.5) ));

Ignoring similar columns when concating matrixes vertically

In matlab I have a 128 by n matrix, which we can call
[A B C]
where each letter is an 128 by 1 matrix.
So what I want to do is concat the above matrix with another matrix,
[A~ D E].
Where A~ is similar in its values to A.
What I want to get as the result of the concat would be:
[A B C D E],
where A~ is omitted.
What is the best way to do this? Note that I do not know beforehand that A~ is similar.
To clarify, my problem is how would I determine if two columns are similar? By similar I mean where between two columns, many of the row values are close in value.
Maybe an illustration would help as well
Vector A: [1 2 3 4 5 6 7 8 9]'
| | | | | | | | |
Vector B: [20 2.4 4 5 0 7 7 7.6 10]'
where there are some instances where the values are completely different, but for the most part the values are close. I don't have a defined threshold for this, but ideally it would be something that I could experiment with.
If you want to omit only identical columns, this is one way to do it:
%# Define the example matrices.
Matrix1 = [ 1 2 3; 4 5 6; 7 8 9 ]';
Matrix2 = [ 4 5 6; 7 8 10 ]';
%# Concatenate the matrices and keep only unique columns.
OutputMatrix = unique([ Matrix1, Matrix2 ]', 'rows')';
To solve this, a matching algorithm called vl_ubcmatch can be used.
[matches, scores] = vl_ubcmatch(da, db) ; For each descriptor in da,
vl_ubcmatch finds the closest descriptor in db (as measured by the L2
norm of the difference between them). The index of the original match
and the closest descriptor is stored in each column of matches and the
distance between the pair is stored in scores.
source:
http://www.vlfeat.org/overview/sift.html
Thus, the solution is to find the matched columns with the highest scores and eliminate them before concatenating.
I think it's pdist2 you need.
Consider the following example:
>> X = rand(25, 5);
>> Y = rand(100, 5);
>> Y(22, : ) = 0.99*X(22,:);
>> D = pdist2(X,Y, 'euclidean');
>> [~,ind] = min(D(:));
>> [i,j]=ind2sub(size(D),ind)
i =
22
j =
22
which is indeed the entry we manipulated to be similar. Read help pdist2 or doc pdist2 for more background.

Generating all combinations containing at least one element of a given set in Matlab

I use combnk to generate a list of combinations. How can I generate a subset of combinations, which always includes particular values. For example, for combnk(1:10, 2) I only need combinations which contain 3 and/or 5. Is there a quick way to do this?
Well, in your specific example, choosing two integers from the set {1, ..., 10} such that one of the chosen integers is 3 or 5 yields 9+9-1 = 17 known combinations, so you can just enumerate them.
In general, to find all of the n-choose-k combinations from integers {1, ..., n} that contain integer m, that is the same as finding the (n-1)-choose-(k-1) combinations from integers {1, ..., m-1, m+1, ..., n}.
In matlab, that would be
combnk([1:m-1 m+1:n], k-1)
(This code is still valid even if m is 1 or n.)
For a brute force solution, you can generate all your combinations with COMBNK then use the functions ANY and ISMEMBER to find only those combinations that contain one or more of a subset of numbers. Here's how you can do it using your above example:
v = 1:10; %# Set of elements
vSub = [3 5]; %# Required elements (i.e. at least one must appear in the
%# combinations that are generated)
c = combnk(v,2); %# Find pairwise combinations of the numbers 1 through 10
rowIndex = any(ismember(c,vSub),2); %# Get row indices where 3 and/or 5 appear
c = c(rowIndex,:); %# Keep only combinations with 3 and/or 5
EDIT:
For a more elegant solution, it looks like Steve and I had a similar idea. However, I've generalized the solution so that it works for both an arbitrary number of required elements and for repeated elements in v. The function SUBCOMBNK will find all the combinations of k values taken from a set v that include at least one of the values in the set vSub:
function c = subcombnk(v,vSub,k)
%#SUBCOMBNK All combinations of the N elements in V taken K at a time and
%# with one or more of the elements in VSUB as members.
%# Error-checking (minimal):
if ~all(ismember(vSub,v))
error('The values in vSub must also be in v.');
end
%# Initializations:
index = ismember(v,vSub); %# Index of elements in v that are in vSub
vSub = v(index); %# Get elements in v that are in vSub
v = v(~index); %# Get elements in v that are not in vSub
nSubset = numel(vSub); %# Number of elements in vSub
nElements = numel(v); %# Number of elements in v
c = []; %# Initialize combinations to empty
%# Find combinations:
for kSub = max(1,k-nElements):min(k,nSubset)
M1 = combnk(vSub,kSub);
if kSub == k
c = [c; M1];
else
M2 = combnk(v,k-kSub);
c = [c; kron(M1,ones(size(M2,1),1)) repmat(M2,size(M1,1),1)];
end
end
end
You can test this function against the brute force solution above to see that it returns the same output:
cSub = subcombnk(v,vSub,2);
setxor(c,sort(cSub,2),'rows') %# Returns an empty matrix if c and cSub
%# contain exactly the same rows
I further tested this function against the brute force solution using v = 1:15; and vSub = [3 5]; for values of N ranging from 2 to 15. The combinations created were identical, but SUBCOMBNK was significantly faster as shown by the average run times (in msec) displayed below:
N | brute force | SUBCOMBNK
---+-------------+----------
2 | 1.49 | 0.98
3 | 4.91 | 1.17
4 | 17.67 | 4.67
5 | 22.35 | 8.67
6 | 30.71 | 11.71
7 | 36.80 | 14.46
8 | 35.41 | 16.69
9 | 31.85 | 16.71
10 | 25.03 | 12.56
11 | 19.62 | 9.46
12 | 16.14 | 7.30
13 | 14.32 | 4.32
14 | 0.14 | 0.59* #This could probably be sped up by checking for
15 | 0.11 | 0.33* #simplified cases (i.e. all elements in v used)
Just to improve Steve's answer : in your case (you want all combinations with 3 and/or 5) it will be
all k-1/n-2 combinations with 3 added
all k-1/n-2 combinations with 5 added
all k-2/n-2 combinations with 3 and 5 added
Easily generalized for any other case of this type.