I have a csv file which contains data like below:[1st row is header]
Element,State,Time
Water,Solid,1
Water,Solid,2
Water,Solid,3
Water,Solid,4
Water,Solid,5
Water,Solid,2
Water,Solid,3
Water,Solid,4
Water,Solid,5
Water,Solid,6
Water,Solid,7
Water,Solid,8
Water,Solid,7
Water,Solid,6
Water,Solid,5
Water,Solid,4
Water,Solid,3
The similar pattern is repeated for State: "Solid" replaced with Liquid and Gas.
And moreover the Element "Water" can be replaced by some other element too.
Time as Integer's are in seconds (to simplify) but can be any real number.
Additionally there might by some comment line starting with # in between the file.
Problem Statement: I want to eliminate the first dip in Time values and smooth out using some quadratic or cubic or polynomial interpolation [please notice the first change from 5->2 --->8. I want to replace these numbers to intermediate values giving a gradual/smooth increase from 5--->8].
And I wish this to be done for all the combinations of Elements and States.
Is this possible through some sort of coding in Matlab etc ?
Any Pointers will be helpful !!
Thanks in advance :)
You can use the interp1 function for 1D-interpolation. The syntax is
yi = interp1(x,y,xi,method)
where x are your original coordinates, y are your original values, xi are the coordinates at which you want the values to be interpolated at and yi are the interpolated values. method can be 'spline' (cubic spline interpolation), 'pchip' (piece-wise Hermite), 'cubic' (cubic polynomial) and others (see the documentation for details).
You have alot of options here, it really depends on the nature of your data, but I would start of with a simple moving average (MA) filter (which replaces each data point with the average of the neighboring data points), and see were that takes me. It's easy to implement, and fine-tuning the MA-span a couple of times on some sample data is usually enough.
http://www.mathworks.se/help/curvefit/smoothing-data.html
I would not try to fit a polynomial to the entire data set unless I really needed to compress it, (but to do so you can use the polyfit function).
Related
I am trying to avoid the for loops and I have been reading through all the old posts there are about it but I am not able to solve my problem. I am new in MATLAB, so apologies for my ignorance.
The thing is that I have a 300x2 cell and in each one I have a 128x128x256 matrix. Each one is an image with 128x128 pixels and 256 channels per pixel. In the first column of the 300x2 cell I have my parallel intensity values and in the second one my perpendicular intensity values.
What I want to do is to take every pixel of every image (for each component) and sum the intensity values channel by channel.
The code I have is the following:
Image_par_channels=zeros(128,128,256);
Image_per_channels=zeros(128,128,256);
Image_tot_channels=zeros(128,128,256);
for a=1:128
for b=1:128
for j=1:256
for i=1:numfiles
Image_par_channels(a,b,j)=Image_par_channels(a,b,j)+Image_cell_par_per{i,1}(a,b,j);
Image_per_channels(a,b,j)=Image_per_channels(a,b,j)+Image_cell_par_per{i,2}(a,b,j);
end
Image_tot_channels(a,b,j)=Image_par_channels(a,b,j)+2*G*Image_per_channels(a,b,j);
end
end
end
I think I could speed it up introducing (:,:,j) instead of specifying a and b. But still a for loop. I am trying to use cellfun without any success due to my lack of expertise. Could you please give me a hand?
I would really appreciate it.
Many thanks and have a nice day!
Y
I believe you could do something like
Image_par_channels=zeros(128,128,256);
Image_per_channels=zeros(128,128,256);
Image_tot_channels=zeros(128,128,256);
for i=1:numfiles
Image_par_channels = Image_par_channels + Image_cell_par_per{i,1};
Image_per_channels = Image_per_channels + Image_cell_par_per{i,2};
end
Image_tot_channels = Image_par_channels + 2*G*Image_per_channels;
I haven't work with matlab in a long time, but I seem to recall you can do something like this. g is a constant.
EDIT:
Removed the +=. Incremental assignment is not an operator available in matlab. You should also note that Image_tot_channels can be build directly in the loop, if you don't need the other two variables later.
I am trying to plot with Matlab. In particular, I try with numerous online source but none of them work.
Here is my problem, I am trying to plot the expression: y=2*(x-1)/(x-4)Kb/L, and I am interested in the range of x between 0 and 1.
K=40;
b=20;
L=0.5;
x=linspace(0,1,1000);
y=2*(x-1)/(x-4)*K*b/L;
but it returns:
y=275.01
I know linspace isn't the proper way to plot. How can I plot this function? I want to keep the K,b,L declaration because I might change them latter.
y=2*(x-1)./(x-4)*K*b/L; you should use ./ replace /
Like hzy199411 said, you should use the "." operation.
I would suggest that you type "help ." at a MATLAB command prompt. MATLAB will respond with a large index of results but look for the section on "Arithmetic Operators".
You may also try the command "doc arith" but I think the "help ." is more helpful because at least in MATLAB 2013 it verbosely lists more "dot" operators.
In short several arithmetic operators prefixed with '.' ("Dot") are "Element-by-Element" operations and as such they operate on each index of the array/matrix.
For example if you had an array s=1:20 and you performed the operation s/s you would get ans = 1, where as if you did s./s you would get an array of 1's with the same length as 's'.
I guess that you are a new matlab user :). The program is in general ok, but you should think of some things. First,
linspace is not a plotting function. The function is useful though. With your syntax it creates a vector of length 1000 with range [0,1]. For plotting, type:
plot(x,y);
Linecolor and style can be set as
plot(x,y,'r-.');
For predefined colors (here 'r-.' means a red dotted line). There are also some additional properties that can be found be checking the online help of plot.
Also as the others say, if you want to operate on each element in the vector, use ./. The / is a matrix operator.
first a little background. I'm a psychology student so my background in coding isn't on par with you guys :-)
My problem is as follow and the most important observation is that curve fitting with 2 different programs gives completly different results for my parameters, altough my graphs stay the same. The main program we have used to fit my longitudinal data is kaleidagraph and this should be seen as kinda the 'golden standard', the program I'm trying to modify is matlab.
I was trying to be smart and wrote some code (a lot at least for me) and the goal of that code was the following:
1. Taking an individual longitudinal datafile
2. curve fitting this data on a non-parametric model using lsqcurvefit
3. obtaining figures and the points where f' and f'' are zero
This all worked well (woohoo :-)) but when I started comparing the function parameters both programs generate there is a huge difference. The kaleidagraph program stays close to it's original starting values. Matlab wanders off and sometimes gets larger by a factor 1000. The graphs stay however more or less the same in both situations and both fit the data well. However it would be lovely if I would know how to make the matlab curve fitting more 'conservative' and more located near it's original starting values.
validFitPersons = true(nbValidPersons,1);
for i=1:nbValidPersons
personalData = data{validPersons(i),3};
personalData = personalData(personalData(:,1)>=minAge,:);
% Fit a specific model for all valid persons
try
opts = optimoptions(#lsqcurvefit, 'Algorithm', 'levenberg-marquardt');
[personalParams,personalRes,personalResidual] = lsqcurvefit(heightModel,initialValues,personalData(:,1),personalData(:,2),[],[],opts);
catch
x=1;
end
Above is a the part of the code i've written to fit the datafiles into a specific model.
Below is an example of a non-parametric model i use with its function parameters.
elseif strcmpi(model,'jpa2')
% y = a.*(1-1/(1+(b_1(t+e))^c_1+(b_2(t+e))^c_2+(b_3(t+e))^c_3))
heightModel = #(params,ages) abs(params(1).*(1-1./(1+(params(2).* (ages+params(8) )).^params(5) +(params(3).* (ages+params(8) )).^params(6) +(params(4) .*(ages+params(8) )).^params(7) )));
modelStrings = {'a','b1','b2','b3','c1','c2','c3','e'};
% Define initial values
if strcmpi('male',gender)
initialValues = [176.76 0.339 0.1199 0.0764 0.42287 2.818 18.52 0.4363];
else
initialValues = [161.92 0.4173 0.1354 0.090 0.540 2.87 14.281 0.3701];
end
I've tried to mimick the curve fitting process in kaleidagraph as good as possible. There I've found they use the levenberg-marquardt algorithm which I've selected. However results still vary and I don't have any more clues about how I can change this.
Some extra adjustments:
The idea for this code was the following:
I'm trying to compare different fitting models (they are designed for this purpose). So what I do is I have 5 models with different parameters and different starting values ( the second part of my code) and next I have the general curve fitting file. Since there are different models it would be interesting if I could put restrictions into how far my starting values could wander off.
Anyone any idea how this could be done?
Anybody willing to help a psychology student?
Cheers
This is a common issue when dealing with non-linear models.
If I were, you, I would try to check if you can remove some parameters from the model in order to simplify it.
If you really want to keep your solution not too far from the initial point, you can use upper bounds and lower bounds for each variable:
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub)
defines a set of lower and upper bounds on the design variables in x so that the solution is always in the range lb ≤ x ≤ ub.
Cheers
You state:
I'm trying to compare different fitting models (they are designed for
this purpose). So what I do is I have 5 models with different
parameters and different starting values ( the second part of my code)
and next I have the general curve fitting file.
You will presumably compare the statistics from fits with different models, to see whether reductions in the fitting error are unlikely to be due to chance. You may want to rely on that comparison to pick the model that not only fits your data suitably but is also simplest (which is often referred to as the principle of parsimony).
The problem is really with the model you have shown resulting in correlated parameters and therefore overfitting, as mentioned by #David. Again, this should be resolved when you compare different models and find that some do just as well (statistically speaking) even though they involve fewer parameters.
edit
To drive the point home regarding the problem with the choice of model, here are (1) results of a trial fit using simulated data (2) the correlation matrix of the parameters in graphical form:
Note that absolute values of the correlation close to 1 indicate strongly correlated parameters, which is highly undesirable. Note also that the trend in the data is practically linear over a long portion of the dataset, which implies that 2 parameters might suffice over that stretch, so using 8 parameters to describe it seems like overkill.
What is the difference between hist and imhist functions in Matlab? I have a matrix of color levels values loaded from image with imread and need to count entropy value of the image using histogram.
When using imhist the resulting matrix contains zeros in all places except the last one (lower-right) which contains some high value number (few thousands or so).
Because that output seems to be wrong, I have tried to use hist instead of imhist and the resulting values are much better, the matrix is fulfilled with correct-looking values instead of zeros.
However, according to the docs, imhist should be better in this case and hist should give weird results..
Unfortunately I am not good at Matlab, so I can not provide you with better problem description. I can add some other information in the future, though.
So I will try to better explain my problem..I have an image, for which I should count entropy and few other values (how much bytes it will take to save that image,..). I wrote this function and it works pretty well
function [entropy, bytes_image, bytes_coding] = entropy_single_pixels(im)
im = double(im);
histg = hist(im);
histg(histg==0) = [];
nzhist = histg ./ numel(im);
entropy = -sum(nzhist.*log2(nzhist));
bytes_image = (entropy*(numel(im))/8);
bytes_coding = 2*numel(unique(im));
fprintf('ENTROPY_VALUE:%s\n',num2str(entropy));
fprintf('BYTES_IMAGE:%s\n',num2str(bytes_image));
fprintf('BYTES_CODING:%s\n',num2str(bytes_coding));
end
Then I have to count the same, but I have to make "pairs" from pixels which are below each other. So I have only half the rows and the same count of columns. I need to express every unique pixel pair as a different number, so I multiplied the first one by 1000 and added the second one to it... Subsequently I need to actually apply the same function as in the first example, but that is the time, when I am getting weird numbers from the imhist function. When using hist, it seems to be OK, but I really don't think that behavior is correct, so that must be my error somewhere. I actually understand pretty good, to what I want to do, or at least I hope so, but unfortunately Matlab makes all that kind of hard for me :)
hist- compute histogram(count number of occurance of each pixel) in color image.........
imhist- compute histogram in two dimensional image.
Use im2double instead of double if you want to use imhist. The imhist function expects double or single-precision data to be in the [0,1] data range, which is why you see everything in the last bin of the histogram.
I want to plot a piecewise function, but I don't want any gaps to appear
at the junctures, for example:
t=[1:8784];
b=(26.045792 + 13.075558*sin(0.0008531214*t - 2.7773943)).*((heaviside(t-2184))-(heaviside(t-7440)));
plot(b,'r','LineWidth', 1.5);grid on
there should not be any gaps appearing in
the plot between the three intervals , but they do.
I want the graph to be continueous without gaps.
Any suggestions on how to achieve that.
Thanks in advance.
EDIT
Actually, my aim is to find the carrier function colored by yellow in the figure below. I divide the whole interval into 3 intervals: 1-constant 2-sinusoidal 3- constant, then I want to find the overall function from the these three functions
Of course there are "gaps". The composite function is identically zero for all t<2184, and for all t>7440. The relationships can only be non-zero inside of that interval. And you have not chosen a function that is zero at the endpoints, so how can you expect there not to be "gaps"?
What values does your function take on at the endpoints of the interval?
>> t = [2184 7440];
>> (26.045792 + 13.075558*sin(0.0008531214*t - 2.7773943))
ans =
15.689 20.616
So look at the hat function part of this. I'll be lazy and use ezplot.
>> ezplot(#(t) ((heaviside(t-2184))-(heaviside(t-7440))),[0,8784])
Now, combine this, multiplying it by a trig piece, and of course the result is identically zero outside of that domain.
>> ezplot(#(t) (26.045792 + 13.075558*sin(0.0008531214*t - 2.7773943)).*((heaviside(t-2184))-(heaviside(t-7440))),[0,8784])
But if your goal is some sort of continuous function across the two chosen points in the hat function, you need to chose the trig part such that it is zero at those same two points. Mathematics is not spelled mathemagics. Wishing that you get a continuous function will not make it so.
So is your real question how to chose that internal piece (segment) as one such that the final result is continuous? If so, then we need to know why you have chosen the arbitrary constants in there. Surely these numbers, {26.045792, 13.075558, 0.0008531214, 2.7773943} all must have some significance to you. And if they are important, then how can we possibly make the result a continuous function?
Perhaps, and I'm just guessing here, you want some other result out of this, such that the function is not identically zero outside of those bounds. Perhaps you wish to extrapolate as a constant function outside of those points. But to help you, you must help us.