Hi im trying to fit a line of best fit/R2 value on my graph. im plotting date/time (x-axis) against an arbitrary value (context agnostic) and when trying to do it through the basic fitting tools in the figure information it is greyed out. If i graph another column against the original arbitrary value column that isn't time it works fine? im not sure why? Attached photos. The column date time is recognised as datetime by matlab (yyyy,dd,mm hh:mm:ss) for example: 2021-01-10 22:30:45. For context i am trying to analyse time between a few hours and linear regression it. the other axis (y-axis) is just values between 0.4- 0.9 slowly increasing. Any help would be muchly appreciated
If it works when graphing against a column that is not datetime, it may be the format that is preventing basic fitting. You could consider converting your datetime values to Unix timestamps. Then you can plot your Unix timestamp against your second column and replace the datetime axis labels.
Related
I'm using MATLAB to predict a trend with a machine learning approach.
My data file is an .xlsx file containing a timeline in one column (various sampling timestamps, i.e. numbers that represents seconds), and in the other columns I have some integers representing my trend.
My .xlsx file is pretty much like this:
0,0100 | 0
0,0110 | 1
0,0135 | 5
And so on.
I used "|" to distinguish between columns. The sampling time is not regular.
Given 10 values of the trend taken from 10 consecutive timestamps, I'd like to predict the 11th value at a given timestamp. For example, if the 9th value is at 34,010 and the 10th value is at 34,568s I'd like to know the value at 37,431s.
How can I do it?
I've found this link: Time Series Forecasting Using Deep Learning, but there the sampling time is regular.
Should I interpolate my trend values and re-sample them with a constant sampling time?
I would distinguish the forecasting problem from the data sampling time problem. You are dealing substantially with missing data.
Forecasting problem: You may use any machine learning technique just ignoring missing data. If you are not familiar with machine learning, I would suggest you to use LASSO (least absolute shrinkage and selection operator), which has been demonstrated to have predicting power (see "Sparse Signals in the Cross-Section of Returns" by ALEX CHINCO, ADAM D. CLARK-JOSEPH, and MAO YE).
Missing imputation problem: In the first place you should consider the reason why you have missing data. Sometime it makes no sense to impute values because the information that the value is missing is itself important and should not be overridden. Otherwise you have multiple options, other than linear interpolation, to estimate the missing values. For example check the MATLAB function fillmissing.
I feel like this should be something simple to solve - but I'm struggling to find the answer anywhere.
I have a set of 'R' values and a set of time values, I want to use curve fitting (I haven't used this part of the software before) to calculate the 'R' values at a different set of time values, literally just be able to access what is displayed in a figure created using curve fitting using a different set of time values (ie I can point the curser to the values I want on a figure and write them down but this is not efficient at all for the number of time values I have). Context is an orbital motion radius vs time.
Thanks in advance :)
You can use Matlab's fit function to do this very easily. Assuming you have your data in arrays r and t, you can do something like this:
f = fit(t, r, 'smoothingspline')
disp(f(5))
If you consult the documentation, you can see the various fit types available. (See https://www.mathworks.com/help/curvefit/fit.html)
I'm looking to do a linear regression to determine the estimated date of depletion for a particular resource. I have a dataset containing a column of dates, and several columns of data, always decreasing. A linear regression using scikit learn's LinearRegression() function yields a bad fit.
I converted the date column to ordinal, which resulted in values ~700,000. Relative to the y axis of values between 0-200, this is rather large. I imagine that the regression function is starting at low values and working its way up, eventually giving up before it finds a good enough fit. If i could assign starting values to the parameters, large intercept and small slope, perhaps it would fix the problem. I don't know how to do this, and i am very curious as to other solutions.
Here is a link to some data-
https://pastebin.com/BKpeZGmN
And here is my current code
model=LinearRegression().fit(dates,y)
model.score(dates,y)
y_pred=model.predict(dates)
plt.scatter(dates,y)
plt.plot(dates,y_pred,color='red')
plt.show()
print(model.intercept_)
print(model.coef_)
This code plots the linear model over the data, yielding stunning inaccuracy. I would share in this post, but i am not sure how to post an image from my desktop.
My original data is dates, and i convert to ordinal in code i have not shared here. If there is an easier way to do this that would be more accurate, i would appreciate a suggestion.
Thanks,
Will
I am new to Matlab and signal processing. I am having an issue with defining the frequency range in which the spectrogram is processed. When I am plotting the spectrogram of .wav audio data, the y axis, frequency, spans from zero to around 23 kHz. The useful data I am looking for is in the range of 200-400 Hz. My code snippet is:
[samFa, fs] = audioread('samFa.wav'); %convert audio to numerical data
samFa = samFa(:,1); %take only one channel of numerical output
spectrogram(samFA,2205,1200,12800, fs,'yaxis','MinThreshold',-80);
I don't want to be some noobie that runs into a problem and instantly gives up and posts a duplicate question to stackoverflow, so I have done as much digging as I can, but am at my wit's end.
I scoured the documentation for parameters or ways to have Matlab only analyze a subset or range of the data, but found nothing. Additionally, in all of the examples the frequency range seems to automatically adapt to the data set.
I know it is possible to just calculate the spectrogram for the entire range of frequencies, and then remove all of the unnecessary data through truncating or manually changing the limits in the plot itself, but changing plotting limits does not help with the numerical data.
I went searching through many similar questions, and found an answer all the way from 2012 here: Can I adjust spectogram frequency axes?
where the suggested answer was to import a vector of specific frequencies for the spectrogram to analyze. I tried passing a vector of integer values between 200 and 400, and a few other test ranges, but got the error:
Error using welchparse>welch_options (line 297)
The sampling frequency must be a scalar.
I've tried passing the parameter in at different places in the function, with no avail, and don't see anything regarding this parameter in the documentation, leading me to believe that this functionality was possibly removed sometime between 2012 and now.
When plotting spectrogram without providing signal frequency, Matlab provides a normalized spectrogram, which only provides a much smaller data window, which I can visually assess to cover the data from 0:5kHz (an artifact of overtones in the audio), so I know that matlab is not finding any data above this range to make the frequency range go to 20kHz
I've been trying to learn some signal processing for this project, so I believe the Nyquist frequency should be the maximum frequency that a Fourier transform is able to analyze, to be half the sampling frequency. My recording frequency is sampling at 44,100 Hz, and the spectrogram is ranging to around 22 or 23 kHz, leading me to believe that it's Matlab is noticing my sampling frequency and assuming that it needs to analyze up to such a high range.
For my work I am doing I am needing to produce thousands of spectrograms to then be processed through much further analysis, so it is very time consuming for Matlab to be processing so much unecessary data, and I would expect there to be some functionality in Matlab somehow to get around this.
Sorry for the very long post, but I wanted to fully explain my problem and show that I have done as much work as I could to solve the problem before turning for help. Thank you very much.
Get the axis handle and set the visual range there:
spectrogram(samFA,2205,1200,12800, fs,'yaxis','MinThreshold',-80);
ax=gca;
ylim(ax, [0.2,0.4]); %kHz
And if you want to calculate specific frequencies range to save time you better use goertzel.
f = 200:10:400;
freq_indices = round(f/fs*N) + 1;
dft_data = goertzel(data,freq_indices);
I'm looking to create a plot of two time series in matlab with separate axes because they are of completely different size. I understand that plotyy function is the function to use however I am having difficulty with the dates. Currently the dates are in date-vector format, so there are three separate columns containing the year, the month and the date. Do I need to convert the dates, and if so then how? Thanks in advance