What is the correct way of interpreting the effect of two interaction terms in lmer? - linear-regression

I am interested in examining the effect of time on the levels of antibodies and how that relationship in influenced by the interaction of gender and age. This is how I have built my model. I have included a random intercept for repeated measures.
model <- lmer(Antibody ~ Time + (Sex*Age)+ (1|PID), data=metadata)
This is the result
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Antibody ~ Time + (Sex * Age) + (1 | PID)
Data: metadata
AIC BIC logLik deviance df.resid
441.0 461.3 -213.5 427.0 128
Scaled residuals:
Min 1Q Median 3Q Max
-2.14160 -0.28264 -0.01453 0.26966 2.27572
Random effects:
Groups Name Variance Std.Dev.
PID (Intercept) 1.8039 1.343
Residual 0.3003 0.548
Number of obs: 135, groups: PID, 95
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 6.717567 0.263584 134.327633 25.485 < 2e-16 ***
Time -0.013952 0.002131 54.645901 -6.547 2.13e-08 ***
Sex 0.740442 0.299279 92.341423 2.474 0.0152 *
Age -0.101281 0.227621 92.276736 -0.445 0.6574
Sex.Male:Age 0.655921 0.307147 91.570395 2.136 0.0354 *
How can I interpret this output? Does the significant interaction between males and age mean that older males have higher antibodies for longer? If not how should I formulate the model to test for the impact of the interaction of age and sex on the duration of the antibody?
Thanks in advance

Related

Marginal Means accounting for the random effect uncertainty

When we have repeated measurements on an experimental unit, typically these units cannot be considered 'independent' and need to be modeled in a way that we get valid estimates for our standard errors.
When I compare the intervals obtained by computing the marginal means for the treatment using a mixed model (treating the unit as a random effect) and in the other case, first averaging over the unit and THEN runnning a simple linear model on the averaged responses, I get the exact same uncertainty intervals.
How do we incorporate the uncertainty of the measurements of the unit, into the uncertainty of what we think our treatments look like?
In order to really propogate all the uncertainty, shouldn't we see what the treatment looks like, averaged over "all possible measurements" on a unit?
``` r
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(emmeans)
library(lme4)
#> Loading required package: Matrix
library(ggplot2)
tmp <- structure(list(treatment = c("A", "A", "A", "A", "A", "A", "A",
"A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B",
"B", "B", "B", "B"), response = c(151.27333548, 162.3933313,
159.2199999, 159.16666725, 210.82, 204.18666667, 196.97333333,
194.54666667, 154.18666667, 194.99333333, 193.48, 191.71333333,
124.1, 109.32666667, 105.32, 102.22, 110.83333333, 114.66666667,
110.54, 107.82, 105.62000069, 79.79999821, 77.58666557, 75.78666928
), experimental_unit = c("A-1", "A-1", "A-1", "A-1", "A-2", "A-2",
"A-2", "A-2", "A-3", "A-3", "A-3", "A-3", "B-1", "B-1", "B-1",
"B-1", "B-2", "B-2", "B-2", "B-2", "B-3", "B-3", "B-3", "B-3"
)), row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"
))
### Option 1 - Treat the experimental unit as a random effect since there are
### 4 repeat observations for the same unit
lme4::lmer(response ~ treatment + (1 | experimental_unit), data = tmp) %>%
emmeans::emmeans(., ~ treatment) %>%
as.data.frame()
#> treatment emmean SE df lower.CL upper.CL
#> 1 A 181.0794 10.83359 4 151.00058 211.1583
#> 2 B 101.9683 10.83359 4 71.88947 132.0472
#ggplot(.,aes(treatment, emmean)) +
#geom_pointrange(aes(ymin = lower.CL, ymax = upper.CL))
### Option 2 - instead of treating the unit as random effect, we average over the
### 4 repeat observations, and run a simple linear model
tmp %>%
group_by(experimental_unit) %>%
summarise(mean_response = mean(response)) %>%
mutate(treatment = c(rep("A", 3), rep("B", 3))) %>%
lm(mean_response ~ treatment, data = .) %>%
emmeans::emmeans(., ~ treatment) %>%
as.data.frame()
#> treatment emmean SE df lower.CL upper.CL
#> 1 A 181.0794 10.83359 4 151.00058 211.1583
#> 2 B 101.9683 10.83359 4 71.88947 132.0472
#ggplot(., aes(treatment, emmean)) +
#geom_pointrange(aes(ymin = lower.CL, ymax = upper.CL))
### Whether we include a random effect for the unit, or average over it and THEN model it, we find no difference in the
### marginal means for the treatments
### How do we incoporate the variation of the repeat measurments to the marginal means of the treatments?
### Do we then ignore the variation in the 'subsamples' and simply average over them PRIOR to modeling?
<sup>Created on 2021-07-31 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>
emmeans() does take into account the errors of random effects. This is what I get when I remove the complex sequences of pipes:
> mmod = lme4::lmer(response ~ treatment + (1 | experimental_unit), data = tmp)
> emmeans(mmod, "treatment")
treatment emmean SE df lower.CL upper.CL
A 181 10.8 4 151.0 211
B 102 10.8 4 71.9 132
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
This is as shown. If I fit a fixed-effects model that accounts for experimental units as a fixed effect, I get:
> fmod = lm(response ~ treatment + experimental_unit, data = tmp)
> emmeans(fmod, "treatment")
NOTE: A nesting structure was detected in the fitted model:
experimental_unit %in% treatment
treatment emmean SE df lower.CL upper.CL
A 181 3.25 18 174.2 188
B 102 3.25 18 95.1 109
Results are averaged over the levels of: experimental_unit
Confidence level used: 0.95
The SEs of the latter results are considerably lower, and that is because the random variations in experimental_unit are modeled as fixed variations.
Apparently the piping you did accounts for the variation of the random effects and includes those in the EMMs. I think that is because you did things separately for each experimental unit and somehow combined those results. I'm not very comfortable with a sequence of pipes that is 7 steps long, and I don't understand why that results in just one set of means.
I recommend against the as.data.frame() at the end. That zaps out annotations that can be helpful in understanding what you have. If you are doing that to get more digits precision, I'll claim that those are digits you don't need, it just exaggerates the precision you are entitled to claim.
Notes on some follow-up comments
Subsequently, I am convinced that what we see in the piped operations in the second part of the OP doe indeed comprise computing the mean of each EU, then analyzing those.
Let's look at that in the context of the formal model. We have (sorry MathJax doesn't work on stackoverflow, but I'll leave the markup there anyway)
$$ Y_{ijk} = \mu + \tau_i + U_{ij} + E_{ijk} $$
where $Y_{ijk}$ is the kth response measurement on the ith treatment and jth EU in the ith treatment, and the rhs terms represent respectively the overall mean, the (fixed) treatment effects, the (random) EU effects, and the (random) error effects. We assume the random effects are all mutually independent. With a balanced design, the EMMs are just the marginal means:
$$ \bar Y_{i..} = \mu + \tau_i + \bar U_{i.} + \bar E_{i..} $$
where a '.' subscript means we averaged over that subscript. If there are n EUs per treatment and m measurements on each EU, we get that
$$ Var(\bar Y_{i..} = \sigma^2_U / n + \sigma^2_E / mn $$
Now, if we aggregate the data on EUs ahead of time, we are starting with
$$ \bar Y_{ij.} = \mu + U_{ij} + \bar E_{ij.} $$
However, if we then compute marginal means by averaging over j, we get exactly the same thing as we did before with $\bar Y_{i..}$, and the variance is exactly as already shown. That is why it doesn't matter if we aggregated first or not.

Analyze weather data stored in csv

I have some weather data stored in a csv file in the form of: „id, date, temperature, rainfall“, with id being the weather station and, obviously, date being the date of measurement. The file contains the data of 3 different stations over a period of 10 years.
What I'd like to do is analyze the data of each station and each year. For example: I'd like to calculate day-to-day differences in temperature [abs((n+1)-n)] for each station and each year.
I thought while-loops could be a possibility, with the loop calculating something as long as the id value is equal to the one in the next row.
But I’ve no idea how to do it.
Best regards
If you still need assistance, I would consider importing the .csv file data using "readtable". So long as only the first row are text, MATLAB will create a 'table' variable (this shouldn't be an issue for a .csv file). The individual columns can be accessed via "tablename.header" and can be reestablished as double data type (ex variable_1=tablename.header). You can then concatenate your dataset as you like. As for sorting by date and station id, I would advocate using "sortrows". For example, if the station id is the first column, sortrow(data,1) will sort "data" by the station id. sortrow(data, [1 2]) will sort "data" by the first column, then by the second column. From there, you can write an if statement to compare the station id's and perform the required calculations. I hope my brief answer is somewhat helpful.
A basic code structure would be:
path=['copy and paste file path here']; % show matlab where to look
data=readtable([path '\filename.csv'], 'ReadVariableNames',1); % read the file from csv format to table
variable1=data.header1 % general example of making double type variable from table
variable2=data.header2
variable3=data.header3
double_data=[variable1 variable2 variable3]; % concatenates the three columns together
sorted_data=sortrows(double_data, [1 2]); % sorts double_data by column 1 then column 2
It always helps to have actual data to work on and specifics as to what kind of output format is expected. Basically, ins and outs :) With the little info provided, I figured I would generate random data for you in the first section, and then calculate some stats in the second. I include the loop as an example since that's what you asked, but I highly recommend using vectorized calculations whenever available, such as the one done in summary stats.
%% example for weather stations
% generation of random data to correspond to what your csv file looks like
rng(1); % keeps the random seed for testing purposes
nbDates = 1000; % number of days of data
nbStations = 3; % number of weather stations
measureDates = repmat((now()-(nbDates-1):now())',nbStations,1); % nbDates days of data ending today
stationIds = kron((1:nbStations)',ones(nbDates,1)); % assuming 3 weather stations with IDs [1,2,3]
temp = rand(nbStations*nbDates,1)*70+30; % temperatures are in F and vary between 30 and 100 degrees
rain = max(rand(nbStations*nbDates,1)*40-20,0); % rain fall is 0 approximately half the time, and between 0mm and 20mm the rest of the time
csv = table(measureDates, stationIds, temp, rain);
clear measureDates stationIds temps rain;
% augment the original dataset as needed
years = year(csv.measureDates);
data = [csv,array2table(years)];
sorted = sortrows( data, {'stationIds', 'measureDates'}, {'ascend', 'ascend'} );
% example looping through your data
for i = 1 : size( sorted, 1 )
fprintf( 'Id=%d, year=%d, temp=%g, rain=%g', sorted.stationIds( i ), sorted.years( i ), sorted.temp( i ), sorted.rain( i ) );
if( i > 1 && sorted.stationIds( i )==sorted.stationIds( i-1 ) && sorted.years( i )==sorted.years( i-1 ) )
fprintf( ' => absolute difference with day before: %g', abs( sorted.temp( i ) - sorted.temp( i-1 ) ) );
end
fprintf( '\n' ); % new line
end
% depending on the statistics you wish to do, other more efficient ways of
% accessing summary stats might be accessible, for example:
grpstats( data ...
, {'stationIds','years'} ... % group by categories
, {'mean','min','max','meanci'} ... % statistics we want
, 'dataVars', {'temp','rain'} ... % variables on which to calculate stats
) % doesn't require data to be sorted or any looping
This produces one line printed for each row of data (and only calculates difference in temperature when there is no year or station change). It also produces some summary stats at the end, here's what I get:
stationIds years GroupCount mean_temp min_temp max_temp meanci_temp mean_rain min_rain max_rain meanci_rain
__________ _____ __________ _________ ________ ________ ________________ _________ ________ ________ ________________
1_2016 1 2016 82 63.13 30.008 99.22 58.543 67.717 6.1181 0 19.729 4.6284 7.6078
1_2017 1 2017 365 65.914 30.028 99.813 63.783 68.045 5.0075 0 19.933 4.3441 5.6708
1_2018 1 2018 365 65.322 30.218 99.773 63.275 67.369 4.7039 0 19.884 4.0615 5.3462
1_2019 1 2019 188 63.642 31.16 99.654 60.835 66.449 5.9186 0 19.864 4.9834 6.8538
2_2016 2 2016 82 65.821 31.078 98.144 61.179 70.463 4.7633 0 19.688 3.4369 6.0898
2_2017 2 2017 365 66.002 30.054 99.896 63.902 68.102 4.5902 0 19.902 3.9267 5.2537
2_2018 2 2018 365 66.524 30.072 99.852 64.359 68.69 4.9649 0 19.812 4.2967 5.6331
2_2019 2 2019 188 66.481 30.249 99.889 63.647 69.315 5.2711 0 19.811 4.3234 6.2189
3_2016 3 2016 82 61.996 32.067 98.802 57.831 66.161 4.5445 0 19.898 3.1523 5.9366
3_2017 3 2017 365 63.914 30.176 99.902 61.932 65.896 4.8879 0 19.934 4.246 5.5298
3_2018 3 2018 365 63.653 30.137 99.991 61.595 65.712 5.3728 0 19.909 4.6943 6.0514
3_2019 3 2019 188 64.201 30.078 99.8 61.319 67.082 5.3926 0 19.874 4.4541 6.3312

What VIF value limit (like 4,5,6,7,....) should I select for Linear Regression model which has 30 discrete variables and 4 continuous variables?

What should be the VIF value limit (like 4,5,6,7,....) for Linear Regression model which has 30 discrete, 4 continuous input variables and 1 continuous variable?
It's confusing to see that different researcher recommend different VIF values to use.
I have tried it in SPSS and by creating dummy variables for discrete variables. Here is the result
Coefficients
Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics
B Std. Error Beta Tolerance VIF
(Constant) .076 1.262 .060 .952
absences .014 .012 .020 1.170 .243 .776 1.289
G1 .129 .039 .109 3.326 .001 .214 4.665
G2 .857 .036 .773 23.541 .000 .215 4.645
age .027 .050 .010 .548 .584 .649 1.540
school_new -.170 .135 -.025 -1.265 .206 .588 1.702
sex_new .150 .121 .023 1.239 .216 .680 1.471
address_new -.119 .127 -.017 -.937 .349 .712 1.405
famsize_new .038 .118 .005 .320 .749 .830 1.205
pstatus_new .004 .169 .000 .025 .980 .786 1.272
schoolsup_new .197 .178 .019 1.105 .269 .811 1.234
famsup_new -.070 .110 -.011 -.632 .528 .836 1.197
paid_new .147 .222 .011 .659 .510 .865 1.156
activities_new -.009 .108 -.001 -.087 .931 .830 1.204
nursery_new .070 .132 .009 .531 .596 .879 1.137
higher_new -.124 .189 -.012 -.655 .513 .712 1.404
internet_new -.115 .134 -.015 -.858 .391 .755 1.324
romantic_new .022 .112 .003 .200 .842 .832 1.202
M_prim_edu -.046 .556 -.006 -.083 .934 .046 21.942
M_5th_TO_9th -.114 .560 -.016 -.203 .839 .038 26.474
M_secon_edu -.143 .566 -.018 -.253 .801 .045 22.328
M_higher_edu -.309 .583 -.042 -.529 .597 .036 27.719
F_prim_edu -.454 .518 -.062 -.875 .382 .046 21.795
F_5th_TO_9th -.318 .522 -.046 -.608 .543 .041 24.624
F_secon_edu -.300 .532 -.037 -.563 .574 .053 18.873
F_higher_edu -.269 .547 -.033 -.492 .623 .051 19.613
M_health_job -.195 .253 -.025 -.770 .441 .229 4.373
M_other_job .050 .256 .004 .197 .844 .541 1.849
M_services_job -.273 .225 -.041 -1.211 .226 .199 5.016
M_teacher_job -.013 .226 -.002 -.055 .956 .286 3.496
F_health_job .470 .335 .036 1.400 .162 .355 2.814
F_other_job .003 .362 .000 .008 .993 .539 1.854
F_services_job .151 .269 .023 .563 .574 .136 7.336
F_teacher_job .015 .275 .002 .054 .957 .159 6.293
reason_school_repu .239 .194 .031 1.235 .217 .364 2.746
reason_course_pref .176 .202 .023 .873 .383 .347 2.886
reason_other .364 .175 .056 2.074 .039 .320 3.129
guard_mother -.030 .129 -.004 -.234 .815 .699 1.431
guard_other .311 .259 .023 1.204 .229 .612 1.635
tra_time_15_TO_30min .043 .120 .006 .356 .722 .764 1.309
tra_time_30_TO_60min .274 .206 .023 1.327 .185 .745 1.342
tra_time_GT_60min .791 .351 .038 2.254 .025 .816 1.225
study_2_TO_5hrs_time .171 .129 .026 1.325 .186 .584 1.713
study_5_TO_10hrs_time .151 .177 .017 .853 .394 .605 1.654
study_GT_10hrs_time .073 .253 .005 .290 .772 .743 1.347
failure_1_time -.532 .189 -.051 -2.814 .005 .704 1.421
failure_2_time -.691 .362 -.033 -1.906 .057 .766 1.305
failure_3_time -.428 .375 -.019 -1.140 .255 .813 1.230
family_rela_bad -.002 .381 .000 -.004 .997 .391 2.558
family_rela_avg .012 .322 .001 .038 .970 .177 5.642
family_rela_good .011 .303 .002 .037 .971 .106 9.470
family_rela_excel -.101 .308 -.014 -.329 .743 .127 7.885
freetime_low .105 .236 .012 .447 .655 .315 3.172
freetime_avg -.038 .217 -.006 -.174 .862 .217 4.600
freetime_high -.026 .231 -.004 -.111 .911 .228 4.384
freetime_very_high -.153 .266 -.014 -.572 .567 .363 2.753
go_out_low .095 .223 .012 .424 .672 .280 3.576
go_out_avg .135 .218 .019 .619 .536 .236 4.244
go_out_high .186 .232 .024 .801 .423 .264 3.781
go_out_very_high -.132 .246 -.015 -.537 .591 .284 3.521
Dalc_low -.157 .156 -.019 -1.006 .315 .655 1.527
Dalc_avg .274 .250 .021 1.097 .273 .628 1.592
Dalc_high -.877 .352 -.043 -2.488 .013 .763 1.310
Dalc_very_high .102 .407 .005 .250 .802 .571 1.751
Walc_low .031 .144 .004 .213 .831 .656 1.526
Walc_avg -.148 .164 -.018 -.901 .368 .594 1.683
Walc_high .000 .205 .000 .002 .998 .495 2.020
Walc_very_high -.059 .309 -.005 -.190 .849 .393 2.542
health_low -.065 .205 -.006 -.314 .754 .542 1.845
health_avg -.125 .185 -.015 -.677 .499 .459 2.179
health_high -.088 .190 -.010 -.465 .642 .482 2.075
health_very_high -.234 .169 -.035 -1.381 .168 .357 2.801
a. Dependent Variable: G3

Matlab:Sampling Curve into Frames and Plotting

I have a Output Curve of following characteristics
http://i.imgur.com/hABfsiC.jpg
Following is the data which represents the output curve cited above
0 1228.15406117455 1213.71796132282 1199.44623423626 1185.33715849069 1171.38902630825 1157.60014358826 1143.96882993237 1130.49341866405 1117.17225684288 1104.00370527364 1090.98613851046 1078.11794485629 1065.39752635781 1052.82329879590 1040.39369167202 1028.10714819050 1015.96212523702 1003.95709335331 992.090536708388 980.360953066379 968.766853751044 957.306763607236 945.979220959370 934.782777567031 923.715998577884 912.777462477973 901.965761039541 891.279499266498 880.717295337626 870.277780547640 859.959599246218 849.761408775075 839.681879403208 829.719694260385 819.873549268974 810.142153074209 800.524226972966 791.018504841132 781.623733059661 772.338670439371 763.162088144576 754.092769615612 745.129510490332 736.271118524637 727.516413512102 718.864227202765 710.313403221133 701.862796983477 693.511275614446 685.257717863085 677.101014018280 669.040065823703 661.073786392290 653.201100120311 645.420942601057 637.732260538220 630.134011658968 622.625164626790 615.204698954125 607.871604914830 600.624883456507 593.463546112736 586.386614915240 579.393122306018 572.482111049470 565.652634144552 558.903754736983 552.234546031533 545.644091204419 539.131483315831 532.695825222615 526.336229491134 520.051818310328 513.841723405001 507.705085949342 501.641056480713 495.648794813713 489.727469954538 483.876260015660 478.094352130827 472.380942370417 466.735235657145 461.156445682143 455.643794821435 450.196514052799 444.813842873049 439.495029215736 434.239329369281 429.046007895553 423.914337548901 418.843599195643 413.833081734033 408.882082014698 403.989904761570 399.155862493301 394.379275445192 389.659471491612 384.995786068946 380.387562099043 375.834149913207 371.334907176693 366.889198813755 362.496396933217 358.155880754588 353.867036534721 349.629257495012 345.441943749158 341.304502231452 337.216346625644 333.176897294352 329.185581209025 325.241831880470 321.345089289941 317.494799820775 313.690416190604 309.931397384121 306.217208586402 302.547321116805 298.921212363411 295.338365718043 291.798270511836 288.300421951372 284.844321055367 281.429474591926 278.055395016351 274.721600409499 271.427614416707 268.172966187255 264.957190314395 261.779826775920 258.640420875285 255.538523183269 252.473689480189 249.445480698649 246.453462866826 243.497207052304 240.576289306430 237.690290609210 234.838796814729 232.021398597103 229.237691396949 226.487275368383 223.769755326528 221.084740695544 218.431845457169 215.810688099765 213.220891567874 210.662083212278 208.133894740554 205.635962168130 203.167925769835 200.729430031931 198.320123604641 195.939659255160 193.587693821135 191.263888164634 188.967907126584 186.699419481673 184.458097893730 182.243618871551 180.055662725203 177.893913522768 175.758059047548 173.647790755713 171.562803734401 169.502796660248 167.467471758370 165.456534761766 163.469694871160 161.506664715267 159.567160311482 157.650901026989 155.757609540290 153.887011803139 152.038837002892 150.212817525260 148.408688917462 146.626189851781 144.865062089509 143.125050445289 141.405902751840 139.707369825069 138.029205429564 136.371166244459 134.733011829684 133.114504592569 131.515409754832 129.935495319916 128.374532040696 126.832293387534 125.308555516696 123.803097239110 122.315699989477 120.846147795722 119.394227248782 117.959727472736 116.542440095264 115.142159218437 113.758681389832 112.391805573974 111.041333124092 109.707067754198 108.388815511474 107.086384748976 105.799586098638 104.528232444585 103.272138896747 102.031122764767 100.805003532210 99.5936028310572 98.3967444164976 97.2142541419957 96.0459599346504 94.8916917708293 93.7512816520819 92.6245635813268 91.5113735393103 90.4115494613329 89.3249312142428 88.2513605736907 87.1906812016462 86.1427386241704 85.1073802094438 84.0844551460460 83.0738144214844 82.0753108009696 81.0887988064351 80.1141346957970 79.1511764424544 78.1997837150241 77.2598178573099 76.3311418685023 75.4136203836077 74.5071196541023 73.6115075288110 72.7266534350062 71.8524283597264 70.9887048313103 70.1353569011454 69.2922601256279 68.4592915483322 67.6363296823873 66.8232544930576 66.0199473805265 65.2262911628803 64.4421700592893 63.6674696733859 62.9020769768354 62.1458802930985 61.3987692813833 59.4883213755835 57.6483869118112 55.8755259367794 54.1665202384763 52.5183563982524 50.9282102384589 49.3934325474650 47.9115359739955 46.4801829919653 45.0971748454356 43.7604413910364 42.4680317622626 41.2181057864981 40.0089260915283 38.8388508436906 37.7063270647464 36.6098844790632 35.5481298468222 34.5197417427304 33.5234657431651 32.5581099878283 31.6225410848696 30.7156803310698 29.8365002210865 28.9840212219663 28.1573087911403 27.3554706179645 26.5776540705486 25.8230438311588 25.0908597048888 24.3803545875820 23.6908125801659 23.0215472376397 22.3718999419403 21.7412383888150 21.1289551796556 20.5344665100015 19.9572109471129 19.3966482896451 18.8522585030342 18.3235407247356 17.8100123339373 17.3112080808174 16.8266792708190 16.3559929997865 15.8987314361504 15.4544911466551 15.0228824624129 14.6035288823265 14.1960665111621 13.8001435297733 13.4154196951792 13.0415658683801 12.6782635679663 12.3252045477276 11.9820903966137 11.6486321595240 11.3245499775272 11.0095727462157 10.7034377910043 10.4058905582723 10.1166843213306 9.83557990027752 9.56234539487254 9.29675592962605 9.03859341036164 8.78764629156238 8.54370935386321 8.30658349109817 8.07607550635358 7.85199791651758 7.63416876485290 7.42241144115244 7.21655450906823 7.01643154023247 6.82188095481515 6.63274586818697 6.44887394337809 6.27011724904377 6.09633212266684 5.92737903874406 5.76312248171972 5.60343082344452 5.44817620495165 5.29723442235448 5.15048481668219 5.00781016748056 4.86909659001510 4.73423343592343 4.60311319717196 4.47563141318063 4.35168658098654 4.23118006832444 4.11401602950876 4.00010132400749 3.88934543760435 3.78166040605072 3.67696074111369 3.57516335893142 3.47618751059112 3.37995471484913 3.28638869291620 3.19541530523493 3.10696249017925 3.02096020460939 2.93734036621838 2.85603679760903 2.77698517204319 2.70012296080697 2.62538938213866 2.55272535166765 2.48207343431511 2.41337779760910 2.34658416636844 2.28163977871173 2.21849334334935 2.15709499811800 2.09739626971883 2.03935003462156 1.98291048109845 1.92803307235332 1.87467451071194 1.82279270284141 1.77234672596718 1.72329679505757 1.67560423094658 1.62923142936686 1.58414183086548 1.54029989157635 1.49767105482374 1.45622172353228 1.41591923341968 1.37673182694912 1.33862862801892 1.30157961736815 1.26555560867691 1.23052822534142 1.19646987790401 1.16335374211922 1.13115373763752 1.09984450728875 1.06940139694817 1.03980043596818 1.01101831815952 0.983032383306248 0.955820599199133 0.929361544172696 0.903634390131477 0.878618886051614 0.854295341944170 0.830644613267103 0.807648085773110 0.785287660780989 0.763545740858505 0.742405215905110 0.721849449623219 0.701862266367018 0.682427938358193 0.663531173258183 0.645157102086936 0.627291267478379 0.609919612263140 0.593028468369314 0.576604546032312 0.560634923305132 0.545107035860604 0.530008667077409 0.515327938401923 0.501053299978151 0.487173521538225 0.473677683546191 0.460555168587976 0.447795653000654 0.435389098734315 0.423325745440032 0.411596102777619 0.400190942937019 0.389101293367371 0.378318429707958 0.367833868915394 0.357639362581574 0.347726890437081 0.338088654034864 0.328717070609165 0.319604767104817 0.310744574372160 0.302129521522970 0.293752830442904 0.285607910456123 0.277688353137837 0.269987927270671 0.262500573940842 0.255220401770266 0.248141682280806 0.241258845386994 0.234566475013650 0.228059304834930 0.221732214131435 0.215580223762081 0.209598492247567 0.203782311962319 0.198127105431917 0.192628421733057 0.187281932993220 0.182083430987262 0.177028823828250 0.172114132749923 0.167335488978228 0.162689130689473 0.158171400052687 0.153778740353848 0.149507693199712 0.145354895799043 0.141317078319080 0.137391061315172 0.133573753231541 0.129862147971212 0.126253322533180 0.122744434714959 0.119332720878702 0.116015493779123 0.112790140451515 0.109654120158199 0.106604962391776 0.103640264933625 0.100757691966098 0.0979549722369389 0.0952298972744753 0.0925803196521720 0.0900041513011877 0.0874993618696001 0.0850639771270087 0.0826960774132594 0.0803937961300688 0.0781553182743600 0.0759788790121574 0.0738627622919147 0.0718052994961873 0.0698048681305858 0.0678598905489812 0.0659688327139552 0.0641302029915237 0.0623425509791821 0.0606044663663534 0.0589145778263389 0.0572715519389033 0.0556740921426450 0.0541209377163267 0.0526108627883675 0.0511426753737151
The following function represents the output characteristics of Tracer in 50 minute time interval ( The Y-Axis-Mbq and X-Axis : Time in minutes)
Now i would like to Sample the Output Curve into 19 frames
4 frames : Each of 15 seconds time interval
2 frames : Each of 30 seconds time interval
2 frames : Each of 60 seconds time interval
11 frames : Each of 200 seconds time interval
and plot each of the Respective frames in a plot, kindly suggest me some methodology to approach this problem
If I understand correctly, you have for the corresponding x-values time from 0 to 50 minutes, so since you have 500 samples you have 10 samples per minute or an interval of 6 seconds between samples.
To get samples at a different rate, you can interpolate your signal using interp1. If the signal you gave above is stored in Y, you can interpolate it to 15 second intervals using:
x = 0:6:3000; % The original sample time, in seconds
xi = 0:15:3000; % The interpolated sample time, in seconds
Yi = interp1(x, Y, xi);
You can interpolate to any specified x-value within the original data, so for the varying sample rate you require you can define:
xi = [0:15:60, 90:30:120, 180:60:240, 440:200:2240];
Plotting of curves in MATLAB is usually done using the plot function. For your interpolated data you can use:
figure;
plot(xi / 60, Yi);
xlabel('Time [min]');
ylabel('Signal [units]');
title('My signal');
Note how the time units have been changed from seconds to minutes for the plot. The rest of the commands given here are useful for plotting as well. figure creates a new figure window for the plot and xlabel, ylabel and title are used to annotate it.

Libsvm Classification MATLAB

I used 1~200 data as trainning data, 201~220 as testing data
format likes: 3 class(class 1,class 2, class 3) and 20 features
2 1:100 2:96 3:88 4:94 5:96 6:94 7:72 8:68 9:69 10:70 11:76 12:70 13:73 14:71 15:74 16:76 17:78 18:81 19:76 20:76
2 1:96 2:100 3:88 4:88 5:90 6:98 7:71 8:66 9:63 10:74 11:75 12:66 13:71 14:68 15:74 16:78 17:78 18:85 19:77 20:76
2 1:88 2:88 3:100 4:96 5:91 6:89 7:70 8:70 9:68 10:74 11:76 12:71 13:73 14:74 15:79 16:77 17:73 18:80 19:78 20:78
2 1:94 2:87 3:96 4:100 5:92 6:88 7:76 8:73 9:71 10:70 11:74 12:67 13:71 14:71 15:76 16:77 17:71 18:80 19:73 20:73
2 1:96 2:90 3:91 4:93 5:100 6:92 7:74 8:67 9:67 10:75 11:75 12:67 13:74 14:73 15:77 16:77 17:75 18:82 19:76 20:74
2 1:93 2:98 3:90 4:88 5:92 6:100 7:73 8:66 9:65 10:73 11:78 12:69 13:73 14:72 15:75 16:74 17:75 18:83 19:79 20:77
3 1:73 2:71 3:73 4:76 5:74 6:73 7:100 8:79 9:79 10:71 11:65 12:58 13:67 14:73 15:74 16:72 17:60 18:63 19:64 20:60
3 1:68 2:66 3:70 4:73 5:68 6:67 7:78 8:100 9:85 10:77 11:57 12:57 13:58 14:62 15:68 16:64 17:59 18:57 19:57 20:59
3 1:69 2:64 3:70 4:72 5:69 6:65 7:78 8:85 9:100 10:70 11:56 12:63 13:62 14:61 15:64 16:69 17:56 18:55 19:55 20:51
3 1:71 2:74 3:74 4:70 5:76 6:73 7:71 8:73 9:71 10:100 11:58 12:58 13:59 14:60 15:58 16:65 17:57 18:57 19:63 20:57
1 1:77 2:75 3:76 4:73 5:75 6:79 7:66 8:56 9:56 10:59 11:100 12:77 13:84 14:79 15:82 16:80 17:82 18:82 19:81 20:82
1 1:70 2:66 3:71 4:67 5:67 6:70 7:63 8:57 9:62 10:58 11:77 12:100 13:84 14:75 15:76 16:78 17:73 18:72 19:87 20:80
1 1:73 2:72 3:73 4:71 5:74 6:74 7:68 8:58 9:61 10:59 11:84 12:84 13:100 14:86 15:88 16:91 17:81 18:81 19:84 20:86
1 1:71 2:69 3:75 4:71 5:73 6:73 7:74 8:61 9:61 10:60 11:79 12:75 13:86 14:100 15:90 16:88 17:74 18:79 19:81 20:82
1 1:74 2:74 3:80 4:76 5:78 6:76 7:73 8:66 9:64 10:59 11:81 12:76 13:88 14:90 15:100 16:93 17:74 18:83 19:81 20:85
1 1:76 2:77 3:77 4:76 5:78 6:75 7:73 8:64 9:68 10:65 11:80 12:78 13:91 14:88 15:93 16:100 17:79 18:79 19:82 20:83
1 1:78 2:78 3:73 4:71 5:75 6:75 7:61 8:58 9:57 10:56 11:82 12:73 13:81 14:74 15:74 16:80 17:100 18:85 19:80 20:85
1 1:81 2:85 3:79 4:80 5:82 6:82 7:63 8:56 9:55 10:57 11:82 12:72 13:81 14:79 15:83 16:79 17:85 18:100 19:83 20:79
1 1:76 2:77 3:78 4:75 5:76 6:79 7:65 8:57 9:57 10:63 11:81 12:87 13:84 14:81 15:81 16:82 17:80 18:83 19:100 20:87
1 1:76 2:76 3:78 4:73 5:75 6:78 7:60 8:59 9:51 10:57 11:82 12:80 13:86 14:82 15:85 16:83 17:85 18:79 19:87 20:100
Then, I write code to classify them:
% read the data set
[image_label, image_features] = libsvmread(fullfile('D:\...'));
[N D] = size(image_features);
% Determine the train and test index
trainIndex = zeros(N,1);
trainIndex(1:200) = 1;
testIndex = zeros(N,1);
testIndex(201:N) = 1;
trainData = image_features(trainIndex==1,:);
trainLabel = image_label(trainIndex==1,:);
testData = image_features(testIndex==1,:);
testLabel = image_label(testIndex==1,:);
% Train the SVM
model = svmtrain(trainLabel, trainData, '-c 1 -g 0.05 -b 1');
% Use the SVM model to classify the data
[predict_label, accuracy, prob_values] = svmpredict(testLabel, testData, model, '-b 1');
But the final result for predict_label are all class 1, so the accuracy is 50%, which that it cannot get the correct predict label for class 2 and 3.
Is there something wrong from the format of data, or the code that I implemented?
Please help me, thanks very much.
To elaborate a bit more about the problem, there are at least three problems here:
You just check one values of parameters C (c) and Gamma (g) - behaviour of SVM is heavily dependant on the good choice of these parameters, so it is a common approach to use a grid search using cross validation testing for selecting the best ones.
Data scale also plays an important role here, if some of the dimensions are much bigger then the rest, you will bias the whole classifier, in order to deal with it there are at least two basic approaches: 1. Scale linearly each dimension to some interval (like [0,1] or [-1,1]) or normalize the data by transformation through Sigma^(-1/2) where Sigma is a data covariance matrix
Label imbalance - SVM works best when you have exactly the same amount of points in each class. Once it is not true, you should use the class weighting scheme in order to get valid results.
After fixing these three issues you should get reasonable results.
My guess is that you'd want to tune your parameters.
Make a loop over your -c and -g values (typically logarithimically, eg -c 10^(-3:5) ) and pick the one that is best.
That said, it is advisable to normalize your data, eg. scale it such that all values are between 0 and 1.