Cross variogram negative : flip upside down - kriging

Using the "Cokriging" method, I first did a cross variogram. However, I had a negative relationship between my two variables taken, which led to the variogram crossed upside down. I was wondering if it was possible to overthrow it so that it would be the same as the other two
CEC: cationic echange capacity
IB.samples : Soil sealing index
chart.Correlation(Newdata[,.(A,pH_KCL,CEC,IB.samples)])
Chart correlation
g <- gstat(id="IB.samples", formula=IB.samples~1,
data= na.omit(Newdata),
locations=~x+y)
g <- gstat(g, id="CEC.res", formula=CEC.res~1,
data= na.omit(Newdata),
locations=~x+y)
v.cross <- variogram(g, cutoff=170 ,
width=15)
plot(v.cross, pch = 16, col='black')
lmc.model <- vgm(psill=0, model="Sph"
,
range=100, nugget=0)
LMC <- fit.lmc(v.cross, g, fit.method=7,
model=lmc.model,correct.diagonal = 0.95)
plot(v.cross, LMC, pch=10, col='black')
Cross variogram
Possible solution, but sounds a little boorish :
Perhaps I could change the sign of one of my variables to have a possitive correlation between them but I did not succeed in doing so.

Related

gtsummary::tbl_regression() - Obtain Random Effects from GLMM Zero-Inflated Model

When trying to create a table with the conditional random effects in r using the gtsummary function tbl_regression from a glmmTMB mixed effects negative-binomial zero-inflated model, I get duplicate random effects rows.
Example (using Mollie Brooks' Zero-Inflated GLMMs on Salamanders Dataset):
data(Salamanders)
head(Salamanders)
library(glmmTMB)
zinbm2 = glmmTMB(count~spp + mined +(1|site), zi=~spp + mined + (1|site), Salamanders, family=nbinom2)
zinbm2_table_cond <- tbl_regression(
zinbm2,
tidy_fun = function(...) broom.mixed::tidy(..., component = "cond"),
exponentiate = TRUE,
estimate_fun = purrr::partial(style_ratio, digits = 3),
pvalue_fun = purrr::partial(style_sigfig, digits = 3))
zinbm2_table_cond
Output:
Random Effects Output (cond)
When extracting the random effects from de zero-inflated part of the model I get the same problem.
Example:
zinbm2_table_zi <- tbl_regression(
zinbm2,
tidy_fun = function(...) broom.mixed::tidy(..., component = "zi"),
exponentiate = TRUE,
estimate_fun = purrr::partial(style_ratio, digits = 3),
pvalue_fun = purrr::partial(style_sigfig, digits = 3))
zinbm2_table_zi
Output:
Random Effects Output (zi)
The problem persists if I specify the effects argument in broom.mixed.
tidy_fun = function(...) broom.mixed::tidy(..., effects = "ran_pars", component = "cond"),
Looking at confidence intervals in both outputs it seems that somehow it is extracting random effects from both parts of the model and changing the estimate of the zero-inflated random effects (in 1st image; opposite in the 2nd image) to match the conditional part estimate while keeping the CI.
I am not knowledgeable enough to understand why this is happening. Since both rows have the same label I am having difficulty removing the wrong one.
Any tips on how to avoid this problem or a workaround to remove the undesired rows?
If you need more info, let me know.
Thank you in advance.
PS: Output images were changed to link due to insufficient reputation.

Issues with ordiplot3d NMDS in 3dvegan package

I am looking for some help here with this 3d NMDS code. I have 3 issues.
The layout of the plot moves significantly each time I execute the code.
The sites and species are sometimes far off of the plot.
The species text is often overlapping. How can I fix this?
I am unsure how to change the plotting environment to ggplot, so that might be out of the question.
library(vegan)
library(vegan3d)
library(tidyverse)
data("dune")
SiteID <- 1:20
NMDS = metaMDS(dune,distance="bray", try=500, wascores = TRUE, k=3)
NMDS1 = NMDS$points[,1]
NMDS2 = NMDS$points[,2]
NMDS3 = NMDS$points[,3]
NMDS = data.frame(NMDS1 = NMDS1, NMDS2 = NMDS2, NMDS3 = NMDS3, SiteID=SiteID)
NMDS_input <- metaMDS(dune,distance="bray",try=500,k=3,wascores = T)
pl4 <- with(NMDS, ordiplot3d(NMDS_input, pch=16, angle=50, main="Fish ion level 3", cex.lab=1.7,cex.symbols=1.5, tick.marks=FALSE))
sp <- scores(NMDS_input, choices=1:3, display="species", scaling="symmetric")
si <- scores(NMDS_input, choices=1:3, display="sites", scaling="symmetric")
text(pl4$xyz.convert(sp), rownames(sp), cex=0.7, xpd=TRUE)
sii <- as.data.frame(cbind(NMDS$SiteID,si))
with(NMDS, orditorp(pl4, labels = sii$V1, air=1, cex = 1))
labels must be character variables in orditorp. We always assumed so, but this was not checked in vegan::orditorp. Latest vegan version in github will take care of this and will also work with numeric labels.
ordiplot3d returns projected coordinates (in 2D) and if you want to plot those, you can just use the pl4 object that you saved and you do not need to use pl4$xyz.convert. This object will also be accepted in orditorp.
If you want to plot points that were not used in the original mock-3D plot, you must use pl4$xyz.convert for their 2D projection. This function will return the projected coordinates in a form that is directly accepted by standard R functions text, points (and some others), but they will not be accepted by orditorp (and I won't change this). You must make these into two-column matrix-like object; data.frame() will work.
Your example code contains a lot of un-needed code. The following is an edit with only necessary lines and fixes that make this example work with current vegan release.
library(vegan)
library(vegan3d)
data(dune)
SiteID <- as.character(1:20) # must be character
NMDS_input <- metaMDS(dune,distance="bray",try=500,k=3,wascores = T)
pl4 <- ordiplot3d(NMDS_input, pch=16, angle=50, main="Fish ion level 3", cex.lab=1.7,cex.symbols=1.5, tick.marks=FALSE) # no with(NMDS,...)
sp <- scores(NMDS_input, choices=1:3, display="species") # no arg scaling in scores.metaMDS
text(pl4$xyz.convert(sp), rownames(sp), cex=0.7, xpd=TRUE)
orditorp(pl4, labels = SiteID, air=1, cex = 1) # character labels w/points in the same location

How to overcome indefinite matrix error (NbClust)?

I'm getting the following error when calling NbClust():
Error in NbClust(data = ds[, sapply(ds, is.numeric)], diss = NULL, distance = "euclidean", : The TSS matrix is indefinite. There must be too many missing values. The index cannot be calculated.
I've called ds <- ds[complete.cases(ds),] just before running NbClust so there's no missing values.
Any idea what's behind this error?
Thanks
I had same issue in my research.
So, I had mailed to Nadia Ghazzali, who is the package maintainer, and got an answer.
I'll attached my mail and her reply.
my e-mail:
Dear Nadia Ghazzali. Hello Nadia. I have some questions about
NbClust function in R library. I have tried googling but could not
find satisfying answers. First, I’m so grateful for you to making
this awsome R library. It is very helpful for my reasearch. I tested
NbClust function in NbClust library with my own data like below.
> clust <- NbClust(data, distance = “euclidean”,
min.nc = 2, max.nc = 10, method = ‘kmeans’, index =”all”)
But soon, an error has occurred. Error: division by zero! Error in
Indices.WBT(x = jeu, cl = cl1, P = TT, s = ss, vv = vv) : object
'scott' not found So, I tried NbClust function line by line and
found that some indices, like CCC, Scott, marriot, tracecovw,
tracew, friedman, and rubin, were not calculated because of object
vv = 0. I’m not very familiar with argebra so I don’t know meaning
of eigen value. But it seems to me that object ss(which is squart of
eigenValues) should not be 0 after prodected.
So, here is my questions.
I assume that my data is so sparse(a lot of zero values) that sqrt(eigenValues) becomes too small, is that right? I’m sorry I
can’t attach my data but I can attach some part of eigenValues and
squarted eigenValues.
> head(eigenValues)
[1] 0.039769880 0.017179826 0.007011972 0.005698736 0.005164871 0.004567238
> head(sqrt(eigenValues))
[1] 0.19942387 0.13107184 0.08373752 0.07548997 0.07186704 0.06758134
And if my assume is right, what can I do for this problems? Only one
way to drop out 7 indices?
Thank you for reading and I’ll waiting your reply. Best regards!
and her reply:
Dear Hansol,
Thank you for your interest. Yes, your understanding is good.
Unfortunately, the seven indices could not be applied.
Best regards,
Nadia Ghazzali
#seni The cause of this error is data related. If you look at the source code of this function,
NbClust <- function(data, diss="NULL", distance = "euclidean", min.nc=2, max.nc=15, method = "ward", index = "all", alphaBeale = 0.1)
{
x<-0
min_nc <- min.nc
max_nc <- max.nc
jeu1 <- as.matrix(data)
numberObsBefore <- dim(jeu1)[1]
jeu <- na.omit(jeu1) # returns the object with incomplete cases removed
nn <- numberObsAfter <- dim(jeu)[1]
pp <- dim(jeu)[2]
TT <- t(jeu)%*%jeu
sizeEigenTT <- length(eigen(TT)$value)
eigenValues <- eigen(TT/(nn-1))$value
for (i in 1:sizeEigenTT)
{
if (eigenValues[i] < 0) {
print(paste("There are only", numberObsAfter,"nonmissing observations out of a possible", numberObsBefore ,"observations."))
stop("The TSS matrix is indefinite. There must be too many missing values. The index cannot be calculated.")
}
}
And I think the root cause of this error is the negative eigenvalues that seep in when the number of clusters is very high, i.e. the max.nc is high. So to solve the problem, you must look at your data. See if it got more columns then rows. Remove missing values, check for issues like collinearity & multicollinearity, variance, covariance etc.
For the other error, invalid clustering method, look at the source code of the method here. Look at line number 168, 169 in the given link. You are getting this error message because the clustering method is empty. if (is.na(method))
stop("invalid clustering method")

P-adjustment (FDR) on Hierarchical Clustering On Principle Components (HCPC) in R

I'm working right now with "Hierarchical Clustering On Principle Components (HCPC)". In the end of the analysis, p-values are computed by the HCPC function.
I searched but I couldn't find any function that could adjust the p-value based on FDR together with HCPC. It's really important to avoid any junk data in my multivariate set. Therefore my question is how can I run together with HCPC the p-value adjustment?
This is what I'm doing right now:
#install.packages(c("FactoMineR", "factoextra", "missMDA"))
library(ggplot2)
library(factoextra)
library(FactoMineR)
library(missMDA)
library(data.table)
MyData <- fread('https://drive.google.com/open?
id=1y1YbIXtUssEBqmMSEbiQGcoV5j2Bz31k')
row.names(MyData) <- MyData$ID
MyData [1] <- NULL
Mydata_frame <- data.frame(MyData)
# Compute PCA with ncp = 3 (Variate based on the cluster number)
Mydata_frame.pca <- PCA(Mydata_frame, ncp = 2, graph = FALSE)
# Compute hierarchical clustering on principal components
Mydata.hcpc <- HCPC(Mydata_frame.pca, graph = FALSE)
Mydata.hcpc$desc.var$quanti
v.test Mean in category
Overall mean sd in category Overall sd p.value
CD8RAnegDRpos 12.965378 -0.059993483
-0.3760962775 0.46726224 0.53192037 1.922798e-38
TregRAnegDRpos 12.892725 0.489753272
0.1381306362 0.46877083 0.59502553 4.946490e-38
mTregCCR6pos197neg195neg 12.829277 1.107851623
0.6495813704 0.48972987 0.77933283 1.124088e-37
CD8posCCR6neg183neg194neg 12.667318 1.741757598
1.1735140264 0.45260338 0.97870842 8.972977e-37
mTregCCR6neg197neg195neg 12.109074 1.044905184
0.6408258230 0.51417779 0.72804665 9.455537e-34
CD8CD8posCD4neg 11.306215 0.724115486
0.4320918842 0.49823677 0.56351333 1.222504e-29
CD8posCCR6pos183pos194neg 11.226390 -0.239967805
-0.4982954123 0.49454619 0.50203520 3.025904e-29
TconvRAnegDRpos 11.011114 -0.296585038
-0.5279707475 0.44863446 0.45846770 3.378002e-28

partial Distance Based RDA - Centroids vanished from Plot

I am trying to fir a partial db-RDA with field.ID to correct for the repeated measurements character of the samples. However including Condition(field.ID) leads to Disappearance of the centroids of the main factor of interest from the plot (left plot below).
The Design: 12 fields have been sampled for species data in two consecutive years, repeatedly. Additionally every year 3 samples from reference fields have been sampled. These three fields have been changed in the second year, due to unavailability of the former fields.
Additionally some environmental variables have been sampled (Nitrogen, Soil moisture, Temperature). Every field has an identifier (field.ID).
Using field.ID as Condition seem to erroneously remove the F1 factor. However using Sampling campaign (SC) as Condition does not. Is the latter the rigth way to correct for repeated measurments in partial db-RDA??
set.seed(1234)
df.exp <- data.frame(field.ID = factor(c(1:12,13,14,15,1:12,16,17,18)),
SC = factor(rep(c(1,2), each=15)),
F1 = factor(rep(rep(c("A","B","C","D","E"),each=3),2)),
Nitrogen = rnorm(30,mean=0.16, sd=0.07),
Temp = rnorm(30,mean=13.5, sd=3.9),
Moist = rnorm(30,mean=19.4, sd=5.8))
df.rsp <- data.frame(Spec1 = rpois(30, 5),
Spec2 = rpois(30,1),
Spec3 = rpois(30,4.5),
Spec4 = rpois(30,3),
Spec5 = rpois(30,7),
Spec6 = rpois(30,7),
Spec7 = rpois(30,5))
data=cbind(df.exp, df.rsp)
dbRDA <- capscale(df.rsp ~ F1 + Nitrogen + Temp + Moist + Condition(SC), df.exp); ordiplot(dbRDA)
dbRDA <- capscale(df.rsp ~ F1 + Nitrogen + Temp + Moist + Condition(field.ID), df.exp); ordiplot(dbRDA)
You partial out variation due to ID and then you try to explain variable aliased to this ID, but it was already partialled out. The key line in the printed output was this:
Some constraints were aliased because they were collinear (redundant)
And indeed, when you ask for details, you get
> alias(dbRDA, names=TRUE)
[1] "F1B" "F1C" "F1D" "F1E"
The F1? variables were constant within ID which already was partialled out, and nothing was left to explain.