cv.GAMBoost {GAMBoost} | R Documentation |
Performs a K-fold cross-validation for GAMBoost
in search for the optimal number of boosting steps.
cv.GAMBoost(x=NULL,y,x.linear=NULL,maxstepno=500, K=10,type=c("loglik","error","L2"),pred.cutoff=0.5, just.criterion=FALSE,trace=FALSE,parallel=FALSE, upload.x=TRUE,folds=NULL,...)
x |
n * p matrix of covariates with potentially non-linear influence. If this is not given (and argument x.linear is employed), a generalized linear model is fitted. |
y |
response vector of length n . |
x.linear |
optional n * q matrix of covariates with linear influence. |
maxstepno |
maximum number of boosting steps to evaluate. |
K |
number of folds to be used for cross-validation. |
type, pred.cutoff |
goodness-of-fit criterion: likelihood ("loglik" ), error rate for binary response data ("error" ) or squared error for others ("L2" ).
For binary response data and the "error" criterion pred.cutoff specifies the p value cutoff for prediction of class 1 vs 0. |
just.criterion |
logical value indicating wether a list with the goodness-of-fit information should be returned or a GAMBoost fit with the optimal number of steps. |
trace |
logical value indicating whether information on progress should be printed. |
parallel |
logical value indicating whether evaluation of cross-validation folds should be performed in parallel
on a compute cluster. This requires library snowfall . |
upload.x |
logical value indicating whether x and x.linear should/have to be uploaded to the
compute cluster for parallel computation. Uploading these only once (using sfExport(x,x.linear) from library snowfall ) can save much time for large data sets. |
folds |
if not NULL , this has to be a list of length K , each element being a vector of indices of fold elements. Useful for employing the same folds for repeated runs. |
... |
miscellaneous parameters for the calls to GAMBoost |
GAMBoost
fit with the optimal number of boosting steps or list with the following components:
criterion |
vector with goodness-of fit criterion for boosting step 1 , ... , maxstep |
se |
vector with standard error estimates for the goodness-of-fit criterion in each boosting step. |
selected |
index of the optimal boosting step. |
folds |
list of length K , where the elements are vectors of the indices of observations in the respective folds. |
Harald Binder binderh@fdm.uni-freiburg.de
## Not run: ## Generate some data x <- matrix(runif(100*8,min=-1,max=1),100,8) eta <- -0.5 + 2*x[,1] + 2*x[,3]^2 y <- rbinom(100,1,binomial()$linkinv(eta)) ## Fit the model with smooth components gb1 <- GAMBoost(x,y,penalty=400,stepno=100,trace=TRUE,family=binomial()) ## 10-fold cross-validation with prediction error as a criterion gb1.crit <- cv.GAMBoost(x,y,penalty=400,maxstepno=100,trace=TRUE, family=binomial(), K=10,type="error",just.criterion=TRUE) ## Compare AIC and estimated prediction error which.min(gb1$AIC) which.min(gb1.crit$criterion) ## End(Not run)