fitvario {RandomFields}R Documentation

LSQ and Maximum Likelihood Estimation of Random Field Parameters

Description

The function estimates arbitrary parameters of a random field specification with various methods.

Usage

fitvario(x, y=NULL, z=NULL, T=NULL, data, model, param,
         lower=NULL, upper=NULL, sill=NA, ...)

fitvario.default(x, y=NULL, z=NULL, T=NULL, data, model, param,
         lower=NULL, upper=NULL, sill=NA, trend,
         use.naturalscaling=TRUE, PrintLevel=RFparameters()$Print,
         optim.control=NULL, bins=20, nphi=1, ntheta=1, ntime=20,
         distance.factor=0.5,
         upperbound.scale.factor=10, lowerbound.scale.factor=20,
         lowerbound.scale.LS.factor=5,
         upperbound.var.factor=10, lowerbound.var.factor=100,
         lowerbound.sill=1E-10, scale.max.relative.factor=1000,
         minbounddistance=0.001, minboundreldist=0.02,
         approximate.functioncalls=50, refine.onborder=TRUE,
          pch=RFparameters()$pch,
         var.name="X",
         time.name="T", transform=NULL, standard.style=NULL,
         lsq.methods=c("self", "plain", "sqrt.nr", "sd.inv", "internal"),
         mle.methods=c("ml", "reml"),
         cross.methods=NULL,
         users.guess=NULL, users.beta=NULL, table.format=FALSE)

Arguments

x (n x 2)-matrix of coordinates, or vector of x-coordinates
y vector of y coordinates
z vector of z coordinates
T vector of T coordinates; these coordinates are given in triple notation, see GaussRF
data vector or matrix of values measured at coord; If also a time component is given, the in the data the indices for the spatial components run the fastest.
model string or list; covariance model, see CovarianceFct, or type PrintModelList() to get all options.
If model is a list, then the parameters with value NA are estimated. Parameters that have value NaN should be explicitely be defined by the function transform. An alternative to define NaN values and the function transform, is to replace the NaN by a real-valued function with solely parameter a list defining a covariance model. In case of the anisotropy matrix, the matrix must be replaced by a list if functions are introduced. Only the list elements variance, scale or anisotropy, and kappas can be used, and not the mean or the trend. Further, the mean or the trend cannot be set by such a function. See also transform below.
param vector or matrix or NULL. If vector then param=c(mean, variance, nugget, scale,...); the parameters must be given in this order. Further parameters are to be added in case of a parametrised class of covariance functions, see CovarianceFct. Any components set to NA are estimated; the others are kept fix. See also model above.
lower list or vector. Lower bounds for the parameters. If lower and param are vectors and length(lower) < length(param) then lower must match the number of additional parameters a,b,c,....
If param is matrix the length of lower must match the number columns of param or being 2 elements smaller (then lower is filled with NA from the left. The bounds are equally applied to all rows.
If lower is a list, then elements that are not given are considered as NA.
If lower is not given, or lower contains NA then the missing bounds are generated automatically.
upper list or vector. Upper bounds for the parameters. See also lower.
sill If not NA the sill is kept fix. Only used if the standard format for the covariance model is given. See Details.
trend Not programmed yet. May only be set if missing(param); linear formula : uses X1, X2,... and T as internal parameters for the coordinates; all parameters are estimated
matrix : must have the same number of rows as x
fixed mean + matrix or linear formula : not possible within this function (just subtract the mean from your data before calling this function)
... arguments as given in fitvario.default and listed in the following.
use.naturalscaling logical. Only used if model is given in standard (simple) way. If TRUE then internally, rescaled covariance functions will be used for which cov(1)~=0.05. use.naturalscaling has the advantage that scale and the form parameters of the model get ‘orthogonal’, but use.naturalscaling does not work for all models. See Details.
PrintLevel level to which messages are shown. See Details.
optim.control control list for optim, which uses 'L-BFGS-B'. However 'parscale' may not be given.
bins number of bins of the empirical variogram. See Details.
nphi scalar or vector of 2 components. If it is a vector then the first component gives the first angle of the xy plane and the second one gives the number of directions on the half circle. If scalar then the first angle is assumed to be zero. Note that a good estimation of the variogramm by LSQ with a anisotropic model a large value for ntheta might be needed (about 20).
ntheta scalar or vector of 2 components. If it is a vector then the first component gives the first angle in the third direction and the second one gives the number of directions on the half circle. If scalar then the first angle is assumed to be zero.
Note that a good estimation of the variogramm by LSQ with a anisotropic model a large value for ntheta might be needed (about 20).
ntime scalar or vector of 2 components. if ntimes is a vector, then the first component are the maximum time distance (in units of the grid length T[3]) and the second component gives the step size (in units of the grid length T[3]). If scalar then the step size is assumed to 1 (in units of the grid length T[3].
distance.factor relative right bound for the bins. See Details.
upperbound.scale.factor relative upper bound for scale in LSQ and MLE. See Details.
lowerbound.scale.factor relative lower bound for scale in MLE. See Details.
lowerbound.scale.LS.factor relative lower bound for scale in LSQ. See Details.
upperbound.var.factor relative upper bound for variance and nugget. See Details.
lowerbound.var.factor relative lower bound for variance. See Details.
lowerbound.sill absolute lower bound for variance and nugget. See Details.
scale.max.relative.factor relative lower bound for scale below which an additional nugget effect is detected. See Details.
minbounddistance absolute distance to the bounds below which a part of the algorithm is considered as having failed. See Details.
minboundreldist relative distance to the bounds below which a part of the algorithm is considered as having failed. See Details.
approximate.functioncalls approximate evaluations of the ML target function on a grid. See Details.
refine.onborder logical. If refine.onborder=TRUE and if the result of any maximum likelihood method or cross validation method is on a borderline, then the optimisation is redone in a modified way (which takes about double extra time)
pch character shown before evaluating any method; if pch!="" then one or two additional steps in the MLE methods are marked by “+”. Default: "*".
var.name basic name for the coordinates in the formula of the trend. Default: ‘X’
time.name basic name for the time component in the formula of the trend. Default: ‘X’
transform vector of strings. Essentially, transform allows for the definition of a parameter as a function of other estimated or fixed parameters. All the parameters are supposed to be in a vector called ‘param’ where the positions are given by parampositions. An example of transform is function(param) {param[3] <- 5 - param[1]; param}. Any parameter that is set by transform, should be NaN in the model definition. If it is NA a warning is given. Note that the mean and the trend of the model can be neither set nor used in transform. See also standard.style.
Instead of giving transform, in the model definition, all NaN values are replaced by functions whose only parameter is a bare model list, i.e., only the list elements variance, scale or anisotropy, and kappas can be used, and not the mean or the trend. Further, the mean or the trend cannot be set by such a function.
Default: NULL
standard.style logical or NULL. This variable should only be set by the advanced user. If NULL then standard.style will be TRUE if the covariance model allows for a ‘standard’ definition (see convert.to.readable and CovarianceFct) and transform is NULL. If a ‘standard’ definition is given and both the variance and the nugget are either not estimated or do not appear on the right hand side of the transform, then standard.style might be set to TRUE by the user. This accelerates the MLE algorithm. The responsibility is completely left to the user, then.
lsq.methods variants of the least squares fit of the variogram. See Details.
mle.methods variants of the maximum likelihood fit of the covariance function. See Details.
cross.methods Not implemented yet.
users.guess User's guess of the parameters. All the parameters must be given using the same rules as for either param (except that no NA's should be contained) or model.
users.beta User's guess of the trend.
table.format logical. The parameters controls the output. See Value.

Details

The optimisations are performed using optimize if one parameter has to be estimated only and optim, otherwise.

First, by means of various control parameters, see below, the algorithm first tries to estimate the bounds for the parameters to be estimated, if the bounds for the parameters are not given.
Independently whether users.guess is given, the algorithm guesses initial values for the parameters. The automatic guess and the user's guess will be called primitive methods in the following.

Second, the variogram model is fitted by various least squares methods (according to the value of lsq.methods) using the best parameter set among the primitive methods as initial value if the effective number of parameters is greater than 1.

[Remarks: (i) “best” with respect to the target value of the respective lsq method; (ii) the effective number of parameters in the optimisation algorithm can be smaller than the number of estimated parameters, since in some cases, some parameters can be calculated explicitely; relevant for the choice between optimize and optim is the effective number of parameters; (iii) optim needs]

Third, the model is fitted by various maximum likelihood methods (according to the value of mle.methods) using the best parameter set among the primitive methods and the lsq methods as initial value (if the effective number of parameters is greater than 1).

Comments on specific parameters:

Value

If table.format=TRUE then a matrix is returned where the first rows contain the estimated parameters, followed by the target values of all methods for the given set of parameters; the last rows give the lower and upper bounds used in the estimations. The columns correspond to the various estimation methods for the parameters.
If table.format=FALSE then the function returns a list with the following elements

ev list returned by EmpiricalVariogram
variogram list of fitted models; they include the fixed and the estimated parameters; each list element gives the estimated covariance model
values list of optimal target values

Acknowledgement

Thanks to Paulo Ribeiro for hints and comparing the perliminary versions of fitvario in RandomFields V1.0 to likfit of the package geoR whose homepage is at http://www.est.ufpr.br/geoR/.

Note

This function does not depend on the value of RFparameters()$PracticalRange. The function fitvario always uses the standard specification of the covariance model as given in CovarianceFct.

Author(s)

Martin Schlather, martin.schlather@math.uni-goettingen.de http://www.stochastik.math.uni-goettingen.de/institute

References

See Also

CovarianceFct, GetPracticalRange, parampositions RandomFields,

Examples

 model <-"gencauchy"
 param <- c(0, 1, 0, 1, 1, 2)
 estparam <- c(0, NA, 0, NA, NA, 2) ## NA means: "to be estimated"
 ## sequence in `estparam' is
 ## mean, variance, nugget, scale, (+ further model parameters)
 ## So, mean, variance, and scale will be estimated here.
 ## Nugget is fixed and equals zero.
 points <- 100
 x <- runif(points,0,3)
 y <- runif(points,0,3) ## 100 random points in square [0, 3]^2
 d <- GaussRF(x=x, y=y, grid=FALSE, model=model, param=param, n=10)
 str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam,
              lower=c(0.1, 0.1), upper=c(1.9, 5), cross.me=NULL))

#########################################################
## The next two estimations give about the same result.
## For the first the sill is fixed to 1.5. For the second the sill
## is reached if the estimated variance is smaller than 1.5
estparam <-  c(0, NA, NA, NA, NA, NA) 
str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam,
    sill=1.5, cross.me=NULL))

estparam <-  c(0, NA, NaN, NA, NA, NA) 
parampositions(model=model, param=estparam)
f <- function(param) {
   param[5] <- max(0, 1.5 - param[1])
   return(param)
}
str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, 
    transform=f, cross.me=NULL))

#########################################################
## esimation of coupled parameters (kappa1 = kappa2, here)
estmodel <- list(list(model="gencauchy", var=NA, scale=NA,
                      kappa=list(NA, function(m) m[[1]]$kappa[1])),
                 "+",
                 list(model="nugget", var=NA))
system.time(str(fitvario(x=cbind(x,y), data=d, model=estmodel,
              cross.me=NULL), vec.len=6))

## much faster implementation, but trickier coding
estparam <- c(0, NA, NA, NA, NA, NaN) 
parampositions(model=model, param=estparam)
f <- function(param) {param[3] <- param[2]; param}
system.time(str(fitvario(x=cbind(x,y), data=d, model=model,
      param=estparam, transform=f, cross.me=NULL), vec.len=6))

## about as fast as previous coding
## about the same precision as previous coding, or better
## But, a warning is given, since the user may programme
## strange things in this setup, and the programme cannot check it.
system.time(str(fitvario(x=cbind(x,y), data=d, model=model,
      param=estparam, transform=f, cross.me=NULL,
      standard.style=TRUE), vec.len=6))


#########################################################
## estimation in a anisotropic framework
  x <- y <- (1:6)/4
  model <- list(list(m="exp", v=1.5, aniso=c(4,2,-2,1)))
  z <- GaussRF(x=x, y=y, grid=TRUE, model=model, n=10)
  estmodel <- list(list(m="exp", v=NA, aniso=c(NA, NA,-2,1)))
  system.time(str(fitvario(as.matrix(expand.grid(x, y)), data=z,
                           model=estmodel, nphi=20)))


[Package RandomFields version 1.3.41 Index]