R/species_mix-class.R
, R/species_mix_s3-class.R
species_mix.multifit.Rd
This version of species mix is useful for fitting models which have complex likelihoods. The multiple starts will enable optimisation of the loglikelihood using multiple starts.
species_mix.multifit( archetype_formula = NULL, species_formula = stats::as.formula(~1), all_formula = NULL, data, nArchetypes = 3, family = "bernoulli", offset = NULL, weights = NULL, bb_weights = NULL, size = NULL, power = 1.6, control = list(ecm_prefit = FALSE), inits = NULL, titbits = FALSE, nstart = 10, mc.cores = 1 ) # S3 method for species_mix.multifit print(x, ...)
archetype_formula | an object of class "formula" (or an object that can be coerced to that class). The response variable (left hand side of the formula) needs to be either 'occurrence', 'abundance', 'biomass' or 'quantity' data. The type of reponse data will help specify the type of error distribution to be used. The dependent variables (the right hind side) of this formula specifies the dependence of the species archetype probabilities on covariates. For all model the basic formula structure follows something like this: cbind(spp1,spp2,spp3)~1+temperature+rainfall |
---|---|
species_formula | an object of class "formula" (or an object that can be
coerced to that class). The right hand side of this formula specifies the
dependence of the species"'" data on covariates (typically different
covariates to |
all_formula | an object of class "formula", which is meant to represent a constant single set of covariates across all species and groups, typically you might use this an alternative to an offset, where there might be some bias in the data which is relatively constant and might arise as an artefact of how the data was collected. |
data | a matrix of dataframe which contains the 'species_data' matrix, a const and the covariates in the strucute of spp1, spp2, spp3, const, temperature, rainfall. dims of matirx should be nsites*(nspecies+const+covariates). |
nArchetypes | The number of mixing components (groups) to fit. |
family | The family of statistical distribution to use within the ecomix models. a choice between "bernoulli", "poisson", "ippm", "negative.binomial" and "gaussian" distributions are possible and applicable to specific types of data. |
offset | a numeric vector of length nrow(data) (n sites) that is included into the model as an offset. It is included into the conditional part of the model where conditioning is performed on the SAM. |
weights | a numeric vector of length ncol(Y) (n species) that is used as weights in the log-likelihood calculations. If NULL (default) then all weights are assumed to be identically 1. Because we are estimating the log-likelihood over species (rather than sites), the weights should be a vector n species long. |
bb_weights | a numeric vector of n species long. This is used for undertaking a Bayesian Bootstrap. See 'vcov.species_mix' for more details. |
size | The size of the sample for a binomial model (defaults to 1). |
power | The power parameter for a Tweedie model. Default is 1.6, and this is assigned to all species |
control | a list of control parameters for optimisation and calculation. See details. |
inits | NULL a numeric vector that provides approximate starting values for species_mix coefficients. These are distribution specific, but at a minimum you will need pi (additive_logitic transformed), alpha (intercepts) and beta (mixing coefs). |
titbits | either a boolean or a vector of characters. If TRUE (default for species_mix(qv)), then some objects used in the estimation of the model"'"s parameters are returned in a list entitled "titbits" in the model object. Some functions, for example plot.species_mix(qv) and predict.species_mix(qv), will require some or all of these pieces of information. If titbits=FALSE (default for species_mix.multifit(qv)), then an empty list is returned. If a character vector, then just those objects are returned. Possible values are:"Y" for the outcome matrix, "X" for the model matrix for the SAM model, "offset" for the offset in the model, "site_spp_weights" for the model weights, "archetype_formula" for the formula for the SAMs, "species_formula" for the formula for the species-specific model, "control" for the control arguments used in model fitting, "family" for the conditional distribution of the species data. Care needs to be taken when using titbits=TRUE in species_mix.multifit(qv) calls as titbits is created for EACH OF THE MODEL FITS. If the data is large or if nstart is large, then setting titbits=TRUE may give users problems with memory. |
nstart | for species_mix.multifit only. The number of random starts to perform for re-fitting. Default is 10, which will need increasing for serious use. |
mc.cores | for species_mix.multifit only. The number of cores to spread the re-fitting over. |
x | A species mix multifit object |
\dots | Ignored |
# \donttest{ library(ecomix) set.seed(42) sam_form <- stats::as.formula(paste0('cbind(',paste(paste0('spp',1:20), collapse = ','),")~x1+x2")) sp_form <- ~ 1 beta <- matrix(c(-2.9,-3.6,-0.9,1,.9,1.9),3,2,byrow=TRUE) dat <- data.frame(y=rep(1,100),x1=stats::runif(100,0,2.5), x2=stats::rnorm(100,0,2.5)) dat[,-1] <- scale(dat[,-1]) simulated_data <- species_mix.simulate(archetype_formula = sam_form,species_formula = sp_form, data = dat,beta=beta,family="bernoulli")#>fm1 <- species_mix(archetype_formula = sam_form,species_formula = sp_form, data = simulated_data, family = 'bernoulli', nArchetypes=3)#>#>#>#>#>#>#>#>#>#>#>#>#>#>#> initial value 819.837658 #> iter 10 value 819.706575 #> final value 819.703416 #> convergedfmods <- species_mix.multifit(archetype_formula = sam_form, species_formula = sp_form, data=simulated_data, family = 'bernoulli', nstart = 10, nArchetypes=3)#>#>#>#>#>#>#>#> | | | 0%. | |======== | 11%. | |================ | 22%. | |======================= | 33%. | |=============================== | 44%. | |======================================= | 56%. | |=============================================== | 67%. | |====================================================== | 78%. | |============================================================== | 89%. | |======================================================================| 100%.#> NULL# }