Simulates a data set from a mixture-of-experts model for RCP (for region of common profile) types.

regional_mix.simulate(
  nRCP = 3,
  S = 20,
  n = 200,
  p.x = 3,
  p.w = 0,
  alpha = NULL,
  tau = NULL,
  beta = NULL,
  gamma = NULL,
  logDisps = NULL,
  powers = NULL,
  X = NULL,
  W = NULL,
  offset = NULL,
  family = "bernoulli"
)

Arguments

nRCP

Integer giving the number of RCPs

S

Integer giving the number of species

n

Integer giving the number of observations (sites)

p.x

Integer giving the number of covariates (including the intercept) for the model for the latent RCP types

p.w

Integer giving the number of covariates (excluding the intercept) for the model for the species data

alpha

Numeric vector of length S. Specifies the mean prevalence for each species, on the logit scale

tau

Numeric matrix of dimension c(nRCP-1,S). Specifies each species difference from the mean to each RCPs mean for the first nRCP-1 RCPs. The last RCP means are calculated using the sum-to-zero constraints

beta

Numeric matrix of dimension c(nRCP-1,p.x). Specifies the RCP's dependence on the covariates (in X)

gamma

Numeric matrix of dimension c(n,p.w). Specifies the species' dependence on the covariates (in W)

logDisps

Logartihm of the (over-)dispersion parameters for each species for negative binomial, Tweedie and Normal models

powers

Power parameters for each species for Tweedie model

X

Numeric matrix of dimension c(n,p.x). Specifies the covariates for the RCP model. Must include the intercept, if one is wanted. Default is random numbers in a matrix of the right size.

W

Numeric matrix of dimension c(n,p.w). Specifies the covariates for the species model. Must not include the intercept. Unless you want it included twice. Default is to give random levels of a two-level factor.

offset

Numeric vector of size n. Specifies any offset to be included into the species level model.

family

Text string. Specifies the family of the species data. Current options are "bernoulli" (default), "poisson", "negative.binomial", "tweedie" and "gaussian.

Examples

if (FALSE) { #generates synthetic data set.seed( 151) n <- 100 S <- 10 nRCP <- 3 my.dist <- "negative.binomial" X <- as.data.frame( cbind( x1=runif( n, min=-10, max=10), x2=runif( n, min=-10, max=10))) Offy <- log( runif( n, min=30, max=60)) pols <- list() pols[[1]] <- poly( X$x1, degree=3) pols[[2]] <- poly( X$x2, degree=3) X <- as.matrix( cbind( 1, X, pols[[1]], pols[[2]])) colnames( X) <- c("const", 'x1', 'x2', paste( "x1",1:3,sep='.'), paste( "x2",1:3,sep='.')) p.x <- ncol( X[,-(2:3)]) p.w <- 3 W <- matrix(sample( c(0,1), size=(n*p.w), replace=TRUE), nrow=n, ncol=p.w) colnames( W) <- paste( "w",1:3,sep=".") alpha <- rnorm( S) tau.var <- 0.5 b <- sqrt( tau.var/2) tau <- matrix( rexp( n=(nRCP-1)*S, rate=1/b) - rexp( n=(nRCP-1)*S, rate=1/b), nrow=nRCP-1, ncol=S) beta <- 0.2 * matrix( c(-1.2, -2.6, 0.2, -23.4, -16.7, -18.7, -59.2, -76.0, -14.2, -28.3, -36.8, -17.8, -92.9,-2.7), nrow=nRCP-1, ncol=p.x) gamma <- matrix( rnorm( S*p.w), ncol=p.w, nrow=S) logDisp <- log( rexp( S, 1)) set.seed(121) simDat <- regional_mix.simulate( nRCP=nRCP, S=S, p.x=p.x, p.w=p.w, n=n, alpha=alpha, tau=tau, beta=beta, gamma=gamma, X=X[,-(2:3)], W=W, family=my.dist, logDisp=logDisp, offset=Offy) }