Human physical growth is not a smooth progression through time, but is inherently dynamic in nature. Growth proceeds in a series of spurts which reflect an increase in the growth velocity (Bogin, 2010; Cameron & Bogin, 2012a; Stulp & Barrett, 2016). The adolescent growth spurt (also known as the pubertal growth spurt) is experienced by all individuals and is the most readily recognized aspect of adolescence. A clear knowledge of the growth dynamics during adolescence is essential when planning treatment to correct skeletal discrepancies such as scoliosis in which the spine has a sideways curve (Busscher et al., 2012; Sanders et al., 2007). Similarly, the timing of the peak growth velocity spurt is an important factor to be considered when initiating growth modification procedures to correct skeletal malocclusion characterized by discrepancies in the jaw size (Dean et al., 2016; Proffit, 2014a, 2014b)

As the notion of change is central in studying growth, a correct view of the growth dynamics and the relationship between different growth phases can only be obtained from longitudinal growth data (Cameron & Bogin, 2012b; Hauspie et al., 2004a). Analysis of longitudinal growth data using an appropriate statistical method allows description of growth trajectories (distance and derivatives) and estimation of the timing and intensity of the adolescent growth spurt. Unlike in early years when linear growth curve model (GCMs) based on conventional polynomials were extensively used for studying human growth (McArdle, 2015), the nonlinear nonlinear mixed effect are now gaining popularity in modelling longitudinal skeletal growth data such as height (Hauspie et al., 2004b; McArdle, 2015).

The super imposition by translation and rotation (*SITAR*)
model (Cole et al.,
2010) is a shape-invariant nonlinear mixed effect growth
curve model that fits a population average (i.e., mean average) curve to
the data and aligns each individual’s growth trajectory to the
population average curve by a set of three random effects (Cole et al., 2010):
the size relative to the mean growth curve (vertical shift), the timing
of the adolescent growth spurt relative to the mean age at peak growth
velocity (horizontal shift), and the intensity of the growth velocity,
i.e., the relative rate at which individuals grow during the adolescent
growth spurt in comparison to the mean growth intensity (horizontal
stretch). The concept of shape invariant model (SIM) was first described
by Lindstrom (1995) and later used by Beath (2007) for
modelling infant growth data (birth to 2 years). The current version of
the *SITAR* model that we describe below is developed by Cole et al. (2010).

The *SITAR* is particularly useful in modelling human physical
growth during adolescence. Recent studies have used *SITAR* for
analyzing height and weight data (Cole & Mori, 2018; Mansukoski et al.,
2019; Nembidzane et
al., 2020; Riddell et al.,
2017) and also to study jaw growth during adolescence (Sandhu, 2020).
All these previous studies estimated *SITAR* model within the
frequentist framework as implemented in the R package, ‘sitar’ package
(T. Cole,
2022).

Consider a dataset comprised of \(j\) individuals \((j = 1,..,j)\) where individual \(j\) provides \(n_j\) measurements (\(i = 1,.., n_j\)) of height (\(y_{ij}\)) recorded at age, \(x_{ij}\).

\[\begin{equation} y_{ij}=\alpha_{0\ }+\alpha_{j\ }+\sum_{r=1}^{p-1}{\beta_r\mathbf{Spl}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0+\zeta_j\right)}{e^{-\ \left(\left.\ \gamma_0+\ \gamma_j\right)\right.}}\right)}+e_{ij} (\#eq:1) \end{equation}\]Where **Spl(.)** is the natural cubic spline function
that generates the spline design matrix and \(\beta_1,.,\beta_{p-1}\) are spline
regression coefficient for the mean curve with \(\alpha_0\), \(\zeta_0\), \(\gamma_0\) as the population average size,
timing, and intensity parameters. By default, the predictor, age (\(x_{ij}\)) is mean centered by subtracting
the mean age (\(\bar{x}\)) from it
where \(\bar{x} = \frac{1}{n}
\sum_{i=1}^{n}x_{i.}\). The individual-specific random effects
for size \((\alpha_j)\), timing \((\zeta_j)\) and intensity \((\gamma_j)\) describe how an individual’s
growth trajectory differs from the mean growth curve. The residuals
\(e_{ij}\) are also assumed to be
normally distributed with zero mean and residual variance parameter
\(\sigma^2\) and are independent from
the random effects. The random effects are assumed to be multivariate
normally distributed with zero means and variance-covariance matrix
\(\Omega_2\). The \(\Omega_2\) is unstructured (i.e., distinct
variances and covariances between random effects) as follows:

There are two competing philosophies of model estimation (Bland & Altman, 1998; Schoot et al., 2014): the Bayesian (based on Bayes’ theorem) and the frequentist (e.g., maximum likelihood estimation) approaches. While the frequentist approach was predominant in the earlier years, the advent of powerful computers has given a new impetus to the Bayesian analysis (Bland & Altman, 1998; Hamra et al., 2013; Schoot et al., 2014). As a result, Bayesian statistical methods are becoming ever more popular in applied and fundamental research. The key difference between Bayesian statistical inference and frequentest statistical methods concerns the nature of the unknown parameters. In the frequentist framework, a parameter of interest is assumed to be unknown, but fixed. That is, it is assumed that in the population there is only one true population parameter, for example, one true mean or one true regression coefficient. In the Bayesian view of subjective probability, all unknown parameters are treated as uncertain and therefore should be described by a probability distribution (Schoot et al., 2014). A particular attractive feature of Bayesian modelling is its ability to handle otherwise complex model specifications such as hierarchical model (i.e,., multilevel/mixed-effects models) that involve nested data structures (e.g., repeated height measurements in individuals) (Hamra et al., 2013). Bayesian statistical methods are becoming popular in applied and clinical research (Schoot et al., 2014).

There are three essential components underlying Bayesian statistics
((Bayes, 1763; Stigler, 1986)): the *prior
distribution,* the *likelihood function,* and the
*posterior distribution*. The *prior distribution* refers
to all knowledge available *before* seeing the data whereas the
*likelihood function* expresses the information in the data given
the parameters defined in the model. The third component (i.e.,
*posterior distribution*) is based on combining the first two
components via Bayes’ theorem and the results are summarized by the
so-called *posterior inference*. The *posterior
distribution*, therefore, reflects one’s updated knowledge,
balancing prior knowledge with observed data.

The task of combining these components together can yield a complex model in which the exact distribution of one or more variables is unknown and estimators that rely on assumptions of normality may perform poorly. This limitation is often mitigated by estimating the Bayesian models using Markov Chain Monte Carlo (MCMC). Unlike deterministic maximum-likelihood algorithms, MCMC is a stochastic procedure that repeatedly generates random samples that characterize the distribution of parameters of interest (Hamra et al., 2013). The popular software platform for Bayesian estimation is the BUGS family including WinBUGS (Lunn et al., 2000), OpenBUGS (Spiegelhalter et al., 2007), JAGS (Plummer, 2003). More recently, the software Stan has been developed to achieve higher computational and algorithmic efficiency by using the No-U-Turn Sampler (NUTS), which is an adaptive variant of Hamiltonian Monte Carlo (HMC) (Gelman et al., 2015; Hoffman & Gelman, 2011; Neal, 2011).

Below we describe Bayesian model specification for a two-level
*SITAR* model (see also overview
section) and the default priors.

To get a better understanding of the data generative mechanism and for the ease of showing prior specification for each individual parameter, we re express the above model @ref(eq:1) as as follows:

\[\begin{equation} \begin{aligned} \text{y}_{ij} & \sim \operatorname{Normal}(\mu_{ij}, \sigma) \\ \\ \mu_{ij} & = \alpha_0+ \alpha_j+\sum_{r=1}^{p-1}{\beta_r\mathbf{Spl}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0+\zeta_j\right)}{e^{-\ \left(\left.\ \gamma_0+\gamma_j\right)\right.}}\right)} \\ \sigma & = \sigma_\epsilon \\ \\ \begin{bmatrix} \alpha_{j} \\ \zeta_{j} \\ \gamma_{j} \end{bmatrix} & \sim {\operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix},\ \mathbf S \mathbf R \mathbf S \end{pmatrix}} \\ \\ \mathbf S & = \begin{bmatrix} \sigma_{\alpha_j} & 0 & 0 \\ 0 &\sigma_{\zeta_j} & 0 \\ 0 & 0 & \sigma_{\gamma_j} \end{bmatrix} \\ \\ \mathbf R & = \begin{bmatrix} 1 & \rho_{\alpha_j\zeta_j} & \rho_{\alpha_j\gamma_j} \\\rho_{\zeta_j\alpha_j} & 1 & \rho_{\zeta_j\gamma_j} \\\rho_{\gamma_j\alpha_j} & \rho_{\gamma_j\zeta_j} & 1 \end{bmatrix} \\ \\ \alpha_0 & \sim \operatorname{student\_t}(3,\ y_{mean},\ {y_{sd}},\ autoscale= TRUE) \\ \zeta_0 & \sim \operatorname{student\_t}(3,\ 0,\ 3.5, \ autoscale= FALSE) \\ \gamma_0 & \sim \operatorname{student\_t}(3,\ 0,\ 1.5, \ autoscale= FALSE) \\ \beta_1 \text{,..,} \beta_r & \sim \operatorname{student\_t}(3,\ 0, \ \mathbf {X_{scaled}},\ autoscale= TRUE) \\ \alpha_j & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {y_{sd}}, \ autoscale= TRUE) \\ \zeta_j & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {2}, \ autoscale= FALSE) \\ \gamma_j & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {0.5}, \ autoscale= FALSE) \\ \sigma_\epsilon & \sim \operatorname{exponential}(y_{sd}, \ autoscale= TRUE) \\ \mathbf R & \sim \operatorname{LKJ}(1), \end{aligned} (\#eq:4) \end{equation}\]

The first line in the above equation is the likelihood which states
that outcome is distributed with mean *mu* and standard
deviation, *sigma*. The *mu* is the function of growth
curve parameters as described earlier @ref(eq:1). The residuals are
assumed to be independent and normally distributed with 0 mean and a
\(n_j \times n_j\) dimensional identity
covariance matrix with a diagonal constant variance parameter, \(\sigma_\epsilon\) i.e, \(I\sigma_\epsilon\) where \(I\) is the identity matrix (diagonal matrix
of 1s), and \(\sigma_\epsilon\) is the
residual standard deviation. The assumption of homoscedasticity of
residuals (constant level 1 variance) can be relaxed.

We follow the recommendations made in the popular packages rstanrm and brms for prior specification. Technically, the prior used in ‘rstanrm’ and ‘brms’ packages are data-dependent and, hence ‘weakly informative’. This is because priors are scaled based on the distribution (i.e., standard deviation) of the outcome and the predictor(s). However, the amount of information used is weak and mainly regulatory in nature that helps in stabilizing the computation. An important feature of such priors is that defaults priors are reasonable for many models.

The ‘bsitar’ package allows setting prior for each parameter
individually. For example, to set prior for the population average
regression coefficient and the standard deviation of the group level
random effects for size parameter `a`

, the arguments used are
`a_prior_beta`

and `a_prior_sd.`

Like ‘brms’ and
‘rstanrm’, the ‘bsitar’ package offers a full flexibility in setting a
wide range of priors that encourage the users to specify priors that
actually reflect their prior knowledge about the human growth processes.
We follow a carefully crafted approach that is based on the
recommendations made in the brms and rstanrm
packages. For example, while we follow the ‘brms’ package in using the
‘student_t’ distribution for the regression coefficients as well as the
standard deviation for group level random effects (with default 3 degree
of freedom), we set ‘exponential’ distribution for the residual standard
deviation parameter as suggested in the ‘rstanarm’ package. Note that
‘rstanrm’ recommends `normal`

distribution for population and
random effect intercepts whereas the ‘brms’ uses the ‘student_t’
distribution for these parameters.

Similar to the ‘brms’ and ‘rstanarm’ packages, the ‘bsitar’ package
allows user to control the the scale parameter for the location-scale
based distributions such as ‘normal’ and ‘student_t’ via an auto scaling
option. Here again we adopt an amalgamation of the strategies offered by
the ‘brms’ and ‘rstanarm’ packages. While ‘rstanarm’ earlier used to set
autoscale as `TRUE`

which scaled distribution by multiplying
it with a fixed value \(2.5\) (recently
authors changed this behavior to FALSE), the ‘brms’ package sets scale
factor as \(1.0\) or \(2.5\) depending on the the Median Absolute
Deviation (MAD) of the outcome. If MAD is less than \(2.5\), it scales prior by a factor of \(2.5\) otherwise the scale factor is \(1.0\) (i.e., no auto scaling). The ‘bsitar’
package, on the other hand, offers full flexibility in choosing the
scale factor via a built in option `autoscale`

. For example,
scaling factor can be set as any real number such as \(1.5\) (e.g., autoscale = 1.5) or setting
the `autoscale`

as `TRUE`

(autoscale = TRUE) which
sets in default scaling factor as \(2.5\). The `autoscale`

option is
available for all location-scale based distibutions such as
`normal`

, `student_t`

, `cauchy`

etc. We
strongly recommend to go through the documentation on priors included in
the brms and
rstanrm
packages.

Below we briefly describe the default approach used in setting the
‘weakly informative’ priors for the regression coefficients as well as
the standard deviation of random effects for `a`

(size),
`b`

(timing) and `c`

(intensity) parameters and
their correlations.

The population regression parameter

`a`

(size) is assigned ‘student_t’ prior centered at \(y_{mean}\) (i.e., mean of the outcome) with scale defined as the standard deviation of the outcome \(y_{sd}\) multiplied by the default scaling factor \(2.5\) i.e.,`student_t(3, ymean, ysd, autoscale = TRUE)`

. The prior on the standard deviation of the random effect parameter`a`

is identical to the ‘student_t’ with mean`0`

and scale identical to the regression parameter with the exception that it is centered at mean`0`

i.e,`i.e.,`

student_t(3, 0, ysd, autoscale = TRUE)`.For population regression parameter

`b`

(timing), the default prior is ‘student_t’ with mean`0`

and scale \(3.5\) i.e,`'student_t'(3, 0, 3.5, autoscale = FALSE)`

. Note that unlike parameter`a`

, autoscale option is set as`FALSE`

. Since predictor`age`

is typically mean centered when fitting the*SITAR*model, the prior implies that 95% mass of the distribution (assuming distribution approaches normal curve) would cover between 6 and 20 years when mean age is`13`

years. Depending on the mean age and whether data are for males or females, the scale factor can be adjusted accordingly. Prior for the sd parameter is ‘student_t’ but with mean`0`

and sd 2.0 (`student_t(3, 0, 2.0, autoscale = FALSE)`

) implying that 95% mass of the distribution will cover \(\pm 4.0\) years for the individual variation in the timing parameter`b`

.The default prior for the population average intensity regression parameter

`c`

is ‘student_t’ with mean`0`

and scale`1.5`

i.e,`student_t(3, 0, 1.5, autoscale = FALSE)`

. Since intensity parameter is estimated on`exp`

scale, the above prior implies that 95% mass of the distribution would cover intensity between`exp(-3)`

and`exp(3)`

i.e.,`0.05`

and`20.0`

unit/year. Prior for the sd parameter (i.e., the random effect parameter`c`

) is again ‘student_t’ with mean`0`

and sd`1.25`

(`student_t(3, 0, 1.25, autoscale = FALSE)`

) indicating that 95% mass of the distribution for individual variation in the timing will cover \(\pm exp(1.25)\) i.e,`0.28`

and`3.50`

unit/year.The prior for the correlations between random effect parameters follow Lewandowski-Kurowicka-Joe (LKJ) distribution. The ‘LKJ’ prior is specified via a single parameter ‘eta’. If

`eta = 1`

(the default) all correlations matrices are equally likely`a priori`

. If`eta > 1`

, extreme correlations become less likely, whereas`0 < eta < 1`

results in higher probabilities for extreme correlations. See brms for more details.

A two level Bayesian *SITAR* model described earlier can be easily extended to analyze
two or more outcome simultaneously. Consider two outcomes (NA and PA)
measured repeatedly on \(j\)
individuals \((j = 1,..,j)\) where
individual \(j\) provides \(n_j\) measurements (\(i = 1,.., n_j\)) of \(y_{ij}^\text{NA}\) (outcome NA) and \(y_{ij}^\text{PA}\) (outcome NA) recorded at
age, \(x_{ij}\). A multivariate model
is then written as follows @ref(eq:5):

\[\begin{equation} \begin{aligned} \begin{bmatrix} \text{NA}_{ij} \\ \text{PA}_{ij} \end{bmatrix} & \sim \operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} \mu_{ij}^\text{NA} \\ \mu_{ij}^\text{PA} \end{bmatrix}, \mathbf {\Sigma_{Residual}} \end{pmatrix} \\ \\ \mu_{ij}^{\text{NA}} & = \alpha_0^\text{NA}+ \alpha_j^\text{NA}+\sum_{r^\text{NA}=1}^{p^\text{NA}-1}{\beta_r^\text{NA}\mathbf{Spl}^\text{NA}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0^\text{NA}+\zeta_j^\text{NA}\right)}{e^{-\ \left(\left.\ \gamma_0^\text{NA}+\gamma_j^\text{NA}\right)\right.}}\right)} \\ \\ \mu_{ij}^{\text{PA}} & = \alpha_0^\text{PA}+ \alpha_j^\text{PA}+\sum_{r^\text{PA}=1}^{p^\text{PA}-1}{\beta_r^\text{PA}\mathbf{Spl}^\text{PA}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0^\text{PA}+\zeta_j^\text{PA}\right)}{e^{-\ \left(\left.\ \gamma_0^\text{PA}+\gamma_j^\text{PA}\right)\right.}}\right)} \\ \\ \mathbf {\Sigma_{Residual}} & = \mathbf S_W \mathbf R_W \mathbf S_W \\ \\ \mathbf S_W & = \begin{bmatrix} \sigma_{ij}^\text{NA} & 0 \\ 0 & \sigma_{ij}^\text{PA} \\ \end{bmatrix} \\ \\ \mathbf R_W & = \begin{bmatrix} 1 & \rho_{\sigma_{ij}^\text{NA}\sigma_{ij}^\text{PA}} \\ \rho_{\sigma_{Ij}^\text{PA}\sigma_{Ij}^\text{NA}} & 1 \end{bmatrix} \\ \\ \begin{bmatrix} \alpha_{j}^\text{NA} \\ \zeta_{j}^\text{NA} \\ \gamma_{j}^\text{NA} \\ \alpha_{j}^\text{PA} \\ \zeta_{j}^\text{PA} \\ \gamma_{j}^\text{PA} \end{bmatrix} & \sim {\operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix},\ \mathbf {\Sigma_{ID}} \end{pmatrix}} \\ \\ \\ \mathbf {\Sigma_{ID}} & = \mathbf S_{ID} \mathbf R_{ID} \mathbf S_{ID} \\ \\ \mathbf S_{ID} & = \begin{bmatrix} \alpha_{j}^\text{NA} & 0 & 0 & 0 & 0 & 0 \\ 0 & \zeta_{j}^\text{NA} & 0 & 0 & 0 & 0 \\ 0 & 0 & \gamma_{j}^\text{NA} & 0 & 0 & 0 \\ 0 & 0 & 0 & \alpha_{j}^\text{PA} & 0 & 0 \\ 0 & 0 & 0 & 0 & \zeta_{j}^\text{PA} & 0 \\ 0 & 0 & 0 & 0 & 0 & \gamma_{j}^\text{PA} \\ \end{bmatrix} \\ \\ \mathbf R_{ID} & = \begin{bmatrix} 1 & \rho_{\alpha_{j}^\text{NA}\zeta_{j}^\text{NA}} & \rho_{\alpha_{j}^\text{NA}\gamma_{j}^\text{NA}} & \rho_{\alpha_{j}^\text{NA}\alpha_{j}^\text{PA}} & \rho_{\alpha_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\alpha_{j}^\text{NA}\gamma_{j}^\text{PA}} \\ \rho_{\zeta_{j}^\text{NA}\alpha_{j}^\text{NA}} & 1 & \rho_{\zeta_{j}^\text{NA}\gamma_{j}^\text{NA}} & \rho_{\zeta_{j}^\text{NA}\alpha_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{NA}\gamma_{j}^\text{PA}} \\ \rho_{\gamma_{j}^\text{NA}\alpha_{j}^\text{NA}} & \rho_{\gamma_{j}^\text{NA}\zeta_{j}^\text{NA}} & 1 & \rho_{\gamma_{j}^\text{NA}\alpha_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{NA}\gamma_{j}^\text{PA}} \\ \rho_{\alpha_{j}^\text{NA}\alpha_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{NA}\alpha_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{NA}\alpha_{j}^\text{PA}} & 1 & \rho_{\alpha_{j}^\text{PA}\zeta_{j}^\text{PA}} & \rho_{\alpha_{j}^\text{PA}\gamma_{j}^\text{PA}} \\ \rho_{\alpha_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{NA}\zeta_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{PA}\alpha_{j}^\text{PA}} & 1 & \rho_{\zeta_{j}^\text{PA}\gamma_{j}^\text{PA}} \\ \rho_{\alpha_{j}^\text{NA}\gamma_{j}^\text{PA}} & \rho_{\zeta_{j}^\text{NA}\gamma_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{NA}\gamma_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{PA}\alpha_{j}^\text{PA}} & \rho_{\gamma_{j}^\text{PA}\zeta_{j}^\text{PA}} & 1 \end{bmatrix} \\ \\ \alpha_0^\text{NA} & \sim \operatorname{student\_t}(3,\ y^\text{NA}_{mean},\ {y^\text{NA}_{sd}},\ autoscale= TRUE) \\ \alpha_0^\text{PA} & \sim \operatorname{student\_t}(3,\ y^\text{PA}_{mean},\ {y^\text{PA}_{sd}},\ autoscale= TRUE) \\ \zeta_0^\text{NA} & \sim \operatorname{student\_t}(3,\ 0, 3.5,\ autoscale= FALSE) \\ \zeta_0^\text{PA} & \sim \operatorname{student\_t}(3,\ 0, 3.5,\ autoscale= FALSE) \\ \gamma_0^\text{NA} & \sim \operatorname{student\_t}(3,\ 0, 1.5,\ autoscale= FALSE) \\ \gamma_0^\text{PA} & \sim \operatorname{student\_t}(3,\ 0, 1.5,\ autoscale= FALSE) \\ \beta_1^\text{NA} \text{,..,} \beta_r^\text{NA} & \sim \operatorname{student\_t}(3,\ 0, \ \mathbf {X^\text{NA}_{scaled}},\ autoscale= TRUE) \\ \beta_1^\text{PA} \text{,..,} \beta_r^\text{PA} & \sim \operatorname{student\_t}(3,\ 0, \ \mathbf {X^\text{PA}_{scaled}},\ autoscale= TRUE) \\ \alpha_j^\text{NA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {y^\text{NA}_{sd}},\ autoscale= TRUE) \\ \alpha_j^\text{PA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {y^\text{PA}_{sd}},\ autoscale= TRUE) \\ \zeta_j^\text{NA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {2},\ autoscale= FALSE) \\ \zeta_j^\text{PA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {2},\ autoscale= FALSE) \\ \gamma_j^\text{NA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {0.5},\ autoscale= FALSE) \\ \gamma_j^\text{PA} & \sim \operatorname{student\_t_{Half}}(3,\ {0},\ {0.5},\ autoscale= FALSE) \\ \sigma_{ij}^\text{NA} & \sim \operatorname{exponential}({y^\text{NA}_{sd}}, \ autoscale = TRUE) \\ \sigma_{ij}^\text{PA} & \sim \operatorname{exponential}({y^\text{PA}_{sd}}, \ autoscale = TRUE) \\ \mathbf R_W & \sim \operatorname{LKJ}(1) \\ \mathbf R_{ID} & \sim \operatorname{LKJ}(1), \end{aligned} (\#eq:5) \end{equation}\]

where \(\text{NA}\) and \(\text{PA}\) superscripts indicate which variable is connected with which parameter. This is a straight multivariate generalization from the previous model, @ref(eq:4). At individual level, we have six parameters varying across individuals, resulting in an \(6 \times 6\) \(\mathbf S_{ID}\) matrix and an \(6 \times 6\) \(\mathbf R_{ID}\) matrix. The within individual variability is captured by the residual parameters which include \(2 \times 2\) \(\mathbf S_{W}\) matrix and an \(2 \times 2\) \(\mathbf R_{W}\) matrix. The priors described above for the [Univariate model specification]) are applied to each outcome. The prior on residual correlation between outcomes is ‘lkj’ prior as described earlier (see [Univariate model specification]).

Bayes, T. (1763). LII. An essay towards solving a problem in the
doctrine of chances. By the late rev. Mr. Bayes, FRS communicated by mr.
Price, in a letter to john canton, AMFR s. *Philosophical
Transactions of the Royal Society of London*, *53*, 370418.

Beath, K. J. (2007). Infant growth modelling using a shape invariant
model with random effects. *Statistics in Medicine*,
*26*(12), 2547–2564. https://doi.org/10.1002/sim.2718

Bland, J. M., & Altman, D. G. (1998). Statistics notes: Bayesians
and frequentists. *BMJ : British Medical Journal*,
*317*(7166), 1151. https://doi.org/10.1136/bmj.317.7166.1151

Bogin, B. (2010). *Evolution of human growth* (M. P. Muehlenbein,
Ed.; pp. 379–395). Cambridge University Press. https://doi.org/10.1017/CBO9780511781193.028

Busscher, I., Kingma, I., Bruin, R. de, Wapstra, F. H., Verkerke, G. J.,
& Veldhuizen, A. G. (2012). Predicting the peak growth velocity in
the individual child: Validation of a new growth model. *European
Spine Journal : Official Publication of the European Spine Society, the
European Spinal Deformity Society, and the European Section of the
Cervical Spine Research Society*, *21*(1), 71–76. https://doi.org/10.1007/s00586-011-1845-z

Cameron, N., & Bogin, B. (2012a). *Human growth and
development* (2nd ed.). Academic Press. https://books.google.co.in/books?id=V8v3yD9--x8C

Cameron, N., & Bogin, B. (2012b). *Human growth and
development* (2nd ed.). Academic Press. https://books.google.co.in/books?id=V8v3yD9--x8C

Cole, T. (2022). *Sitar: Super imposition by translation and rotation
growth curve analysis*. https://CRAN.R-project.org/package=sitar

Cole, T. J., Donaldson, M. D. C., & Ben-Shlomo, Y. (2010).
SITAR—a useful instrument for growth curve analysis.
*International Journal of Epidemiology*, *39*(6),
1558–1566. https://doi.org/10.1093/ije/dyq115

Cole, T. J., & Mori, H. (2018). Fifty years of child height and
weight in Japan and South Korea:
Contrasting secular trend patterns analyzed by
SITAR. *American Journal of Human Biology*,
*30*(1), e23054. https://doi.org/10.1002/ajhb.23054

Dean, J. A., Jones, J. E., & Vinson, L. A. W. (2016). *McDonald
and avery’s dentistry for the child and adolescent* (10th ed.).
Elsevier Health Sciences. https://books.google.co.uk/books?id=HqtcCgAAQBAJ

Gelman, A., Lee, D., & Guo, J. (2015). Stan: A probabilistic
programming language for bayesian inference and optimization.
*Journal of Educational and Behavioral Statistics*,
*40*(5), 530–543. https://doi.org/10.3102/1076998615606113

Hamra, G., MacLehose, R., & Richardson, D. (2013). Markov chain
monte carlo: An introduction for epidemiologists. *International
Journal of Epidemiology*, *42*(2), 627–634. https://doi.org/10.1093/ije/dyt043

Hauspie, R. C., Cameron, N., & Molinari, L. (2004a). *Methods in
human growth research*. Cambridge University Press.

Hauspie, R. C., Cameron, N., & Molinari, L. (2004b). *Methods in
human growth research*. Cambridge University Press.

Hoffman, M. D., & Gelman, A. (2011). *The no-u-turn sampler:
Adaptively setting path lengths in hamiltonian monte carlo*. https://doi.org/10.48550/ARXIV.1111.4246

Lindstrom, M. J. (1995). Self-modelling with random shift and scale
parameters and a free-knot spline shape function. *Statistics in
Medicine*, *14*(18), 2009–2021. https://doi.org/https://doi.org/10.1002/sim.4780141807

Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000).
WinBUGS - A Bayesian modelling framework: Concepts, structure, and
extensibility. *Statistics and Computing*, *10*(4),
325–337. https://doi.org/10.1023/A:1008929526011

Mansukoski, L., Johnson, W., Brooke‐Wavell, K., Galvez‐Sobral, J. A.,
Furlan, L., Cole, T. J., & Bogin, B. (2019). Life course
associations of height, weight, fatness, grip strength, and all‐cause
mortality for high socioeconomic status Guatemalans.
*American Journal of Human Biology*, *31*(4). https://doi.org/10.1002/ajhb.23253

McArdle, J. J. (2015). *Growth curve analysis* (J. Wright, Ed.;
pp. 441–446). Elsevier. https://doi.org/http://dx.doi.org/10.1016/B978-0-08-097086-8.44030-4

Neal, R. M. (2011). *MCMC Using Hamiltonian Dynamics*. Routledge
Handbooks Online. https://doi.org/10.1201/b10905-7

Nembidzane, C., Lesaoana, M., Monyeki, K. D., Boateng, A., & Makgae,
P. J. (2020). Using the SITAR Method to
Estimate Age at Peak
Height Velocity of Children in
Rural South Africa:
Ellisras Longitudinal Study.
*Children*, *7*(3), 17. https://doi.org/10.3390/children7030017

Plummer, M. (2003). *JAGS: A program for analysis of bayesian
graphical models using gibbs sampling*.

Proffit, W. R. (2014a). *Concepts of growth and development* (W.
R. Proffit, H. W. Fields, & D. M. Sarver, Eds.; 5th ed., pp. 20–65).
Elsevier Health Sciences. https://books.google.co.in/books?id=gn8xBwAAQBAJ

Proffit, W. R. (2014b). *Concepts of growth and development* (W.
R. Proffit, H. W. Fields, & D. M. Sarver, Eds.; 5th ed., pp. 20–65).
Elsevier Health Sciences. https://books.google.co.in/books?id=gn8xBwAAQBAJ

Riddell, C. A., Platt, R. W., Bodnar, L. M., & Hutcheon, J. A.
(2017). Classifying Gestational Weight
Gain Trajectories Using the
SITAR Growth Model.
*Paediatric and Perinatal Epidemiology*, *31*(2), 116–125.
https://doi.org/10.1111/ppe.12336

Sanders, J. O., Browne, R. H., McConnell, S. J., Margraf, S. A., Cooney,
T. E., & Finegold, D. N. (2007). Maturity assessment and curve
progression in girls with idiopathic scoliosis. *The Journal of Bone
and Joint Surgery. American Volume*, *89*(1), 64–73. https://doi.org/10.2106/JBJS.F.00067

Sandhu, S. S. (2020). *Analysis of longitudinal jaw growth data to
study sex differences in timing and intensity of the adolescent growth
spurt for normal growth and skeletal discrepancies* [Thesis].

Schoot, R. van de, Kaplan, D., Denissen, J., Asendorpf, J. B., Neyer, F.
J., & Aken, M. A. G. van. (2014). A gentle introduction to bayesian
analysis: Applications to developmental research [Journal Article].
*Child Dev*, *85*(3), 842–860. https://doi.org/10.1111/cdev.12169

Spiegelhalter, D., Thomas, A., Best, N., & Lunn, D. (2007). OpenBUGS
user manual. *Version*, *3*(2), 2007.

Stigler, S. M. (1986). Laplace’s 1774 memoir on inverse probability.
*Statistical Science*, *1*(3), 359363.

Stulp, G., & Barrett, L. (2016). Evolutionary perspectives on human
height variation. *Biological Reviews*, *91*(1), 206–234.
https://doi.org/10.1111/brv.12165