Bayesian mixed models

Theory and especially practice

Andrey Anikin

2023-10-20

Why Bayes

  • Bayesian analysis: probability of model parameters given data and priors

  • Frequentist analysis: (im)probability of data given model

Schoot et al. 2017 “A systematic review of Bayesian articles in psychology: The last 25 years”

Advantages (my personal take)

  1. One logic, one method
McElreath 2020, p. 2

Advantages (my personal take)

  1. More flexible and powerful
  • Works better with small samples (meaningful priors improve convergence)

  • More complex models are possible

  • Straightforward to quantify uncertainty of any fitted values / contrasts / predictions

Advantages (my personal take)

  1. More intuitive

I got it!!!

In this talk

  • Generative models

  • Bayesian mixed models with brms

    • How to fit and diagnose

    • How to plot

    • How to calculate effects

    • How to report

  • Useful resources and further directions

Generative models

Generative = “generating data”

Explicitly specify the process that generated observations

Example

A single normally distributed measure, eg human height

“Normal” = Gaussian = bell curve = 1 / exp^2

x = seq(-5, 5, length.out = 100)
plot(x, exp(-x ^ 2), type = 'l')

Draw from normal distribution

rnorm()

d = rnorm(n = 100, mean = 0, sd = 1)
hist(d)

head(d)
## [1]  1.90524940 -2.15555377 -0.05026634 -0.04281362  1.00146129  1.15929253

Probability density function (pdf)

dnorm()

quantiles = seq(-3, 3, length.out = 100)
dens = dnorm(x = quantiles, mean = 0, sd = 1)
plot(quantiles, dens, type = 'l')

# d is 100 numbers drawn from a normal distribution
d = rnorm(100, mean = 0, sd = 1)
hist(d, freq = FALSE)
# overlay the theoretical pdf
x = seq(-3, 3, length.out = 100)
points(x, dnorm(x, mean = 0, sd = 1), type = 'l', col = 'blue')

Put on the spot

Guess the mean and SD of this (hypothetical) distribution of the height of adult humans:

Guess 1

mean = 150, sd = 50

Guess 2

mean = 150, sd = 3

Guess 3

mean = 165, sd = 10

Informally…

  • Pick a generative model

  • Pick parameter values based on prior knowledge

  • Evaluate fit to observed data

  • Good fit? Keep these parameters and try again

  • Bad fit? Discard and try again

Long list of good parameters = posterior distribution = credible parameter values that might have generated our data

A bit more formally…

  • Define a process that generated data, here dnorm(mean, sd) –> LIKELIHOOD

  • Define reasonable ranges of our parameters, here mean & sd –> PRIOR

  • Explore fit to data within this “prior” range, analytically or with MCMC

  • Credible values of parameters are those that best reproduce data

And more formally still…

Posterior distribution of model parameters ~ prior * likelihood

Bayes rule: posterior = probability of data * prior / average probability of data

McElreath 2020, Ch. 4.1 - 4.3

Bayesian updating

McElreath 2020, p. 38

How it works in practice

Model: height ~ normal(mean, SD)

library(brms)
df = data.frame(height = d)

mod1 = brm(height ~ 1, data = df)  # default flat prior on mean

MCMC is the answer

MCMC = Markov Chain Monte Carlo

Algorithm(s) for approximating posterior distribution of model parameters

Ex.: if guessing mean/SD in rnorm() for human height, we get a JOINT posterior distribution of:

  • credible values for MEAN given a particular SD

  • credible values for SD given a particular MEAN

Note the “joint”!

mcmc = as_draws_df(mod1)
mcmc
## # A draws_df: 1000 iterations, 4 chains, and 3 variables
##    b_Intercept sigma lp__
## 1          163   9.6 -378
## 2          165  10.5 -378
## 3          165  10.7 -378
## 4          163   9.9 -377
## 5          164  10.7 -378
## 6          164   9.5 -378
## 7          164  10.1 -377
## 8          163  10.2 -377
## 9          163  10.1 -377
## 10         164  10.2 -377
## # ... with 3990 more draws
## # ... hidden reserved variables {'.chain', '.iteration', '.draw'}

plot(mcmc[, 1:2])
## Warning: Dropping 'draws_df' class as required metadata was removed.

Correlated model parameters –> difference in confidence intervals ≠ confidence interval on difference

Values to report

Point estimate: mode / median / mean of posterior distribution

mean(mcmc$b_Intercept)
## [1] 163.2034

“The most credible mean height is 163 cm”

Values to report

Credible interval

quantile(mcmc$b_Intercept, probs = c(.5, .025, .975))
##      50%     2.5%    97.5% 
## 163.2102 161.1560 165.2060

“The most credible mean height is 163 cm, 95% CI [161, 165]”

Values to report

Proportion of posterior distribution that satisfies condition

mean(mcmc$b_Intercept > 160)
## [1] 0.99925

“99.9% of the posterior distribution of mean height is above 160 cm”

Bayesian hypothesis testing

  • Bayes Factors (BF)

  • Region of practical equivalence (ROPE)

Bayes Factor

  • Ratio of prior-weighted likelihoods for two models

  • Very sensitive to priors

  • Same problems as p-values

  • More info:
    Dienes 2014 “Using Bayes to get the most out of non-significant results”
    Dienes 2016 “How Bayes factors change scientific practice”

ROPE

Posterior within ROPE -> no effect

rope = c(.45, .55)
posterior = rnorm(1000, mean = .52, sd = .015)
hist(posterior, xlim = c(0, 1)); abline(v = rope, col = 'blue', lwd = 5)

Proportion of posterior within ROPE

mean(posterior > rope[1] & posterior < rope[2])
## [1] 0.975

Posterior outside ROPE -> effect

rope = c(.45, .55)
posterior = rnorm(1000, mean = .72, sd = .015)
mean(posterior > rope[1] & posterior < rope[2])
## [1] 0
hist(posterior, xlim = c(0, 1)); abline(v = rope, col = 'blue', lwd = 5)

ROPE

Posterior overlaps with ROPE -> not clear

rope = c(.45, .55)
posterior = rnorm(1000, mean = .6, sd = .1)
mean(posterior > rope[1] & posterior < rope[2])
## [1] 0.242
hist(posterior, xlim = c(0, 1)); abline(v = rope, col = 'blue', lwd = 5)

Frequentist hypothesis testing

  • Reject the null (p < .05)

  • Fail to reject the null (p >= .05)

Baeysian hypothesis testing

  • Prove a hypothesis (null / alternative / …)

    • Reject the null

    • Accept the null

    • Any posterior condition (eg mean within specific range like 3 to 5)

  • Not enough evidence either way

Power vs precision

Goal 1 (power): prove H0 (posterior within ROPE) or prove H1 (posterior outside ROPE)

Goal 2 (precision): estimate effect size(s) with some precision, eg ± 0.1

More on ROPE, precision/power

Kruschke 2018 “The Bayesian new statistics” Psychon Bull Rev (2018) 25:178–206
Kruschke 2015 “Bayesian data analysis, 2nd ed.”, ch. 12
Kruschke 2015 ch. 13

Bayesian mixed models

Multilevel models with “random” intercepts and/or slopes

How it works: group-level parameters are taken from a single distribution instead of being independent

Example 1: ravens in a tool task

ani = read.csv('files_bayes/ravens.csv', stringsAsFactors = TRUE)
summary(ani)
##     raven        trial     success       
##  archie:15   Min.   : 1   Mode :logical  
##  bred  :15   1st Qu.: 4   FALSE:23       
##  k5    :15   Median : 8   TRUE :40       
##  lola  :15   Mean   : 8   NA's :27       
##  mandy :15   3rd Qu.:12                  
##  mum   :15   Max.   :15

Clean the dataset

ani = na.omit(ani)
head(ani)
##    raven trial success
## 2   lola     2   FALSE
## 3   lola     3   FALSE
## 4   lola     4   FALSE
## 9   lola     9   FALSE
## 10  lola    10   FALSE
## 11  lola    11   FALSE

Intercept-only model

mod0 = glm(success ~ 1, data = ani, family = 'binomial')
summary(mod0)
## 
## Call:
## glm(formula = success ~ 1, family = "binomial", data = ani)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)   0.5534     0.2617   2.115   0.0345 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 82.692  on 62  degrees of freedom
## Residual deviance: 82.692  on 62  degrees of freedom
## AIC: 84.692
## 
## Number of Fisher Scoring iterations: 4

Average success rate across all ravens

logistic = function(x) 1 / (1 + exp(-x))
logistic(summary(mod0)$coef[1, 1])
## [1] 0.6349206
mean(ani$success, na.rm = TRUE)
## [1] 0.6349206

Thus: we are just reproducing sample mean

Success rate per animal

table(ani$success, ani$raven)
##        
##         archie bred k5 lola mandy mum
##   FALSE      0    0  6    7     7   3
##   TRUE       7   12  8    0     6   7

Model with animal as “fixed” effect

mod1 = glm(success ~ raven, data = ani, family = 'binomial')
summary(mod1)
## 
## Call:
## glm(formula = success ~ raven, family = "binomial", data = ani)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  1.957e+01  4.065e+03   0.005    0.996
## ravenbred    1.557e-11  5.115e+03   0.000    1.000
## ravenk5     -1.928e+01  4.065e+03  -0.005    0.996
## ravenlola   -3.913e+01  5.748e+03  -0.007    0.995
## ravenmandy  -1.972e+01  4.065e+03  -0.005    0.996
## ravenmum    -1.872e+01  4.065e+03  -0.005    0.996
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 82.692  on 62  degrees of freedom
## Residual deviance: 49.284  on 57  degrees of freedom
## AIC: 61.284
## 
## Number of Fisher Scoring iterations: 18

Predict average accuracy per animal

NB: no such thing as “average raven” in this model

pr = predict(
  mod1, 
  newdata = data.frame(raven = unique(ani$raven)),  # for each raven
  type = 'response'  # probability of success
)  
round(pr * 100)
##   1   2   3   4   5   6 
##   0 100 100  57  70  46

Again, just sample means

aggregate(success ~ raven, ani, mean)
##    raven   success
## 1 archie 1.0000000
## 2   bred 1.0000000
## 3     k5 0.5714286
## 4   lola 0.0000000
## 5  mandy 0.4615385
## 6    mum 0.7000000

Bayesian with default (flat) priors

mod_brm0 = brm(success ~ 1 + raven, data = ani, 
               family = 'bernoulli', chains = 4, cores = 4,
               file = 'files_bayes/mod_noPrior.RDS')

Check model predictions

conditional_effects(mod_brm0)

Thus: same rubbish as glm()

So what’s the problem?

Each raven is treated as completely unique

  • Doesn’t generalize to unobserved participants

  • Doesn’t work for small clusters (eg few trials from some participants)

  • Can’t deal with floor/ceiling effects in some clusters (eg 100% accuracy)

Check convergence

summary(mod_brm0)
## Warning: Parts of the model have not converged (some Rhats are > 1.05). Be
## careful when analysing the results! We recommend running more iterations and/or
## setting stronger priors.
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: success ~ 1 + raven 
##    Data: ani (Number of observations: 63) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##             Estimate Est.Error  l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    2037.00   1884.67     23.15  5416.64 2.30        5       29
## ravenbred   12941.26   7679.48    511.46 28250.36 1.99        6       13
## ravenk5     -2036.69   1884.68  -5415.89   -22.55 2.30        5       29
## ravenlola  -29746.51  13291.50 -53152.14 -5801.50 2.17        5       29
## ravenmandy  -2037.17   1884.67  -5416.69   -23.40 2.30        5       29
## ravenmum    -2036.03   1884.68  -5415.51   -21.64 2.29        5       29
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check convergence

plot(mod_brm0, variable = c('b_Intercept', 'b_ravenbred', 'b_ravenk5'))

Fix 1: set reasonable priors

par(mfrow = c(1, 3))
intercept1 = rnorm(1000, 0, .5)
hist(logistic(intercept1), main = '')

intercept2 = rnorm(1000, 0, 2)
hist(logistic(intercept2), main = '')

intercept3 = rnorm(1000, 0, 5)
hist(logistic(intercept3), main = '')

Priors on difference between ravens

par(mfrow = c(1, 3))
beta_raven1 = rnorm(1000, 0, 100)
hist(logistic(intercept2 + beta_raven1), main = '')

beta_raven2 = rnorm(1000, 0, 3)
hist(logistic(intercept2 + beta_raven2), main = '')

beta_raven3 = rnorm(1000, 3, .01)
hist(logistic(intercept2 + beta_raven3), main = '')

Check names of priors

get_prior(success ~ 1 + raven, data = ani, family = 'bernoulli')
##                 prior     class       coef group resp dpar nlpar lb ub
##                (flat)         b                                       
##                (flat)         b  ravenbred                            
##                (flat)         b    ravenk5                            
##                (flat)         b  ravenlola                            
##                (flat)         b ravenmandy                            
##                (flat)         b   ravenmum                            
##  student_t(3, 0, 2.5) Intercept                                       
##        source
##       default
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##  (vectorized)
##       default

Fit model with our priors

library(brms)
prior = c(
  set_prior("normal(0, 2)", 'Intercept'),  # our "intercept2"
  set_prior("normal(0, 1)", 'b')           # our "beta_raven2"
)
mod_brm1 = brm(success ~ 1 + raven, data = ani, family = 'bernoulli', 
            prior = prior, chains = 4, cores = 4,
            file = 'files_bayes/mod_fixed.RDS')

Check convergence

summary(mod_brm1)
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: success ~ 1 + raven 
##    Data: ani (Number of observations: 63) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept      2.15      0.95     0.45     4.21 1.00     1720     1703
## ravenbred      2.54      1.88    -0.70     6.67 1.00     2048     2253
## ravenk5       -1.79      1.07    -4.03     0.16 1.00     1779     2154
## ravenlola     -5.30      1.67    -8.86    -2.43 1.00     2085     2290
## ravenmandy    -2.26      1.08    -4.51    -0.26 1.00     1926     1957
## ravenmum      -1.15      1.16    -3.55     1.00 1.00     2042     1917
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check convergence

plot(mod_brm1, variable = c('b_Intercept', 'b_ravenbred', 'b_ravenk5'))

Check model predictions

conditional_effects(mod_brm1)

Compare to observed

table(ani$success, ani$raven)
##        
##         archie bred k5 lola mandy mum
##   FALSE      0    0  6    7     7   3
##   TRUE       7   12  8    0     6   7

Note: Lola has 0/11 successes, yet the predicted accuracy is between 0 and 30%

Why? Because our prior expects only limited difference between animals (a raven is a raven)

SHRINKAGE

An even better fix: mixed models

  • Assume that success rates of ravens belong to a single distribution

  • Learn the parameters of this distribution from the data (adaptive prior)

success ~ bernoulli(mu[raven]) –> LIKELIHOOD

logit(mu) ~ normal(mean, sd) –> adaptive PRIOR

mean and sd (HYPERPRIORS) determined from the distribution across ravens

Adaptive prior

Success rates of all ravens define prior expectations for each raven

Fit this model with default (flat’ish) hyper-priors

mod_brm2 = brm(success ~ 1 + (1|raven), data = ani, 
               family = 'bernoulli', chains = 4, cores = 4,
               file = 'files_bayes/mod_re.RDS')
# compare: mod_brm1 = brm(success ~ 1 + raven, data = ani, family = 'bernoulli', ...)

Check convergence

summary(mod_brm2)
##  Family: bernoulli 
##   Links: mu = logit 
## Formula: success ~ 1 + (1 | raven) 
##    Data: ani (Number of observations: 63) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Group-Level Effects: 
## ~raven (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     3.61      1.74     1.32     8.13 1.00     1099     1074
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     0.79      1.40    -1.96     3.67 1.00      863     1000
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check convergence

plot(mod_brm2)

Predicted accuracy per raven

newdata = data.frame(raven = levels(ani$raven))
fit = fitted(mod_brm2, newdata = newdata)
colnames(fit) = c('fit', 'se', 'lwr', 'upr')
pl = cbind(newdata, fit)
pl$obs = aggregate(success ~ raven, ani, mean)[, 2]

Predicted accuracy per raven

library(ggplot2)
ggplot(pl, aes(raven, fit, ymin = lwr, ymax = upr)) + geom_pointrange() +
  geom_point(aes(raven, obs), shape = 4, color = 'blue', size = 3)

Shrinkage

Group-level intercepts “shrink” to global mean

Amount of shrinkage depends on:

  • Group size: less data –> more shrinkage

  • Distance from global mean: extreme values –> more shrinkage

  • Similarity between clusters: most clusters similar –> more shrinkage

Thus: predicted values ≠ observed values in mixed models

Predicted accuracy for a “typical” raven

View posterior distribution of predicted success rate:

fit = fitted(mod_brm2, newdata = NA, re_formula = NA, summary = FALSE) 
hist(fit)

Summarize

Asymmetric distribution –> mode ≠ median ≠ mean

mean(fit)
## [1] 0.6419514
quantile(fit, probs = c(.5, .025, .975))
##       50%      2.5%     97.5% 
## 0.6759189 0.1236150 0.9752390
# or: logistic(summary(mod_brm2)$fixed[, c(1, 3, 4)]) 
# Compare to observed global mean success rate:
mean(ani$success, na.rm = TRUE)
## [1] 0.6349206

Contrasts

  • Compare groups / conditions / ravens…

  • Method: extract posterior distribution OF THE CONTRAST

Contrasts between ravens

table(ani$raven, ani$success)
##         
##          FALSE TRUE
##   archie     0    7
##   bred       0   12
##   k5         6    8
##   lola       7    0
##   mandy      7    6
##   mum        3    7

bred > lola?

fit_bredLola = fitted(mod_brm2, 
  newdata = data.frame(raven = c('bred', 'lola')), 
  summary = FALSE)
head(fit_bredLola)
##           [,1]       [,2]
## [1,] 0.9990973 0.46316016
## [2,] 0.9992929 0.02048516
## [3,] 0.9931419 0.01229632
## [4,] 0.9973438 0.10188524
## [5,] 0.9999752 0.09040533
## [6,] 0.9999998 0.03309967

bred > lola?

d_bredLola = fit_bredLola[, 1] - fit_bredLola[, 2]
hist(d_bredLola)

bred > lola?

quantile(d_bredLola, probs = c(.5, .025, .975))
##       50%      2.5%     97.5% 
## 0.9290251 0.5771936 0.9996498

bred > lola?

mean(d_bredLola > 0)
## [1] 1

Thus: 100% of the posterior of the “bred vs. lola” contrast is positive. The most credible difference is 92.9%, 95% CI [57.7, 99.9].

mum > mandy?

fit_mumMandy = fitted(mod_brm2, 
  newdata = data.frame(raven = c('mum', 'mandy')), 
  summary = FALSE)
d_mumMandy = fit_mumMandy[, 1] - fit_mumMandy[, 2]
hist(d_mumMandy)

mum > mandy?

quantile(d_mumMandy, probs = c(.5, .025, .975))
##        50%       2.5%      97.5% 
##  0.2372693 -0.1533093  0.5679236
mean(d_mumMandy > 0)
## [1] 0.8865

Thus: quite uncertain. Only 88% of the posterior of the “mum vs. mandy” contrast is positive. The most credible difference is 23.7%, 95% CI [-15.3, 56.7].

Summary: mixed models

  • Nested data with multiple observations per subject / stimulus / family group / …

  • Group-level intercepts belong to a single distribution

  • Model with lme4::lmer() / lme4::glmer() or with brm()

  • Formula: y ~ fixed effects + (random effects), eg y ~ x + (x|group)

Summary: brm()

  • Fit model brm(y ~ x + (x|group)), with reasonable priors if possible

  • Check convergence plot(mod), posterior predictive checks pp_check(mod)

  • Get fitted values for all effects of interest –> plot with conditional_effects(mod)

  • Get posterior for comparisons of interest with fitted(mod), summarize –> plot / text

Advantages of multilevel models

  • Pool info across groups (ravens, participants, schools, …)

  • Generalize better to unobserved groups

  • Better estimates for small clusters (eg few trials from some participants)

  • Deal with floor/ceiling effects in some clusters (eg 100% accuracy)

  • Predictions at both group and population levels (eg for observed raven, for new ravens, or for a “typical” raven)

Useful resources and further directions

There’s more to brm()…

  • Other families ?brmsfamily

  • Priors

  • Multiple outcomes (multivariate regression)

  • Splines

  • Autocorrelated residuals

  • Imputed missing data

  • Measurement error

Other brm() familiers: “categorical”

  • Outcomes with >2 categories

  • Generalization of logistic regression to multiple possible outcomes

Ex.: predict for which party someone votes from sex, age, income, education, …

Other brm() familiers: “cumulative”

  • Ordinal outcomes like Likert scales

  • Generalization of multinomial regression to ordered outcomes

Ex.: “How much do you like this picture?” 1…5

Other brm() families: “zero_one_inflated_beta”

  • Beta for proportions, eg ratings (0, 1)

  • Zero-one-inflated to include 0 and 1

  • Good for continuous ratings on a Visual Analog Scale (VAS)

Multivariate regression

  • Model several outcomes simultaneously

  • Residuals on both outcomes can be modeled as correlated

  • Saves typing

Ex.: brm(mvbind(height, weight) ~ age, …)

Splines (smooth regression)

  • Model nonlinear relationships flexibly

  • “Wiggliness” determined automatically from the data

  • Formula: s(predictor), s(predictor, by = group), etc.

  • See ?brms::s

Example of Bayseian smooth regression

Amorim, M., Anikin, A., Mendes, A., Kotz, S., Lima, C., & Pinheiro, A. (2021). Changes in vocal emotion recognition across the life span. Emotion 21(2), 315-325.

Example of 2D splines

Oliva, M. & Anikin, A. (2018). Pupil dilation reflects the time course of emotion recognition in human vocalizations. Scientific Reports, 8(1), 4871.

Missing data

Autocorrelated time series

  • Autocorrelated residuals: value at time t predicts values at time (t+1)

  • Ex.: pupil size changes over time, event-related potentials, fMRI signal, skin conductance

  • Solution: use autoregressive / autocorrelated / moving-average models

  • Formula: brm(y ~ x + arma(p = 1, q = 1))

  • See ?brms::autocor-terms

Learn more about Bayes

(Library: 2nd ed. paper form, 1st ed. eBook)

MrElreath lecture videos 2022: https://www.youtube.com/playlist?list=PLDcUM9US4XdMROZ57-OIRtIK0aOynbgZN

Code in brms/tidyverse: https://bookdown.org/content/4857/

Learn more about Bayes

Kruschke & Liddell 2018 “The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective”

Kruschke 2015 “Bayesian data analysis”