diff --git a/.Rbuildignore b/.Rbuildignore index fe38dba..350f3f8 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -31,5 +31,7 @@ ^vignettes/Recurring_Events.Rmd$ ^vignettes/Super_Learning.Rmd$ ^vignettes/Time_Varying_Effects.Rmd$ +^vignettes/Causal_Effects.Rmd$ + diff --git a/.github/workflows/R-CMD-check.yml b/.github/workflows/R-CMD-check.yml index 44ca7d0..014ba73 100644 --- a/.github/workflows/R-CMD-check.yml +++ b/.github/workflows/R-CMD-check.yml @@ -21,7 +21,6 @@ jobs: config: - {os: macOS-latest, r: 'release'} - {os: windows-latest, r: 'release'} - - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} - {os: ubuntu-latest, r: 'release'} env: diff --git a/docs/404.html b/docs/404.html index 230dede..a36db3f 100644 --- a/docs/404.html +++ b/docs/404.html @@ -52,7 +52,7 @@ JMbayes2 - 0.4-6 + 0.4-8 @@ -72,6 +72,9 @@
vignettes/Causal_Effects.Rmd
+ Causal_Effects.Rmd
We will illustrate the calculation of causal effects from joint
+models using the PBC dataset for the longitudinal outcome
+serBilir
and the composite event transplantation or death.
+We start by fitting a joint model to the data. In the longitudinal
+submodel, we specify nonlinear subject-specific trajectories using
+natural cubic splines. In the fixed-effects part, we also include the
+treatment effect and its interaction with time. In the survival
+submodel, we only include the treatment effect.
+pbc2.id$status2 <- as.numeric(pbc2.id$status != "alive")
+lmeFit <- lme(log(serBilir) ~ ns(year, 3, B = c(0, 14.4)) * drug,
+ data = pbc2, random = ~ ns(year, 3, B = c(0, 14.4)) | id,
+ control = lmeControl(opt = "optim"))
+CoxFit <- coxph(Surv(years, status2) ~ drug, data = pbc2.id)
+jmFit <- jm(CoxFit, lmeFit, time_var = "year")
+summary(jmFit)
+#>
+#> Call:
+#> jm(Surv_object = CoxFit, Mixed_objects = lmeFit, time_var = "year")
+#>
+#> Data Descriptives:
+#> Number of Groups: 312 Number of events: 169 (54.2%)
+#> Number of Observations:
+#> log(serBilir): 1945
+#>
+#> DIC WAIC LPML
+#> marginal 4217.836 4527.502 -2937.053
+#> conditional 6503.006 6206.639 -3454.250
+#>
+#> Random-effects covariance matrix:
+#>
+#> StdDev Corr
+#> (Intr) 0.9958 (Intr) n(,3,B=c(0,14.4))1 n(,3,B=c(0,14.4))2
+#> n(,3,B=c(0,14.4))1 1.5005 0.2219
+#> n(,3,B=c(0,14.4))2 1.6970 0.4318 0.7505
+#> n(,3,B=c(0,14.4))3 1.8995 0.4247 0.2076 0.6798
+#>
+#> Survival Outcome:
+#> Mean StDev 2.5% 97.5% P Rhat
+#> drugD-penicil -0.0138 0.2061 -0.4079 0.3859 0.9533 1.0037
+#> value(log(serBilir)) 1.3001 0.0881 1.1354 1.4741 0.0000 1.0109
+#>
+#> Longitudinal Outcome: log(serBilir) (family = gaussian, link = identity)
+#> Mean StDev 2.5% 97.5% P Rhat
+#> (Intercept) 0.5863 0.0808 0.4309 0.7430 0.0000 1.0007
+#> ns(,3,B=c(0,14.4))1 1.1251 0.1719 0.7905 1.4659 0.0000 1.0058
+#> ns(,3,B=c(0,14.4))2 2.1954 0.2329 1.7749 2.6913 0.0000 1.0201
+#> ns(,3,B=c(0,14.4))3 2.4177 0.3416 1.8350 3.1714 0.0000 1.0342
+#> drugD-penicil -0.1061 0.1163 -0.3386 0.1217 0.3633 1.0017
+#> n(,3,B=c(0,14.4))1: 0.1869 0.2369 -0.2802 0.6490 0.4218 1.0055
+#> n(,3,B=c(0,14.4))2: -0.4776 0.3224 -1.1454 0.1111 0.1211 1.0175
+#> n(,3,B=c(0,14.4))3: -0.6994 0.4770 -1.7031 0.1481 0.1233 1.0384
+#> sigma 0.2894 0.0065 0.2776 0.3027 0.0000 1.0061
+#>
+#> MCMC summary:
+#> chains: 3
+#> iterations per chain: 3500
+#> burn-in per chain: 500
+#> thinning: 1
+#> time: 20 sec
The coefficient for drugD-penicil
for the survival
+outcome in the output produced by the summary()
method
+denotes the residual/direct effect of treatment on the risk of the
+composite event. It does not include the effect of treatment that
+follows via the serum bilirubin pathway.
We will illustrate the calculation of causal risk differences for the +group of patients that have the same distribution of serum bilirubin +values as Patient 2:
+
+xyplot(log(serBilir) ~ year, data = pbc2, subset = id == 2, type = "b",
+ xlab = "Follow-up time (years)", ylab = "log{serum bilirubin (mg/dL)}",
+ main = "Patient 2")
We calculate the risk difference for the composite event between the
+active treatment D-penicillamine and placebo at the horizon time
+t_horiz = 6
using the longitudinal data up to year
+t0 = 4
. To achieve this, we create a dataset with this
+patient’s data. This patient received the active treatment
+D-penicillamine; hence, we also create a version of her data with the
+drug
variable set to placebo
:
+t0 <- 4
+t_horiz <- 6
+dataP2_Dpenici <- pbc2[pbc2$id == 2 & pbc2$year <= t0, ]
+dataP2_Dpenici$years <- t0
+dataP2_Dpenici$status2 <- 0
+
+dataP2_placebo <- dataP2_Dpenici
+dataP2_placebo$drug <- factor("placebo", levels = levels(pbc2$drug))
Note that in the dataP2_placebo
dataset, we need to
+specify that drug
is a factor with two levels. We also
+specify that the last time point we know the patient was still
+event-free was t0
.
We estimate the cumulative risk for the composite event at
+t_horiz
under the active treatment arm using the
+predict()
method:
+Pr1 <- predict(jmFit, newdata = dataP2_Dpenici, process = "event",
+ times = t_horiz, return_mcmc = TRUE)
We have set the argument return_mcmc
to
+TRUE
to enable the calculation of a credible interval that
+accounts for the MCMC uncertainty. We produce the same estimate under
+the placebo arm:
+Pr0 <- predict(jmFit, newdata = dataP2_placebo, process = "event",
+ times = t_horiz, return_mcmc = TRUE)
The estimated risk difference and its 95% credible interval are
+calculated by the corresponding elements of the Pr1
and
+Pr0
objects, i.e.,
+# estimate
+Pr1$pred[2L] - Pr0$pred[2L]
+#> [1] 0.002916962
+
+# MCMC variability
+quantile(Pr1$mcmc[2L, ] - Pr0$mcmc[2L, ], probs = c(0.025, 0.975))
+#> 2.5% 97.5%
+#> -0.1790865 0.1952537
An extended example with time-varying treatments / intermediate +events that showcases a calculation of the variance of the causal +effects that includes the sampling variability is available here.
+vignettes/Competing_Risks.Rmd
Competing_Risks.Rmd
vignettes/Dynamic_Predictions.Rmd
Dynamic_Predictions.Rmd
vignettes/JMbayes2.Rmd
JMbayes2.Rmd
The output of the summary()
method provides some
descriptive statistics of the sample at hand, followed by some fit
statistics based on the marginal (random effects are integrated out
@@ -427,7 +430,7 @@
As seen above, the functional_forms
argument is a named
list with elements corresponding to the longitudinal outcomes. If a
longitudinal outcome is not specified in this list, then the default,
diff --git a/docs/articles/Multi_State_Processes.html b/docs/articles/Multi_State_Processes.html
index 3b2ca8c..5ae9074 100644
--- a/docs/articles/Multi_State_Processes.html
+++ b/docs/articles/Multi_State_Processes.html
@@ -53,7 +53,7 @@
JMbayes2
- 0.4-6
+ 0.4-8
@@ -73,6 +73,9 @@
vignettes/Multi_State_Processes.Rmd
Multi_State_Processes.Rmd
vignettes/Non_Gaussian_Mixed_Models.Rmd
Non_Gaussian_Mixed_Models.Rmd
vignettes/Recurring_Events.Rmd
Recurring_Events.Rmd
+Figure 1 Visual representation of an hazard function under the gap or calendar timescale. During the folow-up the subject experienced three events. - +
The model also accommodates discontinuous risk intervals, i.e., periods in which the subject is not at risk of experiencing the recurring event (Figure 2). For example, while a patient is in the hospital, they are not at risk of being hospitalized again.
+Figure 2 Visual representation of an hazard function under the gap or calendar timescale, while accounting for non-risk periods (gray areas). During the folow-up the subject experienced three events. - +
A joint model with \(p\) normally distributed longitudinal markers, a terminal event process, and a @@ -245,8 +241,9 @@
gen_data <- function(){
n <- 500 # desired number of subjects
@@ -305,14 +302,14 @@ Data
rmv_ids <- sample(which(ter_na), sum(ter_na) - n_target)
ter_na[rmv_ids] <- FALSE # remove the excess of subjects
ter <- data.frame(id = seq_len(n)[ter_na],
- time = ter_times[ter_na],
- group = group[ter_na],
- age = age[ter_na])
+ tstop = ter_times[ter_na],
+ group = group[ter_na],
+ age = age[ter_na])
frailty <- frailty[ter_na]
b <- b[ter_na, , drop = FALSE]
cens_times <- tmax
- ter$status <- as.numeric(ter$time <= cens_times) # event indicator
- ter$time <- pmin(ter$time, cens_times) # add censoring time
+ ter$status <- as.numeric(ter$tstop <= cens_times) # event indicator
+ ter$tstop <- pmin(ter$tstop, cens_times) # add censoring time
remove(gammas_t, sigma_t, group, W_t, eta_t, alpha_t, invS_t, u_t, i, root,
n_target, rmv_ids, ter_times, cens_times, n, alphaF, age, ter_na,
sigmaF)
@@ -352,7 +349,7 @@ Data
j <- 1
for(i in seq_along(ter$id)) {
tstart <- 0
- while(!is.na(tstart) & tstart < ter$time[i]) {
+ while(!is.na(tstart) & tstart < ter$tstop[i]) {
u_r <- runif(1)
root <- try(uniroot(invS_r, interval = c(1e-05, 250), # arbitrary upper limit
u = u_r, i = i, tstart = tstart)$root, TRUE)
@@ -371,7 +368,7 @@ Data
rec$id <- match(rec$id, unique(rec$id)) # rename IDs
rec$group <- ter$group[rec$id]
rec$age <- ter$age[rec$id]
- rec$Stime <- ter$time[rec$id]
+ rec$Stime <- ter$tstop[rec$id]
rec$status <- as.numeric(!is.na(rec$tstop) & rec$tstop < rec$Stime) # event indicator
rec$tstop <- pmin(rec$tstop, rec$Stime, na.rm = TRUE) # add cens time
rec$Stime <- NULL
@@ -388,7 +385,7 @@ Data
data = long)
eta_y <- as.vector(X %*% betas + rowSums(Z * b[long$id, ]))
long$y <- rnorm(length(eta_y), eta_y, sigma_y)
- long_cens <- long$time <= rep(ter$time, times = rle(long$id)$lengths)
+ long_cens <- long$time <= rep(ter$tstop, times = rle(long$id)$lengths)
long <- long[long_cens, , drop = FALSE] # drop censored encounters
remove(kn, Bkn, X, betas, Z, b, eta_y, sigma_y, n_i, tmax, long_cens, scale)
##############################################################################
@@ -407,10 +404,9 @@ Data
in the package the rc_setup()
function:
cox_data <- rc_setup(rc_data = recu_data, trm_data = term_data,
- rc_idVar = "id", rc_statusVar = "status",
- rc_startVar = "tstart", rc_stopVar = "tstop",
- trm_idVar = "id", trm_statusVar = "status",
- trm_stopVar = "time",
+ idVar = "id", statusVar = "status",
+ startVar = "tstart", stopVar = "tstop",
+ trm_censLevel = 0,
nameStrata = "strata", nameStatus = "status")
Each subject has as many rows in the new data frame as the number of
their recurrent risk periods plus one for the terminal event. The data
@@ -425,26 +421,22 @@
Data
cox_data[cox_data$id == 1, c("id", "tstart", "tstop", "status", "strata")]
#> id tstart tstop status strata
-#> 1 1 0.0000000 0.3756627 1 Rec
-#> 2 1 0.4275324 0.7832841 1 Rec
-#> 3 1 0.8724938 1.0863887 1 Rec
-#> 4 1 1.1212953 1.8741434 1 Rec
-#> 5 1 1.9372355 2.7843451 1 Rec
-#> 6 1 2.7906559 3.4618622 1 Rec
-#> 7 1 3.5166929 3.5830169 1 Rec
-#> 8 1 3.6251219 4.0375415 0 Rec
-#> 9 1 0.0000000 4.0375415 1 Ter
+#> 1 1 0.0000000 0.3756627 1 R
+#> 2 1 0.4275324 0.7832841 1 R
+#> 3 1 0.8724938 1.0863887 1 R
+#> 4 1 1.1212953 1.8741434 1 R
+#> 5 1 1.9372355 2.7843451 1 R
+#> 6 1 2.7906559 3.4618622 1 R
+#> 7 1 3.5166929 3.5830169 1 R
+#> 8 1 3.6251219 4.0375415 0 R
+#> 9 1 0.0000000 4.0375415 1 T1
+Figure 3 Visual representation of the failure-time data during the follow-up for subject 1. The horizontal black line denotes risk periods, while the blue line non-risk periods. ‘R’ and ‘T’ represent a recurrent and terminal event, respectively. - +
vignettes/Super_Learning.Rmd
Super_Learning.Rmd
We observe that the first three models dominate the weights. We also +
We observe that the first two models dominate the weights. We also note that the integrated Brier score based on the combined predictions is smaller than the integrated Brier score of each individual model. To calculate the model weights using the expected predictive cross-entropy @@ -375,16 +378,16 @@
The EPCE results are inline with the ones from the integrated Brier +
The EPCE results are similar with the ones from the integrated Brier
score; however, now only models M2
and M3
share the weights. Again we see that the EPCE based on the combined
cross-validated predictions is smaller than the EPCE based on the
diff --git a/docs/articles/Super_Learning_files/figure-html/unnamed-chunk-1-1.png b/docs/articles/Super_Learning_files/figure-html/unnamed-chunk-1-1.png
index af20efb..752c433 100644
Binary files a/docs/articles/Super_Learning_files/figure-html/unnamed-chunk-1-1.png and b/docs/articles/Super_Learning_files/figure-html/unnamed-chunk-1-1.png differ
diff --git a/docs/articles/Time_Varying_Effects.html b/docs/articles/Time_Varying_Effects.html
index b187e73..be5e76b 100644
--- a/docs/articles/Time_Varying_Effects.html
+++ b/docs/articles/Time_Varying_Effects.html
@@ -53,7 +53,7 @@
JMbayes2
- 0.4-6
+ 0.4-8
@@ -73,6 +73,9 @@
vignettes/Time_Varying_Effects.Rmd
Time_Varying_Effects.Rmd
pbc2.id$status2 <- as.numeric(pbc2.id$status != 'alive')
-CoxFit <- coxph(Surv(years, status2) ~ sex, data = pbc2.id)
Our aim is to assess the strength of the association between the risk
of the composite event and the level of serum bilirubin. We will
describe the patient-specific profiles over time for this biomarker
using a linear mixed-effects model, where both in the fixed- and
random-effects we include an intercept, the linear and quadratic time
effect. In the fixed-effects we also include the interaction of the time
-effect and sex. The syntax to fit this model with lme()
+effect and sex. The syntax to fit this model with lme()
is:
-fm <- lme(log(serBilir) ~ poly(year, 2) * sex, data = pbc2,
- random = ~ poly(year, 2) | id, control = lmeControl(opt = 'optim'))
fm <- lme(log(serBilir) ~ poly(year, 2) * sex, data = pbc2,
+ random = ~ poly(year, 2) | id, control = lmeControl(opt = 'optim'))
The default call to jm()
adds the subject-specific
linear predictor of the mixed model as a time-varying covariate in the
survival relative risk model:
To specify that the association of serum bilirubin may change over
time we include an interaction of this time-varying covariate with a
-natural cubic spline of time using function ns()
from the
+natural cubic spline of time using function ns()
from the
splines package. Important Note: for
this to work correctly we need to explicitly specify the internal and
boundary knots for the B-splines basis, i.e., in the following example
we set the internal knots at 3, 6 and 9 years, and the boundary knots at
0 and 14.5 years:
-form_splines <- ~ value(log(serBilir)) * ns(year, k = c(3, 6, 9), B = c(0, 14.5))
+form_splines <- ~ value(log(serBilir)) * ns(year, k = c(3, 6, 9), B = c(0, 14.5))
jointFit2 <- update(jointFit1, functional_forms = form_splines,
n_iter = 6500L, n_burnin = 2500L)
summary(jointFit2)
@@ -287,14 +290,14 @@ Non Proportional Hazards#> iterations per chain: 6500
#> burn-in per chain: 2500
#> thinning: 1
-#> time: 36 sec
The spline coefficients do not have a straightforward interpretation. We therefore visualize the time-varying association of log serum bilirubin with the hazard of the composite event using the following piece of code:
x_times <- seq(0.001, 12, length = 501)
-X <- cbind(1, ns(x_times, knots = c(3, 6, 9), B = c(0, 14.5)))
+X <- cbind(1, ns(x_times, knots = c(3, 6, 9), B = c(0, 14.5)))
mcmc_alphas <- do.call('rbind', jointFit2$mcmc$alphas)
log_hr <- X %*% t(mcmc_alphas)
log_hr_mean <- rowMeans(log_hr)
diff --git a/docs/articles/Transformation_Functions.html b/docs/articles/Transformation_Functions.html
index 2a3f1a5..00d0ec2 100644
--- a/docs/articles/Transformation_Functions.html
+++ b/docs/articles/Transformation_Functions.html
@@ -53,7 +53,7 @@
JMbayes2
- 0.4-6
+ 0.4-8
vignettes/Transformation_Functions.Rmd
Transformation_Functions.Rmd
In the output this is named as value(hepatomegaly)
to
denote that the current value functional form is used. That is, we
assume that the risk at a specific time \(t\) is associated with the value of the
@@ -271,7 +274,7 @@
Other available functions to use in the definition of the
functional_forms
argument are vexp()
for
calculating the exponent, vlog()
for calculating the
@@ -325,7 +328,7 @@
The call to Dexpit(slope(hepatomegaly))
is internally
transformed to
Dexpit(value(hepatomegaly)):slope(hepatomegaly)
, which
diff --git a/docs/articles/index.html b/docs/articles/index.html
index d682fc1..1f24c84 100644
--- a/docs/articles/index.html
+++ b/docs/articles/index.html
@@ -23,7 +23,7 @@
JMbayes2
- 0.4-6
+ 0.4-8
@@ -41,6 +41,9 @@
Rizopoulos D, Papageorgiou G, Miranda Afonso P (2023). +
Rizopoulos D, Papageorgiou G, Miranda Afonso P (2024). JMbayes2: Extended Joint Models for Longitudinal and Time-to-Event Data. -https://drizopoulos.github.io/JMbayes2/, https://github.com/drizopoulos/JMbayes2. +R package version 0.4-8, https://github.com/drizopoulos/JMbayes2, https://drizopoulos.github.io/JMbayes2/.
@Manual{, title = {JMbayes2: Extended Joint Models for Longitudinal and Time-to-Event Data}, author = {Dimitris Rizopoulos and Grigorios Papageorgiou and Pedro {Miranda Afonso}}, - year = {2023}, - note = {https://drizopoulos.github.io/JMbayes2/, https://github.com/drizopoulos/JMbayes2}, + year = {2024}, + note = {R package version 0.4-8, https://github.com/drizopoulos/JMbayes2}, + url = {https://drizopoulos.github.io/JMbayes2/}, }diff --git a/docs/index.html b/docs/index.html index a006ece..d86c463 100644 --- a/docs/index.html +++ b/docs/index.html @@ -53,7 +53,7 @@ JMbayes2 - 0.4-6 + 0.4-8 @@ -73,6 +73,9 @@
Package: | JMbayes2 |
Type: | Package |
Version: | 0.4-5 |
Date: | 2023-06-23 |
License: | GPL (>=3) |
This package fits joint models for longitudinal and time-to-event data. It can accommodate multiple longitudinal outcomes of different type (e.g., continuous, dichotomous, ordinal, counts), and assuming different distributions, i.e., Gaussian, Student's-t, Gamma, Beta, unit Lindley, censored Normal, Binomial, Poisson, Negative Binomial, and Beta-Binomial. For the event time process, right, left and interval censored data can be handled, while competing risks and multi-sate processes are also covered.
+Package: | JMbayes2 |
Type: | Package |
Version: | 0.4-8 |
Date: | 2024-01-02 |
License: | GPL (>=3) |
This package fits joint models for longitudinal and time-to-event data. It can accommodate multiple longitudinal outcomes of different type (e.g., continuous, dichotomous, ordinal, counts), and assuming different distributions, i.e., Gaussian, Student's-t, Gamma, Beta, unit Lindley, censored Normal, Binomial, Poisson, Negative Binomial, and Beta-Binomial. For the event time process, right, left and interval censored data can be handled, while competing risks and multi-sate processes are also covered.
JMbayes2 fits joint models using Markov chain Monte Carlo algorithms implemented in C++. The package also offers several utility functions that can extract useful information from fitted joint models. The most important of those are included in the See also Section below.
diff --git a/docs/reference/accuracy.html b/docs/reference/accuracy.html index 62f6015..5eeb3d6 100644 --- a/docs/reference/accuracy.html +++ b/docs/reference/accuracy.html @@ -23,7 +23,7 @@ JMbayes2 - 0.4-6 + 0.4-8character string denoting the name of the subject id variable in data
.
character vector with the names of stratifying variables.
integer denoting the seed.