Time-varying variables

Variables that change during follow-up: what can vary, when baseline is enough, and where it gets hard

Published

July 21, 2026

This is a hard topic, and this page is only a short orientation. Time-varying analysis ranges from a simple reshaping of the data (what we show here) to the entire causal machinery behind time-varying treatments - for the latter, see for example What If, Part III.

Almost everything else in this guide builds on one row per person with variables measured at index (baseline). But some things change during follow-up, and then that shape does not always hold. This page helps you decide when that is a problem, and what to do about it.

The code example uses generic path and variable names. Adapt it to your project. survival must be installed in your R environment on DST.

What can vary?

Three different things can vary over time, and they call for very different answers:

Time-varying covariate / confounder: a background variable that changes (e.g. BMI, a blood test, a medication status, marital status). It cannot be a single value per person if it changes along the way.
Time-varying exposure / treatment: the exposure itself starts, stops or switches during follow-up (a treatment the person goes on and off). This is where it gets causally hard, see the feedback section below.
Censoring is also time-varying: people leave follow-up at different times (emigration, death, loss to follow-up). This is already handled in your follow-up time and time-to-event setup, see Time-to-event.

When is the baseline setup enough?

Often. If your exposure is fixed at index (e.g. a diagnosis or a surgery on a given date) and you adjust for confounders measured at baseline, then the ordinary setup from Phase 10-12 - one row per person, covariates frozen at index, one follow-up interval - is the right one. You do not need time-varying analysis just because the data span time. Only consider it when a variable that matters for the analysis actually changes value after index.

Rule of thumb: ask whether the extra complexity changes the answer. If a covariate drifts a little but is not a strong confounder, its baseline value is usually fine. If the exposure itself changes, or a strong confounder changes systematically with the exposure, read on.

A time-varying covariate

If you have a covariate that changes along the way (and that is not caught in feedback with the exposure, see the next section), the fix is mechanical: you build start/stop format (also called counting-process format). Instead of one row per person you get one row per person-interval, where the variable is constant within the interval. You build it with survival::tmerge() and analyse it with an ordinary Cox model.

A person with two measurements becomes two rows, each with its own tstart/tstop interval and the value that applies within it:

pnr	tstart	tstop	measure	outcome
001	0	180	5.2	0
001	180	365	6.8	1

Build start/stop data with tmerge()

library(survival) # tmerge(), tdc(), event()

# base         = one row per person: pnr, exit_day (end of follow-up), status (1 = outcome)
# measurements = many rows per person: pnr, day (time of the measurement), value

d <- tmerge(
  base,
  base,
  id = pnr, # initialise start/stop data from base
  outcome = event(exit_day, status)
) # define the event at the exit time

d <- tmerge(
  d,
  measurements,
  id = pnr, # add the time-varying variable
  measure = tdc(day, value)
) # tdc = time-dependent covariate
# -> d now has tstart, tstop, outcome and measure (one row per interval)

event(time, status) marks when the event occurs.
tdc(day, value) lets measure change value on each measurement day.

Analyse in a Cox model

coxph(Surv(tstart, tstop, outcome) ~ measure + age, data = d) # time-varying Cox model

The time-to-event analysis itself (Kaplan-Meier, Cox, assumptions) is in Time-to-event. The detailed recipe for tmerge() and time-dependent effects is in the survival package vignette (CRAN PDF).

Treatment-confounder feedback

The start/stop Cox above is not always enough - and can even be wrong. The most important trap in time-varying analysis is treatment-confounder feedback: when a time-varying variable is both a confounder for the exposure and itself affected by past exposure.

Classic example: a treatment (exposure) affects a biomarker, the biomarker affects whether the person gets more treatment, and the biomarker also affects the outcome. The variable is at once a confounder (normally handled by adjustment) and an intermediate variable / mediator on the path from past treatment to outcome (adjust for it and you block part of the effect you want to measure, introducing bias).

In that situation you cannot simply put the variable into an ordinary Cox model, the way we did above. Adjusting and not adjusting both give a wrong answer. The solution is a whole different family of methods, g-methods:

the g-formula (standardisation step by step over time)
IP weighting / marginal structural models: the most accessible g-method. The guide introduces the weighting idea for a fixed exposure in IP weighting (IPTW and IPCW); marginal structural models extend it to a time-varying exposure
g-estimation of structural models

This falls outside this guide. If you suspect feedback, stop here, read up (for example What If, Part III), and involve a biostatistician before you analyse.

What can vary?

When is the baseline setup enough?

A time-varying covariate

Build start/stop data with tmerge()

Analyse in a Cox model

Treatment-confounder feedback

See also