Register paths and datastores

Confirmed paths and access methods for all registers on project 708421

Published

July 2, 2026

Warning

Check the modification date on cleaned-data before running the pipeline.

file.info("E:/workdata/708421/cleaned-data/parquet-registers/")$mtime

The registers are not necessarily updated to today. Confirm that coverage matches your study period.


Base paths

# All paths used as constants at the top of scripts
path_parquet_reg  <- "E:/workdata/708421/cleaned-data/parquet-registers/"
path_parquet_ext  <- "E:/workdata/708421/cleaned-data/parquet-external/"
path_dm_pop       <- "E:/workdata/708421/cleaned-data/diabetes-register-pop/dm_population_1977_2022.rds"
path_output       <- "E:/workdata/708421/workspaces/[yourName]/BS_demens/datasets/"

Overview - all registers on project 708421

All confirmed via colnames() on the DST server 2026-05-15. Most registers are updated to end of 2024 as of 2026 (Anders Aasted Isaksen/DARTER team). Column names shown after rename_with(tolower).

Register Access Join key Period Critical column
BEF read_register("bef") pnr All years koen, foed_dag, familie_id
DODSAARS read_register("dodsaars") pnr ~1970–2001 d_dodsdto (death date)
DOD not in cleaned-data pnr ~2001–2024 doddato - see extraction guide
VNDS read_register("vnds") pnr All years indud_kode, haend_dato
LPR2 contacts read_register("lpr_adm") recnum Up to March 2019 d_inddto, c_pattype
LPR2 diagnoses read_register("lpr_diag") recnum Up to March 2019 c_diag, c_diagtype
LPR2 psych contacts read_register("t_psyk_adm") k_recnumrecnum 1995–March 2019 v_cprpnr
LPR2 psych diagnoses read_register("t_psyk_diag") v_recnumrecnum 1995–March 2019 c_diag, c_diagtype
LPR3 contacts read_register("lpr_a_kontakt") dw_ek_kontakt March 2019+ kont_starttidspunkt (datetime)
LPR3 diagnoses read_register("lpr_a_diagnose") dw_ek_kontakt March 2019+ diag_kode, diag_kode_type, senere_afkraeftet
LPR3 procedures read_register("procedurer_kirurgi") dw_ek_forloeb 2019+ procedurekode, dato_start
LPR2 procedures read_register("lpr_sksopr") recnum 1996–2018 c_opr, d_odto
LMDB read_register("lmdb") pnr Approx. 1994+ atc, eksd
UDDA read_register("udda") pnr All years hfaudd, aar
FAIK read_register("faik") familie_id All years famaekvivadisp_13
AKM read_register("akm") pnr All years socio13, aar
DBSO read_register("dbso") pnr 2010+ datoper_prim, surgery flags
OSDC readRDS(path_dm_pop) PNR → rename to pnr 1977–2022 diabetes_type, do_dm
Laboratory results read_register("laboratorieproevesvar_") pnr Approx. 1994+ npu, samplingdato, samplevalue (character)

Critical notes

DODSAARS and DOD - deaths are split across two registers: the date of death is not in one place. dodsaars is in cleaned-data but covers only ~1970–2001 (date of death in d_dodsdto). Deaths after 2001 are in DOD (date of death in doddato, covering ~2001–2024), which is not in cleaned-data - it requires extraction from the raw SAS file via your data manager. Both join on pnr, and you need both for full coverage over a modern study period.

Why it matters: if you run on dodsaars only, everyone who dies after 2001 is treated as alive. That skews censoring and matching in 01_build_cohorts.R. See pitfall 1.

LPR3 - duplicate risk: lpr_a_kontakt and lpr_a_diagnose contain data from two formats (LPR_F and LPR_A). Always filter on lprindberetningssystem == "LPR3". See pitfall 5.

Laboratory results - use only one source: laboratorieproevesvar_ (>2.2 billion rows) replaces lab_forsker/lab_dm_forsker. The old files still exist and cover the same data - use only one to avoid duplicates. Because the register is so large, semi_join(tibble(pnr = kohort$pnr), by = "pnr") and select() before collect() are essential. Two things to watch when extracting:

  • Tests are identified by NPU codes in the npu column - filter on the NPU codes your analysis needs.
  • samplevalue is a character column - it can contain text like “not detected” or “negative”, not just numbers. Convert with care (as.numeric() returns NA on text values).

See pitfall 6 - Laboratory results for a code example.

procedurer_kirurgi: dw_ek_kontakt is NA for all rows on DST. Join to lpr_a_kontakt via dw_ek_forloeb to fetch pnr.

DBSO: The identifier column is cpr in raw parquet - renamed to pnr by 00_prepare_dbso.R. All code uses pnr after that.

OSDC: PNR is uppercase in the raw file - rename with rename(pnr = PNR) after loading.


Loading templates

Note

fastreg replaces dstDataPrep on DARTER. Registers are now loaded with fastreg::read_register("name") - the same function as in the general guide. If you have been using dstDataPrep::load_database(), that code still runs, but write new code with fastreg. You just point fastreg at DARTER’s parquet folder once per script - options(fastreg.project_workdata_dir = path_parquet_reg) - see “Didn’t convert the data yourself?” in Phase 4. Then read_register("bef") works by name.

library(fastreg)   # read_register() - access to DST registers
library(dplyr)         # rename_with, rename, left_join, select

# Standard register - via read_register:
bef <- read_register("bef") %>% rename_with(tolower)   # lazy connection; lowercase columns

# Psychiatric LPR2 - requires renaming v_cpr and k_recnum:
psyk_adm <- read_register("t_psyk_adm") %>%
  rename_with(tolower) %>%                          # lowercase columns
  rename(pnr = v_cpr, recnum = k_recnum)            # v_cpr → pnr; k_recnum → recnum

# DBSO - parquet-external (converted from SAS via 00_prepare_dbso.R):
dbso <- read_register("dbso") %>% rename_with(tolower)   # lazy connection

# OSDC - RDS file with pre-computed diabetes classification:
dm_pop <- readRDS(path_dm_pop) %>% rename(pnr = PNR)   # PNR is uppercase in raw file - rename

# LPR3 procedures - join via dw_ek_forloeb (NOT dw_ek_kontakt - is NA for all rows):
proc <- read_register("procedurer_kirurgi") %>%
  rename_with(tolower) %>%                          # lowercase columns
  left_join(
    read_register("lpr_a_kontakt") %>%
      rename_with(tolower) %>%
      select(dw_ek_forloeb, pnr),                  # fetch pnr via the forloeb key
    by = "dw_ek_forloeb"                            # join key - dw_ek_kontakt does not work
  )
# proc is still lazy - add filter() and collect() before use

See also

Back to top