OSDC - Open Source Diabetes Classifier

Diabetes type classification in Danish register data

Published

July 2, 2026

OSDC is an open source algorithm that classifies everyone in Denmark as having type 1 diabetes (T1D), type 2 diabetes (T2D) or no diabetes, based on data in the national registers.

Article: doi:10.2147/CLEP.S407019 Documentation and code: steno-aarhus.github.io/osdc


What does OSDC give you?

A pre-computed population with: - Diabetes type: T1D, T2D or no diabetes - Onset date: when the diabetes criteria were first met - Age at onset

The classification is based on a combination of diagnosis codes (LPR), medication information (LMDB) and laboratory results.


When is it used?

  • To define a diabetes population as an exposure or inclusion criterion
  • To exclude diabetes patients from a control group
  • To classify type - LPR alone is not sufficient to reliably distinguish T1D from T2D

How to use OSDC

DARTER - project 708421 (pre-computed file)

On DARTER the OSDC classification is already computed and ready to use (in several formats). You do not need to run the algorithm yourself.

Important

New 2024 population - use this going forward The osdc package has been updated and cleared by CRAN, and a new diabetes population has been generated on DARTER incorporating the latest data. It is valid for inclusions until the end of 2024 and is intended to be used instead of the previous population file - not combined with it. The previous files remain unchanged for backward compatibility and reproducibility.

The table is available in several formats:

  • Parquet: E:/workdata/708421/cleaned-data/external-parquet/osdc/2024 - on DARTER you can load it directly with read_register("osdc")
  • .csv, .rds or .dta: E:/workdata/708421/cleaned-data/diabetes_register_pop/2024
Warning

The path and the filename osdc (and read_register("osdc")) are DARTER-specific (project 708421). On other projects the file is named differently and lives elsewhere.

The output format differs slightly from the old population. See the description of the output variables - in particular the difference between “raw” and “stable” inclusion dates (whether the inclusion date falls in a period with sufficient inclusion-event data to be reliable) - in the design vignette.

# New 2024 population on DARTER - read_register resolves the path automatically:
library(fastreg)
dm_pop <- read_register("osdc")
Previous population (up to 2022) - for reproducibility only

The old file has a different output format than the 2024 population (the variable names below apply only to the old file).

dm_pop <- readRDS("E:/workdata/708421/cleaned-data/diabetes-register-pop/dm_population_1977_2022.rds")

names(dm_pop)
# [1] "PNR"           "diabetes_type"  "do_dm"          "age_at_onset"

library(dplyr)
dm_pop <- dm_pop %>% rename(pnr = PNR)   # rename PNR to lowercase pnr

# diabetes_type: 1 = T1D, 2 = T2D  |  do_dm: onset date

# Example: filter to T1D patients
t1d <- dm_pop %>%
  filter(diabetes_type == 1) %>%
  select(pnr, diabetes_type, do_dm) %>%
  rename(date_diabetes = do_dm)

Other projects - run the algorithm yourself

If your OSDC classification has not been pre-computed, you need to run the algorithm yourself.

osdc is a CRAN package that can be installed on DST. It must be installed first and then loaded:

install.packages("osdc")
library("osdc")

On the DST server, install.packages("osdc") also updates the necessary dependencies, which matters because the pre-installed packages may be outdated.

Note

Availability across projects The package is confirmed available on DARTER (project 708421). It is likely available on other projects too, but this has not been verified - check yourself with "osdc" %in% rownames(available.packages()) on your own server (the package is not pre-installed, so available.packages() shows whether it can be installed). If a package is missing, you can contact DST, who can install requested packages for you.

See documentation and code on GitHub: steno-aarhus/osdc and the getting-started guide at steno-aarhus.github.io/osdc/articles/osdc.html.


Coverage and limitations

  • The DARTER dataset has a cohort built with osdc; the new 2024 population is valid for inclusions until the end of 2024
  • Classifies everyone in Denmark - not just your cohort
  • Uses LPR, LMDB and laboratory results - requires access to these registers
Note

Which diabetes drugs count as events? Not all glucose-lowering drugs (ATC A10) count as inclusion events in the algorithm:

  • ALL GLP1-RAs are excluded as events, because this drug class is used extensively for weight loss among individuals without diabetes.
  • For SGLT2 inhibitors, the pure dapagliflozin and empagliflozin preparations are also not included as events (recommended in Danish guidelines for heart and kidney failure among individuals without diabetes).
  • Combination preparations of GLP1-RAs or SGLT2 inhibitors together with other glucose-lowering drugs, by contrast, do count as events.

See the exact logic in the algorithm vignette: steno-aarhus.github.io/osdc/articles/algorithm.html - or the code directly on GitHub: steno-aarhus/osdc.

Warning

OSDC’s type classification conditions on the future - be careful with the pre-computed population The pre-computed OSDC population can assign a person’s diabetes type based on events that occur after your index/inclusion date. T1D vs. T2D is determined partly from insulin purchase patterns over time, so the type can depend on information that did not yet exist on your index date. If you use the type as a baseline variable, you are effectively conditioning on the future (immortal time bias).

Workaround: run the algorithm yourself and filter the input registers to only data before your index date, so classify_diabetes() only sees pre-index information. Then the type classification is not contaminated by future events.

A guide to running the algorithm yourself - and to modifying it, e.g. if you want to include GLP1-RAs as events - is in progress. Note that if you modify the algorithm, the published validation no longer applies.

See the full methodology in the article and documentation site.


Next steps

Back to top