Export and repatriation

GDPR, output control and what may leave DST

Published

July 21, 2026

You have built your dataset and run your analysis (Phases 0–14). What remains is the final step: getting your results safely out of DST.

Data from Statistics Denmark is microdata subject to GDPR. You cannot copy raw data out - everything must go through DST’s repatriation process, and the precise rules live in DST’s own guide.

This page is only an overview. DST’s own guide takes precedence - read and re-read it. DST - Rules for repatriation of analysis results →

What may leave DST?

Yes:

Aggregated tables
Graphs and figures
Model output - coefficients, confidence intervals, p-values

No:

Individual-level data in any form
Cells with fewer than 5 observations
Results that could identify individuals - directly or indirectly

Scripts are also repatriated via the process - not freely. Code/scripts can be repatriated, but go through the same process as result files.

NB: min, max, percentiles and median can point to one person

min, max, percentiles and median reproduce concrete values from the dataset - a minimum or maximum value is one person’s actual figure. Only report them if at least 5 people have that value; otherwise the figure can point to a single individual.

If you are unsure about a descriptive table with min, max, percentiles or median, check the threshold in DST’s guide or with your data manager before repatriating it.

NB: missing values (NA) count too

A missing value is also a cell. If a category has fewer than 5 missing, that figure may not be repatriated either. Solutions: impute, merge categories, drop the category - or another solution agreed with your data manager.

Make a quick overview of missing values, so you see the small cells immediately instead of adding figures up in your head:

colSums(is.na(df)) # number of NAs per column - spot any 1–4 at a glance
table(df$category, useNA = "always") # shows NA as its own category (otherwise hidden)

table() hides NA by default - useNA = "always" forces the NA row out. See Phase 7 - table() hides NA.

How it works

The repatriation itself happens via “Hjemtag Filer” in the DDV app. The system automatically scans for potential microdata and flags risks. If a file is flagged, you add a comment that precisely describes what the file contains and why it is aggregated. Use descriptive filenames (table1_descriptive_n500.csv) - not generic names like output.csv.

The full step-by-step process and the rules in force are in DST’s guide - follow it.

Checklist before uploading

No individual-level data - only aggregated results
All cells ≥ 5 observations
Min, max, percentiles, median: at least 5 people behind each value
No category has 1–4 missing (NA) - impute, merge or drop
Descriptive filenames; scripts clean - no microdata in the code

You have completed the full process

When your results have been repatriated, you have gone all the way - from research question and cohort to analysis-ready dataset and publication-ready results.

Need to look up a function, a register or a pitfall? → Functions: overview · Overview of registers · DST pitfalls
Working on DARTER / project 708421? → DARTER - overview and pipeline

Further information

Further depth in The Epidemiologist R Handbook:

Import and export