Common challenges and solutions to working with R and register data

A list of common challenges facing researchers new to working with R, DST’s servers or Parquet/DuckDB data and solutions to overcome these.

Author
Affiliation

Anders Aasted Isaksen

Warning

🚧 This website and most of its contents are often updated or modified. Many documents are at various stages of completion. 🚧

Aims

This document is a joint effort between junior (and slightly more senior) researchers working with R on the servers of Statistics Denmark. This guide was created to address frequently encountered challenges in a public space for everyone to benefit from—and contribute to. Many of these challenges are related to the register data infrastructure on Steno Aarhus’ project database on Statistics Denmark, e.g. how to utilise Parquet files and DuckDB/duckplyr—but all commonly encountered data processing/analysis challenges are within the scope of this guide.

Note

This website and the documents contained within are aimed at researchers, PhD students, staff, and external collaborators.