Working in the Research Analysis Platform (RAP)

Published

December 13, 2024

Because of the unique setup of the UKB RAP, the easiest way to do your work is through Git and GitHub on the SDCA GitHub Organization. This is due to a few reasons:

  1. Every time you start up and enter into the UKB RAP, the working computer environment is completely clean… No personal files, no R packages installed, nothing. So we need some way of saving our work so we can download it again when we work again in the RAP.

  2. Whenever you finish working for the day, you must terminate your RStudio session in the RAP, because we pay for the RStudio use every hour. So we can’t just leave the session running all the time, we have to turn it off. This forces us into the situation described in the point above.

  3. There are several of us working on this project and we want to be able to easily collaborate and help each other out. We will also have frequent code reviews of analyses. Both the collaboration and reviews are best done through Git and GitHub.

  4. The UKB organizing committee will review all project proposals and do some basic admin tasks, so keeping things organized and centralized on GitHub will help us be effective in our tasks.

  5. Because all code will be stored on the SDCA GitHub, it will be easier to share code and how we do things through it. So you the PhD student, postdoc, or researcher can better make use of all the efforts everyone is doing.

Because of these reasons, we expect everyone working on the Steno UK Biobank project to use Git and use Steno’s GitHub Organization account to store their project.

Steps everytime you enter into the UKB RAP

  1. Whenever you open up the UKB RAP, you won’t have your project files nor have any packages installed. So you’ll need to do a few set up tasks each time you work in the RAP. I’ve written a function that you can source directly that will take you through the steps needed to have everything set up. Run these two lines of code in the R Console, and afterwards follow the instructions on the Console.

    install.packages("pak")
    rstudioapi::restartSession()

    Once this finishes restarting the session, run the next line:

    pak::pak("steno-aarhus/ukbAid", ask = FALSE)
    rstudioapi::restartSession()

    And, unfortunately, you’ll have to restart again to then run the next line:

    ukbAid::proj_setup_rap()
    Note

    Since UKB RAP deletes everything when you Terminate the session, you’ll be backing up your project on GitHub. What this function above does is install the necessary packages, installs the ukbAid package, sets up your Git config, sets up your authentication (credentials) to connect to GitHub, and finally downloads your GitHub repository into the RAP environment.

  2. Once the project is downloaded from GitHub and created the project in the RAP, you can open it by clicking the .Rproj file inside the project folder.

  3. You may have to update your credentials after opening up your project, so run this code to paste in your GitHub PAT token.

    gitcreds::gitcreds_set()
  4. Then, run these two lines of code in the Console while inside your RStudio Project:

    pak::pak(ask = FALSE)
    targets::tar_make()

    More details about this step is also found in your own project’s main README.md file.

  5. When you’ve finished selecting the variables you want in the data-raw/project-variables.csv, you will need to open up the data-raw/create-data.R script and follow the instructions there to create your dataset.

Remember to frequently commit and push your file changes to your GitHub. Otherwise all your work on the RAP will be lost next time you login.

Other notes

  • Save your changes in the Git history and push to your GitHub BEFORE Terminating your UKB-RAP project: If you terminate before saving and pushing to GitHub, you will lose your work! Read the Important notes on using GitHub page for more detail.

  • Remember, do not save any data in the Git history unless you have discussed it with the organizing committee. You can save results for tables and figures, as long as they are aggregate or statistical results.