Skip to content

Materials for BMS 225A workshop on reproducible research and data exploration

Notifications You must be signed in to change notification settings

UCSF-DSCOLAB/BMS-225A

Repository files navigation

BMS 225A Workshop on Reproducible Research and Data Exploration

GK Fragiadakis and the Students of BMS225A January 5th, 2021

Motivation

The goal of this workshop is to lower the barrier of entry to biological data science, and to instill good practices as we do it. We will cover:

  • Principles of reproducible research
  • Intro to version control
  • Data exploration and resources

Reproducible Research

Resources:

Principles:

  1. Organize your data and code
  2. Everything with a script
  3. Use version control
  4. Turn repeated code into functions (and other good coding practices)
  5. Turn scripts into reproducible reports
  6. Package functions for future use

Version control

In this workshop we will cover an introduction to version control using git on GitHub.

Sample git workflow (your git cheatsheet)

  1. Create repo on GitHub
  2. clone repo locally
    • git clone repo-url
  3. locally create a branch
    • git checkout -b branch-name
    • to see which branch you're on and what exists: git branch
    • to switch between branches: git checkout branch-name
  4. make changes on that branch
  5. Add commits on that branch
    • git status (will show you what files have changes and if they are staged)
    • git add file-name (staging your file)
    • git commit -m "commit description"
  6. push that branch to GitHub: push commits every time you come to a stopping point (at least each day)
    • git push origin branch-name
  7. when ready, create pull request on GitHub
  8. review on GitHub
  9. merge branch to master
  10. delete branch
  11. then locally, pull down master
    • git pull origin master
  12. delete branch locally
    • git branch -d branch-name
  13. Run it back from step 3

Additional tips:

Make a .gitignore file with files to ignore by git:

  • touch .gitignore
  • write in the names of files (or like *.pdf) you don't want to have appear

To see changes from the last commit:

  • git diff HEAD

To un-stage a file:

  • git reset name-of-file

Making a repository locally instead:

  • git init

Exploratory data analysis

Resources

  1. Pre-process and tidy your data
  2. Explore your data using the Transform-Visualize-Model loop
  3. Communicate results

Resources for getting started with CyTOF and scRNAseq

CyTOF resources:

scRNAseq resources:

Getting help

We covered a lot, now its time to try it on your own and to reach out if you have further questions as you're going.

Our office hours Our contact info

About

Materials for BMS 225A workshop on reproducible research and data exploration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages