Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highlighting reviews - documentation and readme links #153

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
fdf7847
add reviews
mallewellyn Mar 14, 2024
037adb2
replace default text in readme?
mallewellyn Mar 14, 2024
44e79f4
link to reviews in readme
mallewellyn Mar 14, 2024
b2fbee7
change reviews title for consistency with contributing, authors etc
mallewellyn Mar 14, 2024
5e5117c
Merge branch 'carpentries-incubator:main' into highlighting-reviews
mallewellyn Mar 15, 2024
abf90f2
update addressed points following meeting
mallewellyn Mar 15, 2024
e0c07dd
Merge branch 'carpentries-incubator:main' into highlighting-reviews
mallewellyn Mar 20, 2024
291b2f6
Add how to cite
mallewellyn Mar 20, 2024
f6aece7
Add empty data page file
mallewellyn Mar 20, 2024
f3c4901
add prostate data description
mallewellyn Mar 20, 2024
10ab7a6
partial fill of methylation data
mallewellyn Mar 20, 2024
c10a95b
complete methylation data description
mallewellyn Mar 21, 2024
ee59d35
add titles for other data sets
mallewellyn Mar 21, 2024
674fba9
fill Horvath, needs more info
mallewellyn Mar 21, 2024
d567d8a
is the source the right reference?
mallewellyn Mar 21, 2024
944dac0
fill out breast cancer data
mallewellyn Mar 21, 2024
66cc3ba
fill scrnaseq data, see extended description
mallewellyn Mar 21, 2024
3998874
update reviews with new PR info
mallewellyn Mar 21, 2024
5247d2f
Merge branch 'carpentries-incubator:main' into highlighting-reviews
mallewellyn Mar 21, 2024
60b9da3
remove most Horvath variables
mallewellyn Mar 25, 2024
4aa4391
remove glossary, see new data glossary and #89
mallewellyn Mar 25, 2024
a6c5d4a
alan fill data page
mallewellyn Mar 26, 2024
83fa691
update for complete alt text
mallewellyn Mar 26, 2024
4ca0a4d
update for data page and key points glossary
mallewellyn Mar 26, 2024
d023db9
update reviews for completing changes in response to recent teaching …
mallewellyn Mar 27, 2024
e962989
remove reintroduction of prostate data, episode 5
mallewellyn Mar 28, 2024
a991533
Merge branch 'carpentries-incubator:main' into highlighting-reviews
mallewellyn Mar 28, 2024
caf6723
update changes made in response to Emma Rand
mallewellyn Mar 28, 2024
5074280
cross to tick
mallewellyn Mar 28, 2024
8f5c8d5
complete challenge alignment with objective tasks as in #171
mallewellyn Mar 28, 2024
fe69966
cars and farm data removed, update reviews document
mallewellyn Mar 28, 2024
301ee1e
Adjust author list
ailithewing Apr 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CITATION
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
FIXME: describe how to cite this lesson.
O’Callaghan A, Robertson G, LLewellyn M, Becher H, Meynert A, Vallejos C, Ewing A. (2024). High dimensional statistics with R. https://github.com/
carpentries-incubator/high-dimensional-stats-r.
20 changes: 5 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,7 @@

[![Create a Slack Account with us](https://img.shields.io/badge/Create_Slack_Account-The_Carpentries-071159.svg)](https://swc-slack-invite.herokuapp.com/)

**Thanks for contributing to The Carpentries Incubator!**
This repository provides a blank starting point for lessons to be developed
here.

A member of the [Carpentries Curriculum Team](https://carpentries.org/team/)
will work with you to get your lesson listed on the
[Community Developed Lessons page][community-lessons]
and make sure you have everything you need to begin developing your new lesson.

## What to do next

Before you begin developing your new lesson,
here are a few things we recommend you do:

* [ ] [Add relevant topic tags to your lesson repository][cdh-topic-tags].
This repository is part of The Carpentries Incubator, a place for The Carpentries community to collaboratively create, test, and improve lessons.

## Contributing

Expand All @@ -42,6 +28,10 @@ Look for the tag
This indicates that the maintainers will welcome a pull request fixing this
issue.

## Reviews

The lesson has been iteratively developed and improved. For information on the development process, reviews and feedback from instructors following teaching see [REVIEWS](reviews.md).

## Maintainer(s)

Current maintainers of this lesson are
Expand Down
21 changes: 1 addition & 20 deletions _episodes_rmd/05-factor-analysis.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -80,29 +80,10 @@ components are ordered by the amount of variance they account for.

# Prostate cancer patient data

The prostate dataset represents data from 97 men who have prostate cancer.
The data come from a study which examined the correlation between the level
of prostate specific antigen and a number of clinical measures in men who were
about to receive a radical prostatectomy. The data have 97 rows and 9 columns.
We revisit the prostate dataset of 97 men who have prostate cancer.
Although not strictly a high-dimensional dataset, as with other episodes,
we use this dataset to explore the method.


Columns are:


- `lcavol`: log (cancer volume)
- `lweight`: log (prostate weight)
- `age`: age (years)
- `lbph`: log (benign prostatic hyperplasia amount)
- `svi`: seminal vesicle invasion
- `lcp`: log (capsular penetration); amount of spread of cancer in outer walls
of prostate
- `gleason`: [Gleason score](https://en.wikipedia.org/wiki/Gleason_grading_system)
- `pgg45`: percentage Gleason scores 4 or 5
- `lpsa`: log (prostate specific antigen)


In this example, we use the clinical variables to identify factors representing
various clinical variables from prostate cancer patients. Two principal
components have already been identified as explaining a large proportion
Expand Down
88 changes: 88 additions & 0 deletions _extras/data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: "Data"
---

# Prostate cancer data
[Source](https://search.r-project.org/CRAN/refmans/bayesQR/html/Prostate.html)

Prostate specific antigen values and clinical measures for 97 patients hospitalised for a radical prostatectomy. Prostate specimens underwent histological and morphometric analysis. The column names refer to

- lcavol: log(cancer volume)
- lweight: log(prostate weight)
- age: age
- lbph: log(benign prostatic hyperplasia amount)
- svi: seminal vesicle invasion
- lcp: log(capsular penetration)
- gleason: Gleason score
- pgg45: percentage Gleason scores 4 or 5
- lpsa: log(prostate specific antigen)

# Methylation data

[Source](https://bioconductor.org/packages/release/data/experiment/html/FlowSorted.Blood.EPIC.html)

Illumina Human Methylation data from EPIC on sorted peripheral adult blood cell populations. The data record DNA methylation assays for each individual, which measure, for many sites in the genome, the proportion of DNA that carries a methyl mark (a chemical modification that does not alter the DNA sequence). The methylation assays are recorded as normalised methylation levels (M-values), where negative values correspond to unmethylated DNA and positive values correspond to methylated DNA. The data object also contains phenotypic metadata for each individual such as age and BMI. Precisely, the data object contains:

- assay(data): normalised methylation levels
- colData(data): individual-level information
- Sample_Well: sample well
- Sample_Name: name of sample
- purity: sample cell purity
- Sex: sex
- Age: age in years
- weight_kg: weight in kilograms
- height_m: height in metres
- bmi: BMI
- bmi_clas: BMI class
- Ethnicity_wide: ethnicity, wide class
- Ethnic_self: ethnicity, self-identified
- smoker: yes/no indicator of smoker status
- Array: type of array from the EPIC array library
- Slide: slide identifier

# Horvath data

[Source](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0014821#s5)

Methylation markers across different age groups. The CpGmarker variable used in this lesson are CpG site encodings.

# Breast cancer gene expression data

[Source](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE2990)

Gene expression data showing microarray results for different probes used to examine gene expression profiles in 91 different breast cancer patient samples and metdata for the sampled patients.

- assay(data): gene expression data for each individual
- colData(data): individual-level information
- Study: study identifier
- Age: age in years
- Distant.RFS: indicator of distant relapse free survival
- ER: estrogen receptor positive or negative status
- GGI: gene expression grade index
- Grade: histologic grade
- Size: tumour size in cm
- Time.RFS: time between the date of surgery and diagnosis of relapse (time in relapse free survival, RFS)

# Single-cell RNA sequencing data

[Source](https://pubmed.ncbi.nlm.nih.gov/25700174/)

Gene expression measurements for over 9000 genes in over 3000 mouse cortex and hippocampus cells. These data are an excerpt of the original source.

- assay(data): gene expression data
- colData(data): individual cell-level information
- tissue: tissue type
- group #: group number
- total mRNA mol: total number of observed mRNA molecules corresponding to this cell's unique barcode identifier
- well: the well that this cell's cDNA was stored in during processing
- sex: sex of the donor animal
- age: age of the donor animal
- diameter: estimated cell diameter
- cell_id: cell identifier
- level1class: a cluster label identified using a mix of computational techniques and manual annotation
- level2class: a cluster label identified using a mix of computational techniques and manual annotation
- sizeFactor: estimate size factor calculated for scaling normalisation using (e.g., **`scran`**).


{% include links.md %}

1 change: 0 additions & 1 deletion reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
layout: reference
---

## Glossary


{% include links.md %}
Loading