Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Adds parameter for normalizing donor microarray expression values #90

Merged
merged 8 commits into from
Sep 4, 2019

Conversation

rmarkello
Copy link
Owner

@rmarkello rmarkello commented Sep 3, 2019

Closes #45.

Adds new donor_norm parameter to abagen.get_expression_data() (and abagen CLI command) that controls how donor microarray values are normalized prior to aggregation.

Current options include 'srs' (scaled robust sigmoid) and 'zscore'.

To do:

  • Consider adding batch normalization (i.e., linear models across donors)
  • Consider adding within-sample normalization prior to across-sample normalization

For normalization gene expression values across regions for each donor
prior to donor aggregation.
@codecov
Copy link

codecov bot commented Sep 3, 2019

Codecov Report

Merging #90 into master will increase coverage by 0.48%.
The diff coverage is 99.13%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #90      +/-   ##
==========================================
+ Coverage   90.57%   91.05%   +0.48%     
==========================================
  Files          32       31       -1     
  Lines        1730     1812      +82     
==========================================
+ Hits         1567     1650      +83     
+ Misses        163      162       -1
Impacted Files Coverage Δ
abagen/correct.py 99% <100%> (+0.86%) ⬆️
abagen/tests/test_cli.py 100% <100%> (ø) ⬆️
abagen/__init__.py 100% <100%> (ø) ⬆️
abagen/allen.py 95% <100%> (+0.06%) ⬆️
abagen/tests/test_correct.py 100% <100%> (ø) ⬆️
abagen/cli/run.py 93.18% <80%> (+1.61%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5120ccd...a8acfdf. Read the comment docs.

All functions are moved (including tests) and imports are updated as
necessary.

Also adds an additional 'batch' option for normalize_expression()
function to use linear models to residualize donor differences.
NaN handling is completely performed by normalize_expression()
@rmarkello
Copy link
Owner Author

I think within-sample normalization should be relegated to a separate PR since this one is already a bit monstrous. It's different, conceptually, from what donor_norm is doing here, too! I'll open another issue for it and go from there.

@rmarkello rmarkello merged commit 649afc2 into master Sep 4, 2019
@rmarkello rmarkello deleted the donornorm branch September 4, 2019 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add additional donor normalization options
1 participant