-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update batch processing length normalization to match non-batch processing length normalization #441
Conversation
update help print statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look and thinks this needs work.
src/main.cpp
Outdated
//<< " --error_rate Estimated error rate of long reads (required for --long)" << endl | ||
<< " --threshold Threshold for rate of unmapped kmers per read" << endl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks accidental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify what looks accidental?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -2180,7 +2180,7 @@ void usageTCCQuant(bool valid_input = true) { | |||
<< " (default: equivalence classes are taken from the index)" << endl | |||
<< "-f, --fragment-file=FILE File containing fragment length distribution" << endl | |||
<< " (default: effective length normalization is not performed)" << endl | |||
<< "--long Use version of EM for long reads " << endl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same problem as in line 2075.
@@ -2380,7 +2380,7 @@ int main(int argc, char *argv[]) { | |||
if (fld_lr_c[i] > 0.5) { | |||
//Good results with comment below. | |||
//flensout_f << std::fabs((double)fld_lr[i] / (double)fld_lr_c[i] - index.k);//index.target_lens_[i] - (double)fld_lr[i] / (double)fld_lr_c[i] - k); // take mean of recorded uniquely aligning read lengths | |||
flensout_f << std::fabs(index.target_lens_[i] - ((double)fld_lr[i] / (double)fld_lr_c[i]) - index.k); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Care to elaborate a bit? Ideally in comment, else maybe at least in the commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Based on our analysis for effective length normalization for long reads, the updated effective length provides better results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't follow what you are changing and why but I am not familiar with the code. So if this makes sense to others without further explanation, feel free to ignore my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bound-to-love: I tried to answer your questions.
@@ -2380,7 +2380,7 @@ int main(int argc, char *argv[]) { | |||
if (fld_lr_c[i] > 0.5) { | |||
//Good results with comment below. | |||
//flensout_f << std::fabs((double)fld_lr[i] / (double)fld_lr_c[i] - index.k);//index.target_lens_[i] - (double)fld_lr[i] / (double)fld_lr_c[i] - k); // take mean of recorded uniquely aligning read lengths | |||
flensout_f << std::fabs(index.target_lens_[i] - ((double)fld_lr[i] / (double)fld_lr_c[i]) - index.k); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't follow what you are changing and why but I am not familiar with the code. So if this makes sense to others without further explanation, feel free to ignore my comment.
@@ -2180,7 +2180,7 @@ void usageTCCQuant(bool valid_input = true) { | |||
<< " (default: equivalence classes are taken from the index)" << endl | |||
<< "-f, --fragment-file=FILE File containing fragment length distribution" << endl | |||
<< " (default: effective length normalization is not performed)" << endl | |||
<< "--long Use version of EM for long reads " << endl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same problem as in line 2075.
src/main.cpp
Outdated
//<< " --error_rate Estimated error rate of long reads (required for --long)" << endl | ||
<< " --threshold Threshold for rate of unmapped kmers per read" << endl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.