-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MelResNet Block #705
Add MelResNet Block #705
Conversation
Codecov Report
@@ Coverage Diff @@
## master #705 +/- ##
==========================================
+ Coverage 89.05% 89.15% +0.09%
==========================================
Files 28 29 +1
Lines 2467 2489 +22
==========================================
+ Hits 2197 2219 +22
Misses 270 270
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is draft but added some comments before I forget.
Good work!
torchaudio/models/wavernn.py
Outdated
class ResBlock(nn.Module): | ||
r""" | ||
Args: | ||
num_dims (int, optional): Number of compute dimensions in ResBlock. (Default: ``128``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the default value declared in function signature. Did you forget?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I will update the file with comments. The default value will be added.
test/test_models.py
Outdated
@pytest.mark.parametrize('batch_size', [2]) | ||
@pytest.mark.parametrize('num_features', [200]) | ||
@pytest.mark.parametrize('input_dims', [100]) | ||
@pytest.mark.parametrize('output_dims', [128]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are not really parameterizing the variables, it's better to put them inside of test definition.
If you add more parameters, please use parameterized.parameterized_expand
, instead of pytest.mark.parametrize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check it. Thanks.
For the docstring, let's follow transformer as an example. |
Updates:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pointed to a few nits, but this looks good to me overall :)
Can you rebase? I see a few unrelated errors that have been fixed on master.
|
torchaudio/models/wavernn.py
Outdated
(https://github.com/G-Wang/WaveRNN-Pytorch) | ||
|
||
Args: | ||
num_dims: the number of compute dimensions in the input (default=128). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking out loud: we decided in conventions to use n_*
see readme. What is written here aligns with wav2letter though. We may need to revisit these conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates:
|
Tests should pass after rebasing, #720. |
@jimchen90 -- While we are working on implementing the full pipeline, let's make the wavernn module private by prefixing the file with an underscore, Please also add a fourth step in the description:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
@@ -29,3 +29,23 @@ def test_mfcc(self): | |||
out = model(x) | |||
|
|||
assert out.size() == (batch_size, num_classes, 2) | |||
|
|||
|
|||
class TestMelResNet: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test does not subclass unittest.TestCase
so it won't run in fbcode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! @jimchen90 -- can you send a follow-up pull request to update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I will update it.
Thanks for your contribution! Respectfully, the WaveRNN paper never mentions "upsampling" or "residual block" (the only mention is in reference to WaveNet) or "mel" or "spectrogram". It would be inaccurate to say:
The "MelResNet" block is never mentioned or referenced in the paper... and it would be inaccurate to state it was. |
input_dims: the number of input sequence (default=100). | ||
hidden_dims: the number of compute dimensions (default=128). | ||
output_dims: the number of output sequence (default=128). | ||
pad: the number of kernal size (pad * 2 + 1) in the first Conv1d layer (default=2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo "kernel"
Thank you so much for pointing this out. @PetrochukM |
Happy to help! With regards to references... There are no papers written about "fatchord's model", that I know about. From my understanding, there also hasn't been any formal evaluation of "fatchord's model". From an identity perspective, I don't even know fatchord's name, so it's hard to credit. I'm also a bit perplexed. I'm thinking it doesn't make sense to implement a model that has not been published, peer-reviewed, or evaluated. |
Simple example to demonstrate parameter server training pattern
This MelResNet block is part of WaveRNN model . Now the test is to validate the output dimensions of this block. Other tests will be added after other blocks are combined.
Related to #446
Stack:
1. Add MelResNet Block #705 #7512. Add Upsampling Block #7243. Add WaveRNN Model #7354. Add example pipeline with WaveRNN #749