You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, may I ask you some questions about the training process?
I have modified the SR to 24kHz and HOP_SIZE to 300, which results in a 80Hz spectrum feature for input. I used my own dataset for training, and the training curve is like follows:
VQ loss is increasing, but the accuracy is at around 75%.
Is this a normal situtation?
In fact, I want to use this model for an unsupervised phone loss, but the input size is fixed. Thus, I also want to know, will the phonetic discrimination performance still be good, for other input with arbitrary length?
Thank you.
The text was updated successfully, but these errors were encountered:
Hello, may I ask you some questions about the training process?
I have modified the SR to 24kHz and HOP_SIZE to 300, which results in a 80Hz spectrum feature for input. I used my own dataset for training, and the training curve is like follows:
VQ loss is increasing, but the accuracy is at around 75%.
Is this a normal situtation?
In fact, I want to use this model for an unsupervised phone loss, but the input size is fixed. Thus, I also want to know, will the phonetic discrimination performance still be good, for other input with arbitrary length?
Thank you.
The text was updated successfully, but these errors were encountered: