-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seed parameter while generating #6
Comments
Ah yes, thanks for spotting this - the error looks like an issue caused by trying to assign to a tensor, which is not possible since they're immutable in TensorFlow... It was originally a Numpy array, and they are mutable. Apologies, I'll get this fixed. Actually I'm not sure what would be the result of having different sample rates during training and generation. With regard to the effect of seeding, I suspect it needs a larger chunk of samples for it to have much effect... I need to look at that feature again, sorry haven't had any time to work on it. |
Thank you for the super-fast reply! Its maybe possible that different sample rates can speed up or slow down generated audio (something similar happened in the training process with MelGan), so I switched to 16kHz on every parameter. |
Thanks - yes, a 4hr dataset should yield something useful... 4 minutes unfortunately is unlikely to be practical, you're welcome to try it but my guess is that the model will simply overfit, meaning it will basically just memorize your dataset. Incidentally, in terms of getting a better training workflow we will shortly be releasing a model tuner/optimizer with the code... It's based on Keras Tuner, and there is already a branch available, although it is experimental and buggy at the moment, with no documentation on the tuner. I hope to merge this within the next week or so, I'm testing the implementation on some large datasets now. So instead of blindly picking some hyperparameters and hoping for the best, the tuner will allow users to find the optimal hyperparameters for a dataset, then proceed to a full training session with those hyperparams. Still likely to be more of an art than a science (which is not a bad thing!), but better than blindly stabbing in the dark! |
An actual 4 minute-long file would not work and only produce meaningless noise. Perhaps repeating the audio until it reaches more than 25 minutes would work? |
@DigestContent0 @bstivic Indeed that would work, perhaps a better solution would be some kind of data augmentation (although I still suspect you'd need more raw data than a single 4-min track). This is often used when working with images. I have recently been investigating this very issue, with a view to including a data augmentation script in a future release. I've been experimenting with audiomentations, which looks promising. Dadabots claim that they got good results from 3200 chunks (overlapped). Having worked with datasets of a few hundred chunks I can confirm that, whilst you might be able to achieve good training accuracy, validation accuracy indicates classic overfitting after a few epochs (that's on the validate branch, which I am hoping to merge to master very shortly). I've added a gist for using audiomentations on a directory of wav files. |
Hi,
I have problem with generating audio from the seed audio. As i understood, when we provide seed audio file, generation continues where after 64 samples of seed and it should point generation in some other directions than default? I am trying to seed because I get almost identical or very similar generated audio results (epoch 150, dataset 30min, 22050Hz) with different training parameters every time.
Error that I get when trying to seed:
!python generate.py
--output_path ./generated/default/test_1_t075_s10_16000.wav
--checkpoint_path ./logdir/default/26.09.2020_12.35.35/model.ckpt-140
--seed ./chunks/chunk_22050_mono_norm_chunk_109.wav
--dur 10
--sample_rate 16000
--temperature 0.75
--num_seqs 100
--config_file ./default.config.json
Traceback (most recent call last):
File "generate.py", line 225, in
main()
File "generate.py", line 221, in main
args.sample_rate, args.temperature, args.seed, args.seed_offset)
File "generate.py", line 188, in generate
init_samples[:, :model.big_frame_size, :] = quantize(seed_audio, q_type, q_levels)
TypeError: 'tensorflow.python.framework.ops.EagerTensor' object does not support item assignment
One more question: Is the sample rate differences maybe the problem? Does it have to be a same sample rate in training, generation and seed audio?
Best regards,
Branimir
The text was updated successfully, but these errors were encountered: