todos.txt


Experimentation:

Debugging checkpoints: 
    - v0 LJ model: no masking, no stop predictions (unmasked)
    - v0.5 LJ model: masking, no stop predictions 
    - v1 LJ model: masking + stop predictions

- General
    - Generate train / validation evaluation loss numbers 
- Indic
    * Rerun baseline training from scratch with tf.data pipeline, masked loss, stop prediction (v1)
    * Directly fine-tune pretrained model from LJSpeech (hyperparam search)
    * Unsupervised training on mixed language audio    
    * Transfer learning with unsupervised initialization
    * Transfer with frozen params, lower LR
    - Transfer unsupervised from indic to lj data 
    - Train SSRN on Indic dataset 
- LJSpeech 
    * Retrain M4 model with masked loss, stop predictions
    * modify attention with position encodings, w w/o guided attention
    - multi-task learning, joint optimization (long shot)

General Implementation: 

- For Transfer learning experiments    
    * Implement direct transfer learning 
    * Implement partial loading of ljspeech model 
    * Modify training graph for unsupervised training
    - Modify data loader to use both languages 
- Data pipeline (Refer: https://github.com/tensorflow/tensor2tensor/blob/master/docs/overview.md)
    * Implement dataset preprocessing
    * Implement tf.data.Dataset based pipeline
    * Fix deprecated WARNING: switch to tf.train.MonitoredTrainingSession
!   - Bucket data generator by sequence length 
!   - Implement training + validation during training 
    - Fix SSRN patch-wise data generator with tfrecords
- Other improvements/bugfixes
    * Implement fix: loss masking for padded batches (text2mel,ssrn)
    * Implement stop prediction
    - implement dropout, layernorm / other training tricks
    - Globally use only params.data_dir, remove wav, csv, other path specifications

- Documentation
    - Document params file parameters

Core Implementations:

* TextEnc implementation:
    * implement, test highway conv
    * implement, test text_enc block
    * test initialization of highway conv layers

* Decoder implementation:
    * implement, test causal conv
    * implement, test causal highway conv
    * implement, test causal decoder F2d 
    * implement, test causal decoder d2F 

* Attention implementation:
    * implement basic attention mechanism
    * add, test guided attention loss
    * test training with positional encoding

* SSRN implementation:
    * implement, test transposed convolution
    * implement full SSRN module
    * modify build graph
    * modify data pipeline for faster patch-based training 

* Misc implementation: 
    * predict_op, loss_op
    * train_op, train/predict on batch
    * training, checkpointing, evaluation framework
    * train/dev data load

* Inference/Synthesis graph 
    * implement restoring of variables 
    * implement output feedback
!   - Fix inference with stop prediction (crop mag generation)
    - Implement constrained attention at inference