Some thoughts on optimizing FLM-101B training framework #1

David-H-Yu · 2023-09-21T13:52:58Z

I truly appreciate the FLM-101B team open-sourcing this large-scale language model! After reading the paper, I also have some thoughts on optimizing the training framework, mainly in these aspects:

Progressive data selection strategy: using different datasets for models of different scales to achieve gradual enhancement.
Parameter update driven growth: inserting new layers based on layer update status.
Layer-wise learning rates: setting independent learning rates for different layers.
Genetic algorithm based model expansion.
Incremental fine-tuning for transfer learning.
I drafted a document elaborating these ideas in details. If the team finds it relevant, I'd be very happy to have the opportunity to further discuss thoughts on optimizing the training framework. Please reply to this issue or contact me via [email protected].

Again, thank you for the contributions of the FLM-101B team!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some thoughts on optimizing FLM-101B training framework #1

Some thoughts on optimizing FLM-101B training framework #1

David-H-Yu commented Sep 21, 2023

Some thoughts on optimizing FLM-101B training framework #1

Some thoughts on optimizing FLM-101B training framework #1

Comments

David-H-Yu commented Sep 21, 2023