You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am curious how did you guys implemented this experiments. I mean given Figure 6 in eagle 1 as an example, Feautre & unshifted token can be concatenated for tokens generated by Large target model. However, For tokens generated by draft models, How can they get features without running models in advance?
The text was updated successfully, but these errors were encountered:
It is thus the feature&shifted token. The feature predicted by the draft model goes through an LM head to get a distribution and we can sample the next token from this distribution. In the next round, we concatenate this feature with this sampled token for the next generation. Figure 6 gives a clear description.
I am curious how did you guys implemented this experiments. I mean given Figure 6 in eagle 1 as an example, Feautre & unshifted token can be concatenated for tokens generated by Large target model. However, For tokens generated by draft models, How can they get features without running models in advance?
The text was updated successfully, but these errors were encountered: