Skip to content

My multi gpu is not getting detected? #1935

Closed Answered by NanoCode012
ItzAmirreza asked this question in Q&A
Discussion options

You must be logged in to vote

Hey, my bad there. I fixed my message a bit on the numbers as well.

I mean, sequence_len: 32768 is the one you should set to use the model's full context listed above. The rope_theta param within the model's config (linked above) is what allows it to scale from 32k to its long context.

i.e. set the below and your model would train fine and support 100k seq during inference :)

sequence_len: 32768

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
4 replies
@ItzAmirreza
Comment options

@NanoCode012
Comment options

@ItzAmirreza
Comment options

@NanoCode012
Comment options

Answer selected by ItzAmirreza
Comment options

You must be logged in to vote
1 reply
@NanoCode012
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants