-
Notifications
You must be signed in to change notification settings - Fork 11
Issues: atoma-network/atoma-infer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Sending logprobs in the chat completions body should be optional
good first issue
Good for newcomers
openai-api
vLLM
#111
opened Oct 8, 2024 by
jorgeantonio21
Introduce multi-model hosting for the axum service
enhancement
New feature or request
good first issue
Good for newcomers
models
openai-api
#100
opened Oct 3, 2024 by
jorgeantonio21
Add proper shutdown for New feature or request
good first issue
Good for newcomers
openai-api
vLLM
LlmEngine
enhancement
#98
opened Oct 2, 2024 by
jorgeantonio21
Add proper shutdown for New feature or request
good first issue
Good for newcomers
openai-api
vLLM
TokenizerWorker
enhancement
#97
opened Oct 2, 2024 by
jorgeantonio21
Add support for streaming tokens and compatibility with axum server (using sse)
openai-api
vLLM
#85
opened Sep 26, 2024 by
jorgeantonio21
Add features to README
good first issue
Good for newcomers
#84
opened Sep 26, 2024 by
jorgeantonio21
Integrate the backend service with the OpenAI api
openai-api
vLLM
#77
opened Sep 25, 2024 by
jorgeantonio21
Address
seqlenq_ngroups_swapped
option in kv cached flash attention
#37
opened Aug 5, 2024 by
jorgeantonio21
Check determinism of our implementation
candle
cuda
determinism
llama
paged-attention
#25
opened Jul 24, 2024 by
jorgeantonio21
Add multi-gpu support
candle
cuda
flash-attention2
llama
paged-attention
#24
opened Jul 24, 2024 by
jorgeantonio21
2 tasks
Code clean-up
cuda
flash-attention2
good first issue
Good for newcomers
paged-attention
#22
opened Jul 24, 2024 by
jorgeantonio21
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.