atoma-network / atoma-infer Public

Notifications You must be signed in to change notification settings
Fork 11
Star 24

Code
Issues 24
Pull requests 5
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: atoma-network/atoma-infer

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

24 Open 17 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

benchmark atoma-infer vs vllm

#122 opened Oct 23, 2024 by jorgeantonio21

feat: embeddings endpoint

#121 opened Oct 22, 2024 by francis2tm

feat: image generation endpoint

#115 opened Oct 17, 2024 by francis2tm

Consider using docker container for running tests testing

#112 opened Oct 9, 2024 by eureka-cpu

Sending logprobs in the chat completions body should be optional good first issue

Good for newcomers

openai-api vLLM

#111 opened Oct 8, 2024 by jorgeantonio21

feat: OpenAPI spec

#103 opened Oct 3, 2024 by francis2tm

feat: chat completions streaming openai-api

#102 opened Oct 3, 2024 by francis2tm

Introduce multi-model hosting for the axum service enhancement

New feature or request

good first issue

Good for newcomers

models openai-api

#100 opened Oct 3, 2024 by jorgeantonio21

Add proper shutdown for LlmEngine enhancement

New feature or request

good first issue

Good for newcomers

openai-api vLLM

#98 opened Oct 2, 2024 by jorgeantonio21

Add proper shutdown for TokenizerWorker enhancement

New feature or request

good first issue

Good for newcomers

openai-api vLLM

#97 opened Oct 2, 2024 by jorgeantonio21

Consider adding cuda feature flag cuda

#95 opened Oct 1, 2024 by eureka-cpu

feat: GET /models

#88 opened Sep 27, 2024 by francis2tm

Add support for streaming tokens and compatibility with axum server (using sse) openai-api vLLM

#85 opened Sep 26, 2024 by jorgeantonio21

Add features to README good first issue

Good for newcomers

#84 opened Sep 26, 2024 by jorgeantonio21

Project does not compile without vllm feature flag vLLM

#80 opened Sep 25, 2024 by eureka-cpu

Add other backends

#78 opened Sep 25, 2024 by jorgeantonio21

3 tasks

Integrate the backend service with the OpenAI api openai-api vLLM

#77 opened Sep 25, 2024 by jorgeantonio21

Add a semaphore to LlmService

#76 opened Sep 25, 2024 by jorgeantonio21

Create a reproducible developer environment

#73 opened Sep 23, 2024 by eureka-cpu

Check if there are ways to avoid certain transpose operations on the llama forward pass candle cuda llama models paged-attention

#40 opened Aug 7, 2024 by jorgeantonio21

Address seqlenq_ngroups_swapped option in kv cached flash attention

#37 opened Aug 5, 2024 by jorgeantonio21

Check determinism of our implementation candle cuda determinism llama paged-attention

#25 opened Jul 24, 2024 by jorgeantonio21

Add multi-gpu support candle cuda flash-attention2 llama paged-attention

#24 opened Jul 24, 2024 by jorgeantonio21

2 tasks

Code clean-up cuda flash-attention2 good first issue

Good for newcomers

paged-attention

#22 opened Jul 24, 2024 by jorgeantonio21

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly