Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Concurrent chat doesnt work on Mac Silicon #1569

Closed
2 of 6 tasks
gabrielle-ong opened this issue Oct 29, 2024 · 2 comments
Closed
2 of 6 tasks

bug: Concurrent chat doesnt work on Mac Silicon #1569

gabrielle-ong opened this issue Oct 29, 2024 · 2 comments
Assignees
Labels
category: model running Inference ux, handling context/parameters, runtime type: bug Something isn't working
Milestone

Comments

@gabrielle-ong
Copy link
Contributor

Cortex version

1.0.1-203

Describe the Bug

Mac: Concurrent chats for the same model are queued up rather than parallel

  • Models tested: tinyllama, llama3.2
  • I expect to open 2 CLI windows / Postman window and have concurrent chats
  • Works well if separate models (eg tinyllama chat & llama3.2 chat)

May be related to n_parallel parameter in model.yaml

Windows, Ubuntu: Working as expected

Steps to Reproduce

No response

Screenshots / Logs

No response

What is your OS?

  • MacOS
  • Windows
  • Linux

What engine are you running?

  • cortex.llamacpp (default)
  • cortex.tensorrt-llm (Nvidia GPUs)
  • cortex.onnx (NPUs, DirectML)
@gabrielle-ong gabrielle-ong added the type: bug Something isn't working label Oct 29, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Oct 29, 2024
@gabrielle-ong gabrielle-ong added the category: model running Inference ux, handling context/parameters, runtime label Oct 29, 2024
@vansangpfiev vansangpfiev moved this from Investigating to Review + QA in Menlo Oct 30, 2024
@gabrielle-ong
Copy link
Contributor Author

@vansangpfiev do I need to change anything for this to work?
I redownloaded the models, but still is non concurrent on my local com and the VM test-macos-13-1
ie right chat finishes, only then left chat begins
Image

@gabrielle-ong gabrielle-ong added this to the v1.0.2 milestone Nov 5, 2024
@gabrielle-ong gabrielle-ong moved this from Review + QA to Completed in Menlo Nov 5, 2024
@gabrielle-ong
Copy link
Contributor Author

works with n_parallel = 2, marking as complete

@github-project-automation github-project-automation bot moved this from Completed to Review + QA in Menlo Nov 5, 2024
@gabrielle-ong gabrielle-ong moved this from Review + QA to Completed in Menlo Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model running Inference ux, handling context/parameters, runtime type: bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

2 participants