Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : add pipeline parallelism support #6017

Merged
merged 23 commits into from
Mar 13, 2024
Merged

Commits on Mar 12, 2024

  1. llama : add pipeline parallelism support for batch processing with mu…

    …ltiple CUDA GPUs
    
    ggml-ci
    slaren committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    822121f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    1ac668e View commit details
    Browse the repository at this point in the history
  3. fix server embedding test

    slaren committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    4ddccc2 View commit details
    Browse the repository at this point in the history
  4. llama : fix Mamba inference for pipeline parallelism

    Tested to work correctly with both `main` and `parallel` examples.
    compilade committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    937966d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    00a415d View commit details
    Browse the repository at this point in the history
  6. add LLAMA_SCHED_MAX_COPIES to configure the number of input copies fo…

    …r pipeline parallelism
    
    default increase to 4 (from 2)
    
    changing this value may improve performance for some systems, but increases memory usage
    slaren committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    89bfa1f View commit details
    Browse the repository at this point in the history
  7. fix hip build

    slaren committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    aa1e2f8 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    deb3e24 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    ead5c8b View commit details
    Browse the repository at this point in the history
  10. fix hip build

    slaren committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    255c1ec View commit details
    Browse the repository at this point in the history

Commits on Mar 13, 2024

  1. Configuration menu
    Copy the full SHA
    4400153 View commit details
    Browse the repository at this point in the history
  2. llama : fix norm backend

    slaren committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    9e7cecc View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b25a0f1 View commit details
    Browse the repository at this point in the history
  4. swiftui : sync after decode

    ggerganov committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    529e749 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    54cdd47 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    cda49d3 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    015e1bf View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    0d934ee View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    3c38789 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    9092883 View commit details
    Browse the repository at this point in the history
  11. fix merge

    slaren committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    cb580a6 View commit details
    Browse the repository at this point in the history
  12. small fix

    slaren committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    1f56481 View commit details
    Browse the repository at this point in the history
  13. reduce default n_batch to 2048

    slaren committed Mar 13, 2024
    Configuration menu
    Copy the full SHA
    976176d View commit details
    Browse the repository at this point in the history