Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

[Neural Speed] Improvements to run.py script #87

Merged
merged 7 commits into from
Feb 21, 2024
Merged

Conversation

aahouzi
Copy link
Member

@aahouzi aahouzi commented Jan 23, 2024

Type of Change

  • Handles models that require a HF token access ID (llama, llama2, etc..)

Description

  • Same as above

Expected Behavior & Potential Risk

  • N/A

How has this PR been tested?

  • Tested with same commands provided in the README file, on diverse models requiring token access ID: llama2-7b, llama2-13b, llama2-70b. The script completes its executions as expected.

Dependency Change?

  • huggingface_hub package, but I guess this is already a dependency of transformers ?

@kevinintel kevinintel requested review from Zhenzhong1 and zhenwei-intel and removed request for Zhenzhong1 January 26, 2024 10:00
scripts/inference.py Outdated Show resolved Hide resolved
@aahouzi aahouzi requested a review from Zhenzhong1 January 29, 2024 10:24
@aahouzi aahouzi closed this Jan 31, 2024
@aahouzi aahouzi reopened this Jan 31, 2024
@VincyZhang VincyZhang merged commit 33ffaf0 into intel:main Feb 21, 2024
5 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants