Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

大佬好,请问如何统计每秒 token 数,例如:MLCChat 上输出的 prefill:xx tok/s decode:xx tok/s #231

Open
XTaoWang opened this issue Nov 27, 2024 · 3 comments

Comments

@XTaoWang
Copy link

No description provided.

@wangzhaode
Copy link
Owner

在命令行中测试会直接打印速度的

@XTaoWang
Copy link
Author

怎么测试了?不同的手机配置和参数不同,我理解 prefill:xx tok/s decode:xx tok/s 生成的数值应该都不同?求大佬指教

@wangzhaode
Copy link
Owner

adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json"

README里有介绍的,直接使用adb运行demo就可以打印性能

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants