Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] While waiting for the model's response on an Android phone, performing other operations may cause the phone to become unresponsive or reboot. #3131

Open
yangshgetui opened this issue Feb 13, 2025 · 2 comments
Labels
question Question about the usage

Comments

@yangshgetui
Copy link

❓ General Questions

While waiting for the model's response on an Android phone, performing other operations may cause the phone to become unresponsive or reboot.
For example, if I want to return to the home screen.

Image

I suspect that it's due to insufficient GPU resources on the device. Trying to use only the CPU results in the app crashing.

2025-03-04 15:03:37.647 19380-19447/ai.mlc.mlcchat E/AndroidRuntime: FATAL EXCEPTION: Thread-5
Process: ai.mlc.mlcchat, PID: 19380
org.apache.tvm.Base$TVMError: TVMError: Assert fail: T.tvm_struct_get(p_model_embed_tokens_q_weight, 0, 10, "int32") == 4, Argument qwen2_q4f16_1_e396fd42f6a997ca798eafc3bf56647f_fused_dequantize_take1.p_model_embed_tokens_q_weight.device_type has an unsatisfied constraint: 4 == T.tvm_struct_get(p_model_embed_tokens_q_weight, 0, 10, "int32")

    at org.apache.tvm.Base.checkCall(Base.java:173)
    at org.apache.tvm.Function.invoke(Function.java:130)
    at ai.mlc.mlcllm.JSONFFIEngine.runBackgroundLoop(JSONFFIEngine.java:65)
    at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:42)
    at ai.mlc.mlcllm.MLCEngine$backgroundWorker$1.invoke(MLCEngine.kt:40)
    at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:19)
    at ai.mlc.mlcllm.BackgroundWorker$start$1.invoke(MLCEngine.kt:18)
    at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)
@yangshgetui yangshgetui added the question Question about the usage label Feb 13, 2025
@null-define
Copy link

Some devices' systems may have mechanisms to prevent the device from completely freezing.

Due to MLC occupying the GPU for an extended period, the device's UI rendering becomes completely unresponsive.

On some devices(a Qualcomm Automotive board I tested), this behavior may cause the SystemUI to restart or forcefully interrupt the execution of the OpenCL kernel.

In some cases(a new gen Qualcomm phone I tested), the device may only attempt to kill the application.

So it might be a device issue, maybe you can try with another phone instead.

@Mawriyo
Copy link

Mawriyo commented Feb 21, 2025

I've experienced this as well!

#2894

I would love for this to be clarified but my solution was to compile/use models that have _0.

@Mawriyo Mawriyo mentioned this issue Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about the usage
Projects
None yet
Development

No branches or pull requests

3 participants