Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loongarch64: Small matrix opt #4711

Closed

Conversation

XiWeiGu
Copy link
Contributor

@XiWeiGu XiWeiGu commented May 21, 2024

Add small matrix multiplication optimization

@XiWeiGu XiWeiGu changed the title LoongArch: Small matrix opt Loongarch64: Small matrix opt May 21, 2024
@XiWeiGu XiWeiGu closed this May 21, 2024
Copy link

codspeed-hq bot commented May 21, 2024

CodSpeed Performance Report

Merging #4711 will not alter performance

Comparing XiWeiGu:loongarch64_small_matrix (ff1e7eb) with develop (700ea74)

Summary

✅ 16 untouched benchmarks

@martin-frbg
Copy link
Collaborator

I guess you could check the transa,transb settings in the permit function when you have "only" the NN kernel ready. (And don't let the somewhat experimental codspeed report discourage you - it only runs a select few tests on x86_64)

@XiWeiGu
Copy link
Contributor Author

XiWeiGu commented May 24, 2024

Thank you very much for your suggestion. The current progress is that I am still adding assembly optimizations for the TN, TT, and NT cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants