[mlas] add loongarch lsx and lasx optimize code #17937

junchao-loongson · 2023-10-13T10:02:32Z

Description

Hello we(@lixing-star) are the developers of loongson team.

We add 128 (lsx), 256 (lasx) vector optimization code for the loongarch architecture

100% tests passed, 0 tests failed out of 7

Development Environments1

CPU: 
    Loongson-3C5000L
uname -a:  
    Linux localhost.localdomain 4.19.190-6.4.lns8.loongarch64 #1 SMP Thu Jul 14 12:08:04 CST 2022 loongarch64 loongarch64 loongarch64 GNU/Linux

LonngArch Documents

junchao-loongson · 2023-10-16T00:57:19Z

@microsoft-github-policy-service agree company="Loongson Technology Corporation Limited"

junchao-loongson · 2023-10-16T00:58:53Z

@microsoft-github-policy-service agree company="Loongson Technology Corporation Limited"

junchao-loongson · 2023-10-17T08:38:34Z

@snnn hello~
How to trigger the ci test?
We can provide loongarch machines for testing purposes

snnn · 2023-10-17T15:48:39Z

@faxu , please help review

snnn · 2023-10-17T16:02:36Z

Is it possible to build the code and run the tests in QEMU, like https://wiki.debian.org/LoongArch/sbuildQEMU？
Is there a way to get a cross-compiler for this arch? Like for ARM we can get one from https://www.linaro.org/downloads/.

junchao-loongson · 2023-10-18T09:51:03Z

hello~
Here is some information about qemu and cross-compile

qemu

git clone https://gitlab.com/qemu-project/qemu.git
cd qemu
./configure --target-list=loongarch64-linux-user

download cross-compile tool

 wget https://mirrors.wsyu.edu.cn/fedora/linux/Yongbao/cross-toolchain/x86_64-cross-tools-loongarch64-gcc-libc.tar.xz

cfg

tool.cmake

 SET(CMAKE_SYSTEM_NAME Linux)
 SET(CMAKE_SYSTEM_VERSION 1)
 SET(CMAKE_C_COMPILER loongarch64-unknown-linux-gnu-gcc)
 SET(CMAKE_CXX_COMPILER loongarch64-unknown-linux-gnu-g++)
 SET(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
 SET(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
 SET(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
 SET(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)

wget https://github.com/protocolbuffers/protobuf/releases/download/v3.18.1/protoc-3.18.1-linux-x86_64.zip
unzip protoc-3.18.1-linux-x86_64.zip -d protoc

cmake -DONNX_CUSTOM_PROTOC_EXECUTABLE=`pwd`/../protoc/bin/protoc  -DCMAKE_TOOLCHAIN_FILE=`pwd`/../tool.cmake ../cmake

issues

When I run make cmd, I get the following error message.
I don't get this error when I compile locally using build.sh

[ 18%] Building CXX object CMakeFiles/onnxruntime_mlas.dir/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/threading.cpp.o
In file included from /home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/threading.cpp:17:
/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/mlasi.h:1277:9: error: ?__m128? does not name a type; did you mean ?__int128??
 1277 | typedef __m128 MLAS_FLOAT32X4;
      |         ^~~~~~
      |         __int128
/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/mlasi.h:1278:9: error: ?__m128i? does not name a type
 1278 | typedef __m128i MLAS_INT32X4;
      |         ^~~~~~~
/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/mlasi.h:1285:1: error: ?MLAS_INT32X4? does not name a type
 1285 | MLAS_INT32X4
      | ^~~~~~~~~~~~
/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/mlasi.h:1300:1: error: ?MLAS_INT32X4? does not name a type
 1300 | MLAS_INT32X4
      | ^~~~~~~~~~~~

I realized that the -mlsx (loongarch SIMD) compilation parameter is missing when compiling platform.cpp.
I'm not quite sure if my cmake command parameters are correct or not
Perhaps CMAKE_SYSTEM_PROCESSOR is not set correctly

/home/yala/work/plugins/la-cross-tools/bin/loongarch64-unknown-linux-gnu-g++ -DCPUINFO_SUPPORTED_PLATFORM=0 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DORT_ENABLE_STREAM -DORT_NO_RTTI -DPLATFORM_POSIX -D_GNU_SOURCE -I/home/yala/work/plugins/onnxruntime/build-cross/_deps/utf8_range-src -I/home/yala/work/plugins/onnxruntime/include/onnxruntime -I/home/yala/work/plugins/onnxruntime/include/onnxruntime/core/session -I/home/yala/work/plugins/onnxruntime/build-cross/_deps/pytorch_cpuinfo-src/include -I/home/yala/work/plugins/onnxruntime/build-cross/_deps/google_nsync-src/public -I/home/yala/work/plugins/onnxruntime/build-cross -I/home/yala/work/plugins/onnxruntime/onnxruntime -I/home/yala/work/plugins/onnxruntime/build-cross/_deps/abseil_cpp-src -I/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/inc -I/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib -I/home/yala/work/plugins/onnxruntime/build-cross/_deps/gsl-src/include -ffunction-sections -fdata-sections -Wno-restrict -DCPUINFO_SUPPORTED -fPIC -fno-rtti -Wall -Wextra -Wno-deprecated-copy -Wno-nonnull-compare -Werror -MD -MT CMakeFiles/onnxruntime_mlas.dir/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/platform.cpp.o -MF CMakeFiles/onnxruntime_mlas.dir/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/platform.cpp.o.d -o CMakeFiles/onnxruntime_mlas.dir/home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/platform.cpp.o -c /home/yala/work/plugins/onnxruntime/onnxruntime/core/mlas/lib/platform.cpp

snnn · 2023-10-18T16:40:01Z

Nice! Would you please confirm that the cross-compile tool you showed to me is publicly available? Is it an official package from Fedora project?

junchao-loongson · 2023-10-19T01:26:37Z

cross-compile tool source code come form gcc repository(commit id is cead92b7fc4d7a545dcf2f02397120e3c9afe1a3)
I just compiled it to binary and provided this temporary link

junchao-loongson · 2023-10-19T03:18:03Z

We've put this toolchain into loongson's official repository

https://github.com/loongson/build-tools/releases/download/2023.08.08/x86_64-cross-tools-loongarch64-gcc-libc.tar.xz

snnn · 2023-10-19T04:54:01Z

I will go ahead and merge this PR.

snnn · 2023-10-19T04:59:52Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline

snnn · 2023-10-19T05:00:02Z

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-10-19T05:00:43Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2023-10-19T05:02:07Z

Azure Pipelines successfully started running 7 pipeline(s).

snnn · 2023-10-19T15:54:39Z

Build error:
onnxruntime/core/mlas/lib/mlasi.h:1800:48: error: ‘MALS_INT32X4’ was not declared in this scope; did you mean ‘MLAS_INT32X4’?

junchao-loongson · 2023-10-20T03:36:26Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline

azure-pipelines · 2023-10-20T03:36:31Z

Commenter does not have sufficient privileges for PR 17937 in repo microsoft/onnxruntime

junchao-loongson · 2023-10-20T06:32:56Z

We fixed this issue. Also fixed some issues caused by clang-fomat

snnn · 2023-10-20T16:28:35Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline

snnn · 2023-10-20T16:28:43Z

/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-10-20T16:29:15Z

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines · 2023-10-20T16:29:21Z

Azure Pipelines successfully started running 10 pipeline(s).

snnn · 2023-10-23T22:47:10Z

/azp run Linux CPU CI Pipeline

azure-pipelines · 2023-10-23T22:47:22Z

Azure Pipelines successfully started running 1 pipeline(s).

junchao-loongson · 2023-11-06T02:48:33Z

hello~
what else do we need to do to merge this patch

snnn · 2023-11-07T00:01:47Z

No. Thanks. I've sent this PR to @yufenglee to review.

lixing-star · 2023-12-07T02:55:07Z

@snnn , how about the code review progress? thanks.

snnn · 2023-12-07T19:16:33Z

I will setup a CI build pipeline for this.

lixing-star · 2023-12-08T01:02:01Z

thanks. We will also rebuild the main code for checking our code.

[mlas] add loongarch lsx and lasx optimize code

c8ef1d2

junchao-loongson requested a review from a team as a code owner October 13, 2023 10:02

snnn assigned faxu Oct 17, 2023

revert clang-format

49d3874

MQ-mengqing mentioned this pull request Oct 20, 2023

Content suggestion for This Week in LoongArch newsletter / 《每周一龙》新闻线索信箱 loongson-community/areweloongyet#16

Open

snnn approved these changes Oct 25, 2023

View reviewed changes

yufenglee approved these changes Dec 7, 2023

View reviewed changes

snnn merged commit 4abec97 into microsoft:main Dec 7, 2023
57 of 60 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlas] add loongarch lsx and lasx optimize code #17937

[mlas] add loongarch lsx and lasx optimize code #17937

junchao-loongson commented Oct 13, 2023

junchao-loongson commented Oct 16, 2023 •

edited

Loading

junchao-loongson commented Oct 16, 2023

junchao-loongson commented Oct 17, 2023 •

edited

Loading

snnn commented Oct 17, 2023

snnn commented Oct 17, 2023

junchao-loongson commented Oct 18, 2023 •

edited

Loading

snnn commented Oct 18, 2023

junchao-loongson commented Oct 19, 2023

junchao-loongson commented Oct 19, 2023

snnn commented Oct 19, 2023

snnn commented Oct 19, 2023

snnn commented Oct 19, 2023

azure-pipelines bot commented Oct 19, 2023

azure-pipelines bot commented Oct 19, 2023

snnn commented Oct 19, 2023

junchao-loongson commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

junchao-loongson commented Oct 20, 2023

snnn commented Oct 20, 2023

snnn commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

snnn commented Oct 23, 2023

azure-pipelines bot commented Oct 23, 2023

junchao-loongson commented Nov 6, 2023

snnn commented Nov 7, 2023

lixing-star commented Dec 7, 2023

snnn commented Dec 7, 2023

lixing-star commented Dec 8, 2023

[mlas] add loongarch lsx and lasx optimize code #17937

[mlas] add loongarch lsx and lasx optimize code #17937

Conversation

junchao-loongson commented Oct 13, 2023

Description

Development Environments1

LonngArch Documents

junchao-loongson commented Oct 16, 2023 • edited Loading

junchao-loongson commented Oct 16, 2023

junchao-loongson commented Oct 17, 2023 • edited Loading

snnn commented Oct 17, 2023

snnn commented Oct 17, 2023

junchao-loongson commented Oct 18, 2023 • edited Loading

qemu

download cross-compile tool

cfg

issues

snnn commented Oct 18, 2023

junchao-loongson commented Oct 19, 2023

junchao-loongson commented Oct 19, 2023

snnn commented Oct 19, 2023

snnn commented Oct 19, 2023

snnn commented Oct 19, 2023

azure-pipelines bot commented Oct 19, 2023

azure-pipelines bot commented Oct 19, 2023

snnn commented Oct 19, 2023

junchao-loongson commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

junchao-loongson commented Oct 20, 2023

snnn commented Oct 20, 2023

snnn commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

snnn commented Oct 23, 2023

azure-pipelines bot commented Oct 23, 2023

junchao-loongson commented Nov 6, 2023

snnn commented Nov 7, 2023

lixing-star commented Dec 7, 2023

snnn commented Dec 7, 2023

lixing-star commented Dec 8, 2023

junchao-loongson commented Oct 16, 2023 •

edited

Loading

junchao-loongson commented Oct 17, 2023 •

edited

Loading

junchao-loongson commented Oct 18, 2023 •

edited

Loading