Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLAS] add interfaces to matrix copy/transposition routines #227

Merged

Conversation

andrewtbarker
Copy link
Contributor

Description

This PR adds interfaces for batch strided matrix copy/transposition routines omatcopy, imatcopy, and omatadd.

Currently the non-batch routines are not fully document/implemented in the oneMKL product, when they are we will add interfaces to them as well. At that point we will also implement the cublas and rocblas backends via their geam interfaces where appropriate.

Checklist

All Submissions

New interfaces

  • Have you provided motivation for adding a new feature as part of RFC and
    it was accepted? #420
  • What version of oneAPI spec the interface is targeted? (1.2 provisional rev 1)

New features

  • Have you provided motivation for adding a new feature?
  • Have you added relevant tests? Added tests are "smoke tests" to test for compilation, linking, and execution, but do not test correctness because there is no reference code for the added functionality.

test-cublas.txt
test-mklbackend.txt

Copy link
Contributor

@mmeterel mmeterel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! It looks good overall. I have two comments:

  1. Could you please fix the layout namespaces in couple places?
  2. We need to write our own reference implementation for new APIs, so that we can actually check correctness.

src/blas/backends/cublas/cublas_wrappers.cpp Outdated Show resolved Hide resolved
src/blas/backends/rocblas/rocblas_wrappers.cpp Outdated Show resolved Hide resolved
tests/unit_tests/blas/batch/imatcopy_batch_stride.cpp Outdated Show resolved Hide resolved
tests/unit_tests/blas/batch/imatcopy_batch_stride_usm.cpp Outdated Show resolved Hide resolved
@andrewtbarker
Copy link
Contributor Author

Thanks for the review @mmeterel , I think I have addressed your comments.

namespace {

template <typename fp>
void imatcopy_ref(oneapi::mkl::layout layout, oneapi::mkl::transpose trans, int64_t m, int64_t n,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the reference implementations to this common place (tests/unit_tests/blas/include/reference_blas_templates.hpp), along with other reference implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, good idea!

Copy link
Contributor

@mmeterel mmeterel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

Copy link

@aaronjohnson aaronjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but nothing blocking.

And, #pragma once may suffice for header guard. But to date, both ways are used across the project.

fp alpha, beta;
int64_t i, tmp;

batch_size = 1 + std::rand() % 20;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure this be the same random number every time.

Do we want reproducibility of random batch size?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, if something in a test were to fail. Is knowing the exact values useful for reproducing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, transpose APIs, there is probably not a concern.

Copy link

@aaronjohnson aaronjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And lastly, _precondition and _postcondition may not be necessary since they are stubs.

Then again, to match the rest of pre-existing BLAS it makes sense to include them here.

No changes required. Just some observations. Thank you, @andrewtbarker

@andrewtbarker andrewtbarker merged commit 9abb8cf into uxlfoundation:develop Sep 20, 2022
@andrewtbarker andrewtbarker deleted the transpose-interfaces2 branch September 20, 2022 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants