Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for oneAPI data management backend primitives #3008

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

Vika-F
Copy link
Contributor

@Vika-F Vika-F commented Dec 4, 2024

ndview, ndarray and related primitives are documented.


PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have added a respective label(s) to PR if I have a permission for that.
  • I have resolved any merge conflicts that might occur with the base branch.

@Vika-F Vika-F added the docs Issue/PR related to oneDAL docs label Dec 4, 2024
@david-cortes-intel
Copy link
Contributor

Would be helpful if we could merge this PR first:
#3004

So that we could then check that this one isn't introducing new warnings that might be masked by previous ones in the CI logs.

@@ -285,11 +376,22 @@ class ndview : public ndarray_base<axis_count, order> {
}

#ifdef ONEDAL_DATA_PARALLEL
/// Returns a copy of the element of 1d immutable view located in USM at specified location.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it execute operations in a queue if the data depends on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I do not fully understand the question. This operation is about the same as other sycl operations, it will be included into the flow of dependencies along with the other operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose that the array is the result of a computation like $t(X)*X$.

The operation that calculates the numbers in the array returns a queue, and might not have been completed by the time that .at is called.

Will the call to .at force calculations of the operations in the queues before returning a scalar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, in case the events triggered by $t(X) \cdot X$ are passed into the deps vector correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to mention that in the docs.

@@ -539,14 +712,26 @@ inline sycl::event fill(sycl::queue& q,
}
#endif

/// Multidimensional array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it differ from sycl's own class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think sycl has own multidimensional data representations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it does have a 1D marray. Lots of the usage for views is as 1D containers, so I was curious about their usage over sycl's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think besides of multidimensionality, our data structures are interoperable in the sense one can easily view the same data block as dal::table, dal::array, dal::backend::primitives::ndview, etc.

Also here we implement convenience functions like slice, transpose, to_host, etc.

With sycl own structures the amount of code inside oneDAL algorithmic kernels would be much bigger. Anyway we would need to wrap some common operations into functions. This class groups those common operations on the data together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be helpful to add a hint to prefer these custom classes instead of sycl's.

@Vika-F Vika-F marked this pull request as ready for review December 18, 2024 12:54
@Vika-F
Copy link
Contributor Author

Vika-F commented Dec 18, 2024

@rakshithgb-fujitsu, @keeranroth can you please provide the feedback?

Copy link
Contributor

@keeranroth keeranroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments look good for the code that is present, so I've refrained from commenting on the code. But for my own information, is there a plan to make the views constexpr, or to have the layout separate from the data, so that the offsets into arrays / pointers / multi-dimensional arrays can be calculated at compile time? And does the implementation support arbitrary shapes and strides, e.g. having a layout of the shape (a, (b, c), d), which is a 3 dimensional object, where the second dimension is a 2-dimensional object. This is useful in terms of doing things like cache-blocking. Not for this code review, but it would be great to see functionality like this, if it doesn't already exist

cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
cpp/oneapi/dal/backend/primitives/ndarray.hpp Outdated Show resolved Hide resolved
@Vika-F
Copy link
Contributor Author

Vika-F commented Jan 10, 2025

@keeranroth

is there a plan to make the views constexpr, or to have the layout separate from the data, so that the offsets into arrays / pointers / multi-dimensional arrays can be calculated at compile time?

No, there are no such plans for now.
Currently in our code base the sizes of ndarrays are usually related to the sizes of input/output data tables which is run-time information and cannot be constexpr.

But I understand that it might be beneficial to have ndarrays of constexpr sizes to represent some data blocks used in intermediate computations.

And does the implementation support arbitrary shapes and strides ...?

I think it would be more correct to answer "no" to this question. Because:

  • ndarray and ndview do not support strides at all. The layout of the data is defined only with the data's dimensions. For now we do not have the plans to support strided data layouts.
  • the arbitrary shapes like (a, (b, c), d) are not supported also. For the purposes of caching we usually use regular (or parallel) loops creating ndviews that cache the respective data blocks on each iteration. It is allowed to create multiple ndviews pointing to different parts of the data table, for example.
    So, this (a, (b, c), d) structure might be emulated with loops + ndviews of lower dimension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issue/PR related to oneDAL docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants