Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement basic element/slice indexing #15

Open
2 tasks
dan-zheng opened this issue Oct 5, 2018 · 9 comments
Open
2 tasks

Implement basic element/slice indexing #15

dan-zheng opened this issue Oct 5, 2018 · 9 comments
Assignees
Labels
feature New feature or request help wanted Extra attention is needed

Comments

@dan-zheng
Copy link
Collaborator

dan-zheng commented Oct 5, 2018

Implement basic, non-strided indexing as described here.

  • Element tensor indexing: tensor(i).
    • This returns an element tensor, whose shape is tensor.shape.tail.
  • Subtensor indexing (i.e. "slicing [in the first dimension]"): tensor(range).
    • This returns a tensor slice, whose shape is [range.length] + tensor.shape.tail.

Tensor.apply should be adapted for these indexing strategies (rather than the current scalar indexing behavior).
These indexing strategies add expressivity: they enable minibatching and interesting recursive functions:

def print(separator: String = ", ") = {
  if (isScalar) printf("%f", this.data(0))
  else {
    // Recurse over element tensors.
    printf("[")
    for (i <- 0 until this.shape(0): Rep[Range]) {
      this(i).print(separator)
      printf(separator)
    }
    printf("]")
  }
}

Both element tensors and slices should probably be cost-free views into the data of their parent tensor. Otherwise, indexing large tensors (for minibatching) will be highly inefficient. We can implement copy-on-write semantics.

@dan-zheng dan-zheng added feature New feature or request help wanted Extra attention is needed labels Oct 5, 2018
@dan-zheng dan-zheng changed the title Implement basic indexing Implement basic element/slice indexing Oct 5, 2018
@feiwang3311 feiwang3311 self-assigned this Oct 6, 2018
@feiwang3311
Copy link
Owner

I am working on this but there is some problem with range selection.
If I have apply(i: Rep[Int], j: Rep[Int]) to select a range, the (j-i+1) is actually a Rep[Int], which conflicts with the design that dimensions are not Rep type.

However, this is related to a more general problem, which is the fact that Lantern cannot yet support unknown dimensions (normally used for batch, which is variable). I think if we add unknown dimensions, it will be Rep[Int] (for batch dimension).

What do you guys think? @dan-zheng @TiarkRompf @GSAir

@dan-zheng
Copy link
Collaborator Author

dan-zheng commented Oct 7, 2018

Could you explain why slice indexing uses apply(i: Rep[Int], j: Rep[Int]) rather than apply(i: Int, j: Int)? Or apply(range: Range), where range.step == 1?

I'm also slightly concerned about the readability of slice indexing, especially when mixed with element tensor indexing.

val tensor = Tensor.ones(4, 4, 4, 4)

tensor(0, 3)(0)(1, 2)(0)
// This is a bit sore on the eyes.
// Visually, `apply(i, j)` doesn't look like a "slice" to me.
// `apply(range)` might be more interpretable.
// tensor(0 until 3)(0)(1 until 2)(0)

Lack of support for unknown dimensions is a real issue. I don't have great ideas for solving that at the moment.

Edit: actually, is this really a problem? When reading a dataset, use a count variable to keep track of the leading dimension, and use it to initialize the final tensor.

@feiwang3311
Copy link
Owner

Let's first be sure of the goal. Rep typed range/slicing allows slicing at runtime, but static range/slicing is only slicing at stage time. What is the goal? Is it slicing with known indexes at stage time, or slicing with indexes unknown at stage time?

@TiarkRompf
Copy link
Collaborator

tensor(0 until 3)(0)(1 until 2)(0) looks nice.

What exactly is the key use case for unknown dimensions? Is it because we want to increase batch size during training?

How does XLA deal with it, I think they also require static fixed shapes?

In general I think switching shapes from Array[Int] to Array[Rep[Int]] is defensible but we should be clear about why we need it and make sure it doesn't cause any performance regressions.

@dan-zheng
Copy link
Collaborator Author

dan-zheng commented Oct 7, 2018

@feiwang3311 That's a good question. I'm not sure which is better, but stage-time indexing seems fine to me.

The more important factor regarding performance is to ensure that element tensors and slices have copy-on-write semantics, e.g. they reuse the data of their parent tensor until they're first written to. Otherwise, indexing will cause full element/slice copying every time, which is expensive.

@feiwang3311
Copy link
Owner

Runtime slicing is useful for RNN, whose input is 3D tensor of shape (timeStep, batch, dimension). At each cycle i, we process input(i) with the last hidden state, and produce the next hidden state.

Runtime ranging may not be useful (I don't know), I choose to skip it for now.

Staging time slicing and ranging may be useful (I have not encountered any. Normally for minibatching, we get a mini-batch of tensor directly, without the need to slice a minibatch out)

I am not sure how to implement copy-on-write (some guidance link is helpful). I think for runtime slicing (like the RNN case), no copy at all makes sense.

@TiarkRompf
Copy link
Collaborator

Let's make sure that anything we do has a concrete use case in one of the benchmark apps. If it's not on the critical path for at least one of them then:

  1. it's not the best use of our time right now
  2. we may not sufficiently understand all the requirements

For example, a good COW scheme takes significant work to get right (in particular, to not slow down read-only uses) so we should approach this only when necessary.

@dan-zheng
Copy link
Collaborator Author

dan-zheng commented Oct 7, 2018

I totally agree that implementing COW is not a best use of our time. For minibatching, slice tensors are not modified, so COW is unnecessary.

(The primary purpose of COW is to enable "value semantics for tensors", e.g. tensor(i) is a new Tensor value that can be modified without affecting tensor.)

If we decide it is a good use of time to implement indexing, we can implicitly let element/slice tensors have "reference semantics" and share their parents' data. The only consequence is that writing to element/slice tensors will change the data of their parents, which may be counterintuitive. The actual indexing code will simply perform pointer arithmetic, just like the current slice method.

@TiarkRompf
Copy link
Collaborator

Makes sense. Again, let's decide based on what we encounter in the various models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants