Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix tensor memory #421

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

fix tensor memory #421

wants to merge 4 commits into from

Conversation

nnshah1
Copy link
Contributor

@nnshah1 nnshah1 commented Dec 18, 2024

Initial investigation found that the mechanism for setting the shape field of the managed tensor was not being cleaned up correctly. This caused tensors to not be released even when the garbage collector was forced to run.

Once the shape was changed to be allocated via malloc and freed via malloc - the tests still require the garbage collector to be forced to run to remove tensors deterministically. That indicates there is a circular reference - but exact RCA is not known.

The Tensor object stores a reference to the original object (for example numpy).
When a numpy array is created from the Tensor object - a reference is created to the Tensor object (and implicitly the original numpy array).

When the objects are deleted or dereferenced - they should automatically dereference completely.

 for index in range(50):

    # Create original source in this case on GPU
    tensor = cupy.ones(2**27)

   # create a Triton Tensor object
   # is zero copy - contains reference to tensor object

    dl_pack_tensor = tritonserver.Tensor.from_dlpack(tensor)

    # create second array from Triton tensor
    # is zero copy - contains reference to Triton tensor
    array = cupy.from_dlpack(dl_pack_tensor)

    #            print(index, index*torch.numel(tensor)*tensor.element_size())
   
    # Delete array - also deletes capsule
    # should dereference Triton Tensor
    del array

    #  Delete Triton Tensor
    # should deference original tensor
    del dl_pack_tensor

    # Delete original storage
    # Should free actual memory
    del tensor

    print(index)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant