Add fallback memory resource for TCC devices #257

ksimpson-work · 2024-11-28T17:20:20Z

For devices which don't support memory pools, we need to provide an alternate default memory resource.

This basic WAR implementation works. I used a colossus lease for a Tesla T4 on Friday Nov 29 and these were the results:

using the DefaultAsyncMempool --> python - m pytest tests/test_memory.py
=============================================== short test summary info ===============================================
FAILED tests/test_memory.py::test_buffer_initialization - cuda.core.experimental._utils.CUDAError: CUDA_ERROR_NOT_SUPPORTED: operation not supported

using the implementation in this branch --> python - m pytest tests/test_memory.py
collected 4 items

tests\test_memory.py .... (SUCCESS)

close #208

copy-pr-bot · 2024-11-28T17:20:24Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cuda_core/cuda/core/experimental/_memory.py

cuda_core/cuda/core/experimental/_device.py

cuda_core/cuda/core/experimental/_memory.py

ksimpson-work · 2024-12-04T17:36:46Z

re-tested using synchronous malloc and free on Tesla T4 colossus instance.

on main branch:

(test_env) C:\cuda-python\cuda_core>python -m pytest tests\test_memory.py

platform win32 -- Python 3.12.7, pytest-8.3.4, pluggy-1.5.0
rootdir: C:\cuda-python\cuda_core
configfile: pyproject.toml
collected 4 items

tests\test_memory.py FFFF [100%]

with change:

(test_env) C:\cuda-python\cuda_core>python -m pytest tests\test_memory.py

platform win32 -- Python 3.12.7, pytest-8.3.4, pluggy-1.5.0
rootdir: C:\cuda-python\cuda_core
configfile: pyproject.toml
collected 4 items

tests\test_memory.py .... [100%]

cuda_core/cuda/core/experimental/_memory.py

Co-authored-by: Leo Fang <[email protected]>

leofang · 2024-12-06T21:52:04Z

/ok to test

leofang · 2024-12-06T22:04:25Z

Windows failures are known (#271) and irrelevant. Let's merge. Thanks, Keenan!

ksimpson-work added P0 High priority - Must do! feature New feature or request cuda.core Everything related to the cuda.core module labels Nov 28, 2024

ksimpson-work added this to the cuda.core beta 2 milestone Nov 28, 2024

ksimpson-work self-assigned this Nov 28, 2024

ksimpson-work added bug Something isn't working and removed feature New feature or request labels Nov 28, 2024

ksimpson-work requested review from leofang and vzhurba01 November 29, 2024 19:13

merge with main for ruff

319a372

ksimpson-work force-pushed the ksimpson/tcc_memory_resource branch from 42346c5 to 319a372 Compare November 29, 2024 19:17

ksimpson-work added 2 commits November 29, 2024 11:18

fix tuple reformat

19e3a4f

fix tuple reformat

5e84da7

leofang requested changes Nov 30, 2024

View reviewed changes

cuda_core/cuda/core/experimental/_memory.py Outdated Show resolved Hide resolved

cuda_core/cuda/core/experimental/_memory.py Outdated Show resolved Hide resolved

ksimpson-work added 2 commits December 2, 2024 09:25

switch to sync alloc and free

122d25c

add release notes

27ec6d3

ksimpson-work marked this pull request as ready for review December 3, 2024 21:26

ksimpson-work and others added 4 commits December 3, 2024 13:26

Merge branch 'main' into ksimpson/tcc_memory_resource

e1759a3

make true the default path

42c4b45

minor rewording

64b1f22

Merge branch 'main' into ksimpson/tcc_memory_resource

7edef94

leofang reviewed Dec 4, 2024

View reviewed changes

cuda_core/cuda/core/experimental/_device.py Show resolved Hide resolved

leofang requested changes Dec 4, 2024

View reviewed changes

cuda_core/cuda/core/experimental/_device.py Outdated Show resolved Hide resolved

cuda_core/cuda/core/experimental/_memory.py Outdated Show resolved Hide resolved

cuda_core/cuda/core/experimental/_memory.py Outdated Show resolved Hide resolved

ksimpson-work added 3 commits December 4, 2024 08:48

fix some known issues before colossus test

1d80ca7

fix some known issues before colossus test

a9a7f24

fix some known issues before colossus test

6425063

ksimpson-work requested a review from leofang December 4, 2024 17:36

leofang reviewed Dec 6, 2024

View reviewed changes

cuda_core/cuda/core/experimental/_memory.py Outdated Show resolved Hide resolved

ksimpson-work and others added 2 commits December 6, 2024 11:13

Update cuda_core/cuda/core/experimental/_memory.py

b6d73c8

Co-authored-by: Leo Fang <[email protected]>

Merge branch 'main' into ksimpson/tcc_memory_resource

57f7003

leofang approved these changes Dec 6, 2024

View reviewed changes

leofang merged commit c6a1a94 into main Dec 6, 2024
9 of 12 checks passed

leofang deleted the ksimpson/tcc_memory_resource branch December 6, 2024 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fallback memory resource for TCC devices #257

Add fallback memory resource for TCC devices #257

ksimpson-work commented Nov 28, 2024 •

edited

Loading

copy-pr-bot bot commented Nov 28, 2024

ksimpson-work commented Dec 4, 2024

leofang commented Dec 6, 2024

leofang commented Dec 6, 2024

Add fallback memory resource for TCC devices #257

Add fallback memory resource for TCC devices #257

Conversation

ksimpson-work commented Nov 28, 2024 • edited Loading

copy-pr-bot bot commented Nov 28, 2024

ksimpson-work commented Dec 4, 2024

leofang commented Dec 6, 2024

leofang commented Dec 6, 2024

ksimpson-work commented Nov 28, 2024 •

edited

Loading