Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Make out_of_memory_recovery test trigger OOM faster. #1190

Merged
merged 1 commit into from
Jun 10, 2020

Conversation

alliepiper
Copy link
Collaborator

Fixes #1183. This test is taking up the majority of the test
runtime on CPU backends, slowing eating away at RAM/swap for
two minutes while the rest of the system gets evicted from RAM
and stops responding.

Replaced the allocation loop with a single large allocation,
now the test runs in ~1ms and doesn't actually allocate
significant resources.

Fixes NVIDIA#1183. This test is taking up the majority of the test
runtime on CPU backends, slowing eating away at RAM/swap for
two minutes while the rest of the system gets evicted from RAM
and stops responding.

Replaced the allocation loop with a single large allocation,
now the test runs in ~1ms and doesn't actually allocate
significant resources.
@alliepiper alliepiper requested a review from brycelelbach June 9, 2020 22:00
@alliepiper
Copy link
Collaborator Author

CL 28522192.

@alliepiper alliepiper added the testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). label Jun 9, 2020
//
// Summary of 2720132:
//
// 1. The large allocation fails due to running out of memory.
Copy link
Collaborator

@brycelelbach brycelelbach Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 index your lists!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You first! :)

http://nvbugs/2720132/2

@alliepiper alliepiper merged commit 5a4abe7 into NVIDIA:master Jun 10, 2020
@alliepiper alliepiper added testing: gpuCI passed Passed gpuCI testing. testing: internal ci passed Passed internal NVIDIA CI (DVS). and removed testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). labels Jun 12, 2020
@alliepiper
Copy link
Collaborator Author

A full multiconfig test cycle is now 45 minutes shorter 🎉 🎉 🎉

@alliepiper alliepiper deleted the bug/github/oom_recov_perf/1183 branch July 3, 2020 20:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
testing: gpuCI passed Passed gpuCI testing. testing: internal ci passed Passed internal NVIDIA CI (DVS).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test.out_of_memory_recovery is extremely slow
2 participants