Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing #7572

abellina · 2023-01-24T20:33:08Z

Signed-off-by: Alessandro Bellina [email protected]

Closes #6864
Closes #6561

This PR allows a RapidsBuffer to become "non spillable" for the period of time that a ColumnarBatch is obtained from it (and the ref count of the device memory buffer is > 1). It also enables sharing RapidsBuffer between different RapidsBufferHandle instances. This means that when DeviceMemoryBuffers are being added to the spill framework, we detect whether we are already tracking that buffer (defined as between the creation of the RapidsDeviceMemoryBuffer and the first time free is called on it).

This is a draft PR since I am still running some performance numbers and I need to add more unit tests. There are various race conditions that this creates. Because of the race conditions we may put this into 23.04 instead.

These are tests I need to document, some of them I have done:

Performance testing with NDS @ 3TB
Stress testing with shared RapidsBuffers at 3TB
Stress testing with query95 3TB, 10TB, 30TB
Add more unit tests to exercise potential race conditions
Verify with nsys that extra calls to Table.columnViewsFromPacked are as cheap as we think

Signed-off-by: Alessandro Bellina <[email protected]>

revans2

I did a first pass through this and it looks good, but the code is complicated enough I need to dig in much deeper.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala

…Store.scala

abellina · 2023-01-24T21:58:49Z

build

…bellina/spark-rapids into spill/rapids_buffer_handle_dedup_final

abellina · 2023-01-24T23:02:15Z

build

revans2

Mostly just some nits. Feeling better about this change.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsDeviceMemoryStore.scala

abellina · 2023-01-25T17:29:54Z

I am currently running stress testing for NDS q72 at 3TB by setting the percent of GPU allocation to 30% of the original and watching it spill, especially spill broadcast (shared rapids buffers). I am debugging an inc after close seen here.

abellina · 2023-01-25T17:46:00Z

I am debugging an inc after close seen here

The INC AFTER CLOSE issue is not related to my change. Some of the gather code has race conditions when an exception is thrown (OOM in this case) and can cause this. It is more of a nuisance than anything as the executor is going down anyway. Filed this to look at separately: #7581

sql-plugin/src/main/scala/com/nvidia/spark/rapids/DeviceMemoryEventHandler.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsDeviceMemoryStore.scala

…idsDeviceMemoryBuffer instances being created pointing at the same underlying

…ers to be spillable as soon as they are added, or handled via ref counting

…ied to spill at the same time

abellina · 2023-01-25T23:35:54Z

In my stress testing at 3TB I found a deadlock due to lock inversion with a new commit I added given the race condition feedback (8737302). I am debugging it.

…into spill/rapids_buffer_handle_dedup_final

abellina · 2023-02-02T23:15:32Z

Performance-wise, overall for NDS 3TB with 5 repetitions for baseline (I used a 23.02 snapshot from last week) vs this change:

Name = benchmark
Means = 533737.6, 533441.6
Time diff = 296.0
Speedup = 1.0005548873578665
T-Test (test statistic, p value, df) = 0.09851820421116382, 0.9239445975461231, 8.0
T-Test Confidence Interval = -6632.437535329434, 7224.437535329434

q9 was faster (6%) and q72 was slower (2.2%). I am going to focus more testing on query72 to see if this is caused perhaps by calling unpack or something else, will also test against a 23.04 baseline jar.

Name = query9
Means = 9498.6, 8937.0
Time diff = 561.6000000000004
Speedup = 1.0628398791540785
T-Test (test statistic, p value, df) = 2.5673438616859325, 0.03326472469818815, 8.0
T-Test Confidence Interval = 57.16740662534335, 1066.0325933746574
ALERT: significant change has been detected (p-value < 0.05)
ALERT: improvement in performance has been observed
--------------------------------------------------------------------
Name = query72
Means = 118548.6, 121185.4
Time diff = -2636.7999999999884
Speedup = 0.9782416033614612
T-Test (test statistic, p value, df) = -2.7090393198155214, 0.02669776978463036, 8.0
T-Test Confidence Interval = -4881.312162957456, -392.2878370425201
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferCatalog.scala

abellina · 2023-02-06T18:31:58Z

For the stress testing I used query95 at 3/10/30 TB, and query72 at 3 TB with a reduced foot print. My main goal was to find deadlocks if anything, and cause spill. I am not seeing such issues.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferCatalog.scala

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala

… comments, fix bug

abellina · 2023-02-06T23:13:44Z

build

abellina · 2023-02-07T20:39:30Z

build

abellina · 2023-02-07T21:52:11Z

build

jlowe · 2023-02-08T15:34:14Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferCatalog.scala

+    // total amount spilled in this invocation
+    var totalSpilled: Long = 0
+
+    if (store.currentSpillableSize > targetTotalSize) {


Not necessarily to fix for this PR, but one issue with this old logic is that it will not free anything if targetTotalSize is larger than the current spillable size. That seems pretty bad. For example, freeing just 256MB of memory might allow a 1GB allocation to succeed. Honestly not sure why we would care what the current amount of spillable memory is, we should just blindly ask the store to spill until we can allocate, and the store either will spill or it won't. We shouldn't give up until the spill store is exhausted of spillable memory or we succeed in the allocation, whichever comes first.

Filed this to track #7706

abellina · 2023-02-08T16:56:06Z

@revans2 @jlowe I had introduced a bug in RapidsGdsStoreSuite. Fixed it with 6cca74d.

abellina · 2023-02-08T17:04:31Z

build

abellina added 2 commits January 24, 2023 14:16

Enables RapidsBuffer sharing in the spill framework

ae399bf

Signed-off-by: Alessandro Bellina <[email protected]>

Scalastyle

1d4bb47

revans2 reviewed Jan 24, 2023

View reviewed changes

abellina commented Jan 24, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala Outdated Show resolved Hide resolved

Update sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBuffer…

a64a3d3

…Store.scala

abellina changed the title ~~Enables RapidsBuffer sharing in the spill framework~~ Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing Jan 24, 2023

abellina added 2 commits January 24, 2023 16:59

If we keep holding onto the batch we wont spill to disk

eb94df7

Merge branch 'spill/rapids_buffer_handle_dedup_final' of github.com:a…

5f9ac7a

…bellina/spark-rapids into spill/rapids_buffer_handle_dedup_final

abellina mentioned this pull request Jan 25, 2023

Fix double close on exception in GpuCoalesceBatches #7578

Merged

revans2 reviewed Jan 25, 2023

View reviewed changes

jlowe reviewed Jan 25, 2023

View reviewed changes

abellina added 5 commits January 25, 2023 14:56

Remove synchronize from DeviceMemoryEventHandler

3c5c48c

Move definition of setSpillable to let data members be on top

99771ac

Lock the underlying buffer every time we add, to prevent multiple Rap…

8737302

…idsDeviceMemoryBuffer instances being created pointing at the same underlying

Add spillableOnAdd so that stores can expose whether they intend buff…

e6632d0

…ers to be spillable as soon as they are added, or handled via ref counting

Retry allocation instead of redundant spilling if multiple threads tr…

51673a1

…ied to spill at the same time

Minor whitespace change

a689ed2

abellina changed the base branch from branch-23.02 to branch-23.04 January 30, 2023 19:38

abellina added 5 commits January 30, 2023 15:19

Merge branch 'branch-23.02' of https://github.com/NVIDIA/spark-rapids …

e25f576

…into spill/rapids_buffer_handle_dedup_final

Merge branch 'branch-23.04' of https://github.com/NVIDIA/spark-rapids …

6ccdbca

…into spill/rapids_buffer_handle_dedup_final

Rework locking always taking catalog lock first

a38ea73

Fix typo

d762557

Fix log messages

c470791

abellina commented Feb 3, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferCatalog.scala Outdated Show resolved Hide resolved

jlowe reviewed Feb 6, 2023

View reviewed changes

abellina added 3 commits February 6, 2023 15:35

Take care of some of the feedback

bbc7c54

Removes waitForPending from the RapidsBufferStore, address a few more…

bc2a215

… comments, fix bug

getExistingRapidsBufferAndAcquire private and static

a0b039a

abellina added 2 commits February 7, 2023 14:12

Require that callbacks are null when adding buffers to the device store

93c0007

Dont forget to set the spill store

1cf8c04

revans2 previously approved these changes Feb 7, 2023

View reviewed changes

Add third argument matcher when mocking synchronousSpill

00aa5ca

abellina dismissed revans2’s stale review via 00aa5ca February 7, 2023 21:52

jlowe reviewed Feb 8, 2023

View reviewed changes

revans2 previously approved these changes Feb 8, 2023

View reviewed changes

jlowe previously approved these changes Feb 8, 2023

View reviewed changes

In RapidsGdsStoreSuite use the provided catalog

6cca74d

abellina dismissed stale reviews from jlowe and revans2 via 6cca74d February 8, 2023 16:55

jlowe approved these changes Feb 8, 2023

View reviewed changes

revans2 approved these changes Feb 8, 2023

View reviewed changes

abellina mentioned this pull request Feb 8, 2023

[BUG] currentSpillableSize could cause OOM #7706

Open

abellina merged commit 690017e into NVIDIA:branch-23.04 Feb 8, 2023

abellina deleted the spill/rapids_buffer_handle_dedup_final branch February 8, 2023 19:37

abellina mentioned this pull request Feb 28, 2023

[BUG] decompressed batches corrupt if they are made spillable #7827

Open

abellina mentioned this pull request Mar 22, 2023

Stop checking store size when spilling #7923

Closed

revans2 mentioned this pull request Apr 4, 2023

[FEA] Spill Framework Experimentation #8026

Open

4 tasks

mattahrens assigned abellina Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing #7572

Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing #7572

abellina commented Jan 24, 2023 •

edited

Loading

revans2 left a comment

abellina commented Jan 24, 2023

abellina commented Jan 24, 2023

revans2 left a comment

abellina commented Jan 25, 2023

abellina commented Jan 25, 2023

abellina commented Jan 25, 2023

abellina commented Feb 2, 2023

abellina commented Feb 6, 2023

abellina commented Feb 6, 2023

abellina commented Feb 7, 2023

abellina commented Feb 7, 2023

jlowe Feb 8, 2023

abellina Feb 8, 2023

abellina commented Feb 8, 2023

abellina commented Feb 8, 2023

Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing #7572

Enables spillable/unspillable state for RapidsBuffer and allow buffer sharing #7572

Conversation

abellina commented Jan 24, 2023 • edited Loading

revans2 left a comment

Choose a reason for hiding this comment

abellina commented Jan 24, 2023

abellina commented Jan 24, 2023

revans2 left a comment

Choose a reason for hiding this comment

abellina commented Jan 25, 2023

abellina commented Jan 25, 2023

abellina commented Jan 25, 2023

abellina commented Feb 2, 2023

abellina commented Feb 6, 2023

abellina commented Feb 6, 2023

abellina commented Feb 7, 2023

abellina commented Feb 7, 2023

jlowe Feb 8, 2023

Choose a reason for hiding this comment

abellina Feb 8, 2023

Choose a reason for hiding this comment

abellina commented Feb 8, 2023

abellina commented Feb 8, 2023

abellina commented Jan 24, 2023 •

edited

Loading