Split block and evict task #8

jaewan · 2021-11-18T09:33:12Z

Why are these changes needed?

Previous version implemented blocktasks but with currently submitted task which caused a deadlock (NSDI23/single~/debug/deadlock.py)
This PR is to find the lowest priority from object_table and set blocktask according to it.

add separation from blocking new tasks being dispatched only and block spill (enableBlockTaskSpill)
with blocktasks with non-spill, there are 2 deadlock cases found in NSDI23/single~/debug/deadlock1.py and 2.py
All files from NSDI23 directory are microbenchmark. You can ignore them.
Many changes here are from ClangFormat changes. I listed the files you can ignore

Files changes for BlockTasks

object_manager/plasma/create_request_queue.cc: L185 ~ 218
L34 ~ 167 are just ClangFormat and argument added for block_task threshold (which not used for now) so you can ignore this
raylet/node_manager.cc : callback function for blocktasks

Files changes for searching the lowest priority object in the object_table

object_manager/plasma/common.h : Added GetPriority to find priority per object
object_manager/plasma/create_request_queue.h: getting the lowest priority from object_table to set block_task
object_manager/plasma/object_lifecycle_manager.cc
object_manager/plasma/object_lifecycle_manager.h
object_manager/plasma/object_store.cc: Search for the lowest pri obj from obejct_table
object_manager/plasma/object_store.h
object_manager/plasma/protocol.cc: copying priority

Files you can skip

Many changes in arguments are from ClangFormat. You can ignore these changes from the following files

core_worker/core_worker.cc
object_manager/plasma/create_request_queue.cc: L34 ~ 167 (These changes are ClangFormat and for block_task threshold)
You want to check out L185 ~ 218
object_manager/plasma/eviction_policy.cc : some debug messages
object_manager/plasma/object_store.cc: check object store if it reached a threshold (for threshold blocktask)
object_manager/plasma/store_runner.cc
raylet/local_object_manager.cc
raylet/local_object_manager.h
raylet/scheduling/cluster_task_manager.cc: ClangFormat and some argument added for blocktask callback function
raylet/test/local_object_manager_test.cc

stephanie-wang

Can you separate the threshold changes into a separate PR, so this PR is just the change to add an option for blocking/evicting tasks? We can sync in person on how to implement the threshold.

stephanie-wang · 2021-11-19T02:11:01Z

src/ray/common/ray_config_def.h

+RAY_CONFIG(bool, enable_EvictTasks, false)
+
+// Whether to use EvictTasks when spill required
+RAY_CONFIG(bool, enable_BlockandEvictTasks, false)


Can you remove this? I think it's confusing and unnecessary with the other two flags.

stephanie-wang · 2021-11-19T02:12:03Z

src/ray/object_manager/plasma/create_request_queue.cc

+          return Status::TransientObjectStoreFull("Waiting for higher priority tasks to finish");
+        }
+		RAY_LOG(INFO) << "[JAE_DEBUG] should_spill set";
+			    SetShouldSpill(false);


I don't think we should set should_spill_ here, right? We should allow the callbacks to set this later on. What was the reason for setting it?

stephanie-wang · 2021-11-19T02:19:38Z

src/ray/object_manager/plasma/create_request_queue.cc

@@ -214,7 +256,9 @@ Status CreateRequestQueue::ProcessRequests() {

  // If we make it here, then there is nothing left in the queue. It's safe to
  // run new tasks again.
-  RAY_UNUSED(on_object_creation_blocked_callback_(ray::Priority()));
+  if(!RayConfig::instance().enable_BlockandEvictTasks() || RayConfig::instance().enable_BlockTasks()){


Not sure if this is right. Shouldn't we call the callback when we are back under the threshold, inside the main loop?

ProcessRequest() will set if we are under or over the threshold.

If we are over the threshold, we should call BlockTasks() on the lower priority object currently in the object store.

If we are under the threshold, then we should call BlockTasks(Priority()).

This means we also need to track what the lowest priority object currently is.

stephanie-wang · 2021-11-19T02:21:59Z

src/ray/object_manager/plasma/create_request_queue.cc

-        return Status::TransientObjectStoreFull("Waiting for higher priority tasks to finish");
-      }
+	  if(RayConfig::instance().enable_BlockandEvictTasks()){
+		on_object_creation_blocked_callback_(queue_it->first.first);


Instead of having two separate callbacks, can we do one callback that sets the blocked priority and a flag for whether to also evict tasks?

I think this will be cleaner and also easier to debug. Since the callbacks here are concurrent, it's going to be hard to figure out what's going on if there are too many callbacks.

…ict callback

stephanie-wang · 2021-11-23T17:23:07Z

src/ray/object_manager/plasma/create_request_queue.cc

+      if(RayConfig::instance().enable_BlockTasks()){
+        RAY_LOG(DEBUG) << "[JAE_DEBUG] calling object_creation_blocked_callback priority "
+	    << queue_it->first.first.score;
+	    on_object_creation_blocked_callback_(queue_it->first.first , true, false);


We should call the callback a single time with the right flags, instead of calling it twice (this is to avoid having concurrent callbacks).

src/ray/object_manager/plasma/store.cc

… stuck with producers and 2 is when consumers are dependent on multiple objects

stephanie-wang · 2022-02-17T22:26:16Z

src/ray/object_manager/plasma/create_request_queue.cc

-      if (!should_spill_) {
-        RAY_LOG(INFO) << "Object creation of priority " << queue_it->first.first << " blocked";
-        return Status::TransientObjectStoreFull("Waiting for higher priority tasks to finish");
+      if (RayConfig::instance().enable_BlockTasks()) {


Shouldn't this also check if block_tasks_required is true?

Also, where is the enable_EvictTasks() flag used? I only see it at the end of the loop but I thought it should also use it here?

stephanie-wang · 2022-02-17T22:27:57Z

src/ray/object_manager/plasma/create_request_queue.cc

+                       << lowest_pri;
+        on_object_creation_blocked_callback_(lowest_pri, true, false);
+		if(!RayConfig::instance().enable_BlockTasksSpill()){
+		  spill_objects_callback_();


Could we do something simpler like just return early here?

stephanie-wang · 2022-02-17T22:29:08Z

src/ray/object_manager/plasma/create_request_queue.cc

@@ -214,15 +252,19 @@ Status CreateRequestQueue::ProcessRequests() {

  // If we make it here, then there is nothing left in the queue. It's safe to
  // run new tasks again.
-  RAY_UNUSED(on_object_creation_blocked_callback_(ray::Priority()));
+  if (RayConfig::instance().enable_BlockTasks() && 
+		  !RayConfig::instance().enable_EvictTasks()) {


Why do we check enableEvictTasks here?

stephanie-wang · 2022-02-17T22:56:35Z

src/ray/object_manager/plasma/object_store.cc

+ray::Priority ObjectStore::GetLowestPriObject() {
+  // Return the lowest priority object in object_table
+  auto it = object_table_.begin();
+  ray::Priority lowest_priority = it->second->GetPriority();


Won't this segfault if the object table is empty?

stephanie-wang · 2022-02-17T22:58:26Z

src/ray/object_manager/plasma/object_store.cc

+  it++;
+  for (; it != object_table_.end(); it++){
+	ray::Priority p = it->second->GetPriority();
+    if(lowest_priority < p){


Isn't this actually choosing the highest priority object instead of the lowest?

stephanie-wang · 2022-02-17T22:59:25Z

src/ray/object_manager/plasma/store.cc

+    allocated_percentage = 0;
+  }
+  if (block_tasks_required != nullptr) {
+    if (allocated_percentage >= block_tasks_threshold_) {


How is the block tasks threshold set?

It is set from ray_config. In default they are 1.0

stephanie-wang · 2022-02-22T21:31:52Z

src/ray/object_manager/plasma/create_request_queue.cc

@@ -168,6 +168,14 @@ Status CreateRequestQueue::ProcessRequests() {
    if (spilling_required) {
      spill_objects_callback_();
    }
+	//Block and evict tasks are called if the object store reaches over a threshold
+	if(RayConfig::instance().enable_BlockTasks() && block_tasks_required){
+        on_object_creation_blocked_callback_(lowest_pri, true, false);


I think we may have talked about this earlier, but can we change this so that we only call the callback once? This is to reduce the chance of a race condition since these callbacks are async.

… calling taskblock callback twice

We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is: ``` #0 0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6 #1 0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #2 0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #3 0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #4 0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #5 0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #6 0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2 #11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2 #12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6 #14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2 #15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>) at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369 ``` The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`). It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`. The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though. BTW, I've tried different approaches: 1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well. 2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.

jaewan added 2 commits November 17, 2021 22:42

blocktask and evict tasks separated

9e74f48

blocktasks when oom

2c89769

stephanie-wang reviewed Nov 19, 2021

View reviewed changes

jaewan and others added 4 commits November 19, 2021 21:35

command to replay the bug

cfc7a73

Fix crash in raylet on duplicate object

8176e99

Fixed bug in blocktasks. Did not block when block task called only

25136da

Merged blocktasks and evicttasks into a single callback and erased ev…

020dc20

…ict callback

stephanie-wang self-assigned this Nov 23, 2021

stephanie-wang reviewed Nov 23, 2021

View reviewed changes

src/ray/object_manager/plasma/store.cc Show resolved Hide resolved

jaewan and others added 10 commits December 20, 2021 01:02

get lowest pri, but obj does not have pri yet

b90de27

debug logs to see where to put priority

5d281a1

log without compile error

2373c86

commit for migration

dc06d25

working version. But still need testing

0887544

Blocktasks but 2 new problems emerged

840c853

added microbechmark

c195664

deadlock cases 1 and 2 induced by not spilling. 1 is when workers are…

c23030a

… stuck with producers and 2 is when consumers are dependent on multiple objects

polished debug messages

4aaccc5

typo fix

74b6780

stephanie-wang reviewed Feb 17, 2022

View reviewed changes

removed scripts and included block, evict threshold triggered code

4e8527b

stephanie-wang reviewed Feb 22, 2022

View reviewed changes

jaewan added 6 commits February 24, 2022 20:34

changed get_lowest_pri to prevent a potential seg fault

d00bbd2

polished calling block_task callback

5628e70

modified flags to bool for readability and added a screening to avoid…

bb01e7e

… calling taskblock callback twice

changed object_block callback function's arguments back to bool

7122f76

fixed unchanged function argument for the blockTask callback

ca3bfc4

typo fix

4483e4e

stephanie-wang approved these changes Mar 3, 2022

View reviewed changes

stephanie-wang merged commit 7de7aff into memory-scheduling Mar 3, 2022

jaewan deleted the memory-scheduling-jae branch March 29, 2022 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split block and evict task #8

Split block and evict task #8

jaewan commented Nov 18, 2021 •

edited

Loading

stephanie-wang left a comment

stephanie-wang Nov 19, 2021

stephanie-wang Nov 19, 2021

stephanie-wang Nov 19, 2021

stephanie-wang Nov 19, 2021

stephanie-wang Nov 19, 2021

stephanie-wang Nov 23, 2021

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

stephanie-wang Feb 17, 2022

jaewan Feb 22, 2022

stephanie-wang Feb 22, 2022

Split block and evict task #8

Split block and evict task #8

Conversation

jaewan commented Nov 18, 2021 • edited Loading

Why are these changes needed?

Files changes for BlockTasks

Files changes for searching the lowest priority object in the object_table

Files you can skip

stephanie-wang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaewan commented Nov 18, 2021 •

edited

Loading