Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLPK solver fails to solve simple problems when one or more constraints are disabled. #428

Closed
ChristopherHogan opened this issue Jun 29, 2022 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@ChristopherHogan
Copy link
Collaborator

With OR-Tools, disabling a constraint (by setting it to 0) worked fine, but the GLPK solver fails to find an optimal solution for simple problems when 1 or more constraints are disabled.

Here is a reproducer.

#include <assert.h>
#include <mpi.h>
#include "hermes.h"
#include "bucket.h"

namespace hapi = hermes::api;

int main(int argc, char **argv) {
  int mpi_threads_provided;
  MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &mpi_threads_provided);
  if (mpi_threads_provided < MPI_THREAD_MULTIPLE) {
    fprintf(stderr, "Didn't receive appropriate MPI threading specification\n");
    return 1;
  }

  auto run_test = [](const std::string &bkt_name,
                     float min_remaining_capacity_constraint) {
    std::shared_ptr<hapi::Hermes> hermes = hermes::InitHermesDaemon();
    hapi::Context ctx;
    ctx.minimize_io_time_options =
      hapi::MinimizeIoTimeOptions(min_remaining_capacity_constraint);
    hapi::Bucket bkt(bkt_name, hermes, ctx);

    const size_t kBlobSize = KILOBYTES(4);
    hapi::Blob blob(kBlobSize);
    std::string blob_name = "1";
    assert(bkt.Put(blob_name, blob).Succeeded());

    bkt.Destroy();

    hermes->Finalize(true);
  };

  // Works
  run_test("Default", 0.1f);
  // Fails
  run_test("DisableMinCapacityConstraint", 0.0f);

  return 0;
}
@ChristopherHogan ChristopherHogan added the bug Something isn't working label Jun 29, 2022
@hyoklee
Copy link
Member

hyoklee commented Jul 8, 2022

@ChristopherHogan , I get a "Segmentation Fault 11" error for the default case as well.

 // Works
  run_test("Default", 0.1f);

What branch/tag/commit hash did you use to run your test that works for the default case?

@ChristopherHogan
Copy link
Collaborator Author

The default case works for me on the current HEAD of master. To make compilation easy, I added the file as hermes/test/issue428.cc, then added issue428 to API_TESTS in hermes/test/CMakeLists.txt, then compiled and ran it like this:

$ cd hermes/build
$ make -j 8
$ LSAN_OPTIONS=suppressions=../test/data/asan.supp bin/issue428
>>>
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0708 14:36:24.963989 38422 buffer_pool.cc:1090] 953096 bytes required for BufferPool metadata
I0708 14:36:24.964144 38422 buffer_pool.cc:1129] Device: 0
I0708 14:36:24.964150 38422 buffer_pool.cc:1132]     0-Buffers: 2487
I0708 14:36:24.964155 38422 buffer_pool.cc:1132]     1-Buffers: 680
I0708 14:36:24.964159 38422 buffer_pool.cc:1132]     2-Buffers: 170
I0708 14:36:24.964162 38422 buffer_pool.cc:1132]     3-Buffers: 85
I0708 14:36:24.964166 38422 buffer_pool.cc:1137]     Num Headers: 10647
I0708 14:36:24.964170 38422 buffer_pool.cc:1138]     Num Buffers: 3422
I0708 14:36:24.964174 38422 buffer_pool.cc:1129] Device: 1
I0708 14:36:24.964180 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:24.964184 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:24.964190 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:24.964197 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:24.964202 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:24.964212 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:24.964218 38422 buffer_pool.cc:1129] Device: 2
I0708 14:36:24.964226 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:24.964233 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:24.964241 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:24.964247 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:24.964257 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:24.964267 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:24.964275 38422 buffer_pool.cc:1129] Device: 3
I0708 14:36:24.964285 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:24.964293 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:24.964303 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:24.964313 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:24.964321 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:24.964326 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:24.964335 38422 buffer_pool.cc:1140] Total Buffers: 16322
I0708 14:36:24.966411 38422 memory_management.cc:141] PushSize added 1 bytes of padding for alignment
I0708 14:36:24.966439 38422 metadata_storage_stb_ds.cc:942] Metadata can support 7263 Blobs per node
I0708 14:36:24.980463 38422 rpc_thallium.cc:56] Serving at ofi+sockets://127.0.0.1:8080 with 1 RPC threads
I0708 14:36:25.092846 38422 rpc_thallium.cc:517] Buffer organizer serving at ofi+sockets://127.0.0.1:8081 with 1 RPC threads and 4 BO worker threads
I0708 14:36:25.206789 38422 metadata_management.cc:441] Creating Bucket 'Default'
I0708 14:36:25.207528 38422 bucket.h:273] Attaching blob '1' to Bucket 'Default'
I0708 14:36:25.208034 38422 bucket.cc:400] Destroying bucket 'Default'
I0708 14:36:25.903235 38422 buffer_pool.cc:1090] 953096 bytes required for BufferPool metadata
I0708 14:36:25.903286 38422 buffer_pool.cc:1129] Device: 0
I0708 14:36:25.903297 38422 buffer_pool.cc:1132]     0-Buffers: 2487
I0708 14:36:25.903307 38422 buffer_pool.cc:1132]     1-Buffers: 680
I0708 14:36:25.903316 38422 buffer_pool.cc:1132]     2-Buffers: 170
I0708 14:36:25.903326 38422 buffer_pool.cc:1132]     3-Buffers: 85
I0708 14:36:25.903338 38422 buffer_pool.cc:1137]     Num Headers: 10647
I0708 14:36:25.903355 38422 buffer_pool.cc:1138]     Num Buffers: 3422
I0708 14:36:25.903373 38422 buffer_pool.cc:1129] Device: 1
I0708 14:36:25.903389 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:25.903409 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:25.903427 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:25.903446 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:25.903466 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:25.903484 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:25.903508 38422 buffer_pool.cc:1129] Device: 2
I0708 14:36:25.903529 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:25.903550 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:25.903569 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:25.903589 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:25.903606 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:25.903625 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:25.903645 38422 buffer_pool.cc:1129] Device: 3
I0708 14:36:25.903666 38422 buffer_pool.cc:1132]     0-Buffers: 3200
I0708 14:36:25.903683 38422 buffer_pool.cc:1132]     1-Buffers: 800
I0708 14:36:25.903707 38422 buffer_pool.cc:1132]     2-Buffers: 200
I0708 14:36:25.903726 38422 buffer_pool.cc:1132]     3-Buffers: 100
I0708 14:36:25.903745 38422 buffer_pool.cc:1137]     Num Headers: 4300
I0708 14:36:25.903764 38422 buffer_pool.cc:1138]     Num Buffers: 4300
I0708 14:36:25.903784 38422 buffer_pool.cc:1140] Total Buffers: 16322
I0708 14:36:25.907502 38422 memory_management.cc:141] PushSize added 1 bytes of padding for alignment
I0708 14:36:25.907539 38422 metadata_storage_stb_ds.cc:942] Metadata can support 7263 Blobs per node
I0708 14:36:25.923614 38422 rpc_thallium.cc:56] Serving at ofi+sockets://127.0.0.1:8080 with 1 RPC threads
I0708 14:36:26.036576 38422 rpc_thallium.cc:517] Buffer organizer serving at ofi+sockets://127.0.0.1:8081 with 1 RPC threads and 4 BO worker threads
I0708 14:36:26.150063 38422 metadata_management.cc:441] Creating Bucket 'DisableMinCapacityConstraint'
E0708 14:36:26.150406 38422 data_placement_engine.cc:381] DPE or-tools does not find a solutionwith provided constraints.
E0708 14:36:26.150470 38422 bucket.h:308] DPE PlacementSchema is empty.
issue428: /home/chogan/dev/hermes/test/issue428.cc:27: main(int, char**)::<lambda(const string&, float)>: Assertion `bkt.Put(blob_name, blob).Succeeded()' failed.
Aborted (core dumped)

@hyoklee
Copy link
Member

hyoklee commented Jul 8, 2022

I tried the same thing - adding your code into test called "disable_min_cap".
Then, I ran it using ctest and both failed.
Using your direct command line method, I get no error for both cases on my fork using Jelly:

[hyoklee@jelly]/scr/hyoklee/src/hermes-hyoklee/build> LSAN_OPTIONS=suppressions=../test/data/asan.supp bin/disable_min_cap
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0708 14:48:56.627655 32230 buffer_pool.cc:1090] 953096 bytes required for BufferPool metadata
I0708 14:48:56.629137 32230 metadata_storage_stb_ds.cc:942] Metadata can support 7263 Blobs per node
I0708 14:48:56.644026 32230 rpc_thallium.cc:56] Serving at ofi+sockets://127.0.0.1:8080 with 1 RPC threads
I0708 14:48:56.754027 32230 rpc_thallium.cc:517] Buffer organizer serving at ofi+sockets://127.0.0.1:8081 with 1 RPC threads and 4 BO worker threads
I0708 14:48:56.866076 32230 metadata_management.cc:441] Creating Bucket 'Default'
I0708 14:48:56.866159 32230 bucket.cc:400] Destroying bucket 'Default'
I0708 14:48:57.759034 32230 buffer_pool.cc:1090] 953096 bytes required for BufferPool metadata
I0708 14:48:57.761096 32230 metadata_storage_stb_ds.cc:942] Metadata can support 7263 Blobs per node
I0708 14:48:57.779410 32230 rpc_thallium.cc:56] Serving at ofi+sockets://127.0.0.1:8080 with 1 RPC threads
I0708 14:48:57.894429 32230 rpc_thallium.cc:517] Buffer organizer serving at ofi+sockets://127.0.0.1:8081 with 1 RPC threads and 4 BO worker threads
I0708 14:48:58.006932 32230 metadata_management.cc:441] Creating Bucket 'DisableMinCapacityConstraint'
I0708 14:48:58.006984 32230 bucket.cc:400] Destroying bucket 'DisableMinCapacityConstraint'

I'm quite puzzled.

@ChristopherHogan
Copy link
Collaborator Author

Both failed through ctest because it runs with mpirun -n 2, but the test case is meant to be a serial program. Both succeeded with the direct command because you are in Release mode (I believe), so the assert in the line assert(bkt.Put(blob_name, blob).Succeeded()); is compiled out. Try compiling in Debug mode.

hyoklee added a commit to hyoklee/hermes that referenced this issue Jul 9, 2022
hyoklee added a commit that referenced this issue Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants