Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43746: [C++] Add support for Boost 1.86 #43766

Merged
merged 49 commits into from
Sep 3, 2024
Merged

Conversation

kou
Copy link
Member

@kou kou commented Aug 20, 2024

Rationale for this change

boost/process/*.hpp are deprecated since Boost 1.86. And it seems that it also adds backward incompatible change. We need to use boost/process/v2/*.hpp instead.

What changes are included in this PR?

This introduces arrow::util::Process for testing. It wraps boost/process/ API. So we don't need to use boost/process/ API directly in our tests.

We still use the v1 API on Windows because the v2 API doesn't process group and we don't have a workaround for it on Windows. If GCS's testbench doesn't use multiple processes, we can use the v2 API on Windows because we don't need to use process group in our use case.

See also:

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@kou kou requested a review from lidavidm as a code owner August 20, 2024 05:17
@kou
Copy link
Member Author

kou commented Aug 20, 2024

@github-actions crossbow submit -g cpp -g r -g linux

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Aug 20, 2024
Copy link

⚠️ GitHub issue #43746 has been automatically assigned in GitHub to PR creator.

This comment was marked as outdated.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks neat, thank you @kou ! Some comments below.

@@ -65,7 +48,7 @@ namespace arrow {
using internal::TemporaryDir;
namespace fs {
using internal::ConcatAbstractPath;
namespace bp = boost::process;
// namespace bp = BOOST_PROCESS_V2_NAMESPACE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I should have removed this intermediated code.

std::unique_ptr<process::process> process_;
asio::io_context ctx_;

Status ResolveCurrentExecutable(process::filesystem::path* out) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return a Result<path>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea.
I just moved this from flight/test_util.cc but I should have improved the API.

Comment on lines 161 to 166
if (buffered_output.eof()) {
buffered_output.clear();
auto last = buffered_output.str().size();
buffered_output.seekg(last);
buffered_output.seekp(last);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? Can you add an explanatory comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for clearing a EOS bit. I'll add a comment.

Comment on lines 89 to 90
ARROW_RETURN_NOT_OK(
impl_->server_process_->SetEnv("MINIO_ACCESS_KEY", kMinioAccessKey));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is duplicated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good catch.
I'll remove it.

impl_->server_process_->SetEnv("MINIO_ACCESS_KEY", kMinioAccessKey));
ARROW_RETURN_NOT_OK(
impl_->server_process_->SetEnv("MINIO_SECRET_KEY", kMinioSecretKey));
ARROW_RETURN_NOT_OK(impl_->server_process_->SetEnv("MINIO_BROWSER", "off"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you keep the comment from above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I should have not remove the comment. I'll re-add it.

server_process_.terminate();
server_process_.wait();
status = server_process->SetArgs({"-m", "testbench", "--port", port_});
if (!status.ok()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to add this information, especially as SetArgs cannot actually fail? (same question below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I added Status return value for consistency but I'll change the return type to void.

server_process_.wait();
}
}
~GcsTestbench() override { server_process_ = nullptr; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to keep the "kill process group" behavior, or is that impossible with bp v2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's impossible with v2...
boostorg/process#259

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, too bad. We'll have to do without it, then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. It seems that we can't do with v2. We can't call setpgid() between fork() and exec() with v2...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If GCS's testbench doesn't use multiple processes, we don't need to use process group: googleapis/storage-testbench#669

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can terminate all GCS's testbench related processes on Linux and macOS by SIGTERM. Because GCS's testbench's main process terminates all related processes by SIGTERM.
We can't use similar approaches (SendMessageW(WM_CLOSE) or GenerateConsoleCtrlEvent(CTRL_C_EVENT)) on Windows. They didn't work.

So let's still use v1 on Windows until googleapis/storage-testbench#669 is solved.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting merge Awaiting merge labels Aug 22, 2024
@kou
Copy link
Member Author

kou commented Aug 22, 2024

Hmm. It seems that some platforms still use Boost < 1.80.
I'll add support for Boost < 1.80.

@kou kou force-pushed the cpp-boost-process-v2 branch from 4798ea6 to a5a36ac Compare August 22, 2024 07:32
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Aug 22, 2024
@kou kou force-pushed the cpp-boost-process-v2 branch from e2de404 to ff9fd9f Compare August 23, 2024 00:59
@kou
Copy link
Member Author

kou commented Aug 23, 2024

Hmm. python3 -m testbench can't be terminated on Windows...

@kou
Copy link
Member Author

kou commented Aug 23, 2024

It seems that python3 -m testbench was terminated but something got stuck. What?

https://github.com/apache/arrow/actions/runs/10522425847/job/29155057348?pr=43766#step:12:45167

Fri, 23 Aug 2024 08:32:06 GMT 69: ~GcsTestBench(): end
Fri, 23 Aug 2024 08:36:54 GMT     Test #69: arrow-gcsfs-test .............................***Timeout 299.98 sec

@@ -1191,57 +1194,6 @@ set(Boost_USE_MULTITHREADED ON)
if(MSVC AND ARROW_USE_STATIC_CRT)
set(Boost_USE_STATIC_RUNTIME ON)
endif()
set(Boost_ADDITIONAL_VERSIONS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to keep this around for anyone running an older cmake version + boost without the config? (our min is 3.16 after all)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try our CI.
If we need to keep this, it tells us. :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.
Our vcpkg doesn't have boost-cmake yet...: #43812

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Aug 29, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Aug 29, 2024
@kou
Copy link
Member Author

kou commented Aug 29, 2024

@github-actions crossbow submit -g cpp -g r -g linux -g wheel java-jars

Copy link

Revision: c193c0d

Submitted crossbow builds: ursacomputing/crossbow @ actions-156765ce39

Task Status
almalinux-8-amd64 GitHub Actions
almalinux-8-arm64 GitHub Actions
almalinux-9-amd64 GitHub Actions
almalinux-9-arm64 GitHub Actions
amazon-linux-2023-amd64 GitHub Actions
amazon-linux-2023-arm64 GitHub Actions
centos-7-amd64 GitHub Actions
centos-8-stream-amd64 GitHub Actions
centos-8-stream-arm64 GitHub Actions
centos-9-stream-amd64 GitHub Actions
centos-9-stream-arm64 GitHub Actions
debian-bookworm-amd64 GitHub Actions
debian-bookworm-arm64 GitHub Actions
debian-trixie-amd64 GitHub Actions
debian-trixie-arm64 GitHub Actions
java-jars GitHub Actions
r-binary-packages GitHub Actions
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-r-arrow-backwards-compatibility GitHub Actions
test-r-clang-sanitizer GitHub Actions
test-r-depsource-bundled Azure
test-r-depsource-system GitHub Actions
test-r-dev-duckdb GitHub Actions
test-r-devdocs GitHub Actions
test-r-extra-packages GitHub Actions
test-r-gcc-11 GitHub Actions
test-r-gcc-12 GitHub Actions
test-r-install-local GitHub Actions
test-r-install-local-minsizerel GitHub Actions
test-r-linux-as-cran GitHub Actions
test-r-linux-rchk GitHub Actions
test-r-linux-valgrind GitHub Actions
test-r-macos-as-cran GitHub Actions
test-r-minimal-build Azure
test-r-offline-maximal GitHub Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-release-latest Azure
test-r-rocker-r-ver-latest Azure
test-r-rstudio-r-base-4.1-opensuse155 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 GitHub Actions
test-r-versions GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-20.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-20.04-cpp-thread-sanitizer GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-r-sanitizer GitHub Actions
ubuntu-focal-amd64 GitHub Actions
ubuntu-focal-arm64 GitHub Actions
ubuntu-jammy-amd64 GitHub Actions
ubuntu-jammy-arm64 GitHub Actions
ubuntu-noble-amd64 GitHub Actions
ubuntu-noble-arm64 GitHub Actions
wheel-macos-big-sur-cp310-arm64 GitHub Actions
wheel-macos-big-sur-cp311-arm64 GitHub Actions
wheel-macos-big-sur-cp312-arm64 GitHub Actions
wheel-macos-big-sur-cp313-arm64 GitHub Actions
wheel-macos-big-sur-cp38-arm64 GitHub Actions
wheel-macos-big-sur-cp39-arm64 GitHub Actions
wheel-macos-catalina-cp310-amd64 GitHub Actions
wheel-macos-catalina-cp311-amd64 GitHub Actions
wheel-macos-catalina-cp312-amd64 GitHub Actions
wheel-macos-catalina-cp313-amd64 GitHub Actions
wheel-macos-catalina-cp38-amd64 GitHub Actions
wheel-macos-catalina-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-amd64 GitHub Actions
wheel-manylinux-2-28-cp310-arm64 GitHub Actions
wheel-manylinux-2-28-cp311-amd64 GitHub Actions
wheel-manylinux-2-28-cp311-arm64 GitHub Actions
wheel-manylinux-2-28-cp312-amd64 GitHub Actions
wheel-manylinux-2-28-cp312-arm64 GitHub Actions
wheel-manylinux-2-28-cp313-amd64 GitHub Actions
wheel-manylinux-2-28-cp313-arm64 GitHub Actions
wheel-manylinux-2-28-cp38-amd64 GitHub Actions
wheel-manylinux-2-28-cp38-arm64 GitHub Actions
wheel-manylinux-2-28-cp39-amd64 GitHub Actions
wheel-manylinux-2-28-cp39-arm64 GitHub Actions
wheel-manylinux-2014-cp310-amd64 GitHub Actions
wheel-manylinux-2014-cp310-arm64 GitHub Actions
wheel-manylinux-2014-cp311-amd64 GitHub Actions
wheel-manylinux-2014-cp311-arm64 GitHub Actions
wheel-manylinux-2014-cp312-amd64 GitHub Actions
wheel-manylinux-2014-cp312-arm64 GitHub Actions
wheel-manylinux-2014-cp313-amd64 GitHub Actions
wheel-manylinux-2014-cp313-arm64 GitHub Actions
wheel-manylinux-2014-cp38-amd64 GitHub Actions
wheel-manylinux-2014-cp38-arm64 GitHub Actions
wheel-manylinux-2014-cp39-amd64 GitHub Actions
wheel-manylinux-2014-cp39-arm64 GitHub Actions
wheel-windows-cp310-amd64 GitHub Actions
wheel-windows-cp311-amd64 GitHub Actions
wheel-windows-cp312-amd64 GitHub Actions
wheel-windows-cp313-amd64 GitHub Actions
wheel-windows-cp38-amd64 GitHub Actions
wheel-windows-cp39-amd64 GitHub Actions

@kou
Copy link
Member Author

kou commented Aug 31, 2024

If nobody objects this, I'll merge this in the next week.

Comment on lines 85 to 86
#else
#ifdef BOOST_PROCESS_HAVE_V1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#elif defined(...) perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Let's simplify this.

cpp/src/arrow/testing/process.cc Outdated Show resolved Hide resolved
cpp/src/arrow/testing/process.cc Outdated Show resolved Hide resolved
Comment on lines 85 to 86
#else
#ifdef BOOST_PROCESS_HAVE_V1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Let's simplify this.

cpp/src/arrow/testing/process.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Aug 31, 2024
@raulcd
Copy link
Member

raulcd commented Sep 2, 2024

@github-actions crossbow submit verify-rc-source-macos

@raulcd
Copy link
Member

raulcd commented Sep 2, 2024

I am just triggering some tasks to validate this will solve some nightly failures for the verification jobs.

Copy link

github-actions bot commented Sep 2, 2024

Revision: 6470664

Submitted crossbow builds: ursacomputing/crossbow @ actions-a2a3ccd6a0

Task Status
verify-rc-source-cpp-macos-amd64 GitHub Actions
verify-rc-source-cpp-macos-arm64 GitHub Actions
verify-rc-source-cpp-macos-conda-amd64 GitHub Actions
verify-rc-source-csharp-macos-amd64 GitHub Actions
verify-rc-source-csharp-macos-arm64 GitHub Actions
verify-rc-source-go-macos-amd64 GitHub Actions
verify-rc-source-go-macos-arm64 GitHub Actions
verify-rc-source-integration-macos-amd64 GitHub Actions
verify-rc-source-integration-macos-arm64 GitHub Actions
verify-rc-source-integration-macos-conda-amd64 GitHub Actions
verify-rc-source-java-macos-amd64 GitHub Actions
verify-rc-source-js-macos-amd64 GitHub Actions
verify-rc-source-js-macos-arm64 GitHub Actions
verify-rc-source-python-macos-amd64 GitHub Actions
verify-rc-source-python-macos-arm64 GitHub Actions
verify-rc-source-python-macos-conda-amd64 GitHub Actions
verify-rc-source-ruby-macos-amd64 GitHub Actions
verify-rc-source-ruby-macos-arm64 GitHub Actions

@kou kou merged commit 00d3576 into apache:main Sep 3, 2024
38 of 39 checks passed
@kou kou removed the awaiting change review Awaiting change review label Sep 3, 2024
@kou kou deleted the cpp-boost-process-v2 branch September 3, 2024 01:08
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 00d3576.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

mapleFU pushed a commit to mapleFU/arrow that referenced this pull request Sep 3, 2024
### Rationale for this change

`boost/process/*.hpp` are deprecated since Boost 1.86. And it seems that it also adds backward incompatible change. We need to use `boost/process/v2/*.hpp` instead.

### What changes are included in this PR?

This introduces `arrow::util::Process` for testing. It wraps boost/process/ API. So we don't need to use boost/process/ API directly in our tests.

We still use the v1 API on Windows because the v2 API doesn't process group and we don't have a workaround for it on Windows. If GCS's testbench doesn't use multiple processes, we can use the v2 API on Windows because we don't need to use process group in our use case.

See also:
* The v2 API and process group: boostorg/process#259
* GCS's testbench and multiple processes: googleapis/storage-testbench#669

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#43746

Lead-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Sep 6, 2024
### Rationale for this change

`boost/process/*.hpp` are deprecated since Boost 1.86. And it seems that it also adds backward incompatible change. We need to use `boost/process/v2/*.hpp` instead.

### What changes are included in this PR?

This introduces `arrow::util::Process` for testing. It wraps boost/process/ API. So we don't need to use boost/process/ API directly in our tests.

We still use the v1 API on Windows because the v2 API doesn't process group and we don't have a workaround for it on Windows. If GCS's testbench doesn't use multiple processes, we can use the v2 API on Windows because we don't need to use process group in our use case.

See also:
* The v2 API and process group: boostorg/process#259
* GCS's testbench and multiple processes: googleapis/storage-testbench#669

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#43746

Lead-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
khwilson pushed a commit to khwilson/arrow that referenced this pull request Sep 14, 2024
### Rationale for this change

`boost/process/*.hpp` are deprecated since Boost 1.86. And it seems that it also adds backward incompatible change. We need to use `boost/process/v2/*.hpp` instead.

### What changes are included in this PR?

This introduces `arrow::util::Process` for testing. It wraps boost/process/ API. So we don't need to use boost/process/ API directly in our tests.

We still use the v1 API on Windows because the v2 API doesn't process group and we don't have a workaround for it on Windows. If GCS's testbench doesn't use multiple processes, we can use the v2 API on Windows because we don't need to use process group in our use case.

See also:
* The v2 API and process group: boostorg/process#259
* GCS's testbench and multiple processes: googleapis/storage-testbench#669

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#43746

Lead-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants