Skip to content

Commit

Permalink
GH-39697: [R] Source build should check if offline (#39699)
Browse files Browse the repository at this point in the history
CRAN.

See the commit messages

Existing CI should pass and not be affected. We should confirm that source builds get all features enabled. We should test on macbuilder with this package and with one where we've inserted `download.file <- function(...) stop()` at the top of nixlibs.R to simulate offline behavior.

I hope there is only one user affected.
* Closes: #39697

Authored-by: Neal Richardson <[email protected]>
Signed-off-by: Jacob Wujciak-Jens <[email protected]>
  • Loading branch information
nealrichardson authored and assignUser committed Jan 19, 2024
1 parent 4a239b6 commit 4e24435
Show file tree
Hide file tree
Showing 6 changed files with 73 additions and 28 deletions.
2 changes: 1 addition & 1 deletion dev/tasks/r/github.linux.offline.build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ jobs:
shell: Rscript {0}
- name: Install
env:
TEST_OFFLINE_BUILD: true
ARROW_OFFLINE_BUILD: true
LIBARROW_MINIMAL: false
{{ macros.github_set_sccache_envvars()|indent(8)}}
run: |
Expand Down
2 changes: 1 addition & 1 deletion dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1372,7 +1372,7 @@ tasks:
r_org: library
r_image: r-base
r_tag: latest
flags: '-e TEST_OFFLINE_BUILD=true'
flags: '-e ARROW_OFFLINE_BUILD=true'

test-r-dev-duckdb:
ci: github
Expand Down
31 changes: 31 additions & 0 deletions r/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,37 @@
under the License.
-->

# arrow 14.0.2.9000

## New features

* Bindings for `base::prod` have been added so you can now use it in your dplyr
pipelines (i.e., `tbl |> summarize(prod(col))`) without having to pull the
data into R (@m-muecke, #38601).
* Calling `dimnames` or `colnames` on `Dataset` objects now returns a useful
result rather than just `NULL` (#38377).
* The `code()` method on Schema objects now takes an optional `namespace`
argument which, when `TRUE`, prefixes names with `arrow::` which makes
the output more portable (@orgadish, #38144).

## Minor improvements and fixes

* Don't download cmake when ARROW_OFFLINE_BUILD=true and update `SystemRequirements` (#39602).
* Fallback to source build gracefully if binary download fails (#39587).
* An error is now thrown instead of warning and pulling the data into R when any
of `sub`, `gsub`, `stringr::str_replace`, `stringr::str_replace_all` are
passed a length > 1 vector of values in `pattern` (@abfleishman, #39219).
* Missing documentation was added to `?open_dataset` documenting how to use the
ND-JSON support added in arrow 13.0.0 (@Divyansh200102, #38258).
* To make debugging problems easier when using arrow with AWS S3
(e.g., `s3_bucket`, `S3FileSystem`), the debug log level for S3 can be set
with the `AWS_S3_LOG_LEVEL` environment variable.
See `?S3FileSystem` for more information. (#38267)
* Using arrow with duckdb (i.e., `to_duckdb()`) no longer results in warnings
when quitting your R session. (#38495)
* A large number of minor spelling mistakes were fixed (@jsoref, #38929, #38257)
* The developer documentation has been updated to match changes made in recent releases (#38220)

# arrow 14.0.2

## Minor improvements and fixes
Expand Down
2 changes: 1 addition & 1 deletion r/configure
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ FORCE_BUNDLED_BUILD=`echo $FORCE_BUNDLED_BUILD | tr '[:upper:]' '[:lower:]'`
ARROW_USE_PKG_CONFIG=`echo $ARROW_USE_PKG_CONFIG | tr '[:upper:]' '[:lower:]'`
# Just used in testing: whether or not it is ok to download dependencies (in the
# bundled build)
TEST_OFFLINE_BUILD=`echo $TEST_OFFLINE_BUILD | tr '[:upper:]' '[:lower:]'`
ARROW_OFFLINE_BUILD=`echo $ARROW_OFFLINE_BUILD | tr '[:upper:]' '[:lower:]'`

VERSION=`grep '^Version' DESCRIPTION | sed s/Version:\ //`
UNAME=`uname -s`
Expand Down
62 changes: 38 additions & 24 deletions r/tools/nixlibs.R
Original file line number Diff line number Diff line change
Expand Up @@ -120,11 +120,11 @@ validate_checksum <- function(binary_url, libfile, hush = quietly) {
# The warnings from system2 if it fails pop up later in the log and thus are
# more confusing than they are helpful (so we suppress them)
checksum_ok <- suppressWarnings(system2(
"shasum",
args = c("--status", "-a", "512", "-c", checksum_file),
stdout = ifelse(quietly, FALSE, ""),
stderr = ifelse(quietly, FALSE, "")
)) == 0
"shasum",
args = c("--status", "-a", "512", "-c", checksum_file),
stdout = ifelse(quietly, FALSE, ""),
stderr = ifelse(quietly, FALSE, "")
)) == 0

if (!checksum_ok) {
checksum_ok <- suppressWarnings(system2(
Expand Down Expand Up @@ -565,8 +565,8 @@ build_libarrow <- function(src_dir, dst_dir) {
env_var_list <- c(env_var_list, ARROW_DEPENDENCY_SOURCE = "BUNDLED")
}

# On macOS, if not otherwise set, let's override Boost_SOURCE to be bundled
# Necessary due to #39590 for CRAN
# On macOS, if not otherwise set, let's override Boost_SOURCE to be bundled
# Necessary due to #39590 for CRAN
if (on_macos) {
# Using lowercase (e.g. Boost_SOURCE) to match the cmake args we use already.
deps_to_bundle <- c("Boost", "lz4")
Expand Down Expand Up @@ -906,27 +906,13 @@ on_windows <- tolower(Sys.info()[["sysname"]]) == "windows"
# For local debugging, set ARROW_R_DEV=TRUE to make this script print more
quietly <- !env_is("ARROW_R_DEV", "true")

not_cran <- env_is("NOT_CRAN", "true")

if (is_release) {
VERSION <- VERSION[1, 1:3]
arrow_repo <- paste0(getOption("arrow.repo", sprintf("https://apache.jfrog.io/artifactory/arrow/r/%s", VERSION)), "/libarrow/")
} else {
not_cran <- TRUE
arrow_repo <- paste0(getOption("arrow.dev_repo", "https://nightlies.apache.org/arrow/r"), "/libarrow/")
}

if (!is_release && !test_mode) {
VERSION <- find_latest_nightly(VERSION)
}

# To collect dirs to rm on exit, use cleanup() to add dirs
# we reset it to avoid errors on reruns in the same session.
options(.arrow.cleanup = character())
on.exit(unlink(getOption(".arrow.cleanup"), recursive = TRUE), add = TRUE)

# enable full featured builds for macOS in case of CRAN source builds.
if (not_cran || on_macos) {
not_cran <- env_is("NOT_CRAN", "true")
if (not_cran) {
# Set more eager defaults
if (env_is("LIBARROW_BINARY", "")) {
Sys.setenv(LIBARROW_BINARY = "true")
Expand All @@ -942,9 +928,37 @@ if (not_cran || on_macos) {
build_ok <- !env_is("LIBARROW_BUILD", "false")

# Check if we're authorized to download
download_ok <- !test_mode && !env_is("TEST_OFFLINE_BUILD", "true")
download_ok <- !test_mode && !env_is("ARROW_OFFLINE_BUILD", "true")
if (!download_ok) {
lg("Dependency downloading disabled. Unset ARROW_OFFLINE_BUILD to enable", .indent = "***")
}
# If not forbidden from downloading, check if we are offline and turn off downloading.
# The default libarrow source build will download its source dependencies and fail
# if they can't be retrieved.
# But, don't do this if the user has requested a binary or a non-minimal build:
# we should error rather than silently succeeding with a minimal build.
if (download_ok && Sys.getenv("LIBARROW_BINARY") %in% c("false", "") && !env_is("LIBARROW_MINIMAL", "false")) {
download_ok <- try_download("https://apache.jfrog.io/artifactory/arrow/r/", tempfile())
if (!download_ok) {
lg("Network connection not available", .indent = "***")
}
}

download_libarrow_ok <- download_ok && !env_is("LIBARROW_DOWNLOAD", "false")

# Set binary repos
if (is_release) {
VERSION <- VERSION[1, 1:3]
arrow_repo <- paste0(getOption("arrow.repo", sprintf("https://apache.jfrog.io/artifactory/arrow/r/%s", VERSION)), "/libarrow/")
} else {
arrow_repo <- paste0(getOption("arrow.dev_repo", "https://nightlies.apache.org/arrow/r"), "/libarrow/")
}

# If we're on a dev version, look for the most recent libarrow binary version
if (download_libarrow_ok && !is_release && !test_mode) {
VERSION <- find_latest_nightly(VERSION)
}

# This "tools/thirdparty_dependencies" path, within the tar file, might exist if
# create_package_with_all_dependencies() was run, or if someone has created it
# manually before running make build.
Expand Down
2 changes: 1 addition & 1 deletion r/vignettes/developers/setup.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""), remotes::install_github(
* See the user-facing [article on installation](../install.html) for a large number of
environment variables that determine how the build works and what features
get built.
* `TEST_OFFLINE_BUILD`: When set to `true`, the build script will not download
* `ARROW_OFFLINE_BUILD`: When set to `true`, the build script will not download
prebuilt the C++ library binary or, if needed, `cmake`.
It will turn off any features that require a download, unless they're available
in `ARROW_THIRDPARTY_DEPENDENCY_DIR` or the `tools/thirdparty_download/` subfolder.
Expand Down

0 comments on commit 4e24435

Please sign in to comment.