Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

progress bar and parallel compute for cartogram_ncont #36

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,6 @@ vignettes/*.pdf

# pkgdown
docs

# private folder for draft test scripts
private
14 changes: 11 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,16 @@ Authors@R: c(
Description: Construct continuous and non-contiguous area cartograms.
URL: https://github.com/sjewo/cartogram, https://sjewo.github.io/cartogram/
BugReports: https://github.com/sjewo/cartogram/issues
Imports: methods, sf, packcircles
Suggests:
Imports:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, the furrr, parallelly, and future packages could be moved to Suggests: and the code related to them may be only used if the progress bar is on.

(I also think there should be an option to disable the progress bar)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nowosad On a second thought, that would imply using the for loop (or purrr::map) by default with no progress bar... (though purrr::map also has progress bar). So kind of defeats the purpose. And why would parallel processing rely on the progress bar setting?

Copy link
Owner

@sjewo sjewo Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestions and your work on implementing parallel processing @e-kotov! I haven't had much time to keep an eye on this topic in the past.

I would like to keep the imports as small as possible.

  • Maybe utils::txtProgressBar() is sufficient for adding a progress bar without additional imports?
  • I would also prefer to move the parallelly, and future packages to Suggests, as @Nowosad suggested, and add an if-clause for n_cpu > 1 to use futures parallel execution.
  • I don't want to be petty, but is 'furrr' absolutely necessary? If the code can only be implemented with the future package, I would prefer that.

Best,
Sebastian

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjewo points taken. I'm not sure excluding furrr is practical, as future package only provides the framework, and then you need some other package like furrr or future.apply to do the job. So why not furrr?

I will get to it next week. What do you think of the older PR #28 ? Did you have plans to merge it at all?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your reply, @e-kotov !

Regarding furrr: i would like to keep the number of dependencies low. future.apply has half the number of imported packages. Nevertheless, if furrr is more practical in your opinion, that's fine for me!

I have not gotten the PR #28 to work and the “future” approach looks promising to me. So if you are working on a different solution that results in some speed improvements, I would certainly accept a PR.

methods,
sf,
packcircles,
furrr,
parallelly,
future
License: GPL-3
Encoding: UTF-8
RoxygenNote: 7.2.3
RoxygenNote: 7.3.2
Suggests:
testthat (>= 3.0.0)
Config/testthat/edition: 3
45 changes: 36 additions & 9 deletions R/cartogram_ncont.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#' @param k Factor expansion for the unit with the greater value
#' @param inplace If TRUE, each polygon is modified in its original place,
#' if FALSE multi-polygons are centered on their initial centroid
#' @param n_cpu Number of cores to use. Defaults to maximum available identified with \code{\link[parallelly]{availableCores}}.
#' @return An object of the same class as x with resized polygon boundaries
#' @export
#' @importFrom methods is slot as
Expand All @@ -47,7 +48,13 @@
#'plot(nc_utm_carto[,"BIR74"], add =TRUE)
#'
#' @references Olson, J. M. (1976). Noncontiguous Area Cartograms. In The Professional Geographer, 28(4), 371-380.
cartogram_ncont <- function(x, weight, k = 1, inplace = TRUE){
cartogram_ncont <- function(
x,
weight,
k = 1,
inplace = TRUE,
n_cpu = parallelly::availableCores()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in general it is recommended to keep the default core use to 1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

){
UseMethod("cartogram_ncont")
}

Expand All @@ -66,15 +73,27 @@ nc_cartogram <- function(shp, ...) {
#' @rdname cartogram_ncont
#' @importFrom sf st_as_sf
#' @export
cartogram_ncont.SpatialPolygonsDataFrame <- function(x, weight, k = 1, inplace = TRUE){
as(cartogram_ncont.sf(sf::st_as_sf(x), weight, k = k, inplace = inplace), 'Spatial')
cartogram_ncont.SpatialPolygonsDataFrame <- function(
x,
weight,
k = 1,
inplace = TRUE,
n_cores = parallelly::availableCores()
){
as(cartogram_ncont.sf(sf::st_as_sf(x), weight, k = k, inplace = inplace, n_cores = n_cores), 'Spatial')
}


#' @rdname cartogram_ncont
#' @importFrom sf st_geometry st_area st_buffer st_is_longlat
#' @export
cartogram_ncont.sf <- function(x, weight, k = 1, inplace = TRUE){
cartogram_ncont.sf <- function(
x,
weight,
k = 1,
inplace = TRUE,
n_cores = parallelly::availableCores()
) {

if (isTRUE(sf::st_is_longlat(x))) {
stop('Using an unprojected map. This function does not give correct centroids and distances for longitude/latitude data:\nUse "st_transform()" to transform coordinates to another projection.', call. = F)
Expand All @@ -92,11 +111,19 @@ cartogram_ncont.sf <- function(x, weight, k = 1, inplace = TRUE){
spdf$r <- as.numeric(sqrt( wArea/ surf))
spdf$r[spdf$r == 0] <- 0.001 # don't shrink polygons to zero area
n <- nrow(spdf)
for(i in 1:n){
sf::st_geometry(spdf)[i] <- rescalePoly.sf(spdf[i, ],
inplace = inplace,
r = spdf[i,]$r)
}
crs <- st_crs(spdf) # save crs
future::plan(future::multisession, workers = n_cores)
spdf_geometry_list <- furrr::future_map(1:n, function(i) {
rescalePoly.sf(spdf[i, ],
inplace = inplace,
r = spdf$r[i])
},
.progress = TRUE,
.options = furrr::furrr_options(seed = TRUE)
)
future::plan(future::sequential)
spdf$geometry <- do.call(c, spdf_geometry_list)
st_crs(spdf) <- crs # restore crs
spdf$r <- NULL
sf::st_buffer(spdf, 0)
}
Expand Down
26 changes: 23 additions & 3 deletions man/cartogram_ncont.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/nc_cartogram.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions tests/testthat.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# This file is part of the standard setup for testthat.
# It is recommended that you do not modify it.
#
# Where should you do additional test configuration?
# Learn more about the roles of various files in:
# * https://r-pkgs.org/testing-design.html#sec-tests-files-overview
# * https://testthat.r-lib.org/articles/special-files.html

library(testthat)
library(cartogram)


test_check("cartogram")
22 changes: 22 additions & 0 deletions tests/testthat/test-cartogram_ncont.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
test_that("nc cartogram matches expected area", {
# Load North Carolina SIDS data
nc = sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE)
# transform to NAD83 / UTM zone 16N
nc_utm <- sf::st_transform(nc, 26916)

# Create cartogram
nc_utm_carto <- cartogram_ncont(nc_utm, weight = "BIR74")
cartogram_area <- as.integer((sum(nc_utm_carto |> st_area()))/1000)
expect_equal(cartogram_area, 22284872, tolerance = 0)
})

test_that("nc cartogram has crs", {
# Load North Carolina SIDS data
nc = sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE)
# transform to NAD83 / UTM zone 16N
nc_utm <- sf::st_transform(nc, 26916)

# Create cartogram
nc_utm_carto <- cartogram_ncont(nc_utm, weight = "BIR74")
expect_false(is.na(sf::st_crs(nc_utm_carto)$wkt))
})
Loading