-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
progress bar and parallel compute for cartogram_ncont #36
base: master
Are you sure you want to change the base?
Conversation
Update:
|
weight, | ||
k = 1, | ||
inplace = TRUE, | ||
n_cpu = parallelly::availableCores() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in general it is recommended to keep the default core use to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
@@ -12,8 +12,16 @@ Authors@R: c( | |||
Description: Construct continuous and non-contiguous area cartograms. | |||
URL: https://github.com/sjewo/cartogram, https://sjewo.github.io/cartogram/ | |||
BugReports: https://github.com/sjewo/cartogram/issues | |||
Imports: methods, sf, packcircles | |||
Suggests: | |||
Imports: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, the furrr, parallelly, and future packages could be moved to Suggests: and the code related to them may be only used if the progress bar is on.
(I also think there should be an option to disable the progress bar)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nowosad On a second thought, that would imply using the for loop (or purrr::map) by default with no progress bar... (though purrr::map also has progress bar). So kind of defeats the purpose. And why would parallel processing rely on the progress bar setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestions and your work on implementing parallel processing @e-kotov! I haven't had much time to keep an eye on this topic in the past.
I would like to keep the imports as small as possible.
- Maybe
utils::txtProgressBar()
is sufficient for adding a progress bar without additional imports? - I would also prefer to move the parallelly, and future packages to Suggests, as @Nowosad suggested, and add an if-clause for
n_cpu > 1
to use futures parallel execution. - I don't want to be petty, but is 'furrr' absolutely necessary? If the code can only be implemented with the future package, I would prefer that.
Best,
Sebastian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjewo points taken. I'm not sure excluding furrr
is practical, as future
package only provides the framework, and then you need some other package like furrr
or future.apply
to do the job. So why not furrr
?
I will get to it next week. What do you think of the older PR #28 ? Did you have plans to merge it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reply, @e-kotov !
Regarding furrr
: i would like to keep the number of dependencies low. future.apply
has half the number of imported packages. Nevertheless, if furrr
is more practical in your opinion, that's fine for me!
I have not gotten the PR #28 to work and the “future” approach looks promising to me. So if you are working on a different solution that results in some speed improvements, I would certainly accept a PR.
@Nowosad I will work some more on this PR in the coming days. Clearly it is not ready for merging. There is also something funny happening with the resulting cartogram. It does not show itself when applied to the test Error in `st_transform.sfc()`:
! cannot transform sfc object with missing crs This may have something to do with attributes that I fail to restore to the geom column. I will look into this and update the PR. |
Hi, trying to solve #35 myself I started with the simplest function
cartogram_ncont()
. I know there is an older pull request #28 , but this approach with just building onfurrr
andfuture
seems simpler to me. Arguably it can be improved with better handling of the workers. E.g. instead of users specifying then_cores
incartogram_ncont()
they should be able to set the number of workers and type of parallel processing withfuture
outside the function.This PR results in
cartogram_ncont()
gaining both progress bar and parallel processing.Before making changes to the code, I ran your default example from
cartogram_ncont()
function and obtained a control value of area of the resulting cartogram fornc
fromsf
package. I wrote simple test for this area in thetests
folder using{testthat}
. The test passes with my updated code forcartogram_ncont()
.