-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding progress bar display func #951
base: main
Are you sure you want to change the base?
Conversation
Not printing, looking for further insight. @krlmlr? Any ideas. |
Thanks for working on it! I think this should be an R option with a callback that is called when the option is set, see |
Or, perhaps even a slot in the |
Well, I'm almost done with the callback. So I'll test that first. |
b0cdaba
to
560eee1
Compare
Oh, DUCKDB_DISABLE_PRINT flag probably does not help |
All right, it works, now ironing out the bugs. |
…t 100, then finish.
library(duckdb)
library(cli)
progress <- function(x) {
if (cli::cli_progress_num() == 0) {
cli::cli_progress_bar("Duckdb SQL", total = 100, .envir = .GlobalEnv)
}
cli::cli_progress_update(set = x, .envir = .GlobalEnv)
if (x > 100) {
cli::cli_progress_done(.envir = .GlobalEnv)
}
}
options("duckdb.progress_display" = progress)
conn <- duckdb::dbConnect(duckdb::duckdb())
duckdb::dbSendQuery(conn, "SET progress_bar_time = 0;")
q <- "CREATE OR REPLACE TABLE BOB AS (
SELECT * FROM 'ldbc-sf300-comments-creationDate.parquet')"
duckdb::dbSendQuery(conn, q) |
#ifndef DUCKDB_DISABLE_PRINT seems redundant since it is already used in printer.cpp and it prevents from using a display set via config.create_display_func when compiled with flag -DDUCKDB_DISABLE_PRINT, like the duckdb-r package, where I'm trying to implement a display. https://github.com/duckdb/duckdb/blob/main/src/common/printer.cpp duckdb/duckdb-r#951 PrintProgress -> TerminalProgressBarDisplay::Update -> TerminalProgressBarDisplay::PrintProgressInternal -> Printer::RawPrint and there is a macro there. Plus there is already a config option to enable_progress_bar and default is FALSE. So. Can it be remove? cc: @krlmlr
I'm done on this one. Let me know if this works for you. |
Testing with library(spanishoddata)
library(duckdb)
library(tidyverse)
x_dates <- c("2022-01-01", "2022-01-02", "2022-01-03", "2022-01-04")
x <- spod_get(type = "od", zones = "distr", dates = x_dates)
dbGetQuery(x$src$con, "SELECT current_setting('enable_progress_bar');")
dbSendQuery(x$src$con, "SET enable_progress_bar = true;")
dbGetQuery(x$src$con, "SELECT current_setting('enable_progress_bar');")
progress <- function(x) {
if (cli::cli_progress_num() == 0) {
cli::cli_progress_bar("Duckdb SQL", total = 100, .envir = .GlobalEnv)
}
cli::cli_progress_update(set = x, .envir = .GlobalEnv)
if (x > 100) {
cli::cli_progress_done(.envir = .GlobalEnv)
}
}
options("duckdb.progress_display" = progress)
duckdb::dbSendQuery(x$src$con, "SET progress_bar_time = 0;")
xx <- x |> group_by(id_origin, date, activity_origin) |> summarise(mean_trips = mean(n_trips)) |> collect() And it works! @meztez do we have to manually define the progress function though...? what is the final idea of this PR? I would expect that progress bar just 'magically' appears as soon as we do: dbGetQuery(x$src$con, "SELECT current_setting('enable_progress_bar');") p.s. in my case |
It could provide a dummy default. It's just a function(x) called with progress percentage from within duckdb-r. I'm not the package maintainer and I just needed it for a deliverable, so whatever works is fine by me. |
Thanks for the PR! Looking at the implementation, I think the callback function should be a slot in the connection object. There could be basic reporting (opt-out, in interactive mode only) in the duckdb R package, and more sophisticated progress in duckplyr. |
@meztez totally makes sense. Thanks for the work in the internals to make this possible! Looking forward for this to be merged! |
In the above examples, (x > 100) indicates that the processing is complete. Shouldn't that be (x >= 100)? I think it's more common to consider 100% to indicate "done" than "still processing". |
progress <- function(x) {
if (x < 100 && cli::cli_progress_num() == 0) {
cli::cli_progress_bar("Duckdb SQL", total = 100, .envir = .GlobalEnv, )
}
cli::cli_progress_update(set = x, .envir = .GlobalEnv)
}
options("duckdb.progress_display" = progress) |
For #199.