-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip computing size of child objects #631
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the performance problem because we were repeatedly computing the size for all nested objects?
I guess ideally we'd start with the leaves and compute sizes of intermediary nodes from the size of their components, this way the work we're doing is proportional to the size of the tree?
The apporach taken here seems like a reasonable workaround.
crates/ark/src/variables/variable.rs
Outdated
}, | ||
} | ||
} | ||
|
||
/** | ||
* Create a new Variable from an R object | ||
*/ | ||
fn from(access_key: String, display_name: String, x: SEXP) -> Self { | ||
fn from(access_key: String, display_name: String, x: SEXP, compute_size: bool) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From an API standpoint I think a builder-style approach would be cleaner. For instance adding a compute_size()
method that you'd call after new()
or from()
. By default size
would be 0.
Especially since computing the size is the less common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! I'll make this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the problem is that PositronVariable
doesn't hold any reference to the RObject, so we can't compute_size()
later. We can do something like
var.size = object.size()
at the call site, what do you think of this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep sounds good
I didn't do a detailed benchmark but, when we expand the children, we return a list of Since we can't really tell which child object ultimatelly owns the actual data.frame, we would also probably Eg if we try to size each element of a list built like:
Anyway, skipping computing the size of child objects make expanding the model run ins ~300ms while if we compute the size, the total time is at around ~1300ms |
15b4c4f
to
a064f87
Compare
a064f87
to
4122b8f
Compare
This speeds up expanding objects with a lot of self-references for which it's very slow to compute the size of.
Relates to posit-dev/positron#4636
Built on top of #629