Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python/r] Enforce dataframe domain lower bound == 0 #3300

Merged
merged 7 commits into from
Nov 7, 2024

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Nov 6, 2024

Arose during discussion of #2407 / [sc-51048], but the defect here predates the new-shape work.

Update 2024-11-20: please see #3358 which reverts this.

Copy link

codecov bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.

Project coverage is 85.50%. Comparing base (208c0ab) to head (b1f55f3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3300      +/-   ##
==========================================
+ Coverage   85.37%   85.50%   +0.12%     
==========================================
  Files          52       52              
  Lines        5499     5506       +7     
==========================================
+ Hits         4695     4708      +13     
+ Misses        804      798       -6     
Flag Coverage Δ
python 85.50% <85.71%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 85.50% <85.71%> (+0.12%) ⬆️
libtiledbsoma ∅ <ø> (∅)

@mojaveazure
Copy link
Member

I'm confused, does the lowest value for soma_joinid have to be 0 or does it have to be 0 or greater? IIUC setting soma_joinid to c(10, 20, 100) was valid, but the check added here seems to say otherwise

@johnkerl
Copy link
Member Author

johnkerl commented Nov 6, 2024

@mojaveazure it means the domain needs to start at 0 at create time.

You can still write 10, 20, 30 at write time.

Copy link
Member

@mojaveazure mojaveazure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from the R side, left one minor comment that can be ignored if need be. Please remember to bump the develop version and update the changelog when this gets shipped

@johnkerl
Copy link
Member Author

johnkerl commented Nov 6, 2024

@mojaveazure re

left one minor comment that can be ignored if need be

all I see is #3300 (comment) ... do you have another comment still pending?

@johnkerl
Copy link
Member Author

johnkerl commented Nov 6, 2024

@mojaveazure thanks!

@nguyenv any thoughts on the Python side? (I suspect not, I think this is a very basic thing, one in fact I thought we were already going -- but still I want to check with you.)

Comment on lines 69 to 70
lower <- domain[["soma_joinid"]][1]
stopifnot("The lower bound for soma_joinid domain must be 0" = lower == 0)
Copy link
Member

@mojaveazure mojaveazure Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, this never got submitted with my review 🤦 Is the order required or can we allow alternate ordering so long as we have a min and max? If ordering is required, we should check for that; otherwise, we should adjust this check to lower bound rather than first value

Suggested change
lower <- domain[["soma_joinid"]][1]
stopifnot("The lower bound for soma_joinid domain must be 0" = lower == 0)
stopifnot("The lower bound for soma_joinid domain must be 0" = min(domain[["soma_joinid"]]) == 0)

(@johnkerl feel free to ignore this if neither check is worth implementing)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mojaveazure the lower slot needs to be 0. Only that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can't say (9,0); they have to say (0, 9).

Copy link
Member

@mojaveazure mojaveazure Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then maybe also include a check that domain[["soma_joinid"]][2L] > domain[["soma_joinid"]][1L] (or >= if both are allowed to be 0)? (I'm assuming c(0L, -10L) is disallowed, but not sure)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, thanks @mojaveazure !

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@nguyenv nguyenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

apis/python/src/tiledbsoma/_dataframe.py Show resolved Hide resolved
@johnkerl
Copy link
Member Author

johnkerl commented Nov 7, 2024

Thanks @nguyenv !

@johnkerl johnkerl force-pushed the kerl/sdf-sjid-lower-zero branch from 5f01962 to b1f55f3 Compare November 7, 2024 19:47
@johnkerl johnkerl merged commit 017aad0 into main Nov 7, 2024
14 checks passed
@johnkerl johnkerl deleted the kerl/sdf-sjid-lower-zero branch November 7, 2024 20:33
johnkerl added a commit that referenced this pull request Nov 20, 2024
johnkerl added a commit that referenced this pull request Nov 20, 2024
johnkerl added a commit that referenced this pull request Nov 20, 2024
johnkerl added a commit that referenced this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants