Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #610 using unique constraint on subsets of data when fitting a model #611

Merged
merged 9 commits into from
Oct 29, 2021
Merged

Fixes #610 using unique constraint on subsets of data when fitting a model #611

merged 9 commits into from
Oct 29, 2021

Conversation

xamm
Copy link
Contributor

@xamm xamm commented Oct 21, 2021

Fixes issue #610 and allows fitting a model to a subset of data when also specifying the use of a Unique constraint.

The indexes are reset before each fitting process by using .reset_index(drop=True) for all data passed to the fit method.

@xamm xamm requested a review from a team as a code owner October 21, 2021 07:44
@xamm xamm requested review from pvk-developer and removed request for a team October 21, 2021 07:44
sdv/tabular/base.py Outdated Show resolved Hide resolved
tests/integration/tabular/test_base.py Outdated Show resolved Hide resolved
tests/integration/tabular/test_base.py Show resolved Hide resolved
tests/integration/tabular/test_base.py Outdated Show resolved Hide resolved
Copy link
Contributor

@katxiao katxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Waiting for one more approval.

Copy link
Contributor

@csala csala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this proposal @xamm !
I added a couple of comments about minor details that could be improved, but other than that, the PR seems correct. Once those minor things are addressed I think we can merge.

tests/integration/tabular/test_base.py Outdated Show resolved Hide resolved
tests/integration/tabular/test_base.py Show resolved Hide resolved
- moves sampling to Run section from Assert
- assert that Unique column is really unique
- add link to Github Issue in test
@xamm xamm requested a review from csala October 29, 2021 09:03

# Assert
assert len(model.sample(2)) == 2
assert len(samples) == 2
assert samples["error_column"].is_unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is cool! I did not know about this is_unique attribute!

Copy link
Contributor

@csala csala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments @xamm
This now looks good to go for me! Feel free to merge when ready @katxiao

@katxiao katxiao merged commit 3a9cb24 into sdv-dev:master Oct 29, 2021
@xamm xamm deleted the gh-610-unique-constraint-on-subsets branch November 9, 2021 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants