-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #610 using unique constraint on subsets of data when fitting a model #611
Fixes #610 using unique constraint on subsets of data when fitting a model #611
Conversation
- sample less than the rows used for fitting - sample not only one row
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Waiting for one more approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this proposal @xamm !
I added a couple of comments about minor details that could be improved, but other than that, the PR seems correct. Once those minor things are addressed I think we can merge.
- moves sampling to Run section from Assert - assert that Unique column is really unique - add link to Github Issue in test
|
||
# Assert | ||
assert len(model.sample(2)) == 2 | ||
assert len(samples) == 2 | ||
assert samples["error_column"].is_unique |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is cool! I did not know about this is_unique
attribute!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes issue #610 and allows fitting a model to a subset of data when also specifying the use of a
Unique
constraint.The indexes are reset before each fitting process by using
.reset_index(drop=True)
for all data passed to thefit
method.