-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COMPAT/TEST test, fix for unsafe Vector.resize(), which allows refche… #16258
Conversation
Codecov Report
@@ Coverage Diff @@
## master #16258 +/- ##
==========================================
- Coverage 90.32% 90.3% -0.02%
==========================================
Files 167 167
Lines 50907 50907
==========================================
- Hits 45982 45973 -9
- Misses 4925 4934 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16258 +/- ##
==========================================
+ Coverage 90.34% 90.35% +<.01%
==========================================
Files 167 161 -6
Lines 50908 50863 -45
==========================================
- Hits 45994 45957 -37
+ Misses 4914 4906 -8
Continue to review full report at Codecov.
|
Sorry it took so many builds to get tests to pass, I am still learning the build system |
@@ -52,6 +52,7 @@ cdef struct Int64VectorData: | |||
cdef class Int64Vector: | |||
cdef Int64VectorData *data | |||
cdef ndarray ao | |||
cdef bint external_view_exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you only doing this for Int64 for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other Vector classes are defined in pandas/_libs/hashtable_class_helper.pxi.in, for some reason the original code defines this class here specificially. So I added the cdef bint external_view_exists
where the other classes are defined
@chris-b1 any opinions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine. minor comments. thank you.
self.ao.resize(self.data.n) | ||
if self.data.m != self.data.n: | ||
if self.external_view_exists: | ||
raise ValueError("should have raised on append(), m=%d n=%d" % (self.data.m, self.data.n)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use .format() syntax and wrap the line
pandas/tests/test_algos.py
Outdated
# to_array resizes the vector | ||
uniques.to_array() | ||
htable.get_labels(vals, uniques, 0, -1) | ||
# to_array may resize the vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put a 1-line comment here on wheat you are checking
@jreback rather than use format() in the error message, I simply shortened it. Thanks for the review |
pandas/_libs/hashtable.pyx
Outdated
uniques = ObjectVector() | ||
data = self.uniques.to_array() | ||
for v in data: | ||
uniques.append(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency maybe we should add .extend
to String/Object Vectors. can you add?
thanks! |
pandas-dev#16258) * COMPAT/TEST test, fix for unsafe Vector.resize(), which allows refcheck=False * COMPAT/TEST improve error msg, document test as per review * COMPAT/TEST unify interfaces as per review
pandas-dev#16258) * COMPAT/TEST test, fix for unsafe Vector.resize(), which allows refcheck=False * COMPAT/TEST improve error msg, document test as per review * COMPAT/TEST unify interfaces as per review
closes issue #15854, supersedes pull request #16224, #16193
Adds a test showing how the
uniques
attribute leaks to user space, and callingget_labels()
again with different data could change the underlying ndarray. With this pull request an exception will be raised after callingappend()
after callingto_array()
, which makes the test pass. It also allows addition of therefcheck=False
kwarg tondarray.resize()
, which fixes the issue above.