-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN/DEPS: Clean up post numpy bump to 1.12 #23796
Changes from 13 commits
70a8d0e
8e00645
91e9710
17d971d
aef3654
7ef44b6
83813d7
f8a68a8
aef0fe2
1fdaecd
d2cb333
d854b29
8995d94
d2b1f78
9c2cfbd
9bac669
ecfe824
91d0501
844c2aa
65f06e8
ce5bd94
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -267,6 +267,8 @@ def maybe_promote(dtype, fill_value=np.nan): | |
# for now: refuse to upcast datetime64 | ||
# (this is because datetime64 will not implicitly upconvert | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so rather than do it this way, just make the top level 2 checks (1 for timedelta, 1 for datetime) then you don't need to check twice. |
||
# to object correctly as of numpy 1.6.1) | ||
# TODO: remove old numpy compat code (without introducing segfault for | ||
# tests/test_take.py::TestTake::test_2d_datetime64) | ||
if isna(fill_value): | ||
fill_value = iNaT | ||
else: | ||
|
@@ -723,19 +725,30 @@ def astype_nansafe(arr, dtype, copy=True, skipna=False): | |
|
||
elif is_object_dtype(arr): | ||
|
||
# work around NumPy brokenness, #1987 | ||
if np.issubdtype(dtype.type, np.integer): | ||
return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape) | ||
|
||
# if we have a datetime/timedelta array of objects | ||
# then coerce to a proper dtype and recall astype_nansafe | ||
|
||
if is_timedelta64_dtype(dtype): | ||
# TODO: this is an old numpy compat branch that is not necessary | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel Could you have a look at this. Couldn't fix it easily, not sure how relevant that test is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yikes. Not sure off the top whats going on here. Why the comment saying it is no longer necessary? np.timedelta64 is still a subdtype of np.integer To the extent that this branch is just for timedelta64 dtypes, I would make that explicit instead of checking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original comment (and issue) was about unsafe casting from object to int - that's not the case for timedelta types, and even more so, there's a dedicated branch for such dtypes below that is never hit (and actually leads to a test failure in |
||
# anymore for its original purpose (unsafe casting from object to | ||
# int, see GH 1987). | ||
# Currently, timedelta dtypes get routed through here; whereas | ||
# uncommenting them would re-call (see below) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these comments are way too long. what exactly are you trying to do here? let's leave this entire casting as is for now w/o any changes. this needs a separate followup. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to leave some breadcrumbs as I spent 2-3 hours trying to investigate this. I've shortened the comment now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i would rather you not and just revert back to what it was. if you want to make an issue ok. |
||
# >>> astype_nansafe(to_timedelta(arr).values, dtype, copy=copy), | ||
# and end up in the `is_timedelta64_dtype(arr)` above, which | ||
# explicitly and deliberately returns a float dtype. | ||
# However, the test | ||
# reshape/merge/test_merge.py::TestMerge:;test_other_timedelta_unit | ||
# expects an explicit timedelta dtype as output. | ||
# Once this is fixed, `astype_intsafe` can be deleted from lib. | ||
return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape) | ||
|
||
elif is_datetime64_dtype(dtype): | ||
from pandas import to_datetime | ||
return astype_nansafe(to_datetime(arr).values, dtype, copy=copy) | ||
elif is_timedelta64_dtype(dtype): | ||
from pandas import to_timedelta | ||
return astype_nansafe(to_timedelta(arr).values, dtype, copy=copy) | ||
# elif is_timedelta64_dtype(dtype): | ||
# from pandas import to_timedelta | ||
# return astype_nansafe(to_timedelta(arr).values, dtype, copy=copy) | ||
|
||
if dtype.name in ("datetime64", "timedelta64"): | ||
msg = ("The '{dtype}' dtype has no unit. " | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -408,6 +408,7 @@ def array_equivalent(left, right, strict_nan=False): | |
|
||
# Object arrays can contain None, NaN and NaT. | ||
# string dtypes must be come to this path for NumPy 1.7.1 compat | ||
# TODO: remove old numpy compat code (or comment) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this case, I'm not sure what's to be done here (expand below; no numpy code to be found) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. remove this comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that "for NumPy 1.7.1 compat" should be left standing, but ok. |
||
if is_string_dtype(left) or is_string_dtype(right): | ||
|
||
if not strict_nan: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1458,11 +1458,6 @@ def quantile(self, qs, interpolation='linear', axis=0, axes=None): | |
|
||
def _nanpercentile1D(values, mask, q, **kw): | ||
# mask is Union[ExtensionArray, ndarray] | ||
# we convert to an ndarray for NumPy 1.9 compat, which didn't | ||
# treat boolean-like arrays as boolean. This conversion would have | ||
# been done inside ndarray.__getitem__ anyway, since values is | ||
# an ndarray at this point. | ||
mask = np.asarray(mask) | ||
values = values[~mask] | ||
|
||
if len(values) == 0: | ||
|
@@ -2781,9 +2776,7 @@ def set(self, locs, values, check=False): | |
------- | ||
None | ||
""" | ||
if values.dtype != _NS_DTYPE: | ||
# Workaround for numpy 1.6 bug | ||
values = conversion.ensure_datetime64ns(values) | ||
values = conversion.ensure_datetime64ns(values, copy=False) | ||
|
||
self.values[locs] = values | ||
|
||
|
@@ -3102,7 +3095,7 @@ def _merge_blocks(blocks, dtype=None, _can_consolidate=True): | |
# FIXME: optimization potential in case all mgrs contain slices and | ||
# combination of those slices is a slice, too. | ||
new_mgr_locs = np.concatenate([b.mgr_locs.as_array for b in blocks]) | ||
new_values = _vstack([b.values for b in blocks], dtype) | ||
new_values = np.vstack([b.values for b in blocks]) | ||
|
||
argsort = np.argsort(new_mgr_locs) | ||
new_values = new_values[argsort] | ||
|
@@ -3114,17 +3107,6 @@ def _merge_blocks(blocks, dtype=None, _can_consolidate=True): | |
return blocks | ||
|
||
|
||
def _vstack(to_stack, dtype): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was only used once, and doesn't need to be defined at all once the shim isn't necessary anymore |
||
|
||
# work around NumPy 1.6 bug | ||
if dtype == _NS_DTYPE or dtype == _TD_DTYPE: | ||
new_values = np.vstack([x.view('i8') for x in to_stack]) | ||
return new_values.view(dtype) | ||
|
||
else: | ||
return np.vstack(to_stack) | ||
|
||
|
||
def _block2d_to_blocknd(values, placement, shape, labels, ref_items): | ||
""" pivot to the labels shape """ | ||
panel_shape = (len(placement),) + shape | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -248,9 +248,6 @@ def __getstate__(self): | |
|
||
def __setstate__(self, state): | ||
def unpickle_block(values, mgr_locs): | ||
# numpy < 1.7 pickle compat | ||
if values.dtype == 'M8[us]': | ||
values = values.astype('M8[ns]') | ||
return make_block(values, placement=mgr_locs) | ||
|
||
if (isinstance(state, tuple) and len(state) >= 4 and | ||
|
@@ -776,18 +773,6 @@ def _interleave(self): | |
|
||
result = np.empty(self.shape, dtype=dtype) | ||
|
||
if result.shape[0] == 0: | ||
# Workaround for numpy 1.7 bug: | ||
# | ||
# >>> a = np.empty((0,10)) | ||
# >>> a[slice(0,0)] | ||
# array([], shape=(0, 10), dtype=float64) | ||
# >>> a[[]] | ||
# Traceback (most recent call last): | ||
# File "<stdin>", line 1, in <module> | ||
# IndexError: index 0 is out of bounds for axis 0 with size 0 | ||
return result | ||
|
||
itemmask = np.zeros(self.shape[0]) | ||
|
||
for blk in self.blocks: | ||
|
@@ -1170,8 +1155,7 @@ def insert(self, loc, item, value, allow_duplicates=False): | |
blk.mgr_locs = new_mgr_locs | ||
|
||
if loc == self._blklocs.shape[0]: | ||
# np.append is a lot faster (at least in numpy 1.7.1), let's use it | ||
# if we can. | ||
# np.append is a lot faster, let's use it if we can. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No code changed here, just the comment |
||
self._blklocs = np.append(self._blklocs, 0) | ||
self._blknos = np.append(self._blknos, len(self.blocks)) | ||
else: | ||
|
@@ -1995,13 +1979,9 @@ def _transform_index(index, func, level=None): | |
|
||
def _fast_count_smallints(arr): | ||
"""Faster version of set(arr) for sequences of small numbers.""" | ||
if len(arr) == 0: | ||
# Handle empty arr case separately: numpy 1.6 chokes on that. | ||
return np.empty((0, 2), dtype=arr.dtype) | ||
else: | ||
counts = np.bincount(arr.astype(np.int_)) | ||
nz = counts.nonzero()[0] | ||
return np.c_[nz, counts[nz]] | ||
counts = np.bincount(arr.astype(np.int_)) | ||
nz = counts.nonzero()[0] | ||
return np.c_[nz, counts[nz]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this used often? worth keeping as a standalone function vs inlining? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got no strong opinion on that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. its called twice as an operand for a loop, so ok to keep |
||
|
||
|
||
def _preprocess_slice_or_indexer(slice_or_indexer, length, allow_fill): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -251,7 +251,7 @@ def dtype_for(t): | |
'complex128': np.float64, | ||
'complex64': np.float32} | ||
|
||
# numpy 1.6.1 compat | ||
# windows (32 bit) compat | ||
if hasattr(np, 'float128'): | ||
c2f_dict['complex256'] = np.float128 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not just numpy compat (hence the removed comment) - including There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe a comment to that effect? "windows compat"or "windows 32 bit compat" or "[whatever would be accurate] compat"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,7 +37,6 @@ def test_big_print(self): | |
def test_empty_print(self): | ||
factor = Categorical([], ["a", "b", "c"]) | ||
expected = ("[], Categories (3, object): [a, b, c]") | ||
# hack because array_repr changed in numpy > 1.6.x | ||
actual = repr(factor) | ||
assert actual == expected | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough to say that the repr of np.array has arrived, no? no need to call getting a list from it a hack, IMO |
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replacing this compat code with
leads to a segfault in
tests/test_take.py::TestTake::test_2d_datetime64
. have not investigated further