TST: Parametrized index tests #20624

WillAyd · 2018-04-06T16:41:31Z

I came across this module on another change and noticed that a lot of the tests could really use refactoring. There's a ton more to be done with this module but submitting as is so it doesn't get too large.

Can either add other commits on top of this or have this merged (assuming looks OK) and continue down the module in additional PR(s)

pep8speaks · 2018-04-06T16:41:34Z

Hello @WillAyd! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on April 23, 2018 at 15:49 Hours UTC

WillAyd · 2018-04-06T16:42:30Z

pandas/tests/indexes/test_base.py

-
-    def test_constructor_from_series_period(self):
-        idx = pd.period_range('2015-01-01', freq='D', periods=3)
+        if has_tz:


I suppose this could also leverage hasattr instead but I felt the explicitness of the parameter is more useful

WillAyd · 2018-04-06T16:43:10Z

pandas/tests/indexes/test_base.py

        tm.assert_index_equal(result, expected)

+    @pytest.mark.parametrize("klass", [pd.Series, pd.DataFrame])
+    def test_constructor_from_series_freq(self, klass):


This could arguably be split into two separate tests given the size of the conditional. Open to suggestions

yes woudl do that

WillAyd · 2018-04-06T16:44:05Z

pandas/tests/indexes/test_base.py

@@ -237,61 +223,63 @@ def test_constructor_int_dtype_float(self, dtype):
        result = Index([0., 1., 2., 3.], dtype=dtype)
        tm.assert_index_equal(result, expected)

-    def test_constructor_int_dtype_nan(self):
+    @pytest.mark.parametrize("dtype,klass_or_raises", [


Not sure how we feel about mixing types in klass_or_raises - could also be split into a boolean flag for raises and the klass, though the test would typically just use one or the other

haha just made this comment

WillAyd · 2018-04-06T16:45:00Z

pandas/tests/indexes/test_base.py


+    @pytest.mark.parametrize("swap_objs", [True, False])


Kind of a strange parameter but just emulating the existing test - one assertion is done with the datetime at the initial position and another assertion is done with the timedelta at the initial position

WillAyd · 2018-04-06T16:46:07Z

pandas/tests/indexes/test_base.py

+        # below should coerce
+        [1., 2., 3.], np.array([1., 2., 3.], dtype=float)
+    ])
+    def test_constructor_dtypes_to_int64(self, vals):


This is similar to a lot of the tests below and could arguably be built with parameters of "vals,dtype,klass" but I felt it was cleaner and less repetition to just break the tests by dtype

WillAyd · 2018-04-06T16:47:28Z

pandas/tests/indexes/test_base.py

-        for idx in [Index(np.array([np.timedelta64(1, 'D'), np.timedelta64(
-                1, 'D')])), Index([timedelta(1), timedelta(1)])]:
-            assert isinstance(idx, TimedeltaIndex)
+    @pytest.mark.parametrize("cast_idx", [True, False])


I've used this in a few tests and am not a huge fan of it, but I wasn't sure of a better way to emulate the existing behavior where dtype=object is in the Index constructor in half of the tests

WillAyd · 2018-04-06T16:48:40Z

pandas/tests/indexes/test_base.py


-    def test_constructor_empty(self):
+    def test_constructor_empty_gen(self):
        skip_index_keys = ["repeats", "periodIndex", "rangeIndex",
                           "tuples"]
        for key, idx in self.generate_index_types(skip_index_keys):


In general some of these magic class methods could probably be replaced with pytest functionality. I haven't looked in too much detail just yet but was planning on reviewing after giving the module a first pass at parametrization without changing class behavior

yes see my comments below

note there is already a fixture of indices which works, so have 3 methods of specifying things ATM:

indices fixture

generate_index_types(...)

self.indices

needs to clean this and just make fixtures that we can use generally (in conftest) with docs

jreback

generally looks good.

jreback · 2018-04-06T16:45:03Z

pandas/tests/indexes/test_base.py

+            df['date'] = dts
+            result = DatetimeIndex(df['date'], freq='MS')
+            assert df['date'].dtype == object
+            expected.name = 'date'


put some blank lines in between text like this (so its easily readable)

jreback · 2018-04-06T16:45:41Z

pandas/tests/indexes/test_base.py

-            expected = pd.Index(array)
-            result = pd.Index(ArrayLike(array))
-            tm.assert_index_equal(result, expected)
+        expected = pd.Index(array)


can you audit for whether we use pd.Index (and similar) or Index (we are not generally consistent), like to be consistent within in a module

grep " Index(" -r pandas/tests --include="*.py" | wc -l 891 grep " pd.Index(" -r pandas/tests --include="*.py" | wc -l 264

May be a more robust way of doing it but assuming directionally accurate first is more widely used. Particular to this module I see numbers of 175 and 48, respectively.

I personally prefer pd.Index for explicitness but am fine to change to Index for module-consistency. lmk

jreback · 2018-04-06T16:46:37Z

pandas/tests/indexes/test_base.py

@@ -237,61 +223,63 @@ def test_constructor_int_dtype_float(self, dtype):
        result = Index([0., 1., 2., 3.], dtype=dtype)
        tm.assert_index_equal(result, expected)

-    def test_constructor_int_dtype_nan(self):
+    @pytest.mark.parametrize("dtype,klass_or_raises", [


leave these as 2 separate. generally having parameterize where one case raises is not great (you can name the same however, with _errors on one of the test names)

jreback · 2018-04-06T16:46:54Z

pandas/tests/indexes/test_base.py

+        na_list = [na_val, na_val]
+        exp = klass(na_list)
+        assert exp.dtype == dtype
+        tm.assert_index_equal(Index(na_list), exp)


try to use
result =

jreback · 2018-04-06T16:47:17Z

pandas/tests/indexes/test_base.py

+        tm.assert_index_equal(Index(np.array(na_list)), exp)
+
+    @pytest.mark.parametrize("data", [
+        [pd.NaT, np.nan], [np.nan, pd.NaT], [np.nan, np.datetime64('nat')],


we should make this into a fixture (e.g. nulls_fixture) (if you want to add to conftest). note we should change globally (but can do that in another PR)

jreback · 2018-04-06T16:48:32Z

pandas/tests/indexes/test_base.py

@@ -499,25 +483,25 @@ def test_insert(self):
        null_index = Index([])
        tm.assert_index_equal(Index(['a']), null_index.insert(0, 'a'))

+    @pytest.mark.parametrize("na_val", [np.nan, pd.NaT, None])


e.g. you can use the nulls_fixture from above

Hmm sorry missed this in the last update. I'm assuming that the fixture should NOT include None as a null value so I can break that off into it's own test on the next push. Let me know if you do in fact want None to be included there though (will require updates to the other function using the fixture)

I think should be including None because I think we do convert it for all but object dtypes (where it is left alone). so maybe need 2 fixtures. might be tricky to do this in a general way.

jreback · 2018-04-06T16:49:59Z

pandas/tests/indexes/test_base.py

+        (0, Index(['b', 'c', 'd'], name='idx')),
+        (-1, Index(['a', 'b', 'c'], name='idx'))
+    ])
+    def test_delete(self, pos, exp):


note that some of the tests here should really be parametreized on index types themselves, there is an open issue on that (eg. these set type ops, test_delete and such), are generally tested by the subclasses, but we need a more general cleanup on that

codecov · 2018-04-10T05:11:29Z

Codecov Report

Merging #20624 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #20624      +/-   ##
==========================================
- Coverage   91.85%   91.82%   -0.03%     
==========================================
  Files         153      153              
  Lines       49310    49310              
==========================================
- Hits        45292    45280      -12     
- Misses       4018     4030      +12

Flag	Coverage Δ
#multiple	`90.21% <ø> (-0.03%)`	⬇️
#single	`41.9% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/plotting/_converter.py	`65.07% <0%> (-1.74%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 31e77b0...6e46b57. Read the comment docs.

WillAyd

I've deferred making any changes around the Index vs pd.Index conversation. I'm assuming that we can either do that as one cleanup at the end for the entire module or bundle into a separate change somewhere else but can also include incrementally in this PR if you'd prefer

WillAyd · 2018-04-10T05:16:16Z

pandas/tests/indexes/test_base.py

-        exp = pd.DatetimeIndex([pd.NaT, pd.NaT])
-        assert exp.dtype == 'datetime64[ns]'
-
-        for data in [[pd.NaT, np.nan], [np.nan, pd.NaT],


I think this was a mistake before with pd.NaT to pair with np.nan instead of the datetime constructor. With the new parametrized test that has been adjusted

hmm, I don't think so, this is a construction with mixed types of nulls.

Right but judging off the original test right below it there is something slightly off. This one only uses np.datetime64('nat') in 2/4 parameters, but the below test uses np.timedelta64('nat') in 4/4 parameters.

FWIW it may not hurt to add more constructor tests here especially if we add None to the nulls_fixture. Will take a deeper look on next pass at this

WillAyd · 2018-04-10T05:17:24Z

pandas/tests/indexes/test_base.py

@@ -499,25 +483,25 @@ def test_insert(self):
        null_index = Index([])
        tm.assert_index_equal(Index(['a']), null_index.insert(0, 'a'))

+    @pytest.mark.parametrize("na_val", [np.nan, pd.NaT, None])


Hmm sorry missed this in the last update. I'm assuming that the fixture should NOT include None as a null value so I can break that off into it's own test on the next push. Let me know if you do in fact want None to be included there though (will require updates to the other function using the fixture)

jreback

this is a messy file. prob need to do multiple PR's to get this really nice.

jreback · 2018-04-10T12:34:58Z

pandas/conftest.py

@@ -87,3 +87,11 @@ def join_type(request):
    Fixture for trying all types of join operations
    """
    return request.param
+
+
+@pytest.fixture(params=[numpy.nan, pandas.NaT])


need None as well

can you use np and pd

jreback · 2018-04-11T02:03:46Z

pandas/tests/indexes/test_base.py

        tm.assert_index_equal(result, expected)

+    @pytest.mark.parametrize("klass", [pd.Series, pd.DataFrame])
+    def test_constructor_from_series_freq(self, klass):


yes woudl do that

jreback · 2018-04-11T02:06:23Z

pandas/tests/indexes/test_base.py

-        assert result.tz == idx.tz
+    @pytest.mark.parametrize("cast_as_obj", [True, False])
+    @pytest.mark.parametrize("idx,has_tz", [
+        (pd.date_range('2015-01-01 10:00', freq='D', periods=3,


would prob drop the has_tz, then can make the idx a fixture above (in the test conditional you can directly test if isnstance DTI & tz is not None); then you can add a DTI w/o a tz here as well.

jreback · 2018-04-11T02:09:03Z

pandas/tests/indexes/test_base.py


+    @pytest.mark.parametrize("pos", [0, 1])


these tests that are strictly for datetime like things can prob just move to a new file: test_datetimelike.py (IOW tests that test all of DTI,TDI,PI but all together). can be future PR (or here).

IOW test_base should be only non-datetimelike tests.

jreback · 2018-04-11T02:10:50Z

pandas/tests/indexes/test_base.py

-        for tz in [None, 'UTC', 'US/Eastern', 'Asia/Tokyo']:
-            idx = pd.date_range('2011-01-01', periods=5, tz=tz)
-            dtype = idx.dtype
+    @pytest.mark.parametrize("tz", [


move to pandas/tests/indexes/datetimes/test_timezones.py (might be a duplicate as well)

jreback · 2018-04-11T02:11:43Z

pandas/tests/indexes/test_base.py


+    @pytest.mark.parametrize("attr", ['values', 'asi8'])
+    @pytest.mark.parametrize("klass", [pd.Index, pd.TimedeltaIndex])
+    def test_constructor_dtypes_timedelta(self, attr, klass):


move this to pandas/tests/indexes/timedeltas/test_construction (maybe duplicate)

jreback · 2018-04-11T02:12:57Z

pandas/tests/indexes/test_base.py


-    def test_constructor_empty(self):
+    def test_constructor_empty_gen(self):
        skip_index_keys = ["repeats", "periodIndex", "rangeIndex",
                           "tuples"]
        for key, idx in self.generate_index_types(skip_index_keys):


note there is already a fixture of indices which works, so have 3 methods of specifying things ATM:

indices fixture

generate_index_types(...)

self.indices

needs to clean this and just make fixtures that we can use generally (in conftest) with docs

jreback · 2018-04-11T02:13:39Z

pandas/tests/indexes/test_base.py

-                           labels=[[], []])
-        assert isinstance(empty, MultiIndex)
+    @pytest.mark.parametrize("empty,klass", [
+        (PeriodIndex([], freq='B'), PeriodIndex),


might be duplicated elsewhere. am iffy on where to put things like this. maybe worth extracting things from this file and makign a test_construction for things like this

jreback · 2018-04-11T02:15:46Z

pandas/tests/indexes/test_base.py

@@ -499,25 +483,25 @@ def test_insert(self):
        null_index = Index([])
        tm.assert_index_equal(Index(['a']), null_index.insert(0, 'a'))

+    @pytest.mark.parametrize("na_val", [np.nan, pd.NaT, None])


I think should be including None because I think we do convert it for all but object dtypes (where it is left alone). so maybe need 2 fixtures. might be tricky to do this in a general way.

WillAyd · 2018-04-16T21:14:22Z

Most but not all edits made. I avoided anything that requires moving things out of the module, if only because my suggested plan of attack would be:

Parametrize the existing module where appropriate (this covers ~650 of ~2400 lines, so maybe 3 or 4 more similar PRs)
Replace any class methods / attributes with fixtures where appropriate
Move tests into new modules / sub-modules

Open to suggestions on approach.

jbrockmendel · 2018-04-19T20:25:27Z

pandas/conftest.py

@@ -89,6 +89,14 @@ def join_type(request):
    return request.param


+@pytest.fixture(params=[None, np.nan, pd.NaT])


This is a good idea. Some more for the list: np.timedelta64('NaT'), np.datetime64('NaT'), float('nan'), np.float('NaN').

Note that np.float('NaN') does not return np.nan (i.e. does not behave like pd.NaT), so this would catch any occurrences in pandas of e.g if foo is np.nan

jreback · 2018-04-21T22:01:42Z

@WillAyd can you rebase.

can you expand the nulls this as well? to as much as you can while still having this pass?

WillAyd · 2018-04-23T16:05:30Z

Added the suggestions from @jorisvandenbossche with the exception of np.timedelta64('NaT') and np.datetime64('NaT') as they caused a failure with the below test.

https://github.com/WillAyd/pandas/blob/6e46b57bc99e70b96eb6de448ab12ff7930a2d10/pandas/tests/indexes/test_base.py#L272

The test is parametrized for a DatetimeIndex and a TimedeltaIndex and the failure occurs when you try to construct from a list that contains one instance of both (it returns an object-typed Index instead of one or the other).

I think this is a corner case for this test so we could definitely still add to the fixture and have the test be responsible for skipping where appropriate, but for now I've just kept out of the fixture

jreback · 2018-04-24T10:13:14Z

thanks @WillAyd

if you can create an issue for extending the nulls fixture (and issue around that), and an issue about cleaning more testing things (e.g. comments above).

TST: Parametrized index tests

a8cb08b

WillAyd commented Apr 6, 2018

View reviewed changes

jreback requested changes Apr 6, 2018

View reviewed changes

LINT fixup

7c6b00f

jreback added Testing pandas testing functions or related to the test suite Indexing Related to indexing on series/frames, not to indexes themselves Dtype Conversions Unexpected or buggy dtype conversions labels Apr 6, 2018

WillAyd added 6 commits April 9, 2018 18:49

Test case refactor #1

f54c27d

Created nulls_fixture

f3ca105

Consolidated nat ctor tests

918e7ca

LINT fixup

32dcffb

Split test_empty_fancy test

3687263

Merge remote-tracking branch 'upstream/master' into idx-clnup

5a4182c

WillAyd commented Apr 10, 2018

View reviewed changes

jreback requested changes Apr 11, 2018

View reviewed changes

WillAyd added 2 commits April 16, 2018 13:55

Merge remote-tracking branch 'upstream/master' into idx-clnup

4152c2c

Updated conftest; split some tests

027e67f

LINT Fixup

9f98f16

jbrockmendel reviewed Apr 19, 2018

View reviewed changes

jreback approved these changes Apr 21, 2018

View reviewed changes

jreback added this to the 0.23.0 milestone Apr 21, 2018

WillAyd added 2 commits April 23, 2018 08:49

Added to nulls fixture

7f02db0

Merge remote-tracking branch 'upstream/master' into idx-clnup

6e46b57

jreback merged commit 3bb58ac into pandas-dev:master Apr 24, 2018

WillAyd deleted the idx-clnup branch April 24, 2018 15:44

WillAyd mentioned this pull request Apr 24, 2018

TST: Cleanup tests/indexes/test_base.py #20812

Closed

8 tasks

		@@ -89,6 +89,14 @@ def join_type(request):
		return request.param


		@pytest.fixture(params=[None, np.nan, pd.NaT])

TST: Parametrized index tests #20624

TST: Parametrized index tests #20624

Conversation

WillAyd commented Apr 6, 2018

pep8speaks commented Apr 6, 2018 • edited Loading

Comment last updated on April 23, 2018 at 15:49 Hours UTC

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 10, 2018 • edited Loading

Codecov Report

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Apr 16, 2018

Choose a reason for hiding this comment

jreback commented Apr 21, 2018

WillAyd commented Apr 23, 2018

jreback commented Apr 24, 2018

pep8speaks commented Apr 6, 2018 •

edited

Loading

codecov bot commented Apr 10, 2018 •

edited

Loading