Replace classes with pytest in test_sql #55074

WillAyd · 2023-09-08T21:17:25Z

A lot of this is very intertwined so hard to break up into small diffs.

WillAyd · 2023-09-08T21:21:39Z

pandas/tests/io/test_sql.py

+    dtype_backend_data,
+    dtype_backend_expected,
+):
+    if "sqlite" in conn:


This is silently passed on main. What's interesting is that read_sql_table arguably works better than read_sql because it correctly maps the boolean columns to a boolean value. We can update the fixture to account for that, but didn't want to tackle here to try and reduce the already enormous diff

WillAyd · 2023-09-08T21:23:06Z

pandas/tests/io/test_sql.py

-        res = sql.read_sql_table("test_schema_other", self.conn, schema="other")
-        tm.assert_frame_equal(concat([df, df], ignore_index=True), res)
-
-        # specifying schema in user-provided meta


This until the end of the function doesn't even run on main so I just deleted to simplify

WillAyd · 2023-09-08T21:24:35Z

pandas/tests/io/test_sql.py

-            VALUES (1, '2021-01-01T00:00:00Z');
-        """
-        )
-        if isinstance(self.conn, Engine):


This also doesn't run on main. self.conn is never an engine for this class but always a connection

WillAyd · 2023-09-08T21:27:24Z

pandas/tests/io/test_sql.py


-    def test_read_sql_parameter(self, sql_strings):


read_sql_parameter and read_sql_named_parameter already exist in PandasTest, so still have to parametrize / convert that test but will render these unnecessary

WillAyd · 2023-09-08T21:27:51Z

pandas/tests/io/test_sql.py


-    def test_to_sql_empty(self, test_frame1):


replaced by test_dataframe_to_sql_empty earlier in new module

WillAyd · 2023-09-09T16:44:19Z

pandas/tests/io/test_sql.py


-    def dtype_backend_expected(self, storage, dtype_backend) -> DataFrame:
+@pytest.fixture
+def dtype_backend_expected():


This is the pattern I would ultimately like to follow with the types / iris tables, i.e. just create a separate fixture that takes the connection as an argument and can build / drop the tables from there @mroeschke

WillAyd · 2023-09-09T20:16:26Z

pandas/tests/io/test_sql.py

+@pytest.mark.db
+@pytest.mark.parametrize("conn", all_connectable_iris)
+def test_read_sql_iris_no_parameter_with_percent(conn, request, sql_strings, flavor):
+    if "mysql" in conn or "postgresql" in conn:


This test doesn't appear to even be running on main

WillAyd · 2023-09-09T20:24:17Z

The diff here is unfortunate, but I am not sure there is a lot that can be broken off piece-wise.

With the new design, we have a lot more coverage of the different connection types, particularly when using engines directly (and have also uncovered some broken behavior). Total number of tests increase from 824 to 1449

There is still more that can be done to clean up the fixtures (in particular clean up standard vs iris vs type table requiring fixtures) but as is this should help us much more effectively onboard the ADBC drivers

WillAyd · 2023-09-12T19:01:53Z

Hmm I'm not sure. I'll have to do another pass, but the hard part is that you end up with a lot of duplication in the intermediate state. Taking the first method of TestSQLiteFallback as an example:

class TestSQLiteFallback(SQLiteMixIn, PandasSQLTest):
    """
    Test the fallback mode against an in-memory sqlite database.

    """

    flavor = "sqlite"

    @pytest.fixture(autouse=True)
    def setup_method(self, iris_path, types_data):
        self.conn = self.connect()
        self.load_iris_data(iris_path)
        self.load_types_data(types_data)
        self.pandasSQL = sql.SQLiteDatabase(self.conn)

    def test_read_sql_parameter(self, sql_strings):
        self._read_sql_iris_parameter(sql_strings)

We would have to duplicate load_iris_data, load_types_data and read_sql_iris_parameter just to get that one test to run. We would also have to reimplemnt connect, which in this case fortunately isn't that different from the sqlite_buildin fixture, but then puts us in a state where we have a fixture that does part of the setup/teardown and custom functions to do the rest.

In the current PR there is no intermediate state that mixes fixtures, functions and class-based setup - everything is managed via the fixture. Its a big diff but a cleaner end state

The issue with hanging now is from updating the fixture to test engines directly, not just connections. I did this in hopes of increasing test coverage but may just need to scale it back some more

WillAyd · 2023-09-15T11:55:31Z

OK finally got rid of all the hanging behavior when using engine arguments directly. This should be good to go

mroeschke · 2023-09-18T17:39:28Z

pandas/tests/io/test_sql.py

@@ -77,6 +77,9 @@
    SQLALCHEMY_INSTALLED = False


+pytestmark = [pytest.mark.db, pytest.mark.single_cpu]


I think this shouldn't necessary (except pytest.mark.db used for mysql/postgres)

The sqllite3 engine should be able to run without the pytest.mark.db marker since it uses :memory: I believe i.e. these tests shouldn't be skipped in the CI if no db was specified

IIRC the xdist parallelization is done per file, so tests here should be run sequentially

So are you thinking we should only mark the SQLAlchemy tests as db? Or none at all??

Yeah I think the sqlalchemy tests and maybe wrap the fixture params that use sqlalchemy connections/engines in pytest.param(..., marks=pytest.mark.db)

mroeschke · 2023-09-18T17:41:45Z

pandas/tests/io/test_sql.py

 @pytest.mark.parametrize("conn", all_connectable)
 def test_api_to_sql_index_label_multiindex(conn, request):
    conn_name = conn
    if "mysql" in conn_name:
        request.node.add_marker(
-            pytest.mark.xfail(reason="MySQL can fail using TEXT without length as key")
+            pytest.mark.xfail(
+                reason="MySQL can fail using TEXT without length as key", strict=False


Was this still flaky (is strict=False) still needed?

Yea I think it depends on the version of MySQL and may even be a MySQL versus MariaDB difference. I couldn't pinpoint exactly where this would have been allowed

https://stackoverflow.com/questions/1827063/mysql-error-key-specification-without-a-key-length

mroeschke · 2023-09-18T17:44:01Z

pandas/tests/io/test_sql.py

+@pytest.mark.parametrize("conn", all_connectable)
+def test_transactions(conn, request):
+    if "engine" in conn:
+        pytest.skip(reason="hangs forever in CI with engine")


Is this still the case after the refactor?

Same cases below.

Also, could you use is_ci_environment() in this condition too?

Ah sorry these should be removed. The comment about CI might be misleading - ultimately I think comes back to different versions having different transactional behavior

WillAyd · 2023-09-18T19:49:30Z

pandas/tests/io/test_sql.py

+    df = DataFrame({"t": [datetime(2020, 12, 31, 12)]}, dtype="datetime64[ns]")
+    df.to_sql("test", conn, if_exists="replace", index=False)
+    result = pd.read_sql("select * from test", conn).iloc[0, 0]
+    assert result == "2020-12-31 12:00:00.000000"


 @pytest.mark.db


This and the test following are marked directly against the test; this is a bit different from how the other tests work which use pytest.param

mroeschke · 2023-09-18T23:41:15Z

pandas/tests/io/test_sql.py

-        num_entries = len(test_frame1)
-        num_rows = count_rows(self.conn, "test_frame1")
-        assert num_rows == num_entries
+@pytest.fixture


This looks like it could be a regular function, but can be a follow up

mroeschke · 2023-09-18T23:47:28Z

Very cool! It's a been a clean up that's long been needed

* initial working test * passing mixing class removal * converted non-sqlalchemy tests * large refactor * sqlite class conversion * checkpoint * sqlitefallback conversion * fixup tests * no more test classes * factory func * most code cleanups * removed breakpoint; passing tests * fixes * fix when missing SQLAlchemy * more fixups when no SQLAlchemy * fixups * xfail -> skip * sqlite fixture use transaction for cleanup * verbose test for hangs * try skipping sqlite-sqlalchemy-memory on rollback test * sqlite sqlaclchemy memory cleanup * revert verbose logging in tests * mark all db tests * try single_cpu * skip more engine tests that can hang * try no pandasSQL without transaction * more skip * try verbose * transaction skips * remove verbose CI * CI verbose * no more hanging * reverted CI files * type ignore * cleanup skips * remove marks * mark fixtures * mark postgres fixtures

WillAyd added 6 commits September 7, 2023 15:26

initial working test

101b229

passing mixing class removal

4c84f98

converted non-sqlalchemy tests

eada6f6

large refactor

8dc41b0

sqlite class conversion

e12e5d5

checkpoint

17b8e44

WillAyd commented Sep 8, 2023

View reviewed changes

WillAyd added 4 commits September 9, 2023 10:06

sqlitefallback conversion

0a644f5

fixup tests

7efc9f5

no more test classes

d1b2dbd

factory func

97a4eba

WillAyd commented Sep 9, 2023

View reviewed changes

WillAyd added 3 commits September 9, 2023 12:49

most code cleanups

9557f14

removed breakpoint; passing tests

e8c9b6d

fixes

ea1964d

WillAyd commented Sep 9, 2023

View reviewed changes

Merge branch 'main' into more-sql-refactor

e024424

WillAyd marked this pull request as ready for review September 9, 2023 20:25

WillAyd added 7 commits September 9, 2023 17:40

fix when missing SQLAlchemy

60b7787

more fixups when no SQLAlchemy

c1c13b4

fixups

7223810

xfail -> skip

21a9321

sqlite fixture use transaction for cleanup

63a9973

Merge remote-tracking branch 'upstream/main' into more-sql-refactor

8b3bbd3

verbose test for hangs

6d3ab37

WillAyd requested a review from mroeschke as a code owner September 11, 2023 00:18

try skipping sqlite-sqlalchemy-memory on rollback test

60b1c54

WillAyd added 3 commits September 12, 2023 15:12

try no pandasSQL without transaction

8327cc6

more skip

43b9407

try verbose

420dc95

WillAyd mentioned this pull request Sep 13, 2023

Use pandasSQL transactions in sql test suite to avoid engine deadlocks #55129

Merged

WillAyd added 7 commits September 13, 2023 19:52

transaction skips

c071541

Merge branch 'main' into more-sql-refactor

f6f4f1e

remove verbose CI

7c8c54f

CI verbose

4f3997c

no more hanging

f4f5241

reverted CI files

3b13e1e

type ignore

8978d61

mroeschke reviewed Sep 18, 2023

View reviewed changes

WillAyd added 5 commits September 18, 2023 13:52

Merge remote-tracking branch 'upstream/main' into more-sql-refactor

7b33dfb

cleanup skips

6cf13d6

remove marks

424ada1

mark fixtures

359641d

mark postgres fixtures

587e53d

WillAyd commented Sep 18, 2023

View reviewed changes

mroeschke reviewed Sep 18, 2023

View reviewed changes

mroeschke approved these changes Sep 18, 2023

View reviewed changes

mroeschke added Testing pandas testing functions or related to the test suite IO SQL to_sql, read_sql, read_sql_query labels Sep 18, 2023

mroeschke added this to the 2.2 milestone Sep 18, 2023

mroeschke merged commit df7c0b7 into pandas-dev:main Sep 18, 2023
38 of 39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace classes with pytest in test_sql #55074

Replace classes with pytest in test_sql #55074

WillAyd commented Sep 8, 2023 •

edited

Loading

WillAyd Sep 8, 2023

WillAyd Sep 8, 2023

WillAyd Sep 8, 2023

WillAyd Sep 8, 2023

WillAyd Sep 8, 2023

WillAyd Sep 9, 2023

WillAyd Sep 9, 2023

WillAyd commented Sep 9, 2023 •

edited

Loading

WillAyd commented Sep 12, 2023

WillAyd commented Sep 15, 2023

mroeschke Sep 18, 2023

WillAyd Sep 18, 2023

mroeschke Sep 18, 2023

mroeschke Sep 18, 2023

WillAyd Sep 18, 2023

mroeschke Sep 18, 2023

mroeschke Sep 18, 2023

WillAyd Sep 18, 2023

WillAyd Sep 18, 2023

mroeschke Sep 18, 2023

mroeschke commented Sep 18, 2023

		@@ -77,6 +77,9 @@
		SQLALCHEMY_INSTALLED = False


		pytestmark = [pytest.mark.db, pytest.mark.single_cpu]

Replace classes with pytest in test_sql #55074

Replace classes with pytest in test_sql #55074

Conversation

WillAyd commented Sep 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd commented Sep 9, 2023 • edited Loading

WillAyd commented Sep 12, 2023

WillAyd commented Sep 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mroeschke commented Sep 18, 2023

WillAyd commented Sep 8, 2023 •

edited

Loading

WillAyd commented Sep 9, 2023 •

edited

Loading