Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Test Codes #54

Closed
wants to merge 14 commits into from
Closed

Add Test Codes #54

wants to merge 14 commits into from

Conversation

RexWzh
Copy link

@RexWzh RexWzh commented Dec 19, 2024

Add Test Codes

Hi @lenianiva, could you review the test setup in this PR?

Changes Made

  1. The other PR's tests passed successfully. Some of the statements need fixing, check the LSP here as commented before.

  2. Added pytest framework, migrate unitests from server.py to test_server.py

You can run the test suite with:

pytest -s tests/

Next Steps

  • Add GitHub workflow for automated testing
  • Fix bugs appears in the action, checked by pytest tests -m error

For error cases, we could fix in a new PR to simplify the review process?

Please let me know if you'd like any further adjustments.

pyproject.toml Show resolved Hide resolved
@lenianiva
Copy link
Member

lenianiva commented Dec 20, 2024

I'm a bit concerned about adding minif2f to the unit testing pipeline. Building mathlib takes a while. If it is just mathlib we could lake exe cache get but Pantograph has to work in a mathlib-independent way. How long does it take to build the library?

Can you set it up so that instead of using an example skeleton project with mathlib as a dependency, the minif2f tests directly run with project root pointing to Mathlib? This way we can leverage lake exe cache get and it would run much faster.

@RexWzh RexWzh marked this pull request as ready for review December 20, 2024 20:16
@RexWzh
Copy link
Author

RexWzh commented Dec 20, 2024

Can you set it up so that instead of using an example skeleton project with mathlib as a dependency, the minif2f tests directly run with project root pointing to Mathlib? This way we can leverage lake exe cache get and it would run much faster.

You might approve this PR to run the actions, or check it here https://github.com/Lean-zh/PyPantograph/pull/2/checks
image

@pytest.mark.advance
def test_load_theorem(minif2f_server: Server, minif2f_test: DataFrame, minif2f_valid: DataFrame):
"""Comprehensive test for loading multiple theorems.
use pytest -m "not advance" to skip this test.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use pytest -m "not advance" instead to skip some tests if it takes too long.

@lenianiva
Copy link
Member

Can I push onto your dev branch?

@RexWzh
Copy link
Author

RexWzh commented Jan 12, 2025

Can I push onto your dev branch?

Yes, please, I think you have the write permissions on this dev branch.

Details about the tests:

  1. Tests expected to fail for version 4.12.0 are marked as @pytest.mark.error.

  2. For the updated minif2f dataset, you can review the corrected results here:

    • Test Dataset

    • Validation Dataset

    • Or load and inspect the results on your local machine using the code snippet provided below:

      import re
      import pandas as pd
      
      # Load MiniF2F test and validation datasets
      mini12_test = pd.read_json('experiments/minif2f/test.jsonl', lines=True)
      mini12_valid = pd.read_json('experiments/minif2f/valid.jsonl', lines=True)
      
      # Define the default Lean header
      default_header = """import Mathlib
      open BigOperators Real Nat Topology"""
      
      # Write the test dataset to a Lean file
      with open('formal_test.lean', 'w') as f:
          f.write(default_header + '\n\n')
          text = '\n\n'.join(mini12_test.formal_statement.apply(lambda thm: f"{thm} := by sorry"))
          f.write(text)
      
      # Write the validation dataset to a Lean file
      with open('formal_valid.lean', 'w') as f:
          f.write(default_header + '\n\n')
          text = '\n\n'.join(mini12_valid.formal_statement.apply(lambda thm: f"{thm} := by sorry"))
          f.write(text)

@lenianiva
Copy link
Member

Why did you delete poetry.lock?

@RexWzh
Copy link
Author

RexWzh commented Jan 12, 2025

Why did you delete poetry.lock?

I removed poetry.lock since it caused test failures. I think the pyproject.toml is sufficient for version tracking. I can add back my local lock file if you prefer.

https://github.com/Lean-zh/PyPantograph/actions/runs/12732921282/job/35488677656

image

@lenianiva
Copy link
Member

Why did you delete poetry.lock?

I removed poetry.lock since it caused test failures. I think the pyproject.toml is sufficient for version tracking. I can add back my local lock file if you prefer.

https://github.com/Lean-zh/PyPantograph/actions/runs/12732921282/job/35488677656
image

The lock file should not be removed. If having the lock file causes your unit tests to fail something's seriously messed up with your setup.

@RexWzh
Copy link
Author

RexWzh commented Jan 13, 2025

I tried not to add your review work, so I committed the lock file from your branch instead. It was a bit of a clumsy attempt, though. What I meant is that this file is machine-generated and doesn’t need to be tracked in version control. It’s better to let each environment generate its own lock file as needed.

@RexWzh RexWzh marked this pull request as draft January 13, 2025 16:24
@RexWzh
Copy link
Author

RexWzh commented Jan 14, 2025

This is a non-essential feature, and it feels like we could be losing some development efficiency by getting caught up in these details. As mentioned before, I’m excited to see people willing to contribute to open-source projects. I’ve learned a lot from the Open Source Promotion Plan, and I hope to contribute to Lean's projects through lean-zh. Perhaps we could focus on something more impactful for this project to start with.

@RexWzh RexWzh closed this Jan 14, 2025
@lenianiva
Copy link
Member

I think this is a valuable contribution but we need to refactor out all the clutter e.g. the experiments out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants