Skip to content

Latest commit

 

History

History
89 lines (52 loc) · 4.92 KB

File metadata and controls

89 lines (52 loc) · 4.92 KB

🧪 Behavioural Testing on a Sentiment Clasiffier 🔬

Check how robust a sentiment classifier is to random typos in our dataset using Github Actions and invariance testing.

🗃️ Table of contents

🗺️ A bit of context about this project

Back to Top

This project represents an aid for the talk "Testing Schrödinger's Box, AKA ML Systems" at CommitConf 2024 and Codemotion 2024.

Abstract of the talk:

Just like quantum mechanics and Schrödinger's cat experiment, in AI we have our own mysteries, which, curiously, are also related to boxes.We have come to create systems of such complexity that we call them "black box models" because we are unable to understand what goes on inside them. We only know what goes in and what comes out.

In this talk, we will talk about how can we test these black boxes to shed some light on what goes on inside them, or at least to ensure that they behave in a predictable way. Which is not trivial at all. We will also discuss a hands on example on how perform automatic tests on a sentiment anañlysis project.

📝 Description

Back to Top

In this project we present the code to predict an individual's belief about climate change based on their Twitter activity.

  • Information about the dataset and data processing performed: data/README.md and docs/processed/data_processing.ipynb
  • Benchmarking of the model: docs/processed/sentiment_analysis_guide.ipynb
  • Invariance test: run_invariance_test.py will asses the robustness of our classifier to typos in our dataset.

🛠️ Set up

Back to Top

  1. This project requires python>=3.7=<3.11. Check your python version by running:
python --version
  1. Clone the repository:
git clone https://github.com/LoboaTeresa/Behavioural-Testing-on-a-Sentiment-Clasiffier.git
  1. Install the required packages:
pip install -r requirements.txt
  1. You are ready to go! Dont forget to check the notebooks in the docs folder to understand the data processing and the benchmarking of the model.

🌱 How to use Github Actions to test this and your own project

Back to Top

Learning journey on Github Actions: click here

  1. Create your own Github repository or fork this one. Github actions are integrated into your project as soon you create the Github repository.

  2. Click on Actions in the top bar of your repository to check pre-built workflows. As you can see Github actions offers a very easy integratios with different tools, something indespensable for CI/CD tool.

  3. Your automated workflows must be defined in a Github actions configuration file in.github/workflows directory. It must by a yaml file. Go check out mine.

  4. You can modify the name of the job, the name of the workflow, the name of the python version, the name of the test, etc. You can also add more jobs to the workflow.

  5. Once you have created the yaml file, you can push it to your repository. This will trigger the workflow and you will be able to see the results in the Actions tab of your repository.

👥 Aknowledgements

Back to Top

I would like to thank the organizers of CommitConf and Codemotion for giving me the opportunity to share my knowledge with the community. I would also like to thank the community for their support and feedback.

The code in this repository is based on the work of Max Stocker. Go give him a star in his repository and some claps for his Medium Article.

commitconf

codemotion