Skip to content

Latest commit

 

History

History
89 lines (51 loc) · 6.67 KB

README.md

File metadata and controls

89 lines (51 loc) · 6.67 KB

Neutralizing Biased Text

This repo contains code for the paper, "Automatically Neutralizing Subjective Bias in Text".

Concretely this means algorithms for

  • Identifying biased words in sentences.
  • Neutralizing bias in sentences.

firstpage

This repo was tested with python 3.7.7.

Quickstart

These commands will download data and a pretrained model before running inference.

$ cd src/
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ python
>> import nltk; nltk.download("punkt")
$ sh download_data_ckpt_and_run_inference.sh

You can also run sh integration_test.sh to further verify that everything is installed correctly and working as it should be.

Data

Click this link to download (100MB, expands to 500MB).

Pretrained model

Click this link to download a model checkpoint. We used this command to train it.

Overview

harvest/: Code for making the dataset. It works by crawling and filtering Wikipedia for bias-driven edits.

src/: Code for training models and using trained models to run inference. The models implemented here are referred to as MODULAR and CONCURRENT in the paper.

Usage

Please see src/README.md for bias neutralization directions.

See harvest/README.md for making a new dataset (as opposed to downloading the one available above).