The implementation for SIGIR 2021 paper (arxiv):
Policy-Gradient Training of Fair and Unbiased Ranking Functions
Himank Yadav*, Zhengxiao Du*, Thorsten Joachims (*: equal contribution)
Clone the repo
git clone https://github.com/him229/fultr
cd fultr
Please first install PyTorch, and then install other dependencies by
pip install -r requirements.txt
Script main.sh
contains commands for running various experiments (based on slurm) in the paper.
datasets
folder contains links to download the datasets used for experiments and code we used to transform the datasets to make them suitable for training.
transformed_datasets
folder contains the final version of the transformed dataset that we directly use for training.
We use MSLR and German Credit Datasets for training. They can be found online using the links below:
MSLR-WEB30K (Fold 1)
- https://www.microsoft.com/en-us/research/project/mslr/
German Credit
- https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)
To reproduce the preprocessing process, please download the raw datasets and save files in transformed_datasets/german/raw
and transformed_datasets/mslr/raw
respectively. (All of the commands should be executed under the datasets
directory.)
This dataset contains information about 1000 individuals which are randomly split into train, validation, and test sets with ratio 1:1:1. We convert this to an LTR dataset by sampling 20 individuals from the each set with ratio 9:1 for non-creditworthy individuals to creditworthy individuals for each query.
Group attribute - A binary feature indicating whether the purpose is radio/television (attribute id A43)
python preprocess-german.py
We adopt the train, validation, and test split provided with the dataset. We binarize relevances by assigning 1 to items that were judged as 3 or 4 and 0 to judgments 0, 1, and 2. Next, we remove queries with less than 20 candidates (to better compare different methods and amplify differences). For the remaining queries, we sample 20 candidate items with at most 3 relevant items for each query.
Group attribute - QualityScore (feature id 133) with 40th percentile as the threshold
python preprocess-mslr.py
First save your dataset into the same format as MSLR, then run the following command:
python preprocess-mslr.py --raw_directory <data directory> --output_directory <output directory> --no_log_features
We first train a conventional Ranking SVM with 1 percent of the full-information training data as the logging policy. This logging policy is then used to generate the rankings for which click data is logged.
The click data is generated by simulating the position-based examination model. We use a position bias that decays with the presented rank k of the item as v(k) = (1/k)^n
with n=1
as default. (file: generate_clicks_for_dataset.py
)
To train the production ranker, first download the SVM-Rank (http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html) and Propensity SVM-Rank (http://www.cs.cornell.edu/people/tj/svm_light/svm_proprank.html) into svm_rank/
and svm_proprank/
and then compile the software according to the instructions.
Then run the following commands:
python production_ranker.py --dataset german
python production_ranker.py --dataset mslr
when using your own dataset
python production_ranker.py --dataset mslr --data_directory <output directory for previous preprocessing>