Name		Name	Last commit message	Last commit date
parent directory ..
data		data
utils		utils
README.md		README.md
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
single_ckpt_toxic_detection.py		single_ckpt_toxic_detection.py

README.md

Toxicity Detection

This folder contains implementations for toxicity detection benchmarks on LLM360 models. The benchmark measures model's capability on identifying toxic text.

Overview

Here's a list of toxicity detection benchmarks we have implemented so far.

Benchmark	Model
Social Bias Frames	Crystal Amber
Dynahate	Crystal Amber
Toxigen	Crystal Amber

Directory Structure

single_ckpt_toxic_detection.py is the main entrypoint for evaluating toxicity detection on a single model. It uses python modules in utils/ folder.

The utils/ folder contains helper functions for model/dataset IO:

data_utils.py: Dataset preparation for all benchmarks
model_utils.py: Model loader

By default, the evaluation results are saved in ./{model_name}_results.jsonl.

Installation

Clone and enter the folder:

git clone https://github.com/LLM360/Analysis360.git
cd analysis360/analysis/safety360/toxic_detection

Install dependencies:
```
pip install -r requirements.txt
```

Quick Start

Evaluation and

An example usage is provided in the demo.ipynb, which can be executed with a single A100 80G GPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toxic_detection

toxic_detection

README.md

Toxicity Detection

Table of Contents

Overview

Directory Structure

Installation

Quick Start

Evaluation and

Files

toxic_detection

Directory actions

More options

Directory actions

More options

Latest commit

History

toxic_detection

Folders and files

parent directory

README.md

Toxicity Detection

Table of Contents

Overview

Directory Structure

Installation

Quick Start

Evaluation and