Skip to content

An ongoing little project to probe and visualize the output distribution of LLM

Notifications You must be signed in to change notification settings

lsjlsj35/LLM-Distribution-Probe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Installation

pip install -r requirements.txt

Models and Datasets

  • Datasets are from alpaca_farm
  • Models: ChatGLM2, ChatGLM3 and Phi-2(Microsoft).

Probe

main.py

Call probe_dist() to get probabilities and ranks for the given QA-pairs. Results are saved in qa_status.

This project is tested on alpaca-preference dataset, so many methods are designed according to the special format of this dataset.

I plan to adjust the methods in the DistributionProbe class to abstract the processing of datasets, support streaming output, and provide overall probability distribution for individual tokens.

Visualize

analyze.py

Different color represents tokens with different probabilities and ranks predicted by certain model. In the default setting, darker tokens have lower ranking and vice versa.

White, green, yellow and red represent increasingly lower predicted probabilities in turn. class controller reads the file in qa_status and displays results in the terminal.
pic1

About

An ongoing little project to probe and visualize the output distribution of LLM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages