Question on the computation of P(True) baseline #6

seonwoo-min · 2023-06-19T11:22:22Z

I wanted to bring to your attention a potential error in the computation of the P(True) baseline, unless I have misunderstood something here.

Currently, in the code snippet, only the first len(tokenized_base_prompt) targets are to -100:

semantic_uncertainty/code/get_prompting_based_uncertainty.py

Lines 108 to 113 in 27adbf0

    
           # This computation of the negative log likelihoods follows this tutorial: https://huggingface.co/docs/transformers/perplexity 
        
           tokenized_base_prompt = generation_tokenizer(base_prompt)['input_ids'] 
        
           tokenized_prompt_true = torch.tensor(generation_tokenizer(prompt_true)['input_ids'], device=device) 
        
           target_ids_true = tokenized_prompt_true.clone() 
        
           target_ids_true[:len(tokenized_base_prompt)] = -100

However, it seems that this approach does not ignore the entire context tokens when calculating the NLL loss, as the prompt_true also includes the few_shot_prompt prior to the base_prompt:

semantic_uncertainty/code/get_prompting_based_uncertainty.py

Lines 105 to 106 in 27adbf0

    
           base_prompt = prompt_template.format(question, generated_texts, most_likely_answer) 
        
           prompt_true = few_shot_promopt + prompt_template.format(question, generated_texts, most_likely_answer) + ' True'

It seems not ignoring the entire context tokens for the NLL loss computation might lead to inaccurate results.
Could you please provide some insights or clarification on this matter?

In addition, the current codes only use the n_samples_to_use = 2000 samples for P(True) baseline.
Do the experiment settings for the P(True) and the others different? I don't recall reading any explanation in the paper.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on the computation of P(True) baseline #6

Question on the computation of P(True) baseline #6

seonwoo-min commented Jun 19, 2023 •

edited

Loading

Question on the computation of P(True) baseline #6

Question on the computation of P(True) baseline #6

Comments

seonwoo-min commented Jun 19, 2023 • edited Loading

seonwoo-min commented Jun 19, 2023 •

edited

Loading