You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately, closed-source large language models generally do not provide any logprobs in their predictions. ChatGPT, Claude, Mistral-Large, ... do not provide these logprobs and can therefore not use the technique proposed in the paper.
When I followed the steps to reproduce the results and then went to evaluate_toxicity.py, I encountered an error that displayed
| ERROR | main::129 - An error has been caught in function '', process 'MainProcess' (5179), thread 'MainThread' (139954086889280):
Traceback (most recent call last):
File "/root/autodl-tmp/language-model-arithmetic/scripts/evaluate_toxicity.py", line 134, in
first_model = formula.runnable_operators()[0].model
└ <model_arithmetic.runnable_operators. PromptedLLM object at 0x7f499986bc10>
AttributeError: 'PromptedLLM' object has no attribute 'runnable_operators'
This bug should now be fixed, apologies for that. Note that for reproducing our results, we advice to use the "v1.0" branch, where this bug should not occur.
Can this methodology be applied to closed-source large-scale models such as chatgpt?
The text was updated successfully, but these errors were encountered: