Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments and tips for the prompt #55

Open
SalvatoreRa opened this issue Jul 30, 2023 · 5 comments
Open

Comments and tips for the prompt #55

SalvatoreRa opened this issue Jul 30, 2023 · 5 comments

Comments

@SalvatoreRa
Copy link

Hi,

very solid and useful works.

In this repository they suggest an approach similar to tree-of-thoughts but which should be done in one prompt

an example of this type of prompt:

Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realises they're wrong at any point then they leave. The question is...

Another interesting approach has been described in this paper: Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models where they have collected impressive datasets about college questions. The authors decided to test different techniques such as self-critique, the chain of thought, and few-shot to see how these impact. In addition, the authors decided to test a new approach the authors call expert prompting. In short, the authors ask the model to nominate experts for a question and what the response of these experts would be. Finally, based on these responses make a collective decision.

example of expert prompting:

# from the official repository: https://github.com/idrori/MITQ/blob/main/code/experts.py
generic_expert = f"an MIT Professor of {department} teaching the {course_name} course"
You are " + generic_expert + f". Give an educated guess of who are three experts most capable of solving the following question.
\n Question: {question}.\n Return a comma-separated list of three names."

#example from the article

E = You are an MIT Professor of Computer Science and Mathematics teaching Calculus I.
P3 = Give an educated guess of the three experts most capable of solving this question.
System: You are E.
User: Solve Q

About CoT an interesting article about CoT has just been published by anthropic [here](Measuring Faithfulness in Chain-of-Thought Reasoning) that could be interesting to include in the review

@EliverQ
Copy link
Collaborator

EliverQ commented Aug 2, 2023

Thank you very much for your recognition of our work. We will make revisions in the next version.
We would like to include you in the acknowledgments. Could you please provide your name?

@SalvatoreRa
Copy link
Author

thank you very much, my name is Salvatore Raieli

@SalvatoreRa
Copy link
Author

Thank you, I have seen the updated version and I would suggest some new research that could be interesting to mention:

Mistral 7B has been released with the technical article. It claims it has better performance than LLaMA version 2 (7B and 13B) parameters. It is interesting to note they are used for reducing inference costs. They created this prompt as Guardrail enforcement (to avoid that the model generated dangerous answer):

Always assist with care, respect, and truth. Respond with utmost utility yet securely. Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity.

They claim this is not impacting performance while avoiding the model is not responding to unsafe prompts:
Ref: https://arxiv.org/pdf/2310.06825.pdf

Another interesting topic that could be incorporated is machine unlearning (which is important since many LLMs are trained on copyright data, harmful content, or personal data). Microsoft recently showed how to make “forget” Harry Potter to LLaMA 7B. The authors used a reinforced model that is trained further on the target data to better identify the data you want your model to forget (the tokens related to the topic). Then they compare the logits between the baseline model and the reinforced model for the target data. They replace idiosyncratic expressions in the target data with generic counterparts (using GPT-4 for the mapping in a dictionary) and use the model to generate alternative labels for the tokens. Lastly, they fine-tune the model on these alternative labels, erasing the memory of target data from the model. This approach seems to work without affecting the model performance on the reasoning benchmark
Ref: https://arxiv.org/pdf/2310.02238.pdf

Another interesting article from Microsoft shows that there are some surprising failure of generalization for LLM: a model trained on a sentence of the form “A is B”, often fail to generalize to the reverse direction “B is A”
Ref: https://arxiv.org/abs/2309.12288v2

The author of this work finds an approach that finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response. The approached worked with different models (ChatGPT, Bard, and Claude but also open-source LLaMA-2-Chat, Pythia, Falcon). They developed an algorithm able to hijack the model constraint without the need of manual engineering
Ref: https://arxiv.org/abs/2307.15043

@StevenTang1998
Copy link
Member

Thanks again for your continued interest and valuable suggestions regarding our survey. We are currently undergoing a new revision, and it is expected to be updated in nearly one month.

@SalvatoreRa
Copy link
Author

your work is actually amazing and thank you for keeping updated that incredible survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants