Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUCCESS: Global Search Response: I am sorry but I am unable to answer this question given the provided data. #44

Open
galen36123612 opened this issue Aug 14, 2024 · 7 comments

Comments

@galen36123612
Copy link

I am using Anaconda to build my own project. I am using Python version 3.10.14 and downloaded Ollama, pulled Mistral for my LLM, and pulled Nomic-Embed-Text for my embedding model. I followed the instructions step by step. The following are four screenshots from when I launched the command python -m graphrag.index --root ./ragtest.

截圖 2024-08-14 下午1 39 32 截圖 2024-08-14 下午1 39 49 截圖 2024-08-14 下午1 40 01 截圖 2024-08-14 下午1 40 15 截圖 2024-08-14 下午1 40 32

It seems like everything looks fine when I root my Graphrag index. However, when I use the query command, it doesn’t produce the expected output.

截圖 2024-08-14 下午1 40 45

Does anyone share the same or similar experiences as me? Big thanks.

@ahmedkooli
Copy link

same issue, also following the tutorial

@ahmedkooli
Copy link

This error seems to happen here because the generated response is not JSON serializable (it's not in format {"key1": "value1", "key2": "value2"}). I also saw that the prompt should enforce this JSON format.

I didn't continue my investigation but my guess is that if the output of the LLM doesn't follow the imposed JSON format then it will simply fail.

Please @TheAiSingularity if you have any idea to confirm or deny it could be very helpful as I'm seeing multiple users having the same error while reproducing the tutorial. Thanks a lot!

@TheAiSingularity
Copy link
Owner

TheAiSingularity commented Aug 17, 2024

well, the issue is not with the repo implementation..it works fine if we follow the mentioned steps exactly and we have tested ut several times in our environments on different settings..

you guys can play around with the parameters in the settings.yaml file and see.

i suspect the issue is with the user's other configurations on their system. the same setup has worked for significant amount of people, hence the stars of the repo.

we have observed one pattern though..if we just change the question and query the graphrag, we were able to retreive the response for the same successfully generated database.

hope this helps.

@zmansoft
Copy link

zmansoft commented Aug 22, 2024

This error seems to happen here because the generated response is not JSON serializable (it's not in format {"key1": "value1", "key2": "value2"}). I also saw that the prompt should enforce this JSON format.

I didn't continue my investigation but my guess is that if the output of the LLM doesn't follow the imposed JSON format then it will simply fail.

Please @TheAiSingularity if you have any idea to confirm or deny it could be very helpful as I'm seeing multiple users having the same error while reproducing the tutorial. Thanks a lot!

run this cmd
python -m graphrag.query --root ./ragtest --method global "What is Transformer Neural Networks?"

I print(search_response) and get the search_response as below:
Transformer Neural Networks are a type of deep learning model introduced in the paper "Attention is All You Need" by Vaswani et al. (2017). The Transformer model is primarily used for sequence-to-sequence tasks, such as machine translation, text summarization, and speech recognition.The key innovation of the Transformer model lies in its use of self-attention mechanisms instead of recurrence or convolution to handle sequential data. Self-attention allows the model to weigh the importance of different words within a sequence when generating an output for that sequence. This makes it possible to capture long-range dependencies more effectively compared to traditional RNNs (Recurrent Neural Networks) and CNNs (Convolutional Neural Networks).The Transformer architecture consists of multiple layers, each containing several self-attention mechanisms called multi-head attention layers. These layers allow the model to attend to different aspects or subspaces of the input sequence simultaneously. Additionally, the Transformer includes position-wise feedforward networks (FFNs) for modeling local dependencies within the sequence.The success of the Transformer model has led to its widespread adoption in various natural language processing tasks and inspired further research into attention mechanisms and transformer architectures.

search_response is plain text instead of json.

tried mistral and qwen2:72b, got the same result.

@zmansoft
Copy link

zmansoft commented Aug 22, 2024

Seems some llms(mistral or qwen2:72b) cannot follow JSON format when \n\nThe response should be JSON formatted as follows:\n{\n \"points\": [\n {\"description\": \"Description of point 1 [Data: Reports (report ids)]\", \"score\": score_value},\n {\"description\": \"Description of point 2 [Data: Reports (report ids)]\", \"score\": score_value}\n ]\n is in role system's content.

so I add it to role user's content here simply like this :
search_messages = [ {"role": "system", "content": search_prompt}, {"role": "user", "content": query+'\n\nThe response should be JSON formatted as follows:\n{\n \"points\": [\n {\"description\": \"Description of point 1 [Data: Reports (report ids)]\", \"score\": score_value},\n {\"description\": \"Description of point 2 [Data: Reports (report ids)]\", \"score\": score_value}\n ]\n'}, ]
It's a little ugly but it really works :)
@TheAiSingularity

@galen36123612
Copy link
Author

Thanks for the help! I really appreciate it.

@henrylee0324
Copy link

The reason this error occur is because mistral is too bad to understand our demand. It cannot output json format.
The solution is to change mistral into llama3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants