Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is some bug,Exception raised in Job[1]: ValidationError(1 validation error for ChatMessage role Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] #1782

Open
Terran0629 opened this issue Dec 21, 2024 · 0 comments
Labels
bug Something isn't working module-metrics this is part of metrics module

Comments

@Terran0629
Copy link

Terran0629 commented Dec 21, 2024

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
When I tried to use Ragas to evaluate and test a piece of text, I encountered the following error message
Exception raised in Job[1]: ValidationError(1 validation error for ChatMessage role Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]

Ragas version: >2.6.2
Python version: 3.10

Code to Reproduce

from langchain.schema import Document
from dotenv import load_dotenv
load_dotenv()
import os
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI
from langchain.chains.retrieval import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.prompts import ChatPromptTemplate
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
)

origin_data = {
    "ID": "64fa9b27b82641eb8ecbe14c",
    "event": "2023年7月28日,国家卫生健康委在全国范围内开展“启明行动”——防控儿童青少年近视健康促进活动,发布《防控儿童青少年近视核心知识十条》。",
    "news1": "2023-07-28 10:14:27作者:白剑峰来源:人民日报    ,正文:为在全社会形成重视儿童眼健康的良好氛围,持续推进综合防控儿童青少年近视工作落实,国家卫生健康委决定在全国持续开展“启明行动”——防控儿童青少年近视健康促进活动,并发布了《防控儿童青少年近视核心知识十条》。本次活动的主题为:重视儿童眼保健,守护孩子明眸“视”界。强调预防为主,推动关口前移,倡导和推动家庭及全社会共同行动起来,营造爱眼护眼的视觉友好环境,共同呵护好孩子的眼睛,让他们拥有一个光明的未来。国家卫生健康委要求,开展社会宣传和健康教育。充分利用网络、广播电视、报纸杂志、海报墙报、培训讲座等多种形式,广泛开展宣传倡导,向社会公众传播开展儿童眼保健、保护儿童视力健康的重要意义,以《防控儿童青少年近视核心知识十条》为重点普及预防近视科学知识。创新健康教育方式和载体,开发制作群众喜闻乐见的健康教育科普作品,利用互联网媒体扩大传播效果,提高健康教育的针对性、精准性和实效性。指导相关医疗机构将儿童眼保健和近视防控等科学知识纳入孕妇学校、家长课堂内容。开展儿童眼保健及视力检查咨询指导。医疗机构要以儿童家长和养育人为重点,结合眼保健和眼科临床服务,开展个性化咨询指导。要针对儿童常见眼病和近视防控等重点问题,通过面对面咨询指导,引导儿童家长树立近视防控意识,改变不良生活方式,加强户外活动,养成爱眼护眼健康行为习惯。提高儿童眼保健专科服务能力。各地要积极推进儿童眼保健专科建设,扎实组织好妇幼健康职业技能竞赛“儿童眼保健”项目,推动各层级开展比武练兵,提升业务能力。",
    "questions": "国家卫生健康委在2023年7月28日开展的“启明行动”是为了防控哪个群体的哪种健康问题,并请列出活动发布的指导性文件名称。",
    "answers": "启明行动”是为了防控儿童青少年的近视问题,并发布了《防控儿童青少年近视核心知识十条》。"
}

docs = [Document(page_content=origin_data['news1'], metadata={'source': origin_data['ID']})]

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=256,
    chunk_overlap=50,
)
split_docs = text_splitter.split_documents(docs)

model_name = r'D:\Visual Studio Code\LLM\_LLM06期资料\09Advanced-RAG通用文档分享助手\基于langchain的文档分析助手\BAAI\bge-large-zh-v1.5'
encode_kwargs = {'normalize_embeddings': True}

embeddings = HuggingFaceBgeEmbeddings(
    model_name=model_name,
    encode_kwargs=encode_kwargs,
)

vectordb = FAISS.from_documents(split_docs, embeddings)
index_folder_path = r".\data\faiss_index"
index_name = "01"

vectordb.save_local(index_folder_path, index_name)
vectordb = FAISS.load_local(index_folder_path, embeddings, index_name, allow_dangerous_deserialization=True)
retriever = vectordb.as_retriever(search_kwargs={"k": 2})

llm = ChatOpenAI(temperature=0)
prompt = ChatPromptTemplate.from_messages(["""仅根据提供的上下文回答以下问题:
<context>
{context}
</context>

问题:{input}"""])

question_answer_chain = create_stuff_documents_chain(llm, prompt)
chain = create_retrieval_chain(retriever, question_answer_chain)
response = chain.invoke({"input": origin_data['questions']})
print(response)
print('---------------------------------------------')

evaluate_data = {
    "question": [origin_data['questions']],  # 确保是列表
    "answer": [response['answer']],  # 确保为字符串
    "contexts": [[doc.page_content for doc in response['context']]],  # 合并上下文
    "ground_truth": [origin_data['answers']]  # 确保为字符串
}

print(evaluate_data)
evaluate_dataset = Dataset.from_dict(evaluate_data)

evaluate_result = evaluate(
    evaluate_dataset, 
    metrics=[
        faithfulness,
        answer_relevancy,
        context_recall,
        context_precision,
    ]
)

print(evaluate_result)

Error trace

Evaluating:   0%|                                                                                                                            | 0/4 [00:00<?, ?it/s]Exception raised in Job[3]: ValidationError(1 validation error for ChatMessage
role
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type)
Evaluating:  25%|█████████████████████████████                                                                                       | 1/4 [00:04<00:14,  4.73s/it]Exception raised in Job[2]: ValidationError(1 validation error for ChatMessage
role
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type)
Evaluating:  50%|██████████████████████████████████████████████████████████                                                          | 2/4 [00:05<00:04,  2.32s/it]Exception raised in Job[0]: ValidationError(1 validation error for ChatMessage
role
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type)
Evaluating:  75%|███████████████████████████████████████████████████████████████████████████████████████                             | 3/4 [00:06<00:01,  1.66s/it]Exception raised in Job[1]: ValidationError(1 validation error for ChatMessage
role
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/string_type)
Evaluating: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:06<00:00,  1.67s/it] 
Traceback (most recent call last):
  File "d:\Visual Studio Code\LLM\_LLM05期资料\17-langchain和RAG进阶-24.11.13-于老师\01-简单RAG.py", line 79, in <module>
    evaluate_result = evaluate(
  File "D:\Anaconda\envs\DL_env\lib\site-packages\ragas\_analytics.py", line 205, in wrapper
    result = func(*args, **kwargs)
  File "D:\Anaconda\envs\DL_env\lib\site-packages\ragas\evaluation.py", line 333, in evaluate
    result = EvaluationResult(
  File "<string>", line 10, in __init__
  File "D:\Anaconda\envs\DL_env\lib\site-packages\ragas\dataset_schema.py", line 410, in __post_init__
    self.traces = parse_run_traces(self.ragas_traces, run_id)
  File "D:\Anaconda\envs\DL_env\lib\site-packages\ragas\callbacks.py", line 167, in parse_run_traces
    "output": prompt_trace.outputs.get("output", {})[0],
KeyError: 0

Additional context
Preliminary judgment is that it is a version issue. When I use Python version 3.10 and the following library versions, the code can run normally:
langchain==0.2.5
langchain-community==0.2.5
langchain-core==0.2.7
langchain-openai==0.1.8
langchain-text-splitters==0.2.1
langdetect==1.0.9
langsmith==0.1.77
pydantic==2.0.2
pydantic_core==2.1.2
ragas==0.1.9

@Terran0629 Terran0629 added the bug Something isn't working label Dec 21, 2024
@Terran0629 Terran0629 changed the title Exception raised in Job[1]: ValidationError(1 validation error for ChatMessage role Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] There is some bug,Exception raised in Job[1]: ValidationError(1 validation error for ChatMessage role Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] Dec 21, 2024
@sahusiddharth sahusiddharth added the module-metrics this is part of metrics module label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module-metrics this is part of metrics module
Projects
None yet
Development

No branches or pull requests

2 participants