Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

新增特性:使用Reranker模型对召回语句进行重排 #2435

Merged
merged 2 commits into from
Dec 21, 2023
Merged

Conversation

hzg0601
Copy link
Collaborator

@hzg0601 hzg0601 commented Dec 21, 2023

No description provided.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 21, 2023
@hzg0601 hzg0601 merged commit d77f778 into dev Dec 21, 2023
@hzg0601 hzg0601 deleted the reranker branch December 22, 2023 04:10
youfak pushed a commit to youfak/Langchain-Chatchat that referenced this pull request Jan 3, 2024
commit ad0f18e
Merge: 9dd7d23 61bc815
Author: Bad Child <[email protected]>
Date:   Wed Jan 3 10:08:24 2024 +0800

    Merge branch 'chatchat-space:master' into master

commit 61bc815
Author: liunux4odoo <[email protected]>
Date:   Tue Jan 2 11:26:03 2024 +0800

    fix: Chinese comma in requirements (chatchat-space#2523)

commit 7d4a6b5
Author: liunux4odoo <[email protected]>
Date:   Tue Jan 2 09:54:23 2024 +0800

    fix: ApiRequest.agent_chat 应当返回 dict 而非 str (chatchat-space#2520)

commit 3c33ca7
Author: imClumsyPanda <[email protected]>
Date:   Sun Dec 31 20:15:35 2023 +0800

    Release v0.2.9

commit f1ae95c
Author: imClumsyPanda <[email protected]>
Date:   Sun Dec 31 20:14:01 2023 +0800

    fix typos

commit 719e271
Author: imClumsyPanda <[email protected]>
Date:   Sun Dec 31 20:13:14 2023 +0800

    fix typos

commit 349de9b
Merge: c179230 e6c376f
Author: imClumsyPanda <[email protected]>
Date:   Sun Dec 31 19:25:01 2023 +0800

    Merge branch 'master' into dev

commit e6c376f
Author: imClumsyPanda <[email protected]>
Date:   Sun Dec 31 19:24:42 2023 +0800

    update pics

commit c179230
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 29 09:44:37 2023 +0800

    remove /chat/fastchat API endpoint (chatchat-space#2506)

commit 3b28f40
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 29 09:31:16 2023 +0800

    update requirements: 统一各文件依赖顺序,便于比对;移出streamlit-antd-components; jq默认安装; numexpr 限定 2.8.6 以兼容 py38

commit 5cccd5e
Merge: 9ff7bef af38f75
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 29 09:10:01 2023 +0800

    merge from master

commit af38f75
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 28 15:50:30 2023 +0800

    Update README.md

commit a8f94dd
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 28 15:49:59 2023 +0800

    Add files via upload

commit 1f3a32e
Author: liunux4odoo <[email protected]>
Date:   Thu Dec 28 07:57:25 2023 +0800

    fix Yi-34b model config error(close chatchat-space#2491) (chatchat-space#2492)

commit 9ff7bef
Author: liunux4odoo <[email protected]>
Date:   Tue Dec 26 13:44:36 2023 +0800

    新功能:知识库管理界面支持查看、编辑、删除向量库文档 (chatchat-space#2471)

    * 新功能:
    - 知识库管理界面支持查看、编辑、删除向量库文档。暂不支持增加(aggrid添加新行比较麻烦,需要另外实现)
    - 去除知识库管理界面中重建知识库和删除知识库按钮,建议用户到终端命令操作

    修复:
    - 所有与知识库名称、文件名称有关的数据库操作函数都改成大小写不敏感,所有路径统一为 posix 风格,避免因路径文本不一致导致数据重复和操作失效 (close chatchat-space#2232)

    开发者:
    - 添加 update_docs_by_id 函数与 API 接口。当前仅支持 FAISS,暂时未用到,未将来对知识库做更细致的修改做准备
    - 统一 DocumentWithScore 与 DocumentWithVsId
    - FAISS 返回的 Document.metadata 中包含 ID, 方便后续查找比对
    - /knowledge_base/search_docs 接口支持 file_name, metadata 参数,可以据此检索文档

    * fix bug

commit 2e1442a
Author: zR <[email protected]>
Date:   Sat Dec 23 11:36:11 2023 +0800

    修复 Langchain 更新之后OpenAI在线embed无法正常使用问题

commit 4e5bc8b
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 22 10:17:00 2023 +0800

    修复: zhipu-api 请求出错时返回准确的错误信息

commit 4e69033
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 21 22:19:41 2023 +0800

    Update README.md

commit 778d2d9
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 21 22:19:12 2023 +0800

    Add files via upload

commit d77f778
Merge: 60510ff 129c765
Author: Zhi-guo Huang <[email protected]>
Date:   Thu Dec 21 19:06:59 2023 +0800

    Merge pull request chatchat-space#2435 from chatchat-space/reranker

    新增特性:使用Reranker模型对召回语句进行重排

commit 129c765
Author: hzg0601 <[email protected]>
Date:   Thu Dec 21 19:05:11 2023 +0800

    新增特性:reranker对向量召回文本进行重排

commit 5891f94
Author: hzg0601 <[email protected]>
Date:   Thu Dec 21 16:04:15 2023 +0800

    temporarily add reranker

commit 60510ff
Author: Zhi-guo Huang <[email protected]>
Date:   Wed Dec 20 13:33:00 2023 +0800

    Update model_config.py.example

commit c1a32d9
Author: Funkeke <[email protected]>
Date:   Wed Dec 20 08:40:53 2023 +0800

    fix:使用在线embedding模型时 报错 There is no current event loop in thread 'Any… (chatchat-space#2393)

    * fix:使用在线embedding模型时 报错 There is no current event loop in thread 'AnyIO worker thread'

    * 动态配置在线embbding模型

    ---------

    Co-authored-by: fangkeke <[email protected]>

commit fdea406
Author: liunux4odoo <[email protected]>
Date:   Tue Dec 19 15:59:41 2023 +0800

    update requirements: 统一各文件依赖顺序,便于比对;移出streamlit-antd-components; jq默认安装

commit 9dd7d23
Merge: 53f048d bba4754
Author: Bad Child <[email protected]>
Date:   Tue Dec 19 14:52:01 2023 +0800

    Merge branch 'chatchat-space:master' into master

commit 53f048d
Author: youfak <[email protected]>
Date:   Tue Dec 19 14:47:56 2023 +0800

    增加全局共享变量"LLM_MODEL_NAME",用于存储在agent_chat.py经get_ChatOpenAI方法初始化前的model_name
    修复chatGpt使用agent模版时本地知识库有相关数据但没有数据问题

commit bba4754
Author: imClumsyPanda <[email protected]>
Date:   Mon Dec 18 15:21:27 2023 +0800

    Update README.md

commit 9fdeb47
Author: imClumsyPanda <[email protected]>
Date:   Mon Dec 18 15:20:58 2023 +0800

    Add files via upload

commit a870076
Author: huangzhiguo <[email protected]>
Date:   Fri Dec 15 14:23:34 2023 +0800

    在model_config.py.example中增加qwen量化模型启动的说明

commit 7e01e82
Author: Astlvk <[email protected]>
Date:   Fri Dec 15 07:54:36 2023 +0800

    fixed 迭代器参数传递错误,知识库问答报错TypeError: unhashable type: 'list' (chatchat-space#2383)

    Co-authored-by: liunux4odoo <[email protected]>

commit 7e8391e
Author: xldistance <[email protected]>
Date:   Fri Dec 15 07:53:36 2023 +0800

    修复knowledge_base_chat_iterator传参错误 (chatchat-space#2386)

commit 332e1cc
Author: xldistance <[email protected]>
Date:   Thu Dec 14 19:56:39 2023 +0800

    更新self.dims_length赋值错误 (chatchat-space#2380)

commit e7410e4
Author: jaluik <[email protected]>
Date:   Thu Dec 14 16:32:05 2023 +0800

    fix: 文档错误 (chatchat-space#2384)

commit 1cbad32
Author: imClumsyPanda <[email protected]>
Date:   Wed Dec 13 18:56:43 2023 +0800

    Update README.md

commit f45d6ab
Author: imClumsyPanda <[email protected]>
Date:   Wed Dec 13 18:56:20 2023 +0800

    Add files via upload

commit 9c5b81c
Author: lookou <[email protected]>
Date:   Wed Dec 13 17:02:27 2023 +0800

    优化EventSource回包 (chatchat-space#1200)

    通过 sse_startllete 支持 JS 前端 SSE 协议 (Close chatchat-space#2333)

    Co-authored-by: liunux4odoo <[email protected]>

commit 472a97a
Author: liunux4odoo <[email protected]>
Date:   Wed Dec 13 16:08:58 2023 +0800

    所有 chat 接口都改为 EventSourceResponse;ApiRequest 作对应修改

commit c8fef33
Author: liunux4odoo <[email protected]>
Date:   Wed Dec 13 16:52:40 2023 +0800

    merge from dev

commit db008c1
Author: hzg0601 <[email protected]>
Date:   Wed Dec 13 15:52:11 2023 +0800

    为milvus增加额外索引与搜索配置

commit 2604c9e
Author: liunux4odoo <[email protected]>
Date:   Tue Dec 12 21:12:33 2023 +0800

    fix: prompt template name error in file_chat (chatchat-space#2366)

commit 10d8f59
Author: youfak <[email protected]>
Date:   Mon Dec 11 18:00:59 2023 +0800

    修复
    langchain==0.0.344
    langchain-experimental>=0.0.42
    使用openai初始化向量数据库的时候出错

commit 95c0998
Author: hzg0601 <[email protected]>
Date:   Mon Dec 11 11:22:39 2023 +0800

    给出chatglm3-6b输出角色标签<|user|>等及自问自答的解决方案

commit db1c1e2
Author: hzg0601 <[email protected]>
Date:   Mon Dec 11 10:39:59 2023 +0800

    解决多次调用es创建索引失败的问题

commit 2e93198
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 8 15:04:08 2023 +0800

    support .xls files

commit 7b70776
Author: liunux4odoo <[email protected]>
Date:   Fri Dec 8 13:59:55 2023 +0800

    fix: doc ext name error in LOADER_DICT

commit fdd6eb5
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 7 16:07:24 2023 +0800

    Update README.md

commit bcbeb9d
Author: imClumsyPanda <[email protected]>
Date:   Thu Dec 7 16:07:00 2023 +0800

    Add files via upload

commit ad0b133
Author: hzg0601 <[email protected]>
Date:   Wed Dec 6 21:57:59 2023 +0800

    解决faiss相似度阈值不在0-1之间的问题

commit 4c2fda7
Merge: d1f94c2 1fac51f
Author: hzg0601 <[email protected]>
Date:   Wed Dec 6 20:42:32 2023 +0800

    Merge branch 'dev' of https://github.com/chatchat-space/Langchain-Chatchat into dev

commit d1f94c2
Author: hzg0601 <[email protected]>
Date:   Wed Dec 6 20:42:27 2023 +0800

    temporarily add reranker

commit 1fac51f
Author: hzg0601 <[email protected]>
Date:   Wed Dec 6 09:45:56 2023 +0000

    temporarily save faiss_cache

commit 67b7c99
Author: liunux4odoo <[email protected]>
Date:   Mon Dec 4 09:39:56 2023 +0800

    ocr 支持 GPU 加速(需要手动安装 rapidocr_paddle[gpu]);知识库支持 MHTML 和 Evernote 文件。 (chatchat-space#2265)

    在 requirements 和 Wiki 中增加对可选文档加载器 SDK 的说明 ( close chatchat-space#2264 )

commit 7d2de47
Author: liunux4odoo <[email protected]>
Date:   Sat Dec 2 19:22:44 2023 +0800

    文件对话和知识库对话 API 接口实现全异步操作,防止阻塞 (chatchat-space#2256)

    * EmbeddingFunAdapter 支持异步操作;文件对话和知识库对话 API 接口实现全异步操作,防止阻塞

    * 修复: 使 list_files_from_folder 返回相对路径

commit dcb7698
Author: zR <[email protected]>
Date:   Sat Dec 2 16:50:56 2023 +0800

    修复Azure 不设置Max token的bug (chatchat-space#2254)

commit 12113be
Author: liunux4odoo <[email protected]>
Date:   Sat Dec 2 10:52:29 2023 +0800

    在startup中自动执行 create_tables, 确保数据库表被创建

# Conflicts:
#	document_loaders/ocr.py
#	server/agent/tools/search_knowledgebase_complex.py
#	server/chat/knowledge_base_chat.py
#	server/knowledge_base/kb_service/es_kb_service.py
#	server/knowledge_base/utils.py
@helix-dan
Copy link

请问这个解决了诸如哪类问题?

@hzg0601
Copy link
Collaborator Author

hzg0601 commented Jan 18, 2024

对检索结果进行重排,有可能提高召回率和mrr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants