Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added release notes for v0.15.0 #4056

Merged
merged 4 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@
| nightly | ≈9 | :heavy_check_mark: | *Unstable* nightly build |
| nightly-slim | ≈2 | ❌ | *Unstable* nightly build |

> [!TIP]
> [!TIP]
> 如果你遇到 Docker 镜像拉不下来的问题,可以在 **docker/.env** 文件内根据变量 `RAGFLOW_IMAGE` 的注释提示选择华为云或者阿里云的相应镜像。
> - 华为云镜像名:`swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow`
> - 阿里云镜像名:`registry.cn-hangzhou.aliyuncs.com/infiniflow/ragflow`
Expand Down
2 changes: 2 additions & 0 deletions docs/guides/deploy_local_llm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import TabItem from '@theme/TabItem';

Run models locally using Ollama, Xinference, or other frameworks.

---

RAGFlow supports deploying models locally using Ollama, Xinference, IPEX-LLM, or jina. If you have locally deployed models to leverage or wish to enable GPU or CUDA for inference acceleration, you can bind Ollama or Xinference into RAGFlow and use either of them as a local "server" for interacting with your local models.

RAGFlow seamlessly integrates with Ollama and Xinference, without the need for further environment configurations. You can use them to deploy two types of local models in RAGFlow: chat models and embedding models.
Expand Down
2 changes: 2 additions & 0 deletions docs/guides/run_health_check.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ slug: /run_health_check

Double-check the health status of RAGFlow's dependencies.

---

The operation of RAGFlow depends on four services:

- **Elasticsearch** (default) or [Infinity](https://github.com/infiniflow/infinity) as the document engine
Expand Down
9 changes: 5 additions & 4 deletions docs/references/http_api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1372,15 +1372,15 @@ curl --request POST \
- `"model_name"`, `string`
The chat model name. If not set, the user's default chat model will be used.
- `"temperature"`: `float`
Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
- `"top_p"`: `float`
Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
- `"presence_penalty"`: `float`
This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`.
- `"frequency penalty"`: `float`
Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
- `"max_token"`: `integer`
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
- `"prompt"`: (*Body parameter*), `object`
Instructions for the LLM to follow. If it is not explicitly set, a JSON object with the following values will be generated as the default. A `prompt` JSON object contains the following attributes:
- `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted reranking score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
Expand Down Expand Up @@ -1507,15 +1507,15 @@ curl --request PUT \
- `"model_name"`, `string`
The chat model name. If not set, the user's default chat model will be used.
- `"temperature"`: `float`
Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
- `"top_p"`: `float`
Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
- `"presence_penalty"`: `float`
This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`.
- `"frequency penalty"`: `float`
Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
- `"max_token"`: `integer`
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
- `"prompt"`: (*Body parameter*), `object`
Instructions for the LLM to follow. A `prompt` object contains the following attributes:
- `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
Expand Down Expand Up @@ -2149,6 +2149,7 @@ Failure:
---

## Create session with agent

*If there are parameters in the `begin` component, the session cannot be created in this way.*

**POST** `/api/v1/agents/{agent_id}/sessions`
Expand Down
8 changes: 4 additions & 4 deletions docs/references/python_api_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -950,15 +950,15 @@ The LLM settings for the chat assistant to create. Defaults to `None`. When the
- `model_name`: `str`
The chat model name. If it is `None`, the user's default chat model will be used.
- `temperature`: `float`
Controls the randomness of the model's predictions. A lower temperature increases the model's confidence in its responses; a higher temperature increases creativity and diversity. Defaults to `0.1`.
Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`.
- `top_p`: `float`
Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3`
- `presence_penalty`: `float`
This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`.
- `frequency penalty`: `float`
Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`.
- `max_token`: `int`
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.

#### prompt: `Chat.Prompt`

Expand Down Expand Up @@ -1016,11 +1016,11 @@ A dictionary representing the attributes to update, with the following keys:
- `"dataset_ids"`: `list[str]` The datasets to update.
- `"llm"`: `dict` The LLM settings:
- `"model_name"`, `str` The chat model name.
- `"temperature"`, `float` Controls the randomness of the model's predictions.
- `"temperature"`, `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses.
- `"top_p"`, `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from.
- `"presence_penalty"`, `float` This discourages the model from repeating the same information by penalizing words that have appeared in the conversation.
- `"frequency penalty"`, `float` Similar to presence penalty, this reduces the model’s tendency to repeat the same words.
- `"max_token"`, `int` The maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to `512`.
- `"max_token"`, `int` The maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to `512`. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.
- `"prompt"` : Instructions for the LLM to follow.
- `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`.
- `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`.
Expand Down
34 changes: 34 additions & 0 deletions docs/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,40 @@ slug: /release_notes

Key features, improvements and bug fixes in the latest releases.

## v0.15.0

Released on December 18, 2024.

### New features

- Introduces additional Agent-specific APIs.
writinwaters marked this conversation as resolved.
Show resolved Hide resolved
- Supports using page rank score to improve retrieval performance when searching across multiple knowledge bases.
- Offers an iframe in Chat and Agent to facilitate the integration of RAGFlow into your webpage.
- Adds a Helm chart for deploying RAGFlow on Kubernetes.
- Supports importing or exporting an agent in JSON format.
- Supports stepping for Agent components/tools.
writinwaters marked this conversation as resolved.
Show resolved Hide resolved
- Adds a new UI language (*contributed by the community*): Japanese.
writinwaters marked this conversation as resolved.
Show resolved Hide resolved
- Supports resuming GraphRAG and RAPTOR from a failure, enhancing task management resilience.
- Adds more Mistral models.
- Adds a dark mode to the UI, allowing users to toggle between light and dark themes.

### Improvements

- Upgrades document layout recognition models for Deepdoc.
- Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine.

### Related APIs

#### HTTP APIs

- [List agent sessions](https://ragflow.io/docs/dev/http_api_reference#list-agent-sessions)
- [List agents](https://ragflow.io/docs/dev/http_api_reference#list-agents)

#### Python APIs

- [List agent sessions](https://ragflow.io/docs/dev/python_api_reference#list-agent-sessions)
- [List agents](https://ragflow.io/docs/dev/python_api_reference#list-agents)

## v0.14.1

Released on November 29, 2024.
Expand Down
8 changes: 4 additions & 4 deletions web/src/locales/en.ts
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ export default {
toMessage: 'Missing end page number (excluded)',
layoutRecognize: 'Layout recognition',
layoutRecognizeTip:
'Use visual models for layout analysis to better understand the structure of the document and effectively locate document titles, text blocks, images, and tables. If disabled, only the plain text from the PDF will be retrieved.',
'Use visual models for layout analysis to better understand the structure of the document and effectively locate document titles, text blocks, images, and tables. If disabled, only the plain text in the PDF will be retrieved.',
taskPageSize: 'Task page size',
taskPageSizeMessage: 'Please input your task page size!',
taskPageSizeTip: `During layout recognition, a PDF file is split into chunks and processed in parallel to increase processing speed. This parameter sets the size of each chunk. A larger chunk size reduces the likelihood of splitting continuous text between pages.`,
Expand Down Expand Up @@ -398,7 +398,7 @@ The above is the content you need to summarize.`,
'Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently.',
maxTokens: 'Max tokens',
maxTokensMessage: 'Max tokens is required',
maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to 512.`,
maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to 512. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.`,
maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
quote: 'Show quote',
Expand Down Expand Up @@ -430,7 +430,7 @@ The above is the content you need to summarize.`,
partialTitle: 'Partial Embed',
extensionTitle: 'Chrome Extension',
tokenError: 'Please create API Token first!',
betaError: 'Please apply an API key in system setting firstly.',
betaError: 'Please acquire a RAGFlow API key from the System Settings page first.',
searching: 'Searching...',
parsing: 'Parsing',
uploading: 'Uploading',
Expand All @@ -453,7 +453,7 @@ The above is the content you need to summarize.`,
profileDescription: 'Update your photo and personal details here.',
maxTokens: 'Max Tokens',
maxTokensMessage: 'Max Tokens is required',
maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses. Defaults to 512.`,
maxTokensTip: `This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). Defaults to 512. If disabled, you lift the maximum token limit, allowing the model to determine the number of tokens in its responses.`,
maxTokensInvalidMessage: 'Please enter a valid number for Max Tokens.',
maxTokensMinMessage: 'Max Tokens cannot be less than 0.',
password: 'Password',
Expand Down