[Bug]: Launch service from source, bash ./entrypoint.sh failed. #992

simpleyin · 2024-05-30T06:33:04Z

Is there an existing issue for the same bug?

I have checked the existing issues.

Branch name

main

Commit ID

daa4799

Other environment information

Current Repo: ragflow
Commit Id: daa4799
Operating system: centos 8 (Kernel version: 4.18.0-348.el8.x86_64)
CPU Type: x86_64
Memory: 125Gi
Docker Version: 26.1.3,
Python Version: 3.11.0

Actual behavior

followe the guide, after running the entrypoint.sh:

[WARNING] Load term.freq FAIL!
[WARNING] [2024-05-30 14:26:21,304] [synonym.__init__] [line:24]: Realtime synonym is disabled, since no redis connection.
[WARNING] [2024-05-30 14:26:21,590] [redis_conn.__open__] [line:44]: Redis can't be connected.
[WARNING] Load term.freq FAIL!
[WARNING] [2024-05-30 14:26:23,257] [synonym.__init__] [line:24]: Realtime synonym is disabled, since no redis connection.
Traceback (most recent call last):
  File "/root/rag/ragflow/api/ragflow_server.py", line 26, in <module>
    from api.apps import app
  File "/root/rag/ragflow/api/apps/__init__.py", line 92, in <module>
    client_urls_prefix = [
                         ^
  File "/root/rag/ragflow/api/apps/__init__.py", line 93, in <listcomp>
    register_page(path)
  File "/root/rag/ragflow/api/apps/__init__.py", line 78, in register_page
    spec.loader.exec_module(page)
  File "/root/rag/ragflow/api/apps/api_app.py", line 27, in <module>
    from api.db.services.dialog_service import DialogService, chat
  File "/root/rag/ragflow/api/db/services/dialog_service.py", line 23, in <module>
    from api.db.services.llm_service import LLMService, TenantLLMService, LLMBundle
  File "/root/rag/ragflow/api/db/services/llm_service.py", line 18, in <module>
    from rag.llm import EmbeddingModel, CvModel, ChatModel
  File "/root/rag/ragflow/rag/llm/__init__.py", line 17, in <module>
    from .chat_model import *
  File "/root/rag/ragflow/rag/llm/chat_model.py", line 22, in <module>
    from volcengine.maas.v2 import MaasService
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/volcengine/maas/__init__.py", line 1, in <module>
    from .MaasService import MaasService
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/volcengine/maas/MaasService.py", line 14, in <module>
    from .models.api.api_pb2 import ChatResp
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/volcengine/maas/models/api/api_pb2.py", line 16, in <module>
    from .. import base_pb2 as base__pb2
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/volcengine/maas/models/base_pb2.py", line 30, in <module>
    raw_body = _descriptor.FieldDescriptor(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/google/protobuf/descriptor.py", line 553, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
Traceback (most recent call last):
  File "/root/rag/ragflow/rag/svr/task_executor.py", line 48, in <module>
    from rag.app import laws, paper, presentation, manual, qa, table, book, resume, picture, naive, one
  File "/root/rag/ragflow/rag/app/resume.py", line 23, in <module>
    from deepdoc.parser.resume import step_one, step_two
  File "/root/rag/ragflow/deepdoc/parser/resume/step_two.py", line 5, in <module>
    from deepdoc.parser.resume.entities import degrees, schools, corporations
  File "/root/rag/ragflow/deepdoc/parser/resume/entities/corporations.py", line 52, in <module>
    GOOD_CORP = set([corpNorm(rmNoise(c), False) for c in GOOD_CORP])
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/rag/ragflow/deepdoc/parser/resume/entities/corporations.py", line 52, in <listcomp>
    GOOD_CORP = set([corpNorm(rmNoise(c), False) for c in GOOD_CORP])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/rag/ragflow/deepdoc/parser/resume/entities/corporations.py", line 32, in corpNorm
    tks = rag_tokenizer.tokenize(nm).split(" ")
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/rag/ragflow/rag/nlp/rag_tokenizer.py", line 249, in tokenize
    return " ".join([self.stemmer.stem(self.lemmatizer.lemmatize(t)) for t in word_tokenize(line)])
                                                                              ^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 129, in word_tokenize
    sentences = [text] if preserve_line else sent_tokenize(text, language)
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
    tokenizer = load(f"tokenizers/punkt/{language}.pickle")
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/nltk/data.py", line 750, in load
    opened_resource = _open(resource_url)
                      ^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/nltk/data.py", line 876, in _open
    return find(path_, path + [""]).open()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/anaconda3/envs/ragflow/lib/python3.11/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/root/nltk_data'
    - '/root/anaconda3/envs/ragflow/nltk_data'
    - '/root/anaconda3/envs/ragflow/share/nltk_data'
    - '/root/anaconda3/envs/ragflow/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************

Expected behavior

ragflow api server running successfully.

Steps to reproduce

bash ./entrypoint.sh

Additional information

No response

The text was updated successfully, but these errors were encountered:

simpleyin · 2024-05-30T08:20:10Z

it seems like my volcengine is 1.098, problem solved adter updated to lateset version.

### What problem does this PR solve? #992 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)

Old-Lane · 2024-06-24T09:49:27Z

I upgraded volcengine but the problem was not solved

dosometingbyme · 2024-09-18T14:45:20Z

我也遇到了这样的问题

dosometingbyme · 2024-09-18T14:47:17Z

(rag-flow) andrew@node01:~/ragflow$ sudo bash entrypoint.sh
Traceback (most recent call last):
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 84, in __load
root = nltk.data.find(f"{self.subdir}/{zip_name}")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find
raise LookupError(resource_not_found)
LookupError:

Resource wordnet not found.
Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet.zip/wordnet/

Searched in:
- '/root/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/andrew/ragflow/api/ragflow_server.py", line 26, in
from api.apps import app
File "/home/andrew/ragflow/api/apps/init.py", line 26, in
from api.db.db_models import close_connection
File "/home/andrew/ragflow/api/db/db_models.py", line 33, in
from api.settings import DATABASE, stat_logger, SECRET_KEY, DATABASE_TYPE
File "/home/andrew/ragflow/api/settings.py", line 36, in
from rag.nlp import search
File "/home/andrew/ragflow/rag/nlp/init.py", line 21, in
from . import rag_tokenizer
File "/home/andrew/ragflow/rag/nlp/rag_tokenizer.py", line 26, in
from nltk import word_tokenize
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/init.py", line 153, in
from nltk.translate import *
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/translate/init.py", line 24, in
from nltk.translate.meteor_score import meteor_score as meteor
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/translate/meteor_score.py", line 14, in
from nltk.stem.api import StemmerI
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/init.py", line 34, in
from nltk.stem.wordnet import WordNetLemmatizer
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/wordnet.py", line 13, in
class WordNetLemmatizer:
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/wordnet.py", line 48, in WordNetLemmatizer
morphy = wn.morphy
^^^^^^^^^
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 120, in getattr
self.__load()
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 86, in __load
raise e
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 81, in __load
root = nltk.data.find(f"{self.subdir}/{self.__name}")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find
raise LookupError(resource_not_found)
LookupError:

Resource wordnet not found.
Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet

Searched in:
- '/root/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'

Traceback (most recent call last):
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 84, in __load
root = nltk.data.find(f"{self.subdir}/{zip_name}")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find
raise LookupError(resource_not_found)
LookupError:

Resource wordnet not found.
Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet.zip/wordnet/

Searched in:
- '/root/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data'
- '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/andrew/ragflow/rag/svr/task_executor.py", line 29, in
from api.db.services.file2document_service import File2DocumentService
File "/home/andrew/ragflow/api/db/services/init.py", line 18, in
from .user_service import UserService
File "/home/andrew/ragflow/api/db/services/user_service.py", line 22, in
from api.db.db_models import DB, UserTenant
File "/home/andrew/ragflow/api/db/db_models.py", line 33, in
from api.settings import DATABASE, stat_logger, SECRET_KEY, DATABASE_TYPE
File "/home/andrew/ragflow/api/settings.py", line 36, in
from rag.nlp import search
File "/home/andrew/ragflow/rag/nlp/init.py", line 21, in
from . import rag_tokenizer
File "/home/andrew/ragflow/rag/nlp/rag_tokenizer.py", line 26, in
from nltk import word_tokenize

Rid7 · 2024-09-19T03:12:33Z

(rag-flow) andrew@node01:~/ragflow$ sudo bash entrypoint.sh Traceback (most recent call last): File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 84, in __load root = nltk.data.find(f"{self.subdir}/{zip_name}") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find raise LookupError(resource_not_found) LookupError:

Resource wordnet not found. Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet.zip/wordnet/

Searched in: - '/root/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/andrew/ragflow/api/ragflow_server.py", line 26, in from api.apps import app File "/home/andrew/ragflow/api/apps/init.py", line 26, in from api.db.db_models import close_connection File "/home/andrew/ragflow/api/db/db_models.py", line 33, in from api.settings import DATABASE, stat_logger, SECRET_KEY, DATABASE_TYPE File "/home/andrew/ragflow/api/settings.py", line 36, in from rag.nlp import search File "/home/andrew/ragflow/rag/nlp/init.py", line 21, in from . import rag_tokenizer File "/home/andrew/ragflow/rag/nlp/rag_tokenizer.py", line 26, in from nltk import word_tokenize File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/init.py", line 153, in from nltk.translate import * File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/translate/init.py", line 24, in from nltk.translate.meteor_score import meteor_score as meteor File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/translate/meteor_score.py", line 14, in from nltk.stem.api import StemmerI File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/init.py", line 34, in from nltk.stem.wordnet import WordNetLemmatizer File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/wordnet.py", line 13, in class WordNetLemmatizer: File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/stem/wordnet.py", line 48, in WordNetLemmatizer morphy = wn.morphy ^^^^^^^^^ File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 120, in getattr self.__load() File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 86, in __load raise e File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 81, in __load root = nltk.data.find(f"{self.subdir}/{self.__name}") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find raise LookupError(resource_not_found) LookupError:

Resource wordnet not found. Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet

Searched in: - '/root/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data'

Traceback (most recent call last): File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/corpus/util.py", line 84, in __load root = nltk.data.find(f"{self.subdir}/{zip_name}") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/andrew/anaconda3/envs/rag-flow/lib/python3.11/site-packages/nltk/data.py", line 579, in find raise LookupError(resource_not_found) LookupError:

Resource wordnet not found. Please use the NLTK Downloader to obtain the resource:

import nltk
nltk.download('wordnet')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/wordnet.zip/wordnet/

Searched in: - '/root/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/share/nltk_data' - '/home/andrew/anaconda3/envs/rag-flow/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/andrew/ragflow/rag/svr/task_executor.py", line 29, in from api.db.services.file2document_service import File2DocumentService File "/home/andrew/ragflow/api/db/services/init.py", line 18, in from .user_service import UserService File "/home/andrew/ragflow/api/db/services/user_service.py", line 22, in from api.db.db_models import DB, UserTenant File "/home/andrew/ragflow/api/db/db_models.py", line 33, in from api.settings import DATABASE, stat_logger, SECRET_KEY, DATABASE_TYPE File "/home/andrew/ragflow/api/settings.py", line 36, in from rag.nlp import search File "/home/andrew/ragflow/rag/nlp/init.py", line 21, in from . import rag_tokenizer File "/home/andrew/ragflow/rag/nlp/rag_tokenizer.py", line 26, in from nltk import word_tokenize

After execute pip install nltk==3.8 in environment of this project, you should avoid the error above.

qinguangxu · 2024-09-19T03:44:13Z

安装最新的nltk就不报错了

dosometingbyme · 2024-10-09T08:25:21Z

感谢您 ---- Replied Message ---- ***@***.***>Date9/19/2024 ***@***.***>***@***.***>, ***@***.***>SubjectRe: [infiniflow/ragflow] [Bug]: Launch service from source, bash ./entrypoint.sh failed. (Issue #992) 安装最新的nltk就不报错了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

### What problem does this PR solve? infiniflow#992 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)

simpleyin added the bug Something isn't working label May 30, 2024

simpleyin closed this as completed May 30, 2024

KevinHuSh mentioned this issue Jun 5, 2024

add version to package volcengine #1062

Merged

1 task

KevinHuSh added a commit that referenced this issue Jun 5, 2024

add version to package volcengine (#1062)

b6980d8

### What problem does this PR solve? #992 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)

Halfknow pushed a commit to Halfknow/ragflow that referenced this issue Nov 11, 2024

add version to package volcengine (infiniflow#1062)

7e3bfea

### What problem does this PR solve? infiniflow#992 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Launch service from source, bash ./entrypoint.sh failed. #992

[Bug]: Launch service from source, bash ./entrypoint.sh failed. #992

simpleyin commented May 30, 2024

simpleyin commented May 30, 2024

Old-Lane commented Jun 24, 2024

dosometingbyme commented Sep 18, 2024

dosometingbyme commented Sep 18, 2024

Rid7 commented Sep 19, 2024 •

edited

Loading

qinguangxu commented Sep 19, 2024

dosometingbyme commented Oct 9, 2024 via email

[Bug]: Launch service from source, bash ./entrypoint.sh failed. #992

[Bug]: Launch service from source, bash ./entrypoint.sh failed. #992

Comments

simpleyin commented May 30, 2024

Is there an existing issue for the same bug?

Branch name

Commit ID

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

simpleyin commented May 30, 2024

Old-Lane commented Jun 24, 2024

dosometingbyme commented Sep 18, 2024

dosometingbyme commented Sep 18, 2024

Rid7 commented Sep 19, 2024 • edited Loading

qinguangxu commented Sep 19, 2024

dosometingbyme commented Oct 9, 2024 via email

Rid7 commented Sep 19, 2024 •

edited

Loading