Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: etcd auto compaction not enabled in Milvus's Ubuntu package #16511

Closed
1 task done
schuberttobias opened this issue Apr 16, 2022 · 6 comments
Closed
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. stale indicates no udpates for 30 days

Comments

@schuberttobias
Copy link

schuberttobias commented Apr 16, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: Version: 2.0.2-1 from APT-Sources: http://ppa.launchpad.net/milvusdb/milvus/ubuntu bionic/main amd64 Packages
- Deployment mode(standalone or cluster): standalone
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus==2.0.1
- OS(Ubuntu or CentOS): Ubuntu 18.04.6 LTS
- CPU/Memory: AWS instance type c5a.4xlarge. AMD EPYC 7002 2nd generation / 32 GB 
- GPU: none
- Others:

Current Behavior

When I insert approximately 10 million vectors over the course of 3 days, Milvus crashes with the following log statement:

Apr 14 13:39:08 ip-10-0-84-244 milvus[18584]: panic: etcdserver: mvcc: database space exceeded

Expected Behavior

When I insert approximately 10 million vectors over the course of 3 days, Milvus shouldn't crash.

Steps To Reproduce

1. Install Milvus standalone on Ubuntu 18.04
2. Write and execute a Python program that inserts 10 million vectors over the course of 3 days.
3. Milvus will crash (see above).

Anything else?

The problem seems to be that etcd's quota-size-bytes (default=2GB) was exceeded and etcd's auto-compaction was not enabled.

The issue has been reported before (#5519, #6753) and as a fix the Milvus team enabled etcd's auto compaction for Docker and Helm (#5519 (comment)).

However, it is not enabled in the Milvus standalone Ubuntu package.

On startup milvus-etcd logs:

Apr 16 16:56:54 ... milvus-etcd[3742]: {... 
"quota-size-bytes":2147483648, 
"auto-compaction-mode":"periodic",
"auto-compaction-retention":"0s",
"auto-compaction-interval":"0s",
...}

Workaround

  1. change ExecStart in /lib/systemd/system/milvus-etcd.service to:
    ExecStart=/usr/bin/milvus-etcd --data-dir /var/lib/milvus/etcd-data --auto-compaction-retention '1000' --auto-compaction-mode 'revision' --quota-backend-bytes '4294967296'
  2. sudo systemctl milvus stop
  3. sudo systemctl milvus-etcd stop
  4. sudo systemctl daemon-reload
  5. sudo systemctl milvus start

Now milvus-etcd logs:

Apr 16 19:04:33 ... milvus-etcd[28307]: { ... 
"quota-size-bytes":4294967296,
,"auto-compaction-mode":"revision"
,"auto-compaction-retention":"1µs"
,"auto-compaction-interval":"1µs"
... }

(Note: 1µs is an implementation detail of etcd and apparently not a bug, see etcd-io/etcd#9337)

@schuberttobias schuberttobias added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 16, 2022
@schuberttobias schuberttobias changed the title [Bug]: etcd auto compaction not enabled in milvus's Ubuntu package [Bug]: etcd auto compaction not enabled in Milvus's Ubuntu package Apr 16, 2022
@schuberttobias
Copy link
Author

schuberttobias commented Apr 16, 2022

I looked at the auto compaction configuration in https://github.com/milvus-io/milvus/releases/download/v2.0.2/milvus-standalone-docker-compose.yml and updated my post above to use auto-compaction-mode=revision and a larger quota-size-bytes accordingly.

@xiaofan-luan
Copy link
Collaborator

@schuberttobias hi tobias, feel free to create a pr and @LoveEachDay will help on review it

@xiaofan-luan
Copy link
Collaborator

/assign @schuberttobias

@xiaofan-luan
Copy link
Collaborator

/assign @LoveEachDay

@yanliang567 yanliang567 removed their assignment Apr 18, 2022
@schuberttobias
Copy link
Author

@xiaofan-luan Thanks for your response, I'll set up a PR.

@stale
Copy link

stale bot commented May 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label May 19, 2022
@stale stale bot closed this as completed May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. stale indicates no udpates for 30 days
Projects
None yet
Development

No branches or pull requests

4 participants