Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support Truncate collection #26280

Open
1 task done
xiaofan-luan opened this issue Aug 11, 2023 · 7 comments
Open
1 task done

[Feature]: Support Truncate collection #26280

xiaofan-luan opened this issue Aug 11, 2023 · 7 comments
Assignees
Labels
good first issue Good for newcomers kind/feature Issues related to feature request from users

Comments

@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

Truncate a collection cleans all data in a collection.
This can help user to clean up data as soon as possible.

To truncate a collection

  1. Disalble writing of a collection
  2. release the collection
  3. change collection id to a new value
  4. cleanup all the meta of the old collection
  5. wait for async garbage colleciton of the actual. data

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

@xiaofan-luan xiaofan-luan added the kind/feature Issues related to feature request from users label Aug 11, 2023
@xiaofan-luan xiaofan-luan self-assigned this Aug 11, 2023
@JackLCL
Copy link
Contributor

JackLCL commented Aug 23, 2023

Can truncate partition be implemented together?

@xiaofan-luan
Copy link
Collaborator Author

should be ok to do that.
Assign @PowderLi

@datnguyenzzz
Copy link

Good day @xiaofan-luan

May I ask is this issue still valid or open for contributing? I would like to pick it up. If yes, can you help to provide the starting point. Thanks!

@xiaofan-luan
Copy link
Collaborator Author

Hi @datnguyenzzz.
I think @PowderLi did the initial version but @czs007 has multiple comment on her PR.

maybe you can start from this PR and figure out

@datnguyenzzz
Copy link

Thanks for the suggestion. Will read through the MR and the comments, then will ask again if I need further clarification.
Also can you assign this issue to me so I can keep track of the issue ?

@xiaofan-luan
Copy link
Collaborator Author

/assign @datnguyenzzz

@datnguyenzzz
Copy link

datnguyenzzz commented Dec 9, 2024

Hello @xiaofan-luan

I'm in the midst of implementing this feature, after reading through above comments. My step-by-step approach is as follows:

  1. Create a temporary collection.
  2. Build an index for the temporary collection.
  3. Exchange the original collection with the temporary collection in the meta table (change the temporary collection name and aliases to the original ones), and set the original collection state to "Dropping."
  4. Return and end the function.
    --- The asynchronous background garbage collector job in the rootcoord will eventually pick up the original collection, which already has the "drop" state, for deletion (meta, actual data, indexes, etc.).

What do you think about my approach?

If the plan sounds okay, then I have a newbie question about step 2, which is about Build an index for the temporary collection. Could you give me some suggestions on how to build indexes for the temporary collection? My idea is to list the indexes of the original collection, then create new ones with newly allocated index IDs with the rest of the index information stays remain. But I'm not certain if that idea is feasible and which codepath ( or engineering documentation) that I can read though to understand more about the index creation step.

Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers kind/feature Issues related to feature request from users
Projects
None yet
Development

No branches or pull requests

4 participants