Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleaner should now use commit timeline and not include deltacommits #539

Merged
merged 1 commit into from
Jan 28, 2019

Conversation

n3nash
Copy link
Contributor

@n3nash n3nash commented Dec 26, 2018

For MOR tables, there will be 2 commits, delta + commit (ideally for one following compaction). In this scenario, we'd have to double the number of commits to be retained. Ideally, each commit action decides how many versions should be kept, hence making the timeline to read only from "commit" action.

@vinothchandar
Copy link
Member

how did you run into this ?

@n3nash
Copy link
Contributor Author

n3nash commented Dec 31, 2018

Basically, the cleaner was retaining fewer versions for trips table before this. On debugging, realized with deltacommit being considered as part of the number of commits in the cleaner is the problem

@vinothchandar
Copy link
Member

Looks like its touching lot more than just cleaner ?let me spend more time understanding, since its around timeline.

@n3nash
Copy link
Contributor Author

n3nash commented Jan 2, 2019

Yeah, basically I introduced a new method in the timeline for allCommit and changed the existing one to allCommits. So, the changes you see across the files are just as a result of rename of the method. The new method is only used in the cleaner. Hope that helps

@n3nash
Copy link
Contributor Author

n3nash commented Jan 26, 2019

@vinothchandar @bvaradar Please take a pass at this one as well when you get a chance.

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

/**
* Get only the completed (no-inflights) commit timeline
*/
public HoodieTimeline getCompletedCommitTimeline() {
return metaClient.getCommitsTimeline().filterCompletedInstants();
return metaClient.getCommitTimeline().filterCompletedInstants();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess cleaner is still going to call this and you just changed other usages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

@vinothchandar vinothchandar merged commit 994d42d into apache:master Jan 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants