You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lucene added support for merge on refresh a while back.
Lucene creates one segment per concurrent write to Lucene index. As a result of this, in highly concurrent environments(more write threads), Lucene generates lot of small segments. This increases the search cost. Since the immediately created segments are merged, they are potentially hot in cache and hence can be retrieved faster- thereby improving merge cost as well.
The change fixes this problem by quickly merging the segments during refresh and commit and then opening the reader on the merged segment. The user needs to just set a configurable wait time for merge in order to allow Lucene's IndexWriter to try merging within that duration. This feature is disabled by default.
Choosing the merge wait timeout
One typical approach to achieve this could be setting the DEFAULT_MAX_FULL_FLUSH_MERGE_WAIT_MILLIS to a fraction of refresh interval. For higher refresh intervals, this could turn out to be a a really long timeout. May be we can add a ceiling on this wait time.
Should we expose this to users as an expert setting?
The text was updated successfully, but these errors were encountered:
Lucene added support for merge on refresh a while back.
Lucene creates one segment per concurrent write to Lucene index. As a result of this, in highly concurrent environments(more write threads), Lucene generates lot of small segments. This increases the search cost. Since the immediately created segments are merged, they are potentially hot in cache and hence can be retrieved faster- thereby improving merge cost as well.
The change fixes this problem by quickly merging the segments during refresh and commit and then opening the reader on the merged segment. The user needs to just set a configurable wait time for merge in order to allow Lucene's IndexWriter to try merging within that duration. This feature is disabled by default.
Choosing the merge wait timeout
One typical approach to achieve this could be setting the DEFAULT_MAX_FULL_FLUSH_MERGE_WAIT_MILLIS to a fraction of refresh interval. For higher refresh intervals, this could turn out to be a a really long timeout. May be we can add a ceiling on this wait time.
Should we expose this to users as an expert setting?
The text was updated successfully, but these errors were encountered: