-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep write support for old codecs? [LUCENE-9234] #10274
Comments
Robert Muir (@rmuir) (migrated from JIRA) I don't think we should do this. Having to write not just N but N-1 and support reads for those writes later is too much.
If this is the case, then perhaps offer a "concrete bargain" from the distributed systems side. Personally I assume they are just lazy, and trying to force the work on lucene (simply look at their tests for inconclusive proof of this!). So I would like to know what they would be willing to tradeoff in return. For example, solr tests running successfully in 5 minutes on my machine? |
Tomas Eduardo Fernandez Lobbe (@tflobbe) (migrated from JIRA)
Yes, this is the same problem in Solr in my experience. For the existing collection, things are hidden a bit by the fact that newly elected leaders tend to be the oldest active replica (because of how leader election works) but this is in no way guaranteed. For new collection, I guess one could use placement rules to define where the replicas should land, but as you said, this creates imbalances. Certainly having a Solr cluster with more than one version version is a recipe for problems. |
Adrien Grand (@jpountz) (migrated from JIRA) I think this option is appealing because it doesn't require direct trade-offs from the users, but it definitely has a big maintenance/test cost. |
David Smiley (@dsmiley) (migrated from JIRA) I tend to agree with Rob. Distributed systems on top of Lucene should be able to cope with the status quo, and this may mean more work for replica placement to consider the version if this wasn't thought of in the past. And a truly big/hard-core user could do some relatively basic Lucene re-packaging to ship the previous version if they were sufficiently motivated to care. Not all big search users would even care about this since a re-index or backup/restore may be feasible (it is where I work). |
Currenty we maintain read/write support for the latest codec in lucene/core, and read-only support for codecs of previous versions (up to {N-1}.0}) in lucene/backward-codecs. We often keep write support in test-framework for testing purposes only.
This raises challenges for Elasticsearch with regard to rolling upgrades: we have some users who index very large amounts of data on clusters that are quite large, so that rolling upgrades take significant time. Meanwhile, several indices may be created.
Allocating indices when the cluster has nodes of different versions requires care as Lucene indices created on nodes with a newer version cannot be read by the nodes running the older version. It is possible to force primary replicas to be allocated on the older nodes, but this brings other problems like availability, uneven disk usage across nodes, or moving a lot of data around.
If Lucene could write data using the minimum version that exists in the cluster, this would avoid this problem as the written data could be read by any node of the cluster. I understand this change would not come for free, especially when it comes to testing as we'd need to make sure that older Lucene versions can read indices created by this "compatibility mode".
I'd be curious to understand whether this is a problem for Solr too, if not how this problem is being handled, and maybe whether there are other problems that you have encountered that would also benefit from the ability to write data with an older format.
Migrated from LUCENE-9234 by Adrien Grand (@jpountz), resolved Oct 14 2020
The text was updated successfully, but these errors were encountered: