-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use _refresh instead of reading from Translog in the RT GET case #20102
Conversation
Today we do a lot of accounting inside the engine to maintain locations of documents inside the transaction log. This is only needed to ensure we can return the documents source from the engine if it hasn't been refreshed. Aside of the added complexity to be able to read from the currently writing translog, maintainance of pointers into the translog this also caused inconsistencies like different values of the `_ttl` field if it was read from the tlog or not. TermVectors are totally different if the document is fetched from the tranlog since copy fields are ignored etc. This chance will simply call `refresh` if the documents latest version is not in the index. This streamlines the semantics of the `_get` API and allows for more optimizations inside the engine and on the transaction log. Note: `_refresh` is only called iff the requested document is not refreshed yet but has recently been updated or added.
LGTM, so simple. This goes a long ways towards addressing #19787 where the version map consumes 2/3 of the indexing buffer. Version map is still alive w/ this change, but not storing the |
LGTM2 |
In a followup we should switch the default for GETs to non-realtime. Its nice that we can still do a realtime GET but the performance is very different. I figure we can do the docs in that PR? I'm happy to make the PR if no one else wants it. |
@nik9000 What is your motivation to make GET not realtime by default? Are you worried about the worst-case of someone doing a PUT then GET on the same id all the time? |
Basically, yes. HTTP GETs are usually fast and don't make changes to the server. People are used to that because it is true for almost all GETs. This way of doing realtime GETs is still fairly fast in most cases but it'll make lots of little segments. Lots of little segments are bad for performance for lots of reasons but in this case I worry because they don't actually slow down the GET all that much, just the searches done later. I think it is the right thing to do if we're trying to push folks towards fast things by default. |
I don't think this is true. We only refresh iff the document we are fetching has changed since we refreshed the last time. So I guess commonly we won't refresh at all |
I removed discuss and WIP labels and updated docs, I think it's ready |
has been updated but is not yet refreshed, the get API will issue a refresh | ||
call in-place to make the document visible. This will also make other documents | ||
changed since the last refresh visible. In order to disable realtime GET, | ||
one can set `realtime` parameter to `false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set THE
minor docs changes, otherwise LGTM |
|
||
@Override | ||
public long ramBytesUsed() { | ||
return RamUsageEstimator.NUM_BYTES_OBJECT_HEADER + Long.BYTES + RamUsageEstimator.NUM_BYTES_OBJECT_REF + | ||
(translogLocation != null ? translogLocation.size : 0); | ||
return RamUsageEstimator.NUM_BYTES_OBJECT_HEADER + Long.BYTES + RamUsageEstimator.NUM_BYTES_OBJECT_REF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so Translog.Location does not need to implement Accountable anymore?
In the event of an explosion of segments because of these refreshes, performance concerns aside (as I don't know enough to comment on them), it /could/ chew through file descriptors, which would be more problematic. Would it make sense to include a caveat to that effect in the event someone abuses this? Or are we assuming that
|
@tylerfontaine very good question! The assumption here is that our mechanisms of throttling indexing and only flushing compound file segments will keep the number of file descriptors at bay by triggering merging. If you have thousands of shards on a node that might become a problem but that's orthogonal to this change. makes sense? |
As of now, one recommendation to improve indexing speed in bulk is to change the refresh interval parameter to a higher value (30 seconds) or to disable refresh meanwhile doing bulk operations.
Could it happen that we slow indexing speed because of too many refresh operations happening constantly? |
@ismael-hasan your analysis of the problem is correct. If you are updating the same doc over and over again you will hit refresh way more often. If this is the rule rather than the exception in you application it will be recommended to bulk those updates together on your application layer. I am convinced that this is a rare corner case, would you agree given the overall user base? We are trying to reduce the footprint in terms of memory and APIs that exists solely because of this corner-case so reading from translog on only this situation is basically reverting this PR. We don't want to scarify the overall experience for corner-cases like this since it will make maintainability of performance and quality hard over time. Yet, I am happy to add this limitation and your use-case to the docs to give people a heads-up, can you elaborate on your use-case a bit more so I can phrase it? |
@s1monw It was an example use case, not an actual one. I also think it is an unusual scenario and it should be addressed from the data point of view - I saw it just a couple of times in previous jobs, for instance, I saw it with partial updates: a document is inserted in a production database triggering storing it somewhere else to be indexed later; then, a new field is populated and the change is stored also, and a second/third time; finally, when all of this goes to ingestion, we have 1 record for the original document and 3 records with addition of new fields to the object. I just wanted to raise awareness of it and to know better how we planned to handle it; if allowing the translog approach for bulk was already considered and discarded, I am happy with it! I strongly +1 to adding a note to the documentation, it will save some headaches in the future with "Ingestion is going slow" - it is easier to plan ahead how to implement your bulk ingestion taking this into account than changing later the approach when you realize you are refreshing too often during bulk. I would add something like "When doing bulk indexing, sometimes the data will not be curated, including several operations to the same document in the same bulk operation. This approach can cause slow indexing, since it will trigger unnecessary refresh operations. It is recommended in this use case to pre-process the data to send only one addition/update per document per bulk operation which includes all of the modifications at once." |
@ismael-hasan I though about your usecase again and I think there is a misunderstanding. This really only applied to indexing if you use elasticsearch as a primary store not when you sync a database. For instance if you have a trigger in the DB you generate a new version of doc X and either index it or put it aside (as you said) to bulk it later. Now another field is populated you create version N+1 of the doc and either index it or put it aside in the same bulk, and so on. Non of these indexing operations will use the GET API. if you use the |
Note this is informative, by no means I am suggesting to revert this change |
This makes GET operations more consistent with `_search` operations which expect `(stored_)fields` to work on stored fields and source filtering to work on the `_source` field. This is now possible thanks to the fact that GET operations do not read from the translog anymore (elastic#20102) and also allows to get rid of `FieldMapper#isGenerated`. The `_termvectors` API (and thus more_like_this too) was relying on the fact that GET operations would extract fields from either stored fields or the source so the logic to do this that used to exist in `ShardGetService` has been moved to `TermVectorsService`. It would be nice that term vectors do not rely on this, but this does not seem to be a low hanging fruit.
This option has been removed in elastic#20102.
This option has been removed in #20102.
This option has been removed in #20102.
This option has been removed in #20102.
The realtime GET API currently has erratic performance in case where a document is accessed that has just been indexed but not refreshed yet, as the implementation will currently force an internal refresh in that case. Refreshing can be an expensive operation, and also will block the thread that executes the GET operation, blocking other GETs to be processed. In case of frequent access of recently indexed documents, this can lead to a refresh storm and terrible GET performance. While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted to read from the translog in case of realtime GET API or update API, this was removed in 5.0 (#20102) to avoid inconsistencies between values that were returned from the translog and those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and upsert to read from the translog again as it was easier to guarantee consistency for these, and also brought back more predictable performance characteristics of this API. Calls to the realtime GET API, however, would still always do a refresh if necessary to return consistent results. This means that users that were calling realtime GET APIs to coordinate updates on client side (realtime GET + CAS for conditional index of updated doc) would still see very erratic performance. This PR (together with #48707) resolves the inconsistencies between reading from translog and index. In particular it fixes the inconsistencies that happen when requesting stored fields, which were not available when reading from translog. In case where stored fields are requested, this PR will reparse the _source from the translog and derive the stored fields to be returned. With this, it changes the realtime GET API to allow reading from the translog again, avoid refresh storms and blocking the GET threadpool, and provide overall much better and predictable performance for this API.
The realtime GET API currently has erratic performance in case where a document is accessed that has just been indexed but not refreshed yet, as the implementation will currently force an internal refresh in that case. Refreshing can be an expensive operation, and also will block the thread that executes the GET operation, blocking other GETs to be processed. In case of frequent access of recently indexed documents, this can lead to a refresh storm and terrible GET performance. While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted to read from the translog in case of realtime GET API or update API, this was removed in 5.0 (#20102) to avoid inconsistencies between values that were returned from the translog and those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and upsert to read from the translog again as it was easier to guarantee consistency for these, and also brought back more predictable performance characteristics of this API. Calls to the realtime GET API, however, would still always do a refresh if necessary to return consistent results. This means that users that were calling realtime GET APIs to coordinate updates on client side (realtime GET + CAS for conditional index of updated doc) would still see very erratic performance. This PR (together with #48707) resolves the inconsistencies between reading from translog and index. In particular it fixes the inconsistencies that happen when requesting stored fields, which were not available when reading from translog. In case where stored fields are requested, this PR will reparse the _source from the translog and derive the stored fields to be returned. With this, it changes the realtime GET API to allow reading from the translog again, avoid refresh storms and blocking the GET threadpool, and provide overall much better and predictable performance for this API.
Today we do a lot of accounting inside the engine to maintain locations
of documents inside the transaction log. This is only needed to ensure
we can return the documents source from the engine if it hasn't been refreshed.
Aside of the added complexity to be able to read from the currently writing translog,
maintainance of pointers into the translog this also caused inconsistencies like different values
of the
_ttl
field if it was read from the tlog or not. TermVectors are totally different ifthe document is fetched from the tranlog since copy fields are ignored etc.
This change will simply call
refresh
if the documents latest version is not in the index. Thisstreamlines the semantics of the
_get
API and allows for more optimizations inside the engineand on the transaction log. Note:
_refresh
is only called iff the requested document is not refreshed yet but has recently been updated or added.NOTE: there are more APIs that are test only now that potentially can be removed... I didn't run REST tests yet etc so it's really WIP