Continue registering pipelines after one pipeline parse failure. #28752

talevy · 2018-02-20T21:35:20Z

Ingest has been failing to apply existing pipelines from cluster-state
into the in-memory representation that are no longer valid. One example of
this is a pipeline with a script processor. If a cluster starts up with scripting
disabled, these pipelines will not be loaded. Even though GETing a pipeline worked,
indexing operations claimed that this pipeline did not exist. This is because one
gets information from cluster-state and the other is from an in-memory data-structure.

Now, two things happen

suppress the exceptions until after other successfull pipelines are loaded
replace failed pipelines with a placeholder pipeline that carries the exception message
in its description

If the pipeline execution service encounters a pipeline stub, it is known that
something went wrong at the time of pipeline creation and an exception was thrown to
the user at some point at start-up.

closes #28269.

Ingest has been failing to apply existing pipelines from cluster-state into the in-memory representation that are no longer valid. One example of this is a pipeline with a script processor. If a cluster starts up with scripting disabled, these pipelines will not be loaded. Even though GETing a pipeline worked, indexing operations claimed that this pipeline did not exist. This is because one gets information from cluster-state and the other is from an in-memory data-structure. Now, two things happen 1. suppress the exceptions until after other successfull pipelines are loaded 2. replace failed pipelines with a placeholder pipeline called `Pipeline.EMPTY` If the pipeline execution service encounters `Pipeline.EMPTY`, it is known that something went wrong at the time of pipeline creation and an exception was thrown to the user at some point at start-up. closes elastic#28269.

martijnvg

This looks good @talevy! I left a suggestion on how to improve handeling these pipeline construction errors.

martijnvg · 2018-02-21T06:49:23Z

server/src/main/java/org/elasticsearch/ingest/PipelineStore.java

        for (PipelineConfiguration pipeline : ingestMetadata.getPipelines().values()) {
            try {
                pipelines.put(pipeline.getId(), factory.create(pipeline.getId(), pipeline.getConfigAsMap(), processorFactories));
            } catch (ElasticsearchParseException e) {
-                throw e;
+                pipelines.put(pipeline.getId(), Pipeline.EMPTY);


I wonder if we can create a pipeline with a single fail processor. This fail processor can then contain the reason why the pipeline in question couldn't be loaded. This way there is no need to check the logs.

I like this, but I will try to do this in such a way that the pipeline isn't actually executed and is only used to store the exception. If I can't find a way to cleanly do that, then I will just let the pipeline run and fail

ah, also, fail processor is inside ingest-common so that makes it slightly more annoying

talevy · 2018-02-21T18:04:18Z

@martijnvg I've updated to carry the exception's message and pass it on to the user at runtime. good call — way better than telling them to look at logs.

martijnvg

@talevy Cool! I left an additional suggestion on how to improve the error handling.

martijnvg · 2018-02-22T06:56:09Z

server/src/main/java/org/elasticsearch/ingest/PipelineStore.java

        for (PipelineConfiguration pipeline : ingestMetadata.getPipelines().values()) {
            try {
                pipelines.put(pipeline.getId(), factory.create(pipeline.getId(), pipeline.getConfigAsMap(), processorFactories));
            } catch (ElasticsearchParseException e) {
-                throw e;
+                pipelines.put(pipeline.getId(), new Pipeline("invalid_" + pipeline.getId(), e.getMessage(),


what if we create a pipeline that throws the exception for us (fail processor is in a ingest-common, but it very simple to have that behaviour here too):

pipelines.put(pipeline.getId(), substitutePipeline(pipeline.getId(), e)); private Pipeline substitutePipeline(String id, ElasticsearchParseException e) { String tag = e.getHeaderKeys().contains("processor_tag") ? e.getHeader("processor_tag").get(0) : null; String type = e.getHeaderKeys().contains("processor_type") ? e.getHeader("processor_type").get(0) : "unknown"; String errorMessage = "pipeline with id [" + id + "] could not be loaded, caused by [" + e.getDetailedMessage() + "]"; Processor failureProcessor = new AbstractProcessor(tag) { @Override public void execute(IngestDocument ingestDocument) throws Exception { throw new IllegalStateException(errorMessage); } @Override public String getType() { return type; } }; String description = "this is a place holder pipeline, because pipeline with id [" + id + "] could not be loaded"; return new Pipeline(id, description, null, new CompoundProcessor(failureProcessor)); }

This way the if statement added PipelineExecutionService.java is not needed and this failure to load logic is in one place.

Note, I updated the above comment.

Also I think it is important to throw an IllegalStateException instead of a IllegalArgumentException, because the fact that we failed to load the pipeline at this stage is a real issue and this is the best way we can deal with it?

The problem with this approach is that we will be put these substituted pipelines in PUT requests.

this will return true: https://github.com/elastic/elasticsearch/pull/28752/files#diff-fd6036d9f6ce14fecb8751a3adb9fc59R280 (testPutWithPipelineFactoryError)

hmm. nevermind.. for some reason I saw that test fail. I might have done something wrong or a different test failed

martijnvg · 2018-02-22T06:57:34Z

server/src/main/java/org/elasticsearch/ingest/PipelineStore.java

@@ -81,16 +81,22 @@ void innerUpdatePipelines(ClusterState previousState, ClusterState state) {
        }

        Map<String, Pipeline> pipelines = new HashMap<>();
+        ArrayList<ElasticsearchParseException> exceptions = new ArrayList<>();


s/ArrayList/List ?

talevy · 2018-03-06T18:45:24Z

@martijnvg I've updated the PR to reflect your suggestion around inserting a placeholder pipeline that fails during execution

martijnvg

LGTM!

(The build only failed, because of a checkstyle violation)

Also I think we should backport this to the 5.6 branch. wdyt?

talevy · 2018-03-08T16:06:45Z

@martijnvg Once I merge into 6.x and master, I'll see how the backport looks for 5.6

) Ingest has been failing to apply existing pipelines from cluster-state into the in-memory representation that are no longer valid. One example of this is a pipeline with a script processor. If a cluster starts up with scripting disabled, these pipelines will not be loaded. Even though GETing a pipeline worked, indexing operations claimed that this pipeline did not exist. This is because one gets information from cluster-state and the other is from an in-memory data-structure. Now, two things happen 1. suppress the exceptions until after other successful pipelines are loaded 2. replace failed pipelines with a placeholder pipeline If the pipeline execution service encounters the stubbed pipeline, it is known that something went wrong at the time of pipeline creation and an exception was thrown to the user at some point at start-up. closes #28269.

…stic#28752) Ingest has been failing to apply existing pipelines from cluster-state into the in-memory representation that are no longer valid. One example of this is a pipeline with a script processor. If a cluster starts up with scripting disabled, these pipelines will not be loaded. Even though GETing a pipeline worked, indexing operations claimed that this pipeline did not exist. This is because one gets information from cluster-state and the other is from an in-memory data-structure. Now, two things happen 1. suppress the exceptions until after other successful pipelines are loaded 2. replace failed pipelines with a placeholder pipeline If the pipeline execution service encounters the stubbed pipeline, it is known that something went wrong at the time of pipeline creation and an exception was thrown to the user at some point at start-up. closes elastic#28269.

* master: (28 commits) Maybe die before failing engine (elastic#28973) Remove special handling for _all in nodes info Remove Booleans use from XContent and ToXContent (elastic#28768) Update Gradle Testing Docs (elastic#28970) Make primary-replica resync failures less lenient (elastic#28534) Remove temporary file 10_basic.yml~ Use different pipeline id in test. (pipelines do not get removed between tests extending from ESIntegTestCase) Use fixture to test the repository-gcs plugin (elastic#28788) Use String.join() to describe a list of tasks (elastic#28941) Fixed incorrect test try-catch statement Plugins: Consolidate plugin and module loading code (elastic#28815) percolator: Take `matchAllDocs` and `verified` of the sub result into account when analyzing a function_score query. Build: Remove rest tests on archive distribution projects (elastic#28952) Remove FastStringReader in favor of vanilla StringReader (elastic#28944) Remove FastCharArrayReader and FastCharArrayWriter (elastic#28951) Continue registering pipelines after one pipeline parse failure. (elastic#28752) Build: Fix ability to ignore when no tests are run (elastic#28930) [rest-api-spec] update doc link for /_rank_eval Switch XContentBuilder from BytesStreamOutput to ByteArrayOutputStream (elastic#28945) Factor UnknownNamedObjectException into its own class (elastic#28931) ...

talevy added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Feb 20, 2018

talevy requested a review from martijnvg February 20, 2018 21:35

talevy mentioned this pull request Feb 20, 2018

[Ingest] Ingest Node cannot find pipeline that exists #28269

Closed

talevy force-pushed the keep-registering-pipelines branch from c930300 to bcfa96d Compare February 20, 2018 22:23

martijnvg reviewed Feb 21, 2018

View reviewed changes

talevy added 2 commits February 21, 2018 09:40

Merge branch 'master' into keep-registering-pipelines

a1336c4

move from Pipeline.Empty to storing exception in description!

1680c0f

talevy added 3 commits February 21, 2018 13:11

fix pipelinestore test on error. pipeline is no longer null

0d3caba

more test fixes

e25ea29

cleanup state

2da875c

martijnvg reviewed Feb 22, 2018

View reviewed changes

talevy added 2 commits March 5, 2018 10:11

Merge branch 'master' into keep-registering-pipelines

4d2d3ee

change mode to insert pipeline with fail processor

dfb2679

martijnvg approved these changes Mar 8, 2018

View reviewed changes

martijnvg added v7.0.0 v6.3.0 v6.2.3 labels Mar 8, 2018

fix linelength

02d3f97

Merge branch 'master' into keep-registering-pipelines

31a22aa

talevy mentioned this pull request Mar 8, 2018

Ingest Pipelines not working after removing plugin #27783

Closed

fix cleanup

534e155

talevy merged commit 7784c1b into elastic:master Mar 8, 2018

talevy deleted the keep-registering-pipelines branch March 8, 2018 23:23

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continue registering pipelines after one pipeline parse failure. #28752

Continue registering pipelines after one pipeline parse failure. #28752

talevy commented Feb 20, 2018 •

edited

Loading

martijnvg left a comment

martijnvg Feb 21, 2018

talevy Feb 21, 2018

talevy Feb 21, 2018

talevy commented Feb 21, 2018

martijnvg left a comment

martijnvg Feb 22, 2018 •

edited

Loading

martijnvg Feb 22, 2018

talevy Mar 6, 2018 •

edited

Loading

talevy Mar 6, 2018

martijnvg Feb 22, 2018

talevy Mar 6, 2018

talevy commented Mar 6, 2018

martijnvg left a comment

talevy commented Mar 8, 2018

Continue registering pipelines after one pipeline parse failure. #28752

Continue registering pipelines after one pipeline parse failure. #28752

Conversation

talevy commented Feb 20, 2018 • edited Loading

martijnvg left a comment

Choose a reason for hiding this comment

martijnvg Feb 21, 2018

Choose a reason for hiding this comment

talevy Feb 21, 2018

Choose a reason for hiding this comment

talevy Feb 21, 2018

Choose a reason for hiding this comment

talevy commented Feb 21, 2018

martijnvg left a comment

Choose a reason for hiding this comment

martijnvg Feb 22, 2018 • edited Loading

Choose a reason for hiding this comment

martijnvg Feb 22, 2018

Choose a reason for hiding this comment

talevy Mar 6, 2018 • edited Loading

Choose a reason for hiding this comment

talevy Mar 6, 2018

Choose a reason for hiding this comment

martijnvg Feb 22, 2018

Choose a reason for hiding this comment

talevy Mar 6, 2018

Choose a reason for hiding this comment

talevy commented Mar 6, 2018

martijnvg left a comment

Choose a reason for hiding this comment

talevy commented Mar 8, 2018

talevy commented Feb 20, 2018 •

edited

Loading

martijnvg Feb 22, 2018 •

edited

Loading

talevy Mar 6, 2018 •

edited

Loading