Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add runtime mappings to data frame analytics source config #69183

Conversation

dimitris-athanasiou
Copy link
Contributor

Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.

Closes #65056

Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.

Closes elastic#65056
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Feb 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@benwtrent benwtrent self-requested a review February 18, 2021 15:35
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add a test for regression, classification that utilizes runtime field mappings (didn't see one). This way we can be sure it works there and works fine during the inference phase of analytics.

import java.util.Map;

public final class RuntimeMappingsValidator {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private ctor?

Suggested change
private RuntimeMappingsValidator() { }

Map<String, Object> propNode = new HashMap<>(((Map<String, Object>) entry.getValue()));
Object typeNode = propNode.get("type");
if (typeNode == null) {
throw ExceptionsHelper.badRequestException("No type specified for runtime field [" + fieldName + "]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw ExceptionsHelper.badRequestException("No type specified for runtime field [" + fieldName + "]");
throw ExceptionsHelper.badRequestException("No type specified for runtime field [{}]", fieldName);

Comment on lines 31 to 32
throw ExceptionsHelper.badRequestException("Expected map for runtime field [" + fieldName + "] " +
"definition but got a " + fieldName.getClass().getSimpleName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw ExceptionsHelper.badRequestException("Expected map for runtime field [" + fieldName + "] " +
"definition but got a " + fieldName.getClass().getSimpleName());
throw ExceptionsHelper.badRequestException(
"Expected map for runtime field [{}] definition but got a {}",
fieldName,
fieldName.getClass().getSimpleName()
);

I think string formatting like this is much nicer. Feel free to ignore :)

@dimitris-athanasiou
Copy link
Contributor Author

@benwtrent Thank you for pointing out lack of coverage for regression/classification. Adding tests revealed a couple more places where I had to add runtime mappings. Pushed the fix, should be good for another review now.

@dimitris-athanasiou
Copy link
Contributor Author

@elasticmachine update branch

@dimitris-athanasiou dimitris-athanasiou merged commit 7fb98c0 into elastic:master Feb 19, 2021
@dimitris-athanasiou dimitris-athanasiou deleted the search-runtime-mappings-for-data-frame-analytics branch February 19, 2021 14:29
dimitris-athanasiou added a commit that referenced this pull request Feb 19, 2021
dimitris-athanasiou added a commit that referenced this pull request Feb 19, 2021
#69284)

Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.

Closes #65056

Backport of #69183
dimitris-athanasiou added a commit that referenced this pull request Feb 20, 2021
dimitris-athanasiou added a commit that referenced this pull request Feb 22, 2021
…ily (#69329)

Since config runtime mappings were added to data frame analytics jobs
in  #69183, when we create the destination index we add a `runtime`
section in its mappings regardless of whether the job config contains
any runtime fields. This causes failures in the BWC tests when in a
mixed cluster where there are nodes versioned before runtime fields
were introduced.

This is the first part of the fix were we do not create a `runtime` section
in the destination index mappings unless necessary.
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Feb 22, 2021
…essarily (elastic#69329)

Since config runtime mappings were added to data frame analytics jobs
in  elastic#69183, when we create the destination index we add a `runtime`
section in its mappings regardless of whether the job config contains
any runtime fields. This causes failures in the BWC tests when in a
mixed cluster where there are nodes versioned before runtime fields
were introduced.

This is the first part of the fix were we do not create a `runtime` section
in the destination index mappings unless necessary.

Backport of elastic#69329
dimitris-athanasiou added a commit that referenced this pull request Feb 22, 2021
…essarily (#69329) (#69332)

Since config runtime mappings were added to data frame analytics jobs
in  #69183, when we create the destination index we add a `runtime`
section in its mappings regardless of whether the job config contains
any runtime fields. This causes failures in the BWC tests when in a
mixed cluster where there are nodes versioned before runtime fields
were introduced.

This is the first part of the fix were we do not create a `runtime` section
in the destination index mappings unless necessary.

Backport of #69329
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :ml Machine learning Team:ML Meta label for the ML team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Add support for search-time runtime fields to data frame analytics
4 participants