-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add Basic Support for OpenSearch Ingestion controls from the Migration Console #621
[Feature] Add Basic Support for OpenSearch Ingestion controls from the Migration Console #621
Conversation
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
…line Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #621 +/- ##
============================================
+ Coverage 75.77% 75.88% +0.10%
- Complexity 1533 1534 +1
============================================
Files 168 168
Lines 6395 6395
Branches 570 570
============================================
+ Hits 4846 4853 +7
+ Misses 1172 1164 -8
- Partials 377 378 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
SECRET_CONFIG_TEMPLATE = """ | ||
pipeline_configurations: | ||
aws: | ||
secrets: | ||
source-secret-config: | ||
secret_id: <SOURCE_SECRET_ID_PLACEHOLDER> | ||
region: <AWS_REGION_PLACEHOLDER> | ||
sts_role_arn: <STS_ROLE_ARN_PLACEHOLDER> | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three can be replaced done with standard string interpolation, because the variables are already filled above. This would ensure that changing the variable name only has to happen in one place, or if we move away from these placeholders (to env vars or something else), this doesn't have to be changed. (same applies to lines 35-39)
SECRET_CONFIG_TEMPLATE = """ | |
pipeline_configurations: | |
aws: | |
secrets: | |
source-secret-config: | |
secret_id: <SOURCE_SECRET_ID_PLACEHOLDER> | |
region: <AWS_REGION_PLACEHOLDER> | |
sts_role_arn: <STS_ROLE_ARN_PLACEHOLDER> | |
""" | |
SECRET_CONFIG_TEMPLATE = f""" | |
pipeline_configurations: | |
aws: | |
secrets: | |
source-secret-config: | |
secret_id: {SOURCE_SECRET_ID_PLACEHOLDER} | |
region: {AWS_REGION_PLACEHOLDER} | |
sts_role_arn: {STS_ROLE_ARN_PLACEHOLDER} | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I slightly misunderstood how this was being used. I still think an f-string is a better way to do it, but will elaborate below
…sole Signed-off-by: Tanner Lewis <[email protected]>
secret_config = SECRET_CONFIG_TEMPLATE.replace(SOURCE_SECRET_ID_PLACEHOLDER, source_auth_secret) | ||
secret_config = secret_config.replace(STS_ROLE_ARN_PLACEHOLDER, source_pipeline_role_arn) | ||
secret_config = secret_config.replace(AWS_REGION_PLACEHOLDER, aws_region) | ||
pipeline_config = pipeline_config.replace(AWS_SECRET_CONFIG_PLACEHOLDER, secret_config) | ||
pipeline_config = pipeline_config.replace(SOURCE_AUTH_OPTIONS_PLACEHOLDER, SOURCE_BASIC_AUTH_CONFIG_TEMPLATE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm not sure I understand why you went with this construction method and all the placeholders, over e.g. a generate_secret_config
function that takes these variables and returns a secret config string (likely using f-strings internally, but could be whatever you want). Am I missing a reason why that wouldn't work?
I think I would basically make it look something like this:
def generate_secret_config(source_auth_secret, source_pipeline_role_arn, aws_region):
return f"""
pipeline_configurations:
aws:
secrets:
source-secret-config:
secret_id: {source_auth_secret}
region: {aws_region}
sts_role_arn: {source_pipeline_role}
"""
And then something pretty similar for pipeline config. It looks much more straightforward and testable to me, but maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, if you want to "declare" the config/python templates somewhere outside of a function, you can do this sort of thing:
SECRET_CONFIG_FRAGMENT = """
secrets:
source-secret-config:
secret_id: {source_auth_secret}
region: {aws_region}
sts_role_arn: {source_pipeline_role}
"""
def generate_secret_config(**kwargs):
return SECRET_CONFIG_FRAGEMENT.format(**kwargs) # or make a dictionary with the named function params
But either of these seems like a much more canonical python way to do what you're doing "manually"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out and the examples, this went through a few iterations and ended up in a usable but unsavory state. I've refactored this a good bit to not use any placeholders internally but have left placeholders in the actual template file. My assumption is that users may want to add OSI specific settings to the template, and this seems to be the easiest way to allow users to optionally add to the template. As we actually use this though, we may find that this isn't the case and users are better off if we just completely generate the pipeline template file ourselves.
logger = logging.getLogger(__name__) | ||
DEFAULT_PIPELINE_NAME = "migration-assistant-pipeline" | ||
DEFAULT_PIPELINE_CONFIG_PATH = "./osiPipelineTemplate.yaml" | ||
AWS_SECRET_CONFIG_PLACEHOLDER = "<AWS_SECRET_CONFIG_PLACEHOLDER>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor note, but if we stick with the PLACEHOLDER thing, can you add a blank line to create a distinction between the "real" variables whose value likely matters to the user, and the placeholders that are less relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
cw_log_options = { | ||
'IsLoggingEnabled': False | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: right now we won't get any logs from the pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, from my testing there was issue with OSI both to auto-create log group (didn't seem possible with error), and to provide an existing log-group (was looking for a very specific name, and when matched still produced issue). Need to get these resolved, but that's the current state
# Documentation lists this as a requirement but does not seem to be the case from testing: | ||
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/osis/client/create_pipeline.html | ||
# BufferOptions={ | ||
# 'PersistentBufferEnabled': False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which value does it default to? Do we care which one it actually is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want false since we are a pull based source, but I experienced an issue with deployment essentially saying BufferOptions shouldn't exist, yet its a requirement on the API doc 🤷♂️
@@ -34,6 +38,7 @@ RUN chmod ug+x /root/humanReadableLogs.py | |||
RUN chmod ug+x /root/simpleDocumentGenerator.py | |||
RUN chmod ug+x /root/catIndices.sh | |||
RUN chmod ug+x /root/showFetchMigrationCommand.sh | |||
RUN chmod ug+x /root/osiMigration.py | |||
RUN chmod ug+x /root/kafka-tools/kafkaExport.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor, we should probably start doing all these chmods in one line so we create fewer layers (and maybe the copies too), but I don't think that's an issue for this moment
I have some questions about the python implementation, but otherwise LTGM |
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Description
This change introduces basic support for creating/starting/stopping an OpenSearch Ingestion pipeline.
osiMigration.py
script to the migration console for controlling OSI pipelines. This script supports starting/stopping a pipeline as well as creating a pipeline from provided command arguments or creating the pipeline from inferred settings based on the associated deployment.See README update for details on executing. Once this is merged, I expect to copy and paste something similar to the wiki.
Issues Resolved
https://opensearch.atlassian.net/browse/MIGRATIONS-1659
Testing
Manual AWS testing on cloud
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.