Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Create-Workload Enhancements] Rearchitect Create-Workload Feature #587

Open
IanHoang opened this issue Jul 17, 2024 · 3 comments
Open

[Create-Workload Enhancements] Rearchitect Create-Workload Feature #587

IanHoang opened this issue Jul 17, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@IanHoang
Copy link
Collaborator

IanHoang commented Jul 17, 2024

Overview

This is an issue based off one of the proposed priorities in this RFC: #395

Background

As of now, OSB's create-workload is a monolith that uses a two modules of functions to create a custom workload. It was inherently designed to be a quick and easy way to build custom workloads off of small corpora. While this approach has worked in the past, there is an increasing demand for building custom workloads based off of complex workloads and more users are using this feature to achieve this.

Users who have been using this feature have mentioned that the create-workload code currently is difficult to extend, maintain, and, for newcomers to OSB, difficult to follow and interpret.

We should rearchitect the code to be more organized and scalable, which in turn will make it easier to extend and maintain. This work will also serve as the foundation for future development, such as extracting a random sampling of the documents and repairing incomplete workloads.

Proposed Design

Exploring POCs and different possible designs

Proposed priority

It also makes it difficult for newcomers to come and understand the code easily. This approach would promote encapsulation and abstraction, overall making create-workload more organized and scalable as well as will be easier to extend and maintain.

@IanHoang IanHoang added enhancement New feature or request untriaged and removed untriaged labels Jul 17, 2024
@IanHoang IanHoang self-assigned this Jul 17, 2024
@gkamat gkamat changed the title [Create-Workload Enhancements] Reorganize Create-Worklload Feature [Create-Workload Enhancements] Reorganize Create-Workload Feature Jul 18, 2024
@IanHoang IanHoang moved this from 🆕 New to 🏗 In progress in Engineering Effectiveness Board Jul 30, 2024
@IanHoang IanHoang changed the title [Create-Workload Enhancements] Reorganize Create-Workload Feature [Create-Workload Enhancements] Rearchitect Create-Workload Feature Aug 9, 2024
@IanHoang
Copy link
Collaborator Author

IanHoang commented Aug 9, 2024

Received feedback to add support for pbzip2 compression now that OSB supports it. Will create a separate PR for it.

@gkamat gkamat moved this to 🏗 In progress in OpenSearch Benchmark Roadmap Sep 13, 2024
@IanHoang IanHoang moved this from 🏗 In progress to Backlog in Engineering Effectiveness Board Sep 25, 2024
@gkamat
Copy link
Collaborator

gkamat commented Oct 15, 2024

@IanHoang, it may be helpful to add some child tasks to this issue, since there are multiple items here.

@IanHoang IanHoang moved this from 📦 Backlog to 🏗 In progress in Engineering Effectiveness Board Jan 21, 2025
@IanHoang
Copy link
Collaborator Author

IanHoang commented Jan 21, 2025

Taking this back up. Will timebox this to explore POCs and different designs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🏗 In progress
Status: 🏗 In progress
Development

No branches or pull requests

2 participants