Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendor Agnostic Models #16

Open
1 task
chauhankaranraj opened this issue Dec 16, 2020 · 0 comments
Open
1 task

Vendor Agnostic Models #16

chauhankaranraj opened this issue Dec 16, 2020 · 0 comments

Comments

@chauhankaranraj
Copy link
Member

Feedback no. 3

One salient feature of the Backblaze dataset is that the distribution of vendors in the data is neither uniform nor exhaustive. For example, seagate comprises ~70% of data, HGST comprises ~15%, Intel drive data is absent, etc. Also, our initial assumption was that SMART metrics may behave differently for different vendors. Therefore in the current forecasting notebook, models are trained vendor-wise. However, the distribution of vendors across Ceph users is likely different and we want to support all of those vendors.

As a data scientist, I want to explore how "transferable" forecasting models are, across vendors. That is, how is performance affected when a model is trained on data from one vendor and evaluated on data from another one.

Acceptance criteria:

  • EDA notebook comparing model performance on data from the vendor it's trained on and data from other vendors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant