Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] faultinj CICD #525

Open
pxLi opened this issue Sep 6, 2022 · 1 comment
Open

[FEA] faultinj CICD #525

pxLi opened this issue Sep 6, 2022 · 1 comment
Labels
build faultinj relates to fault injection tool

Comments

@pxLi
Copy link
Collaborator

pxLi commented Sep 6, 2022

Is your feature request related to a problem? Please describe.
create a ticket to discuss the CICD requirement for faultinj tooling.

There is still some question for the tooling,
A. The artifact is a .so file, where should we deploy it to?
internal only or external artifactory store? Or we ask developers to build it whenever they want the tool?

B. What is the plan for this tooling? do we have plan to release it?
do we have some roadmap for it? like what we are trying to achieve in next release

C. We have several scenarios in design doc, but there is still no specific test specs (SW&HW) and expectation to make sure we have deterministic regular runs nightly. It would be nice to have some tables to clarify the details to help define the scenarios instead of simply giving a command.
e.g.
spark test w/ some specific configs
some faultinj specific configs
driver 450.xx
ubuntu 18.04
GPU w/ 12Gi mem
should return error count X. Then if using driver 465.yy/centos7/24Gi-mem gpu, it should return error count Y/Z/A
Or explicit saying that like cuda/OS/GPU types do not matter here, or we do not care about error count, or if test error out then all the setup meets our expectations. Then we could have a regular run for it

Thanks

@pxLi pxLi added build faultinj relates to fault injection tool labels Sep 6, 2022
@pxLi
Copy link
Collaborator Author

pxLi commented Sep 8, 2022

@sameerz can you help share some info about the plan here? At least for me, I would like to understand what is the must have of faultinj for the next release 22.10, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build faultinj relates to fault injection tool
Projects
None yet
Development

No branches or pull requests

1 participant