Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dot graph to include stages and remove some duplication #2761

Merged
merged 3 commits into from
Jun 23, 2021

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Jun 21, 2021

This uses a modified version of the code that Apache Spark uses to generate its SQL visualization. I updated it to match the style that we currently are using.

This fixes #2711
This fixes #2712

It does not add in job information it could do it, but it is not that critical.

It is not perfect with how it dedupes stages/etc but it is as good as Apache Spark is, because it reuses the code.

This rips out the compare code, that was not being used, and I plan to work on getting a stage to stage comparison setup next so we can more easily see which stages, in large queries both between the CPU and the GPU, and equivalent. That way I can more quickly see what is showing speedups or not.

@revans2 revans2 added the task Work required that improves the product but is not user facing label Jun 21, 2021
@revans2 revans2 added this to the June 21 - July 2 milestone Jun 21, 2021
@revans2 revans2 requested a review from andygrove June 21, 2021 16:39
@revans2 revans2 self-assigned this Jun 21, 2021
@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

build

1 similar comment
@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

Something odd happened with the previous CI job. It skipped everything and I don't know why.

@tgravescs
Copy link
Collaborator

@andygrove can you take a look

andygrove
andygrove previously approved these changes Jun 22, 2021
Copy link
Contributor

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested locally with some TPC-DS queries and it worked well. It's great having the stage metrics on the diagram.

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks fine to me, can we add more to the GenerateDotSuite to make sure the job/stage info is present?

@revans2
Copy link
Collaborator Author

revans2 commented Jun 22, 2021

@tgravescs I updated the test and I also added in some more docs to the README to explain how the stages work.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 22, 2021

build

tgravescs
tgravescs previously approved these changes Jun 23, 2021
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, can you upmerge to fix the conflict

@revans2
Copy link
Collaborator Author

revans2 commented Jun 23, 2021

build

@tgravescs tgravescs merged commit a646e77 into NVIDIA:branch-21.08 Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
3 participants