Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query updates for performance #215

Closed
bmtcril opened this issue Apr 11, 2024 · 4 comments
Closed

Query updates for performance #215

bmtcril opened this issue Apr 11, 2024 · 4 comments

Comments

@bmtcril
Copy link
Contributor

bmtcril commented Apr 11, 2024

Run 1

Superset Reports: aspects-0.96.4-5ad897

1. Problem engagement by section/subsection
Superset time: 141.447137 (s).
2. Video engagement by section/subsection
Superset time: 97.140689 (s).
3. Distribution Of Attempts
Superset time: 90.075801 (s).
4. Distribution Of Hints Per Correct Answer
Superset time: 89.046454 (s).
5. Responses Per Problem
Superset time: 88.972371 (s).
6. Distribution Of Responses
Superset time: 50.679826 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 2
Filters: [{'column': 'emission_time'}]

7. Watched Video Segments
Superset time: 47.319447 (s).
8. Pageview engagement by section/subsection
Superset time: 46.930388 (s).
9. Partial and full views per video
Superset time: 45.919465 (s).
10. Distribution Of Problem Grades
Superset time: 39.189939 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

11. Currently Enrolled Learners Per Day
Superset time: 20.703394 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1560
Filters: [{'column': 'enrollment_status'}, {'column': 'enrollment_status_date'}]

12. Watches Per Video
Superset time: 18.602801 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

13. Course Grade Distribution
Superset time: 10.497829 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

14. Enrollments By Enrollment Mode
Superset time: 2.516631 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 3
Filters: [{'column': 'enrollment_status'}, {'column': 'emission_time'}]

15. Enrollment Events Per Day
Superset time: 1.440251 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1549
Filters: [{'column': 'emission_time'}]

16. Transcripts / Captions Per Video
Superset time: 1.042287 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

17. Posts per user
Superset time: 0.747924 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 58
Filters: []

18. Distinct forum users
Superset time: 0.44716 (s).
Succeeded: True
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'emission_time'}]

Run 2

Superset Reports: aspects-0.96.4-264afc

1. Problem engagement by section/subsection
Superset time: 140.097964 (s).
2. Video engagement by section/subsection
Superset time: 115.120385 (s).
3. Distribution Of Hints Per Correct Answer
Superset time: 90.338544 (s).
4. Distribution Of Attempts
Superset time: 89.500845 (s).
5. Responses Per Problem
Superset time: 89.127751 (s).
6. Distribution Of Responses
Superset time: 50.785438 (s).
Succeeded: True
Query duration: 50.269 (s).
Result rows: 2
Memory Usage (MB): 4289.866982460022
Row count (superset) 2
Filters: [{'column': 'emission_time'}]

7. Pageview engagement by section/subsection
Superset time: 47.334997 (s).
8. Watched Video Segments
Superset time: 45.728415 (s).
9. Partial and full views per video
Superset time: 45.415808 (s).
10. Distribution Of Problem Grades
Superset time: 38.675151 (s).
Succeeded: True
Query duration: 38.196 (s).
Result rows: 10
Memory Usage (MB): 4131.521450996399
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

11. Currently Enrolled Learners Per Day
Superset time: 19.75312 (s).
Succeeded: True
Query duration: 19.087 (s).
Result rows: 1560
Memory Usage (MB): 3465.305534362793
Row count (superset) 1560
Filters: [{'column': 'enrollment_status'}, {'column': 'enrollment_status_date'}]

12. Watches Per Video
Superset time: 18.122886 (s).
Succeeded: True
Query duration: 17.602 (s).
Result rows: 10000
Memory Usage (MB): 2978.9347105026245
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

13. Course Grade Distribution
Superset time: 9.349726 (s).
Succeeded: True
Query duration: 9.021 (s).
Result rows: 10
Memory Usage (MB): 1034.8061332702637
Row count (superset) 10
Filters: [{'column': 'emission_time'}]

14. Enrollment Events Per Day
Superset time: 1.588779 (s).
Succeeded: True
Query duration: 1.07 (s).
Result rows: 3094
Memory Usage (MB): 226.0701723098755
Row count (superset) 1549
Filters: [{'column': 'emission_time'}]

15. Enrollments By Enrollment Mode
Superset time: 1.486591 (s).
Succeeded: True
Query duration: 0.747 (s).
Result rows: 3
Memory Usage (MB): 248.0316982269287
Row count (superset) 3
Filters: [{'column': 'enrollment_status'}, {'column': 'emission_time'}]

16. Transcripts / Captions Per Video
Superset time: 1.153137 (s).
Succeeded: True
Query duration: 0.521 (s).
Result rows: 10000
Memory Usage (MB): 141.44000148773193
Row count (superset) 10000
Filters: [{'column': 'emission_time'}]

17. Posts per user
Superset time: 0.805276 (s).
Succeeded: True
Query duration: 0.432 (s).
Result rows: 58
Memory Usage (MB): 395.145396232605
Row count (superset) 58
Filters: []

18. Distinct forum users
Superset time: 0.557155 (s).
Succeeded: True
Query duration: 0.211 (s).
Result rows: 1
Memory Usage (MB): 158.88139629364014
Row count (superset) 1
Filters: [{'column': 'emission_time'}]
@bmtcril
Copy link
Contributor Author

bmtcril commented Apr 11, 2024

Full log including queries and errors is attached here.
perf_runs.log

@bmtcril
Copy link
Contributor Author

bmtcril commented Apr 18, 2024

Additional information:

Using the updated query performance script that takes a course id and filters to that course (the way these dashboards are intended to be used) we've got better numbers (using a "huge" load test course on the same dataset):

1. Problem Interactions
Superset time: 187.613377 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 160
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

2. Problem Results
Superset time: 187.319167 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 160
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

3. Learners with Passing Grade
Superset time: 1.707321 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'course_key'}]

4. Cumulative Interactions
Superset time: 1.349912 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

5. Evolution of engagement
Superset time: 1.24138 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

6. Course Information
Superset time: 1.218131 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'course_key'}]

7. Page views per section/subsection
Superset time: 0.913372 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

8. Subsection Summary
Superset time: 0.91096 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 27
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

9. Section Summary
Superset time: 0.910074 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 8
Filters: [{'column': 'visited_on'}, {'column': 'course_key'}]

10. Problems attempted per section/subsection
Superset time: 0.509738 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

11. Partial and full views per video
Superset time: 0.481967 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 40
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

12. Video views per section/subsection
Superset time: 0.477667 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 50
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

13. Current Enrollees
Superset time: 0.327181 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 1
Filters: [{'column': 'emission_time'}, {'column': 'enrollment_status'}, {'column': 'course_key'}]

14. Cumulative Enrollments by Track
Superset time: 0.322249 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 5
Filters: [{'column': 'emission_time'}, {'column': 'course_key'}]

15. Enrollees per Enrollment Track
Superset time: 0.189191 (s).
Query duration: 0.0 (s).
Result rows: None
Memory Usage (MB): None
Row count (superset) 3
Filters: [{'column': 'emission_time'}, {'column': 'enrollment_status'}, {'column': 'course_key'}]

The only report that got killed for memory was "Watched Video Segments":

2024-04-18 19:38:19,771:ERROR:performance_metrics:Error fetching slice data: Watched Video Segments. Error: Error: :HTTPDriver for https://kwzev57t01.us-east-2.aws.clickhouse.cloud:8443 returned response code 500)
 Code: 241. DB::Exception: Memory limit (total) exceeded: would use 7.29 GiB (attempt to allocate chunk of 4906248 bytes), maximum: 7.20 GiB

Interestingly this consistently loads in the UI.

"Distribution of Current Course Grade" consistently failed, or returned no data in 0 seconds in spite of it actually returning data in the dashboard.

Full log:
perf_runs_2.log

@Ian2012
Copy link
Contributor

Ian2012 commented Sep 26, 2024

Problem interactions and problem results are that slow? Really?

Should we close this one and open a new one for Sumac?

@bmtcril
Copy link
Contributor Author

bmtcril commented Sep 26, 2024

Yes, we should see more recent performance. Some of the performance issues here are what I was talking about when I said that the performance script doesn't line up with what we actually see in Superset. I'm not sure where the problem lies currently, but we should use a combination of both to find the slowest queries we'd like to work on and not rely just on the script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants