Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback from current round of teaching - first 2024 round #145

Closed
10 tasks done
mallewellyn opened this issue Mar 6, 2024 · 8 comments
Closed
10 tasks done

Feedback from current round of teaching - first 2024 round #145

mallewellyn opened this issue Mar 6, 2024 · 8 comments

Comments

@mallewellyn
Copy link
Contributor

mallewellyn commented Mar 6, 2024

Overall, the instructors liked teaching the course and found it fun to teach. There is some feedback on improvements and the following action points are converted directly from the feedback.

  • Timings need to be adjusted. The feedback was that we possibly need 5 days to cover the material and that the first three parts are too short, especially given the time to complete the challenges.
  • In the last challenge of the factor analysis episode, the plot needs to be made consistent with the solution.
  • Possibly could describe the R packages/rationalise them in greater detail to reduce the risk of technical issues.
  • The factor analysis episode could be combined into another episode (potentially doesn't need to be its own episode) since the course is pitched at biologists.

I feel like the topic of Factor Analysis would have been more suited to a sidebar at the end of the PCA episode (similar to the one for tidymodels at the end of the regularisation episode) than having a section to itself.  For the other sections I felt that there was enough material to provide a good high level understanding of how the techniques worked and when and how to apply them.  The discussion of Factor Analysis felt superficial and incomplete in comparison, and I felt that there was a risk that including it as its own section has the potential to confuse students. 

  • In episode 2, the DNA methylation data discussion could be moved to episode 1.
  • In episode 2, the broom package is possibly unnecessary and we could just use summary().
  • In episode 2, the advanced content to compute the t-statistics by hand would probably be better off in a sidebar.
  • In episode 3, the Introduction and coefficient estimates section could be moved to episode 2.
  • In episode 3, the coefficient estimates section should probably go before discussion of singularities .
  • In episode 3, more explanation of the heat map in the cross-validation section required to clarify what it shows.

The instructors noted that the data could be documented and discussed more thoroughly and should include the experiments that generated the data and what the features being measured are. This is addressed by #132.

@mallewellyn
Copy link
Contributor Author

FA episode is short to balance cognitive load, further improvements in #118 to address point 4

@mallewellyn
Copy link
Contributor Author

Just re point 5: "In episode 2, the DNA methylation data discussion could be moved to episode 1."

I'm not sure about this one - I think in some ways it's good to introduce the data in the episode where it's actually used for the first time

@alanocallaghan
Copy link
Collaborator

I'm not sure about this one - I think in some ways it's good to introduce the data in the episode where it's actually used for the first time

Agreed, data page also would make this redundant

@alanocallaghan
Copy link
Collaborator

Re:

In episode 2, the broom package is possibly unnecessary and we could just use summary().

It's better than summary(mod)$coef that produces a matrix with weird column names

@mallewellyn
Copy link
Contributor Author

True - I've fully converted from the broom package to base R in one of my branches, but it's incredibly messy. Think it's better to stick with broom

@alanocallaghan
Copy link
Collaborator

In episode 2, the advanced content to compute the t-statistics by hand would probably be better off in a sidebar.

I'm probably biased here, but the point here is to give specifics on how the shrinkage is happening.

In episode 3, more explanation of the heat map in the cross-validation section required to clarify what it shows.

This is now solved as the heatmap is removed.

In episode 3, the Introduction and coefficient estimates section could be moved to episode 2.
In episode 3, the coefficient estimates section should probably go before discussion of singularities .

Disagree, we assume a knowledge of linear models when going in, so we're assuming learners know this already. We're recapping because the specifics are relevant to the explanations of regularisation and the visualisations in this episode

@alanocallaghan
Copy link
Collaborator

True - I've fully converted from the broom package to base R in one of my branches, but it's incredibly messy. Think it's better to stick with broom

Agree, generally it makes different models more consistent to work with so I'm keen to encourage it where possible

@mallewellyn
Copy link
Contributor Author

I'm probably biased here, but the point here is to give specifics on how the shrinkage is happening.

Agreed, the formula at least is essential. Arguably the 'by hand' bit (calculating the estimate/standard error) explicitly could be a callout with the table_age_methyl1$statistic printed explicitly first but I think that could be more confusing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants