Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teaching Intro to Deep Learning #303

Closed
NidhiGowdra opened this issue Jan 10, 2023 · 9 comments
Closed

Teaching Intro to Deep Learning #303

NidhiGowdra opened this issue Jan 10, 2023 · 9 comments
Labels
Carpentries Lab Needs to be fixed for Carpentries Lab

Comments

@NidhiGowdra
Copy link

Hello All,
Just wanted to make contact with the organizers/creators of the lesson as I would be teaching the lesson in March 2023 (We are still in the planning phase, timelines might vary) along with Mike Laverick and others.

@svenvanderburg svenvanderburg added the Carpentries Lab Needs to be fixed for Carpentries Lab label Jan 11, 2023
@svenvanderburg
Copy link
Collaborator

svenvanderburg commented Jan 11, 2023

That is great to hear @NidhiGowdra! We are really curious how the lesson material works out for you. Any feedback you have
is very useful to us, also a few lines of comment in this issue. See also #178 (comment) .

Let us know if you need any help in preparing for the course. We use these texts to advertise and communicate about the workshop, maybe they are useful for you as well?

@mike-ivs
Copy link
Contributor

Thanks for the mention @NidhiGowdra, looking forward to getting stuck into this carpentries incubator!

@svenvanderburg for a bit more context we're looking at presenting a carpentry "intro to Python/ML/DL" workshop in late March 2023. Obviously the Python Lessons are well established, but were pretty excited to delve into the new ML/DL lessons in the carpentry incubators and help trial them out.

For our ML section we are considering to use the Intro to ML - SKLearn, which I can see you are somewhat familiar with [here and here] ;) . Likely we will try to develop this out further to fit into a broader "intro to Python/ML/DL" workshop/context and so I wanted to reach out given your presence in the area and, of course, your above comment! (we're only just jumping into the carpentries ML scene but keen to help build out and trial all the resources).

@svenvanderburg
Copy link
Collaborator

svenvanderburg commented Jan 19, 2023

@mike-ivs that is great to hear! Good that you are teaching using this material and nice that you're keen on jumping into the carpentries ML scene, welcome 👋 If you are interested in developing the lesson further, we organize a lesson development sprint day on the 8th of march. (no is an answer of course 😋 )

I think there is a lot to choose from regarding ML lessons (as you already read in the issues you reference). Please note that we will trial with scikit learns' material in 2 weeks. I just added this: esciencecenter-digital-skills/lesson-machine-learning-intro@a7ca3dd to the readme of that repo as we are currently not using nor developing it further. You can see our plans for teaching with the scikitlearn material here: https://esciencecenter-digital-skills.github.io/2023-01-30-ds-sklearn/ (checkout the syllabus/schedule for example). But I am also curious to see how the carpentries machine learning novice lesson works for you.

@svenvanderburg
Copy link
Collaborator

@mike-ivs and @NidhiGowdra how did it go? Do you have any feedback for the lesson?

@NidhiGowdra
Copy link
Author

@svenvanderburg Apologies for my delayed response.

The course went well and we received valuable feedback from the cohort.

Some of the main points are:

  1. How to implement DL models for specific use cases (Admittedly, this is very hard to achieve in an intro lesson).
  2. I think reducing the number of datasets that were used in the lesson would have been better to explain the differences between the models/techniques applied. Ex: The penguins dataset and CIFAR-10 dataset perform the same classification task, they could have been merged. Applying MLP and CNN to the same CIFAR-10 dataset would have been helpful to show and explain how information flows within the model and how convolutions improve performance for image classification.
  3. I think it would have been better to organize the episodes based on task i.e. MLP classification -> CNN classification -> regression.

Technical issues:

  1. We tried to enforce a local install of the packages via anaconda/pip but there were issues around required permissions on university devices.
  2. We ended up utilizing google colab for the lessons.

Overall, I think it went well and I am keen to redo the lesson in H2-2023.

@svenvanderburg
Copy link
Collaborator

Great, thank you for your feedback @NidhiGowdra . And good to hear running the workshop went well. I will keep this issue open so we can think how to incorporate your feedback!

@svenvanderburg
Copy link
Collaborator

svenvanderburg commented Aug 7, 2023

@NidhiGowdra sorry it took so long to get back to your feedback!

  1. For specific use cases/real-world examples I opened Include real-world example #362 . We usually demo some projects to meet this need of students and relate the whole lesson to real-world scientific research.
  2. From the beginning we decided to use different machine learning problems, because it effectively shows how to approach a deep learning problem 3 times. In the end we want students to apply deep learning to their own problem, and that is always a different dataset. We did notice that it was a bit time-consuming to introduce a dataset every time so we greatly reduced the time we spent on data exploration in Shorten data exploration in episode 2, 3 and 4 #358 .
    I like your idea of doing MLP on CIFAR-10 first and then CNN to demonstrate the power of CNNs, I created Compare CNN with MLP in terms of accuracy #363 for it.
  3. I tend to disagree. I like the penguins dataset to introduce the topic of deep learning, but keeping things simple. Then the weather dataset in episode 3 allows the demonstration of model evaluation, monitoring and hyperparameter tuning, in other words it takes what we learned in episode 2 to the next step, the fact that it is a regression task is actually only a minor learning objective, in practice the approach does not matter that much. CNNs are definitely the hardest topic in this lesson, so I would still insist to put it last.

Regarding setup issues, indeed if students don't have the right permissions to install stuff, google colab is a good alternative.

Let me know what you think!

By the way, did you teach the lesson another time? If not, why not? (out of curiosity).

@NidhiGowdra
Copy link
Author

@svenvanderburg No worries and thanks for the reply.

  1. I haven't worked on genomic data but you're right, We can show more examples of the DL projects we have worked on before the start of the lesson. We showcased a few projects but, I guess it was not broad enough. We will add more examples to the updated slides and discuss/highlight and show the real-world practical applications of DL for research/commercial projects.

  2. Agreed and Perfect. Yes the penguins dataset cements the idea of an abstract concept of "features". Harder to explain the same concept with image datasets.

  3. Fair enough, CNNs do need more time and require concepts from earlier lessons, we will keep the flow the same.

We are teaching the workshop again next month. We are finalizing the lesson plan and slide deck. We are also looking into maybe adding in LLMs (seeing its ever-growing popularity).

I will post back here after the workshop and collating feedback from students.
Thanks

@svenvanderburg
Copy link
Collaborator

@NidhiGowdra thanks for your reply! Feel free to open up issues if you want to contribute any of your own material or see improvements to the lesson.
Indeed LLMs are hard to ignore, I am curious to hear how teaching about it went!

I am closing this issue now, you can open a new issue for feedback from the next edition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Carpentries Lab Needs to be fixed for Carpentries Lab
Projects
None yet
Development

No branches or pull requests

3 participants