Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master Topic List #1

Open
7 tasks
kescobo opened this issue Aug 21, 2019 · 6 comments
Open
7 tasks

Master Topic List #1

kescobo opened this issue Aug 21, 2019 · 6 comments

Comments

@kescobo
Copy link
Member

kescobo commented Aug 21, 2019

I thought it would be good to collect topic/lesson ideas in one place. Comment here and I'll add stuff to this first comment. If the list gets long enough I'll do some organization. Some ideas off the top of my head:

  • search for, download, and examine gene sequences from NCBI
  • search for, download, and examine articles with certain keywords from pubmed
  • Get all subsequences of length N from a given protein/DNA/RNA sequence
  • Parse a fasta file, and get all unique sequences (accounting for reverse compliments) and write that to a new fasta file
  • Basically anything from rosalind.info
  • ML/DL/RL for genomics
  • Clean / align sequencing results

Please feel free to post other ideas and I'll add them to the list.

@TransGirlCodes
Copy link
Member

It's also worth considering, is this going to be a cookbook style collection of example solutions to a list of topics, or is there some broader pedagogy/themes the book could follow. Like a Data gathering / curation segment could cover the NCBI and search/download topics, before moving on to processing and so on.

@kescobo
Copy link
Member Author

kescobo commented Aug 21, 2019

Worth discussing in a separate issue perhaps, but my 2c is that we start collecting recipes and organize into themes once we have enough that that makes sense. These could be the basis for something more directed later, but developing a sensible pedagogical ordering is non-trivial, and then you need to worry about level (tutorials designed for novice julia programmers will be different than those developed for professional bioinformaticians), pacing etc.

@alejandromerchan
Copy link

The low hanging fruit is always a cookbook with simple recipes. That allow users a simple introduction and let them get familiar with the language, the type system, etc. I have the Julia 1.0 Programming Cookbook from Packt publishers and it's a very nice, easy introduction. The idea would be to show how to start a project, from downloading the files, either ones online or ones you created on your own, cleaning, aligning, all the way to gene or protein predictions. I have no idea if the ecosystem allows all this already, but that's what I can see as a really good introductory tool, using what you guys have already developed.

@XVilka
Copy link

XVilka commented Aug 22, 2019

Probably a small chapter/overview of different public databases and their interaction with/through Julia world. Maybe sequencing data formats support.

Certainly applying ML/DL/RL for genomics, at least it worth its own chapter.

@aguang
Copy link

aguang commented Aug 23, 2019

I've started doing some basic bioinformatics applications in Julia and am planning on keeping track of them in a cookbook repo for my group. If any of the recipes would be helpful feel free to use / I'd be interested in contributing. Topics I would find relevant are ways of parsing and merging FASTA files based on header information, subsequence searches, obtaining and visualizing basic statistics (i.e. sequence quality distribution plots, proportion of mapped reads in BAMs, etc.) Would also be nice to see some stuff on manipulating phylogenetic trees, but it doesn't look like a phylogenetics package is being actively developed.

@miguelraz
Copy link

Posting to stay in the loop. Awesome initiative!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants