The workflow need the following inputs:
- The DEG file:
- A tabular file with first column the gene symbol and second column a boolean value whether the gene is a differentially expressed gene or not.
- The gene length file:
- A tabular file with first column the gene symbol and second column the gene length of the genes.
- You can create this file with Gene length and GC content tool. You will need a GTF file as input.
- If you are using featureCounts you can set
Create gene-length file
to yes and get gene length as separate output.
- The KEGG file:
- A tabular file with first column the Pathway ID and second column the Pathway name like:
-
ID Name 01100 Metabolic pathways - mmus 01200 Carbon metabolism - mmus
-
- You can get this information from the KEGG database. For example:
- A tabular file with first column the Pathway ID and second column the Pathway name like:
- Genome: Select one of the available genomes
- Gene ID format: Select the format of your input genes (Ensembl, Entrez, or Symbol)
- The workflow will do a simple enrichment analysis with taking into account the gene length
- The output will be 3 files
GO table
,Top ontology plot
andDE genes in each category
for Cellular Component, Biological Processes, and Molecular Function ontologies andKEGG table
andDE genes in each KEGG Pathways
@nilchia wrote the workflow and the tests.