title | author | date | number-sections |
---|---|---|---|
Analysis of ChIP-seq Data |
Hugo Tavares |
today |
false |
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a method used to identify binding sites for transcription factors, histone modifications and other DNA-binding proteins across the genome.
These materials cover the fundamentals of ChIP-seq data analysis, from raw data processing to downstream applications.
We will start with an introduction to ChIP-seq methods, including important considerations when designing your experiments.
We will cover the bioinformatic steps in a standard ChIP-seq analysis workflow, covering raw data quality control, trimming/filtering, mapping, duplicate removal, post-mapping quality control, peak calling and peak annotation.
We will discuss metrics used for quality assessment of the called peaks when multiple replicates are available, as well as the analysis of differential binding across sample groups.
Finally, we will also cover tools and packages that can be used for visualising and exploring your results.
::: {.callout-tip}
- Describe how ChIP-seq data is generated and what information it provides about the (epi)genome
- Recall the experimental design considerations that are needed when performing ChIP-seq experiments
- Understand the bioinformatic steps involved in processing ChIP-seq data
- Interpret and assess the quality of your data and results
- Perform differential binding analysis to compare different groups of samples :::
This course is aimed at researchers with no prior experience in the analysis of ChIP-seq data, who would like to get started in processing their data using a standardised pipeline and perform downstream analysis and visualisation of their results.
- Basic understanding of high-throughput sequencing technologies.
- Watch this iBiology video for an excellent overview.
- A working knowledge of the UNIX command line (course registration page).
- If you are not able to attend this prerequisite course, please work through our Unix command line materials ahead of the course (up to section 7).
- A working knowledge of R (course registration page).
- If you are not able to attend this prerequisite course, please work through our R materials ahead of the course.
About the authors (alphabetical by surname):
- Sandra Cortijo
Affiliation: Centre National de la Recherche Scientifique: Montpellier
Roles: writing; conceptualisation; coding - Sergio Martinez Cuesta
Affiliation: AstraZeneca, Cambridge
Roles: writing; conceptualisation; coding - Sankari Nagarajan
Affiliation: University of Manchester
Roles: writing; conceptualisation - Ashley Sawle
Affiliation: Cancer Research UK, Cambridge Institute
Roles: writing; conceptualisation; coding - Denis Seyres
Affiliation: Universitätsspital Basel: Basel
Roles: writing; conceptualisation; coding - Hugo Tavares
Affiliation: Bioinformatics Training Facility, University of Cambridge
Roles: writing; conceptualisation; coding
Please cite these materials if:
- You adapted or used any of them in your own teaching.
- These materials were useful for your research work. For example, you can cite us in the methods section of your paper: "We carried our analyses based on the recommendations in Cortijo S et al. (2023).".
You can cite these materials as:
Cortijo S, Martinez Cuesta S, Nagarajan S, Sawle A, Seyres D, Tavares H (2023) "cambiotraining/chipseq: Analysis of ChIP-seq Data", https://cambiotraining.github.io/chipseq/
Or in BibTeX format:
@Misc{,
author = {Cortijo, Sandra AND Martinez Cuesta, Sergio AND Nagarajan, Sankari AND Sawle, Ashley AND Seyres, Denis AND Tavares, Hugo},
title = {cambiotraining/chipseq: Analysis of ChIP-seq Data},
month = {July},
year = {2023},
url = {https://cambiotraining.github.io/chipseq/}
}
There are many online resources that inspired our own materials (e.g. package vignettes) and we cite them where relevant.
We also recommend the following training materials:
- Understanding chromatin biology using high throughput sequencing from the Harvard Chan Bioinformatics Core
- Introduction to ChIPseq using HPC from the Harvard Chan Bioinformatics Core
- ChIP-seq analysis from the Babraham Institute