Datenanalyse_2022_23

This repository contains all the tools and methods developed specifically for the course “Applied data analysis in bioinformatics” from the masters program “Bioinformatik und Systembiologie” at the Justus-Liebig-University and the Technische Hochschule Mittelhessen in the wintern term 2022/2023.

The goal of this course is to develop a pipeline for the Max Planck Institute for Heart and Lung Research which takes data from CATLAS and performs distinct analyses mainly based on the chromatin accessibility¹.

Furthermore, this pipeline is organized into two separate packages (WP1/WP2) due to the group distribution of the course. A short description of each package is given below:

WP1:
- The first part of the pipeline contains functions for reading .bed files, plotting and computation of quality control parameters like e.g. mean/median of the fragment lengths or an interpretable score. Additionally, with the help of .gtf files it is possible to calculate the fragment distribution around TSS.
WP2:
- The second part of the pipeline contains functions for calculating the feature overlap for each cell barcode to a given feature and visualize the calculated data with different plots.

Each package also contains a rich README explaining all features and their functionality and how to use them. The slides of the final presentation held on the 01.03.23 can be taken from presentation.pdf. To increase the understanding of the whole pipeline we developed a graphical representation, which can be seen below.

This representation in terms of functionality can be divided into the two packages as follows. The first starts on the left with the reading of the .bed files, followed by two interlocking gears, the quality control functions and parameters and the visualization of these. The result after these two steps is an AnnData object, which in turn is further used by the second package. Here, this object is then additionally filled with information from further .bed and .gtf files. After that, another process of two gears takes place, which is divided into a feature overlap calculation and a visualization step. The result of our pipeline is a rich AnnData object ready for even further analysis. More details are given in the corresponding subfolders.

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
images		images
wp1		wp1
wp2		wp2
.gitignore		.gitignore
README.md		README.md
presentation.pdf		presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Datenanalyse_2022_23

About

Releases

Packages

Contributors 6

Languages

loosolab/Datenanalyse_2022_23

Folders and files

Latest commit

History

Repository files navigation

Datenanalyse_2022_23

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages