STARR Seq to MPRAnalyze input

Overview

STARR_Seq_to_MPRAnalyze_input is a pipeline that processes STARR Seq data (.fastq.gz) to an input file for MPRAnalyze (.csv). This repository contains the workflow and scripts for processing this data, and many of these scripts were adopted and curated from Arpit Misha (post-doc in the Hawkins lab). If there are any questions/bugs/errors, please contact me at [email protected] or [email protected].

Installation

Confirm that conda is installed.
Clone this repository into the location you want to run the pipeline.
Create and activate the provided environment:

git clone https://github.com/hawkins-lab/STARR_Seq_to_MPRAnalyze_input.git \ 
&& cd STARR_Seq_to_MPRAnalyze_input/env/ \ 
&& conda env create -f STARRSeq2MPRAnalyze_env \ 
&& conda env create -f umitools_env

Running the pipeline

Navigate to the STARR_Seq_to_MPRAnalyze_input directory.
Add any Starr Seq files into the pipeline_input directory. These should be .fasta.gz files.
Submit a job to the computation cluster with the run_pipeline.sh script.
Wait ~12 hours for the data to be processed
Check pipeline_output/13_final_mpranalyze_input/ directory for the MPRAnalyze input .csv files.

Just a few notes for reference:

We run all heavy computational processes on University of Washington's Genome Sciences cluster computing.
Arpit has notified me that the first steps of processing the fastq.gz files may have different barcode lengths. If that is the case, you might need to modify the numbers (i.e. cutadapt -j 0 -u 50 -u -40 might be different than what is written here).

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bwa_index_aid		bwa_index_aid
envs		envs
pipeline_input		pipeline_input
pipeline_output		pipeline_output
scripts		scripts
README.md		README.md
run_pipeline.sh		run_pipeline.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STARR Seq to MPRAnalyze input

Overview

Installation

Running the pipeline

About

Releases

Packages

Contributors 2

Languages

hawkins-lab/STARR_Seq_to_MPRAnalyze_input

Folders and files

Latest commit

History

Repository files navigation

STARR Seq to MPRAnalyze input

Overview

Installation

Running the pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages