Day1 Task2

A BASH script to map reads using Bowtie2

Source: https://github.com/GeoGenetics/data-analysis-2024/tree/main/reproducible-data-analysis/day1/task2
Data: https://github.com/GeoGenetics/data-analysis-2024/tree/main/reproducible-data-analysis/data/mapping
Environment: day1

This is a Bash script that takes command-line arguments, runs Bowtie2 to map reads, sorts the resulting BAM file with Samtools, and cleans up intermediate files.

Click to get the code used to generate the image!

Paste this code into the Mermaid Editor

    graph TD
    A[Input files] --> B{Command line arguments}
    B -->|index| C(index)
    B -->|reads| D(reads)
    B -->|threads| E(threads)
    C --> F[Reference index]
    D --> G[Input reads]
    E -->|Default: 2| H[Number of threads]
    F -->|Required| I{Map reads with Bowtie2}
    G --> I
    I --> J{Sort with Samtools}
    J --> K{Clean up intermediate files}
    K --> L[Output BAM file]

Let's go through the script step by step to understand what it does and how it works.

Input validation and argument parsing. This step checks the user input to make sure all required arguments are present, and that the specified files and options are valid. It also sets default values for any optional arguments that were not specified.

#!/bin/bash
set -euo pipefail # Enable strict mode
trap 'echo "Error: ${BASH_SOURCE}:${LINENO}: ${BASH_COMMAND}" >&2' ERR
trap 'echo "Interrupted" >&2 ; exit 1' INT

# Define usage message
usage() {
    echo "Usage: $(basename "$0") -i <index> -r <reads> [-t <threads>]"
}

# Parse command-line arguments
while getopts ":i:r:t:" opt; do
    case ${opt} in
    i) index="${OPTARG}" ;;
    r) reads="${OPTARG}" ;;
    t) threads="${OPTARG}" ;;
    \?)
        echo "Invalid option: -${OPTARG}" >&2
        usage
        exit 1
        ;;
    :)
        echo "Option -${OPTARG} requires an argument" >&2
        usage
        exit 1
        ;;
    esac
done

# Check that required options were provided
if [[ -z "${index:-}" || -z "${reads:-}" ]]; then
    usage
    exit 1
fi

# Set default values
threads="${threads:-2}"

Read mapping with Bowtie2 and Converting SAM to BAM and filtering mapped reads. This step uses Bowtie2, a short-read aligner, to map reads from the input FASTQ file(s) to a reference genome. Then, pipes the results from Bowtie2 into SAMtools to convert the SAM file output from the previous step to a binary BAM format. It then filters out any reads that were not successfully mapped to the reference genome, resulting in a BAM file containing only the mapped reads.
```
# Construct output file paths
file_name=$(basename ${reads})
bam_mapped_file="${file_name%%[ .]*}".mapped.bam
bam_file="${file_name%%[ .]*}".sorted.bam

# Run Bowtie2 to map the reads
bowtie2 --threads "${threads}" -x "${index}" -U "${reads}" \
    | samtools view -@ "${threads}" -bS -F 4 > "${bam_mapped_file}"
```
Sorting the BAM file. This step uses SAMtools to sort the mapped BAM file by genomic coordinates, which is necessary for downstream analysis. Are you comfortable with the concept of sorting a BAM file? Do you have any questions about the sorting process?
```
# Sort the resulting BAM file with Samtools
samtools sort -@ "${threads}" -o "${bam_file}" "${bam_mapped_file}"
```
Cleaning up intermediate files. This step removes any intermediate files generated during the previous steps, leaving only the final sorted BAM file.
```
# Clean up intermediate files
rm  -rf "${bam_mapped_file}"
```

We can call this script as follows:

cd ~/course/wdir/data-analysis-2024/reproducible-data-analysis/day1/task2
./mapping-and-sorting.sh -r ../../data/mapping/sample1.fq.gz -i ../../data/mapping/SRS121011 -t 2

Day 0
Day 1
- Setting up the environment
- Task 1: Writing a simple BASH script
- Task 2: Writing a More Complex BASH script
Day 2
- Task 1: Writing a simple workflow
- Task 2: Writing a More Complex Workflow
- Task3: Mapping reads using Bowtie2
Resources
- Reproducible Research
- Setting up a Project

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Day1 Task2

A BASH script to map reads using Bowtie2

Clone this wiki locally