-
Notifications
You must be signed in to change notification settings - Fork 0
Shared tools on Imperial HPC
Many tools that our lab uses are not installed on Imperial HPC. Installing tools yourself on HPC can be tricky because you don't have root permissions. However, there are some ways around this:
- Install local executable files on HPC (covered on this page).
- Create conda environments on HPC (covered here).
- Request that it be installed via ASK. Do note that this can take some time depending on how busy the HPC team is.
The Tools folder (/rds/general/project/neurogenomics-lab/live/Tools
) contains executables of software that can be used by the entire neurogenomics-lab group. It includes the following:
- Description: Tool description.
- Download steps: How the tool from downloaded and installed on HPC. The header includes a hyperlink to the official instructions.
-
Usage: Example usage. Any
export
commands can be pasted into your~/.bashrc
so they are automatically available when you next log into HPC. Alternatively, you could add theexport
commands to specific scripts.
CellRanger is a toolkit for the pre-processing of (single-cell) RNA-seq data. It contains a number of tools for creating fastq files, gene count matrices, and more.
- Download:
curl -o cellranger-6.0.2.tar.gz "https://cf.10xgenomics.com/releases/cell-exp/cellranger-6.0.2.tar.gz?Expires=1628114300&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9jZi4xMHhnZW5vbWljcy5jb20vcmVsZWFzZXMvY2VsbC1leHAvY2VsbHJhbmdlci02LjAuMi50YXIuZ3oiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE2MjgxMTQzMDB9fX1dfQ__&Signature=ksNjxn8WMwd5Q4eirSfxGXxoC6X41-RvV2XliuPR5iu1v9ftDs4f967z9W3krCUdDJxCpwAr5YGw4WOr-XZsHPc5h5eV7X5Zt7aXDEffnOUmIARhYdLn3utC1lm9bHsuzjwJVyxH3TjjsxgPa8PY7E5TXitxVSPZXvUJrJLTFHqi5d1xRKQJaVIKvMnuyN0OAJhZTxqQRwaSvjE-H7-U-y2Q1WReYFjFcYUgAwz-jkGqCbBb~e4D29-sYJyjoKCCbLLiZde3D85v1JVGy4zzsCVqkpiXYsV90uS-IPsoNgI1UaJ4nQ9LluZ4-nY0ihchjIqkRKvkexb9YHXG3kvETg__&Key-Pair-Id=APKAI7S6A5RYOXBWRPDA"
- Decompress (WARNING: takes a long time, ~1 hour):
tar -xzvf cellranger-6.0.2.tar.gz
- Set permissions:
chmod -R u=rwx,go=rx cellranger-6.0.2/
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-6.0.2:$PATH
cellranger -h
CellRanger ARC is an extension of CellRanger that specialises in the pre-processing of Chromium Single Cell Multiome ATAC + Gene Expression sequencing data. However, CellRanger ARC also includes a number of post-processing (secondary) analysis pipelines that can, for example, generate PCA/t-SNE/UMAP projections and cell clusters as well as feature linkages.
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-arc-2.0.0:$PATH
cellranger-arc -h
CellRanger ATAC is an extension of CellRanger that specialises in the pre-processing of Chromium Single Cell ATAC data.
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/cellranger-atac-1.2.0/cellranger-atac-1.2.0:$PATH
cellranger-atac -h
Update JAVA executables. The default JAVA version on HPC is not updated and can cause conflicts with other software like Nextflow. Therefore you need to use this local version instead.
Download the Java SE Development Kit (JDK) from the Oracle website. Annoyingly, they now require you to log into an Oracle account in order to donwload their software. This means you can't simply wget/curl this software directly from HPC.
Instead, you must download via a web browser on your local computer, and then copy the file over to HPC by dragging it into your mounted folder (if you have that set up), or:
scp <local_file_path> <hpc_username>:@wmcr-nskene.med.ic.ac.uk:/rds/general/project/neurogenomics-lab/live/Tools
Once the compressed file is on HPC, decompress it:
tar -xzvf jdk-11.0.12_linux-x64_bin.tar.gz
When using other software like Nextflow, you must first load the paths to this updated version of JAVA to override the HPC defaults.
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/jdk-11.0.12/bin:$PATH
export JAVA_HOME=/rds/general/project/neurogenomics-lab/live/Tools/jdk-11.0.12
Nextflow is a workflow executation software that allows you to write rboust and reproducible pipelines (at least in theory). However, HPC is not set up properly to use Nextflow as-is. Therefore a lot of work is needed to get Nextflow to work on HPC. I've tried to document as many of these steps as possible.
I've downloaded the latest version of Nextflow (v21.04.3.5560 as of Aug 4th, 2021) and provided it here.
Nextflow requires JAVA version 8 or greater. Therefore, you will need to follow the steps in the jdk-11.0.12 section in add.
The following command downloads nextflow to your current working directory:
curl -s https://get.nextflow.io | bash
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/nextflow-21.04.3.5560:$PATH
nextflow run -h
Alternatively, you can install Nextflow via a conda environment. This can be useful when trying to make sure all other softeware are compatible with a particular version of Nextflow.
- Install miniconda/anaconda if you haven't done so already.
- Create a conda env using the following yaml file (stored remotely).
# Only need to create the conda env once.
conda env create -f https://github.com/bschilder/scKirby/raw/main/inst/conda/nfcore.yml
# Extra steps are required only when on HPC
module load anaconda3/personal
bash
# Activate the conda env
conda activate nfcore
# May need to run this extra export step if you have other versions of Java installed on your machine that are overriding your conda-installed version.
# export JAVA_HOME=/opt/anaconda3/envs/nfcore
# You will now be using the version of nextflow that is inside your conda env.
nextflow
MAGMA is a tool for gene analysis and generalized gene-set analysis of GWAS data. It can be used to analyse both raw genotype data as well as summary SNP p-values from a previous GWAS or meta-analysis.
Download the "Linux (Debian, 64 bits)" version of magma from their website, or simply run:
wget https://ctg.cncr.nl/software/MAGMA/prog/magma_v1.09a.zip
unzip magma_v1.09a.zip
The magma
executable will now be in your current working directory and you can move it wherever you.
The MAGMA documentation website also includes auxillary files, such as reference genomes, which you can use.
export PATH=/rds/general/project/neurogenomics-lab/live/Tools/magma_v1.08a:$PATH
magma
?
Ask Nathan for documentation.
- Home
- Useful Info
- To do list for new starters
- Recommended Reading
-
Computing
- Our Private Cloud System
- Cloud Computing
- Docker
- Creating a Bioconductor package
- PBS example scripts for the Imperial HPC
- HPC Issues list
- Nextflow
- Analysing TIP-seq data with the nf-core/cutandrun pipeline
- Shared tools on Imperial HPC
- VSCode
- Working with Google Cloud Platform
- Retrieving raw sequence data from the SRA
- Submitting read data to the European Nucleotide Archive
- R markdown
- Lab software
- Genetics
- Reproducibility
- The Lab Website
- Experimental
- Lab resources
- Administrative stuff