Skip to content
/ BLUPPP Public

Pipeline that prepares genotypes files for BLUPF90

License

Notifications You must be signed in to change notification settings

alopgar/BLUPPP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GPLv3 License Last version

BLUPPP

This pipeline processes a file of SNP genotypes and prepares it for its use with BLUPF90 or ASRGenomics.

Processing steps included:

1) Add new genotypes:

Two SNP files might be combined if new animals are to be included.

2) Custom processing:

Removal of non-SNP and non-animal ID columns, or other custom operations can be specified in the parameter file.

3) Combination or removal of duplicated animals:

If 2 copies of an animal ID are detected, BLUPPP will combine their SNP information, if the correspondence is higher than 95%, or remove both duplicates if the correspondence is lower than 95%. If 3 or more duplicates are detected, BLUPPP wil warn the user to check these IDs.

4) PLINK custom filters:

An additional script with plink processing (plink_filt.sh) is included in BLUPPP, which can be modified according to the user's preferences.

5) Conversion of SNP file to BLUPF90/ASRGenomics formats:

This step includes: a) The removal/keeping of headers (i.e., SNP names or codes).
b) The mutation of NA codes to 5.
c) The collapse or spaced separation of SNP values.
Following the requirements of BLUPF90/ASRGenomics, respectively.

0. Software requirements:

For the correct functioning of this scripts, the installation of several software is required:

R dependencies:

  • dplyr: install.packages("dplyr")
  • tidyr: install.packages("tidyr")
  • stringr: install.packages("stringr")
  • data.table: install.packages("data.table")

1. Installation:

a) Download all the ./bin files in your installation directory.
b) Add your installation directory to your PATH variable in ~/.bashrc file.
c) Change BINPATH variable inside BLUPPP_exe.sh to point your BLUPPP installation directory.

2. Pipeline execution:

a) Use BLUPPP_exe.sh -h for more information.
b) Download the parameter file model (BLUPPP_parameters.par) and fill variables.
c) Run as: BLUPPP_exe.sh -i path/to/BLUPPP_parameters.par [additional options]

In development:

  1. Check R code compatibility with last R version (v4.2.2).
  2. Accessibility improvements.

About

Pipeline that prepares genotypes files for BLUPF90

Resources

License

Stars

Watchers

Forks

Packages

No packages published