Skip to content

Latest commit

 

History

History
40 lines (38 loc) · 2.91 KB

HiCorr_eHi-C.md

File metadata and controls

40 lines (38 loc) · 2.91 KB

👇 HiCorr on eHi-C

  • Download the code from this repository, "bin/eHiC/"
  • Download the reference files for eHiC (mm10/hg19 genome build)
wget http://hiview.case.edu/ssz20/tmp.HiCorr.ref/eHiC_HiCorr.tar.gz
tar -xvf eHiC_HiCorr.tar.gz
bash eHiC.sh eHiC_HiCorr/ bin/eHiC/ <frag_loop.name.cis> <frag_loop.name.trans> <outputname> <hg19/mm10>
   # specify the path of downloaded unzipped reference file and scripts
   # input two fragment loop files genrated from preprocessing step
   # specifiy outputname prefix
   # specify genome build, the provided reference only include hg19 and mm10

eHiC

eHiC mode corrects bias of eHi-C data. It takes two fragment-end-pair files as input (use HiCorr's eHiC-QC mode if you need to generate these files) and outputs an anchor_pair file.

  • The two input files: one file contains intra-chromosome looping fragment-end pairs(cis pairs), and another contains inter-chromosome looping fragment-end pairs(trans pairs).
    • Intra-chromosome looping pairs need to have 4 tab-delimited columns, in the following format:
      frag_end_id_1 frag_end_id_2 observed_reads_count distance_between_two_fragments
      See sample file here:
    • Inter-chromosome looping piars need to have 3 tab-delimited columns, in the following format:
      frag_end_id_1 frag_end_id_2 observed_reads_count
      See sample file here:
    • These two files needs to be sorted before you run the pipeline (sort -k1 -k2).
  • The final result of HindIII mode is an anchor-to-anchor looping pairs file, which has 5 columns:
    anchor_id_1anchor_id_2 obserced_reads_count expected_reads_count p_value_
    See sample file here: http://hiview.case.edu/test/sample/anchor_2_anchor.loop.IMR90.p_val.sample

To run the eHiC mode:
./HiCorr eHiC <cis_loop_file> <trans_loop_file> <name_of_your_data> <reference_genome>

eHiC-QC

eHiC-QC mode takes a pair of fastq.gz files as input, aligns and processes eHiC reads, outputs fragment-end-pair files for further analysis. This mode also outputs summarize numbers which works as quality check fo eHiC experiments. Make sure to name your fastq.gz files as .R1.fastq.gz and .R1.fastq.gz. You need to have Bowtie(http://bowtie-bio.sourceforge.net/index.shtml) and samtools(http://www.htslib.org/) installed since HiCorr calls Bowtie to do alignments. You also need Bowtie index and fa.fai file. To run the eHiC-QC mode, you need 4 arguments:
./HiCorr eHiC-QC <bowtie_index> <fa.fai> <name>