Skip to content

Basic Fusion Metrics

Landon Clipp edited this page Oct 11, 2017 · 25 revisions
  • Generation times:
    • ~30 minutes for no compression
    • ~45 minutes with HDF compression and chunking
  • Approximate dataset sizes:
    • 1 granule uncompressed: 20GB to 120GB
    • 1 granule compressed: 20GB to 75GB
    • 1 Year uncompressed BF output (2013): 250 TB
    • 1 year compressed BF output (2013, compression level 1): 180 TB
    • 1 year BF input (2013): 54 TB
  • Tarring time for 1 year BF input: ~ 7 Days
  • Untarring time for 1 year BF input: ~ 7 Days
  • Ratio of output file size to input for one orbit: For orbit 69400, input files total 13.3 GB. Output is 41.5 GB. Ratio: 41.5 / 13.3 = 3.12. NOTE That this ratio is subject to change from orbit to orbit due to subsetting and/or possibly corrupt files.

  • Python tarring of input files by orbit:
    • DB query using python multithreading
    • tar file generation using python multiprocessing
    • Login node, 1 instance of program
    • 2 tar files: 15GB + 13GB / 1m15s = 22.4 GB / min
    • 6 tar files: 15GB+13GB+12GB+14GB+16GB+14GB / 2m27s = 33.6 GB / min
    • 10 tar files: 135GB / 3m35s = 38.6GB/min
    • 15 tar files: 204GB / 5m6s = 40GB/min
    • 20 tar files: 417GB/11.4min = 36.5GB/min
    • 30 tar files with MULTIPROCESSING for SQL queries: 417GB/10min = 41.7GB/min
    • AUG 25 I rewrote the script to use MPI. Using 5 nodes, 20 processors per node on ROGER, I managed 78.7GB/min (1.4TB / 17min50s). Can probably go higher with more nodes.
    • AUG 29 Using 6 nodes and storing data on large block file system, ROGER tarring of input files reached 136GB/min

NEARLINE TRANSFER SPEED

  • Transferring 100 tar files from scratch -> nearline, about 15GB each: 5.86GB/s

Input File Sizes

mod
NOTE that for MODIS, the histogram does not differentiate between the 1KM, QKM, HKM and MOD03 files.
mop
ast
cer
mis_agp
mis_gmp
mis_grp
mis_hrll