Skip to content
This repository has been archived by the owner on Oct 7, 2019. It is now read-only.
ammatsun edited this page Mar 25, 2015 · 3 revisions

title: "Cloud Introduction" author: "Data Carpentry" date: "Tuesday, March 24, 2015"

What are some of reasons to access a remote computer system?

  • Your computer does not have enough resources to run the desired analysis (memory, processors, disk space, network bandwidth).
  • You want to produce results faster than your computer can.
  • You cannot install software in your computer (application does not have support for your operating system, conflicts with other existing applications)

Main Computational Resources

Computer Node

  • Find how much disk space is available in your system:
  • Find how many cores and processors are available in your system:
cat /proc/cpuinfo
  • Find how many processes are running in your system, and how much resources is each consuming:
top
  • Find how long it takes to run an application:
time <your_application>

Distributed System Definitions and stacks:

(Note that many definitions exist for these terms)

  • Distributed application: an application that can be executed on a distributed system platform (e.g., mpiBLAST)

  • Distributed system platform: software layers that facilitates coordination and management of a distributed system (e.g., queue-based system, and MapReduce)

  • Distributed system:

    • High Performance Computing (HPC): large assemble of physical machines and a homogeneous operating system (e.g., your institutions' HPC, XSEDE's HPC)
    • Cloud Computing: virtual machines, distributed platforms and/or applications offered as a service (e.g., Amazon Web Services, Microsoft Azure, Google Cloud Computing)
  • Virtual machine (VM): software computer that like a physical computer can run an operating system and applications

  • Operating system (OS): the basic software layer that allows execution and management of applications

  • Physical machine: the hardware (processors, memory, disk and network)

HPC vs. Cloud:

HPC Cloud
User account on the system root account on the system
Limited control of the system Full control of the system
Central shared file system Local file system
Jobs submitted into a queue Jobs executed on each resource
Account-based isolation OS-based isolation
Batch-oriented execution of applications support for batch or interactive applications
Request for resource and time allocation Pay-per-use
etc. etc.

HPC vs. Cloud

Resources: