Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

Latest commit

 

History

History
149 lines (111 loc) · 3.6 KB

sns-io-repair-rebalance.md

File metadata and controls

149 lines (111 loc) · 3.6 KB

SNS (io, repair, rebalance)

Overview

SNS Overview

Repair/Rebalance

  • Generic Components

    • CM, CP
    • Pump
    • Proxy
    • Ag store
    • Sliding window
  • SNS Specific Components

    • Trigger fop/fom
    • SNS repair/rebalance copy machine
    • SNS repair copy packet
    • SNS repair data iterator
    • SNS repair incoming aggregation groups iterator

SNS repair/rebalance copy machine service

  • Repair and Rebalance are implemented as Motr services

    • $MOTR_SRC/sns/cm/repair/service.[ch]
    • $MOTR_SRC/sns/cm/rebalance/service.[ch]
  • Both the services run on every ioservice node.

  • Copy machine service initialises and finalises (start/stop) the fop and fom types for,

    • Copy packet fop and fom
    • Sw update fop and fom
    • Trigger fop and fom

Trigger fop/fom

  • Operations

    • Repair
    • Rebalance
    • Repair quiesce/resume
    • Rebalance quiesce/resume
    • Repair abort
    • Rebalance abort
    • Repair status
    • Rebalance status
  • Source: $MOTR_SRC/sns/cm/trigger_{fop, fom}.[ch]

Trigger fom

Sources :

  • $MOTR_SRC/cm/repreb/trigger_fom.c : Generic trigger fom implementation for PREPARE, READY, START and FINI phases.
  • $MOTR_SRC/sns/cm/trigger_fom.c : sns repair/rebalance trigger fom implementation.

Trigger fop/fom contd..

  • Triggers sns repair/rebalance

Phases

  • PREPARE
    • Quiesce/Abort/Status
    • Invokes copy machine prepare
      • Buffer pool provisioning, initialises ag, data iterator
  • READY - Invokes copy machine ready
    • Starts ast thread, updates initial sliding window
  • START - Invokes copy machine start
    • Starts pump fom, data iterator, initialises size data structures

Copy machine

CM start

CM Start

Repair/Rebalance copy machine

  • Inherits generic copy machine
    • Repair and Rebalance implemented as separate m0_reqh_service
    • Allocate - allocates and initialises struct m0_cm
    • Start - sets up copy machine, initialises fop/foms
    • Stop - finalises copy machine
  • Setup
    • Initialises data structures
    • Invokes corresponding copy machine (repair/rebalance) setup, mainly initialises buffer pools
      • Buffer pools - incoming and outgoing
    • Initialises sns data iterator
  • Prepare (generic)
    • Setup proxies
    • Start sw store fom
    • Setup pump
    • AG store fom start
  • Prepare
    • RM init
    • Buffer pool provisioning
    • Ag iterator init
  • Ready (generic)
    • Start ast thread
    • Update remote replicas
  • Start
    • Start pump (generic)
    • Start iterator
  • Stop
    • Stop iterator
    • Finalise RM
    • Prune bufferpools
    • Stop ast thread (generic)

SNS data iterator

SNS data iterator

Copy packet

Copy packet

Copy packet receive

Copy packet receive

Sliding window

Sliding window

Failure Handling

Failure handling

CM stop

CM Stop

Additional functionality

  • Abort
  • Quiesce/Resume
  • Concurrent io with repair/rebalance
  • Concurrent delete with repair/rebalance
  • Impose resource restrictions with help of sliding window

References