DisaggSim (Disaggregated architecture Simulator) provides a scalable and configurable simulation for disaggregated architecture based on the gem5 simulator.
Please refer to the original gem5's documentation if you need to install the dependencies.
$ scons build/X86/gem5.opt -j$(nproc)
$ build/X86/gem5.opt configs/disaggregated_system/disaggregated_se.py --cmd="tests/test-progs/hello/bin/x86/linux/hello" --cpu-type=TimingSimpleCPU --caches --l2cache --l3cache
$ scons build/DISAGGREGATED/gem5.opt --default=X86 PROTOCOL=DISAGGREGATED -j$(nproc)
$ build/X86/gem5.opt configs/disaggregated_system/disaggregated_se.py --cmd="tests/test-progs/hello/bin/x86/linux/hello" --cpu-type=TimingSimpleCPU --d_ruby --network=garnet --caches --l2cache --l3cache
GEM=<GEM5 DIRECTORY>/build/DISAGGREGATED/gem5.opt
SE=<GEM5 DIRECTORY>/configs/disaggregated_system/disaggregated_se.py
$GEM $SE\
--cmd=<command-to-run> \
--options=<options-for-cmd> \
--cpu-type=DerivO3CPU \
--caches --l2cache --l3cache \
--network=garnet --d_ruby
CPU Module-related Options
cpu-type
: the CPU type- choose from 'AtomicSimpleCPU', 'NonCachingSimpleCPU', 'X86KvmCPU', 'DerivO3CPU', 'TraceCPU', 'TimingSimpleCPU')
num_cpus
: the number of cores within a module (default: 1)- e.g.
--num-cpus = 1
- also need to append cmd with ';new command' (e.g.
--cmd="/bin/x86/linux/hello;/bin/x86/linux/hello"
)
- e.g.
num_cpu_modules
: the number of CPU Modules (default: 2)- e.g.
--num_cpu_modules=2
- e.g.
allocated_d_mem_size
: allocated remote memory (default: "1024MB")- e.g.
--allocated_d_mem_size = "1024MB"
- e.g.
local_mem_size
: indicate that local memory is used and set how large it is- e.g.
--local_mem_size="1024MB"
- e.g.
local_mem_type
: local memory type (default: "DDR4_2400_16x4")- e.g.
--local_mem_size="DDR4_2400_16x4"
- e.g.
cpu-clock
: CPU core clock(default: '3.4GHz')sys-clock
: module clock (e.g. internal interconnect clock) (default: '3.4GHz')caches
: indicate that (private) l1 cache is used (default: None)l1d_size
: l1d cache size (default:'32kB')l1i_size
: l1i cache size (default:'32kB')l1d_assoc
: l1d set-associativity (default:'8')l1i_assoc
: l1i set-associativity (default:'8')l2cache
: indicate that (private) l2 cache is used (default: None)l2_size
: l2 cache size (default:'256kB')l2_assoc
: l2 set-associativity (default:'8')l3cache
: indicate that (shared) l3 cache is used (default: None)l3_size
: l3 cache size per core (default:'2MB')- NOTE: number of cache set must be a power of 2
l3_assoc
: l3 set-associativity (default:'16')
Memory Module-related Options
d_logic_mem_pool_latency
: the latency applied to the logic attached to memory pool (default='40ns')num_mem_ctrls
: the number of memory controllers per one CPU module (integer, default=2)num_mem_modules
: the number of memory modules (e.g., DRAM) per one memory controller (integer, default=2)size_mem_module
: the size of one memory module (unit: MB) (integer, default=256)d_mem_size
: the size of remote memory per one CPU (string, default=1024MB)- MUST be the same as
options.num_mem_ctrls * options.num_mem_modules * options.size_mem_module
- MUST be the same or greater than
allocated_d_mem_size
- both sanity checks are done in
configs/disaggregated_system/MemModuleConfig.py
- MUST be the same as
- NOTE: currently, components' mapping information is hard-coded in
src/disaggregated_component/disaggregated_system.cc
- Memory Pool Manager's CID is the last CPU's CID + 1 (also fixed in
configs/disaggregated_system/MemModuleConfig.py
) - Address translation tables are generated with the fixed CID information.
- Memory Pool Manager's CID is the last CPU's CID + 1 (also fixed in
mem_interleave
: if specified, memory controllers' are accessed with 4KB-page interleaving. If not, CPU modules' interconnect logics are configured to translate addresses without interleaving. (default=False)interleave_unit
: memory interleaving unit (byte) (integer, default=4096)num_mem_pool_logic_processing_units
: number of processing units in the interconnect logic of a memory pool (integer, default=1)
Interconnect Logic-related Options
d_logic_clock
: interconnect logic clock frequency (default: '2GHz')- e.g.
--d_logic_clock='2GHz'
- e.g.
d_logic_latency
: interconnect logic latency (default: '60ns')- e.g.
--d_logic_latency='60ns'
- pipelining could be done by modifying latencies in (processSend & processReceive) but for now, packets are processed in a one-by-one manner
- e.g.
Interconnect-related Options
d_ruby
: indicate that garnet network is used (default: None); if Garnet is not used, just a default NonCoherentXBar is used- e.g.
--d_ruby
- e.g.
d_ruby_clock
: ruby system's clock (default: '2GHz')- e.g.
--d_ruby = '2GHz'
- e.g.
- Visualize the current configuration with pydot (output pdf file created in m5out directory)
$ pip install pydot
$ sudo apt-get install graphviz
$ <command to run the simulation>
As DisaggSim is based on the gem5 simulator, it is released under a Berkeley license.