Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H100 config #344

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions gpu-simulator/configs/tested-cfgs/SM90_H100/trace.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
-trace_opcode_latency_initiation_int 2,2
-trace_opcode_latency_initiation_sp 2,1
-trace_opcode_latency_initiation_dp 64,64
-trace_opcode_latency_initiation_sfu 21,8
-trace_opcode_latency_initiation_tensor 32,32

#execute branch insts on spec unit 1
#<enabled>,<num_units>,<max_latency>,<ID_OC_SPEC>,<OC_EX_SPEC>,<NAME>
-specialized_unit_1 1,4,4,4,4,BRA
-trace_opcode_latency_initiation_spec_op_1 4,4

#TEX unit, make fixed latency for all tex insts
-specialized_unit_2 1,4,200,4,4,TEX
-trace_opcode_latency_initiation_spec_op_2 200,4

#tensor unit
-specialized_unit_3 1,4,32,4,4,TENSOR
-trace_opcode_latency_initiation_spec_op_3 32,32

#UDP unit, for turing and above
#for more info about UDP, see https://www.hotchips.org/hc31/HC31_2.12_NVIDIA_final.pdf
-specialized_unit_4 1,4,4,4,4,UDP
-trace_opcode_latency_initiation_spec_op_4 4,1