A transformer model built from scratch using Python and PyTorch that translates from English to Chinese text.
Architecture contains encoder and decoder layer with MultiHeadAttention and MultiHeadCrossAttention classes, padding masks, future mask, and everything else that allows the whole system to work correctly.