A minimal PyTorch re-implementation of OpenAI GPT (Generative Pretrained Transformer) training GPT, both training and inference. numoGPT tries to be small, clean, interpretable and educational, as most of the currently available GPT model implementations can a bit sprawling. GPT is not a complicated model and this implementation is appropriately about 300 lines of code (see numogpt/model.py). All that's going on is that a sequence of indices feeds into a Transformer, and a probability distribution over the next index in the sequence comes out. The majority of the complexity is just being clever with batching (both across examples and over sequence length) for efficiency.
- numoGPT is alternative fork from minGPT
- Demo has been implemented, that demonstrated training on input train text.
- Embedded openai' GPT2: tokens volabulary, json-vocabulary of indices for encoder.
- Implemented filtering by customized stopwords: stopwords.txt.
- Implemented pytorch TextDataset (text_dataset.py) with splitting the input text into token-blocks.
Working demo: demo.py
- device: cpu
- model: gpt-numo
- n_layer: 4
- n_head: 4
- n_embd: 64
- block_sz: 8
- params: 3.42M
- stopwords
TextWordacyDataset.sz=4677, block_size=6, blocks=780
number of parameters: 0.28M
running on device: cpu
...on 100th iter...
...on 200th iter...
...on 300th iter...
...on 400th iter...
...on 500th iter...
...
...on 4600th iter...
...on 4700th iter...
...on 4800th iter...
...on 4900th iter...
...on 5000th iter...
...finished 5000 iter(s)
--------------------------------------------------------------------------------
evaluate_gpt epoch:: batches=74, batch_sz=64
val_loss=0.5704, perplexity(PPL)=1.7690
--------------------------------------------------------------------------------
prompt ("text"): clustering algorithms sota pretrained language
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
prompt ("text"): clustering models proposed consider
--------------------------------------------------------------------------------
- 1. minGPT from Andrej Karpathy
- 2. minbpe from Andrej Karpathy
- 3. Karpathy-nn-zero-to-hero-gpt-exercises
- 4. Training a Mini-GPT to Learn Two-Digit Addition