Skip to content

diixo/numoGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

numoGPT

A minimal PyTorch re-implementation of OpenAI GPT (Generative Pretrained Transformer) training GPT, both training and inference. numoGPT tries to be small, clean, interpretable and educational, as most of the currently available GPT model implementations can a bit sprawling. GPT is not a complicated model and this implementation is appropriately about 300 lines of code (see numogpt/model.py). All that's going on is that a sequence of indices feeds into a Transformer, and a probability distribution over the next index in the sequence comes out. The majority of the complexity is just being clever with batching (both across examples and over sequence length) for efficiency.

  • numoGPT is alternative fork from minGPT

Updates:

Model:

Working demo: demo.py

  • device: cpu
  • model: gpt-numo
  • n_layer: 4
  • n_head: 4
  • n_embd: 64
  • block_sz: 8
  • params: 3.42M
  • stopwords

tokens_block size=8:

TextWordacyDataset.sz=4677, block_size=6, blocks=780
number of parameters: 0.28M
running on device: cpu
...on 100th iter...
...on 200th iter...
...on 300th iter...
...on 400th iter...
...on 500th iter...
...
...on 4600th iter...
...on 4700th iter...
...on 4800th iter...
...on 4900th iter...
...on 5000th iter...
...finished 5000 iter(s)
--------------------------------------------------------------------------------
evaluate_gpt epoch:: batches=74, batch_sz=64
val_loss=0.5704, perplexity(PPL)=1.7690
--------------------------------------------------------------------------------
prompt ("text"): clustering algorithms sota pretrained language
--------------------------------------------------------------------------------

numoGPT

Text generation by specified prompt:

--------------------------------------------------------------------------------
prompt ("text"): clustering models proposed consider
--------------------------------------------------------------------------------

References:

Releases

No releases published

Packages

No packages published

Languages