Tokenization
GPT from scratch
This is a very rustic blog post about the foundational aspect of building a GPT model from the ground up. The article covers essential topics such as tokenization, data chunking, batch processing, and the implementation of a Bigram language model using PyTorch.