luma

Warning

This project is still a work in progress.

GPT-2 [1] implementation with manually computed gradients, inspired by karpathy/llm.c and karpathy/nanoGPT. Also features a BPE tokenizer [2]. The plan is to eventually rewrite this in C++ with hand-optimized CPU kernels for small, hardware-restricted LM purposes.

Usage

Requires PyTorch >= 2.6.0. Run python src/gpt.py to train using the Shakespeare dataset and display a sample of inference.

References

[1] Phuong, Mary, and Marcus Hutter. ‘Formal Algorithms for Transformers’. arXiv [Cs.LG], 2022, http://arxiv.org/abs/2207.09238.
[2] Sennrich, Rico, et al. ‘Neural Machine Translation of Rare Words with Subword Units’. arXiv [Cs.CL], 2016, http://arxiv.org/abs/1508.07909.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
misc		misc
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

luma

Usage

References

About

Releases

Packages

Languages

atzuur/luma

Folders and files

Latest commit

History

Repository files navigation

luma

Usage

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages