[1] https://huggingface.co/datasets/roneneldan/TinyStories
The use case for this is learning in simple example.
[1] https://github.com/karpathy/nanoGPT