6 Educational Resources for Foundation Model Training

Training models at any scale can be quite daunting to newer practitioners. The following educational resources may be useful in learning about the considerations required for successfully and effectively training or fine-tuning foundation models.

Educational Resources for Foundation Model Training

Additional Educational Resources

Text 6 Speech 2 Vision 2
  • Everything about Distributed Training and Efficient Finetuning

    Everything about Distributed Training and Efficient Finetuning

    A rundown and crash course in distributed training for deep learning, with an eye toward LLM finetuning and current useful tools and resources. Provides a good overview of the various (distributed) training strategies for efficient and scalable training.

  • Machine Learning Engineering Online Book

    An “online textbook” and resource collection on ML engineering at scale, ranging from debugging distributed systems, parallelism strategies, effective use of large HPC clusters, and chronicles of past large-scale training runs with lessons learned.

    Text Speech Vision
  • nanoGPT

    A minimal, stripped-down training codebase for teaching purposes and easily-hackable yet performant small-scale training.

  • The EleutherAI Model Training Cookbook

    A set of resources on how to train large scale AI systems

    Text Speech Vision
  • Transformer Inference Arithmetic

    Transformer Inference Arithmetic

    A blog post on the inference costs of transformer-based LMs. Useful for providing more insight into deep learning accelerators and inference-relevant decisions to make when training a model.

  • Transformer Math 101

    Transformer Math 101

    An introductory blog post on training costs of LLMs, going over useful formulas and considerations from a high to low level