Reading List

Transformers

Title	Topic	Comments
Transformers Primer by Aman.AI (Chadha, 2020)	Transformers	Very comprehensive
The Illustrated Transformer by Jay Alammar (Chadha, 2020)	Transformers	Great Illustrations
Attention in transformers, visually explained by 3Blue1Brown (3Blue1Brown, 2024)	Transformers	Great Visuals and Explanation
Some Intuition on Attention and the Transformer by Eugene Yan (Yan, 2023)	Transformers	Great Visuals and Explanation
The Transformer Family by Lilian Weng (Weng, 2020)	Advances in Transformers	Advanced transformer post-enhancements
The Transformer Family 2.0 by Lilian Weng (Weng, 2023)	Advances in Transformers	Update to (Weng, 2020) the transformer family. Adds a lot of other updates on the transformers, however, some modules (which were not covered in (Weng, 2020) since it’s very detailed and niche.
The Illustrated BERT (Alammar, 2018)	LMs	Good Short Overview
Generalized Language Models by Lilian Weng (Weng, 2019)	LMs	Great overview of BERT and its successors
Ten Noteworthy AI Research Papers of 2023 by Sebastian Raschka (Raschka, 2023)	LMs/Research	Decent samplers of 2023 10 papers
AI and Open Source in 2023 (Raschka, 2023)	LMs/Research	Decent samplers of 2023 10 papers
New LLM Pre-training and Post-training Paradigms (Raschka, 2024)	LMs/Training/Research	detailed overview of pre-training pipelines
Multimodality and Large Multimodal Models (LMMs) by Chip Huyen (Huyen, 2023)	MMs	Great review of MMs, with CLIP , FLAMINGO and insights
Generalized Visual Language Models by Lilian Weng (Weng, 2022)	MMs	Great overview of VLM techniques
Primers - Vision Language Models (Chadha, 2020)	MMs	Average Read
RLHF: Reinforcement Learning from Human Feedback (Huyen, 2023)	Training	Great intro to pre-training, SFT and RMs
LLM Training: RLHF and Its Alternatives (Raschka, 2023)	Training	Good overview of RLHF
LLM Alignment by Aman.AI (Chadha, 2023)	Training	Very thorough, not all topics are useful at the current date
Predictive Human Preference: From Model Ranking to Model Routing (Huyen, 2024)	Training	Basics of Model evaluation, routing and ranking. Other items like predictive human preference experiments can be ignored.
Aligning language models to follow instructions by OpenAI (OpenAI, 2022)	Training
Reinforcement Learning for Language Models (missing reference)	Training
Instruction Pretraining LLMs (Raschka, 2024)	Training
Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) (missing reference)	PET Methods	Good overview of LoRA and practical tips for using it
Noteworthy AI Research Papers of 2024 (Part One) (Raschka, 2023)	PET Methods	6 Research Papers of 2024-H1
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch (Raschka, 2023)	PET Methods	DoRA overview in-depth
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments (Raschka, 2023)	PET Methods	Deep-dive of (missing reference)
The Scaling Hypothesis (Gwern, 2022)	Scaling Laws	Great discussion and overview, and thought provoking (Long Read)
Scaling Laws in Large Language Models (Mandliya, 2024)	Scaling Laws	Great Quick Overview
Model Merging, Mixtures of Experts, and Towards Smaller LLMs (Raschka, 2023)	MoE/Merging