Title |
Topic |
Comments |
Transformers Primer by Aman.AI (Chadha, 2020) |
Transformers |
Very comprehensive |
The Illustrated Transformer by Jay Alammar (Chadha, 2020) |
Transformers |
Great Illustrations |
Attention in transformers, visually explained by 3Blue1Brown (3Blue1Brown, 2024) |
Transformers |
Great Visuals and Explanation |
Some Intuition on Attention and the Transformer by Eugene Yan (Yan, 2023) |
Transformers |
Great Visuals and Explanation |
The Transformer Family by Lilian Weng (Weng, 2020) |
Advances in Transformers |
Advanced transformer post-enhancements |
The Transformer Family 2.0 by Lilian Weng (Weng, 2023) |
Advances in Transformers |
Update to (Weng, 2020) the transformer family. Adds a lot of other updates on the transformers, however, some modules (which were not covered in (Weng, 2020) since it’s very detailed and niche. |
The Illustrated BERT (Alammar, 2018) |
LMs |
Good Short Overview |
Generalized Language Models by Lilian Weng (Weng, 2019) |
LMs |
Great overview of BERT and its successors |
Ten Noteworthy AI Research Papers of 2023 by Sebastian Raschka (Raschka, 2023) |
LMs/Research |
Decent samplers of 2023 10 papers |
AI and Open Source in 2023 (Raschka, 2023) |
LMs/Research |
Decent samplers of 2023 10 papers |
New LLM Pre-training and Post-training Paradigms (Raschka, 2024) |
LMs/Training/Research |
detailed overview of pre-training pipelines |
Multimodality and Large Multimodal Models (LMMs) by Chip Huyen (Huyen, 2023) |
MMs |
Great review of MMs, with CLIP , FLAMINGO and insights |
Generalized Visual Language Models by Lilian Weng (Weng, 2022) |
MMs |
Great overview of VLM techniques |
Primers - Vision Language Models (Chadha, 2020) |
MMs |
Average Read |
RLHF: Reinforcement Learning from Human Feedback (Huyen, 2023) |
Training |
Great intro to pre-training, SFT and RMs |
LLM Training: RLHF and Its Alternatives (Raschka, 2023) |
Training |
Good overview of RLHF |
LLM Alignment by Aman.AI (Chadha, 2023) |
Training |
Very thorough, not all topics are useful at the current date |
Predictive Human Preference: From Model Ranking to Model Routing (Huyen, 2024) |
Training |
Basics of Model evaluation, routing and ranking. Other items like predictive human preference experiments can be ignored. |
Aligning language models to follow instructions by OpenAI (OpenAI, 2022) |
Training |
|
Reinforcement Learning for Language Models (missing reference) |
Training |
|
Instruction Pretraining LLMs (Raschka, 2024) |
Training |
|
Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) (missing reference) |
PET Methods |
Good overview of LoRA and practical tips for using it |
Noteworthy AI Research Papers of 2024 (Part One) (Raschka, 2023) |
PET Methods |
6 Research Papers of 2024-H1 |
Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch (Raschka, 2023) |
PET Methods |
DoRA overview in-depth |
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments (Raschka, 2023) |
PET Methods |
Deep-dive of (missing reference) |
The Scaling Hypothesis (Gwern, 2022) |
Scaling Laws |
Great discussion and overview, and thought provoking (Long Read) |
Scaling Laws in Large Language Models (Mandliya, 2024) |
Scaling Laws |
Great Quick Overview |
Model Merging, Mixtures of Experts, and Towards Smaller LLMs (Raschka, 2023) |
MoE/Merging |
|