Skip to main content

LLM

📄️ The Illustrated Transformer (Notes)

- 注意力机制

📄️ Transformer Inference Arithmetic

截至 2022-03-30

📄️ Chat Template

https://github.com/openai/openai-python/blob/release-v0.28.0/chatml.md

📄️ LLM为什么使用Decoder only架构？

1. 淘汰掉BERT这种encoder-only，因为它用masked language modeling预训练，不擅长做生成任务

📄️ Speculative Decoding

Speculative decoding uses two models:

📄️ ChatGPT Fine Tuning

When to Fine Tune?

📄️ Train Domain LLM with GRPO

motivation