Exploring Medusa and Multi-Token Prediction
10 min read
CAT
July 10, 2024
Matthew Gunton This blog post will go into detail on the “MEDUSA: Simple LLM Inference Acceleration Framework...