70B Llama 2 reaches 35 tokens/s on consumer hardware 🏇

Sep 13, 2023

📰 News

📦 Repos

📄 Papers

DeepMind: From Sparse to Soft Mixtures of Experts
From the author "Sparse MoEs are a popular method for increasing the model size without increasing its cost, but they come with several issues. Soft MoEs avoid them and significantly outperform ViT and different Sparse MoEs on image classification."
Stay on topic with Classifier-Free Guidance
Classifier-Free Guidance (CFG) is a technique to improve the coherence and alignment of text generations from large language models. And it's in llama.cpp https://github.com/ggerganov/llama.cpp/pull/2135
Lost in the Middle: How Language Models Use Long Contexts

📚 Resources

Want more? Follow me on Twitter! @ricklamers

Coding with Intelligence