Coding with Intelligence

Share this post

70B Llama 2 reaches 35 tokens/s on consumer hardware 🏇

codingwithintelligence.com

Discover more from Coding with Intelligence

CoWI is a weekly newsletter covering the latest developments in Large Language Models and Machine Learning. Get the latest News, Repos, Demos, Products, and Papers.
Over 2,000 subscribers
Continue reading
Sign in

70B Llama 2 reaches 35 tokens/s on consumer hardware 🏇

Week 37 of Coding with Intelligence

Rick Lamers
Sep 13, 2023
3
Share this post

70B Llama 2 reaches 35 tokens/s on consumer hardware 🏇

codingwithintelligence.com
Share

📰 News

  • 70B Llama 2 at 35tokens/second on 4090 with 2.5 bits/w

  • phi-1.5 open sourced: Textbooks Are All You Need

  • Adept releases OSS LLM Persimmon-8B, beats Llama 2 and all other <10B OSS models

📦 Repos

  • textstat: 📝 python package to calculate readability statistics of a text

  • NeMo-Guardrails: LLM guardrails framework by NVIDIA

  • llmonitor: Open-source monitoring & analytics for AI apps and agent

📄 Papers

  • DeepMind: From Sparse to Soft Mixtures of Experts

    From the author "Sparse MoEs are a popular method for increasing the model size without increasing its cost, but they come with several issues. Soft MoEs avoid them and significantly outperform ViT and different Sparse MoEs on image classification."

  • Stay on topic with Classifier-Free Guidance

    Classifier-Free Guidance (CFG) is a technique to improve the coherence and alignment of text generations from large language models. And it's in llama.cpp https://github.com/ggerganov/llama.cpp/pull/2135

  • Lost in the Middle: How Language Models Use Long Contexts

📚 Resources

  • Cube(.js) semantic layer with LangChain for hallucination-free natural language querying

  • Reverse engineering iOS/macOS upcoming on-device LLM

  • Paper List for LLM In-context Learning

  • Report on Frontier Model Training


Want more? Follow me on Twitter! @ricklamers

3
Share this post

70B Llama 2 reaches 35 tokens/s on consumer hardware 🏇

codingwithintelligence.com
Share
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Rick Lamers
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing