

Discover more from Coding with Intelligence
CoWI is a weekly newsletter covering the latest developments in Large Language Models and Machine Learning. Get the latest News, Repos, Demos, Products, and Papers.
Over 2,000 subscribers
Continue reading
8x faster inference with Flash Decoding - OS models are getting faster
Week 42 of Coding with Intelligence
📰 News
Together AI: Flash-Decoding for long-context inference
8x inference speedup on long sequences. From the same author that brought Flash Attention, Tri Dao.
📄 Papers
RAFA: A Framework for LLM Agents with Provable Sample Efficiency
BitNet: Scaling 1-bit Transformers for Large Language Models
🛠️ Products
📚 Resources
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments
Fireside Chat with Ilya Sutskever and Jensen Huang: AI Today and Vision of the Future
Want more? Follow me on Twitter! @ricklamers