π° News
LMSYS Chatbot Arena launches Math and Instruction Following leaderboard
Anole: putting image generation back into Metaβs Chameleon
Itβs interesting because Meta probably removed the capability of image gen from Chameleon for legal reasons.
π¦ Repos
Builder.io launches micro-agent SWE agent
Similar to e.g. Aider
AiEleuther has released a set of top-k sparse autoencoders for every layer of Llama 3 8B
Very interesting (open!) interpretability work. Golden Gate Llama when?
PufferLib: Simplifying reinforcement learning for complex game environments
TLDR a bunch of nice abstractions to make reinforcement learning on complex environments less painful. Nice video demonstration.
π Papers
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
Dynamically accessing memory intelligently (LLM driven dynamic RAG) is becoming a crucial pattern for agentic LLM apps. Cool paper highlighting one such approach.
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
"demonstrating for the first time how to train a large-scale binary language model from scratch (not the partial binary or ternary LLM like BitNet b1.58) to match the performance of its full-precision counterparts (e.g., FP16 or BF16) in transformer-based LLMs"
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Data curation via joint example selection further accelerates multimodal learning
"13Γ fewer iterations and 10Γ less computation". By Google DeepMind
π± Demos
Not SoTA level but interesting study artifact!
π οΈ Products
This is a solidification of the pattern of LLMs generating useful, self-contained programs that can directly be used by users.
π Resources
Just read twice: closing the recall gap for recurrent language models
Extrinsic Hallucinations in LLMs
Another banger by Lilian Weng from OpenAI
torch.compile: the missing manual
By Edward Z. Yang from the PyTorch team at Meta
CapPa: Training vision models as captioners
Very neat W&B report that is an "Open-source reproduction of "Image Captioners are Scalable Vision Learners Too"
ColPali: Efficient Document Retrieval with Vision Language Models π
Want more? Follow me on X! @ricklamers