Can we patch recurrent models? TTT & read twice

Jul 10, 2024

📰 News

📦 Repos

Builder.io launches micro-agent SWE agent
Similar to e.g. Aider
AiEleuther has released a set of top-k sparse autoencoders for every layer of Llama 3 8B
Very interesting (open!) interpretability work. Golden Gate Llama when?
PufferLib: Simplifying reinforcement learning for complex game environments
TLDR a bunch of nice abstractions to make reinforcement learning on complex environments less painful. Nice video demonstration.

📄 Papers

Learning to (Learn at Test Time): RNNs with Expressive Hidden States
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
Dynamically accessing memory intelligently (LLM driven dynamic RAG) is becoming a crucial pattern for agentic LLM apps. Cool paper highlighting one such approach.
Distilling System 2 into System 1
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
"demonstrating for the first time how to train a large-scale binary language model from scratch (not the partial binary or ternary LLM like BitNet b1.58) to match the performance of its full-precision counterparts (e.g., FP16 or BF16) in transformer-based LLMs"
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Data curation via joint example selection further accelerates multimodal learning
"13× fewer iterations and 10× less computation". By Google DeepMind

📱 Demos

🛠️ Products

Poe launches Artifacts clone
This is a solidification of the pattern of LLMs generating useful, self-contained programs that can directly be used by users.

📚 Resources

Want more? Follow me on X! @ricklamers

Coding with Intelligence