The 100M token context window has arrived

Sep 04, 2024

📰 News

📦 Repos

LLM Compressor library by Neural Magic
"before today, creating quantized checkpoints required navigating a fragmented ecosystem of bespoke compression libraries such as AutoGPTQ, AutoAWQ, AutoFP8, etc. We built LLM Compressor from the ground up as a single library for applying the latest compression best practices, including GPTQ, SmoothQuant, SparseGPT, and RTN"

📄 Papers

Beyond Preferences in AI Alignment
"Instead of alignment with the preferences of a human user, developer, or humanity-writ-large, AI systems should be aligned with normative standards appropriate to their social roles, such as the role of a general-purpose assistant." interesting take on alignment.
Memory-Efficient LLM Training with Online Subspace Descent
"In this work, we provide the first convergence guarantee for arbitrary update rules of projection matrix." that's an impressive feat for low-rank training and will reduce the memory load of LLM training for many projects.
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
ContextCite: Attributing Model Generation to Context
One of the key parts of reducing hallucinations is direct referencing of context window sources, this paper introduces a general technique and includes working code at https://github.com/MadryLab/context-cite

📱 Demos

📚 Resources

Want more? Follow me on X! @ricklamers

Coding with Intelligence