Anthropic Circuits Updates: your best bet for understanding LLMs?

Jul 31, 2024

📰 News

Mistral Large 2: 123B with 128k context at ~Llama 3.1 405B level
123B parameters, 128k-context model supporting 80+ coding languages, achieving 84% MMLU accuracy, multilingual tasks (including low-resource languages), GPT-4/Claude-level performance on code generation, math, and reasoning benchmarks while prioritizing inference efficiency on single nodes. Unfortunately, it's not permissively licensed, research license + open weights.
Self-Directed Synthetic Dialogues and Revisions dataset released by AllenAI
The most significant gap they are looking to file is licensing restrictions and an increased number of turns. Data is available on Hugging Face allenai/sdsd-dialogues allenai/sdsd-revisions a great gift to the community of open source AI.
Amazon Q Developer Agent reaches #1 on SWE-bench
See https://www.swebench.com/

📦 Repos

📄 Papers

Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
The Larger the Better? Improved LLM Code-Generation via Budget Reallocation
Interesting but relatively well known result by FAIR research, I think the principle should be "if you can evaluate an answer fairly well/cheaply using a judge, prompting smaller models multiple times is preferred over prompting a large model once". Of course latency might suffer because of required sequencing of inferences (draw samples, judge samples, send synthesis/selected sample). Judging is easier in code contexts where unit tests can guide the judging process.
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Interesting attempt to quantify the role of the vocab size in model scaling.
Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning
They've released the model weights on Hugging Face
Apple Intelligence Foundation Language Models paper
Surprisingly Google's GCP TPUs not GPUs. Alas no size details for AFM-server, their larger, Llama 3 rivaling server-side foundation model.
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
Models on Hugging Face and code on GitHub

📱 Demos

🛠️ Products

Meshy AI: text-2-3D by MIT PhD
Really impressive generation results, although it can take a while to generate.

📚 Resources

Want more? Follow me on X! @ricklamers

Coding with Intelligence