📰 News
Answer.AI releases toolkit for training 70B models on two 3090s
By Tim Dettmers, HF staff and Answer.AI (Jeremy Howard's R&D lab). Benchmarks are in the works. There’s also a repo on GitHub.
📦 Repos
Function calling optimized for Qwen
Transformer Debugger by OpenAI
Here’s a Loom intro video.
AICI: Prompts as (Wasm) Programs
Very cool project by Microsoft. The project aims to standardize on using a WASM runtime to facilitate structured LLM inference where an explicit control flow is followed as specified by the user. Check out the "generating a numbered list" example in the README. This could serve as the backbone for projects like Guidance, LMQL, SGLang, Outlines, jsonformer, LMFE, etc.
📄 Papers
Yi: Open Foundation Models by 01.AI
Lots of useful information about model pre-training
Is Cosine-Similarity of Embeddings Really About Similarity?
I’ve always felt cosine similarity is a crude approximation of actually semantic document retrieval. This paper by Netflix shines a light on some of its flaws.
Efficient Tool Use with Chain-of-Abstraction Reasoning
Collaboration between EPFL and Meta. Very interesting work on function calling/agentic tool use by LLMs.
🛠️ Products
Another cool low-latency TTS provider I came across. Check out this video demo using Groq's LLM API.
I like their chat-to-webpages (search) and chat-to-document modes. Their citation system is also nicely implemented.
Devin - an AI Software Engineer by Cognition
They pass 10% on SWE bench, which is a reputable eval from a group at Princeton submitted to ICLR 2024, while GPT-4 is stuck below 5%.
📚 Resources
Training great LLMs entirely from ground zero in the wilderness as a startup
Here's an idea. You can recursively apply this: use Genstruct-7B to create Genstruct-7B-v2 finetuning instructions. On to v3, v4, ... vN.
New benchmark leaderboard for LLMs by reputable Allen Institute for AI
Shows Claude 3 comes in below last GPT-4 (0125-preview) and Mistral Large scoring above Gemini 1 Pro.
Want more? Follow me on Twitter! @ricklamers
It’s ironic, and pathetic, that most of these platforms are closed. Yes, I know, or realize, many also have a quasi-open model requiring one to leverage their infrastructure, but, you’d think at this stage of evolution in Tech, we’d be past the greed mindset of the closed model. I presume their argument would be either a matter of safety, as in, their particular model’s probability to be used nefariously, or, that it justifies all of their hard work and expense that they need to profit from.
In the past closed models, did any of the Tech proprietors not benefit from apache.org?
The greed model never ends I guess.