Big update to Google's Bard and OpenAI ships a new completion model (with logprobs!)

Week 38 of Coding with Intelligence

Sep 20, 2023

📰 News

Google Bard ships Extensions update
This is a very powerful idea: it integrates data from Drive, Docs, Gmail, YouTube, Maps, Flights, and Hotels. Check out how well it does on your data.
OpenAI ships fine-tuning UI
OpenAI releases gpt-3.5-turbo-instruct, completion model
Apparently it reaches 1800 Elo in chess https://twitter.com/GrantSlatton/status/1703913578036904431 and https://gist.github.com/grantslatton/8ae9d5bfe0f9e26bb5211a32b799abd3 Note the endpoint includes logprobs that can sometimes be useful, read more here: https://twitter.com/alexgraveley/status/1704169124467749090

📦 Repos

Marvin: minimalist LangChain alternative
Some cool ideas in here to treat LLMs as function calls with structured output in a very Python-native style. To get a real-world example, the chess example Gist link of gpt-3.5-turbo-instruct in this issue uses it.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Main author is also behind the https://vllm.ai/ project. Supervised by the inventor of Flash Attention, Tri Dao. There's also a blog post https://sites.google.com/view/medusa-llm

📄 Papers

Language Modeling Is Compression
Compression Is All You Need? Also check out this earlier post of Ilya Sutskever's talk about unsupervised learning as compression https://www.youtube.com/watch?v=AKMuA_TVz3A
Scaling Laws for Sparsely-Connected Foundation Models
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Nougat: OCR for Academic Documents
Implementation is also available on GitHub https://github.com/facebookresearch/nougat Exciting how many high quality tokens this will unlock for training/retrieval.
DeepMind paper on letting LLMs optimize prompts
tl;dr 8% GSM8K and 50% Big-Bench Hard performance by letting the LLM find a better prompt. Paper title "Large Language Models as Optimizers"
Efficient Memory Management for Large Language Model Serving with PagedAttention

📱 Demos

v0.dev: Midjourney for React components by Vercel

📚 Resources

Google's Jeff Dean and Amin Vahdat reveal some TPU/LLM scaling challenges
Generative AI Infra & Market Map by Sequoia Capital
Useful to get a "lay of the land". The infra map is very helpful if you're a builder.
Pallas: a JAX extension similar to Triton
In case you forgot about Triton, it’s OpenAI’s framework that tries to be in the Goldilocks zone for ML engineers working on neural network architectures.

Want more? Follow me on Twitter! @ricklamers

Discussion about this post

No posts

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts