GPT-3.5 Turbo fine-tuning has arrived here's why it's a HUGE deal
Week 34 of Coding with Intelligence
π° News
GPT-3.5 Turbo fine-tuning is here
If youβve been paying attention to GenAI products & features coming out over the past months you must have come across this meme:It seemed for a long time that it would be very difficult or impossible to build a differentiated experience to other startups because under the hood every startup/incumbent team was calling the same OpenAI model with slightly different prompts.
With the advent of more sophisticated RAG-based prompting the delta between various features & products has started to become larger but itβs still quite difficult, especially for some use-cases, to build meaningful differentiation.
The most heavyweight option to differentiate was fine-tuning open source base models for specific task, but that has only slowly started to become more feasible with more powerful base models becoming available as open source LLMs.
With this release from OpenAI however they make building a differentiated offering a matter solely of dataset collection & curation. This increases the degrees of freedom (beyond prompting techniques like RAG) for building differentiated features in terms of raw performance/accuracy/style, whatever your task metric may be.
In the long run, many use cases will still benefit from the advantages from open source LLMs compared to fine-tuning on a proprietary API and much more will happen here over the next few months/years.
But right this second, the best way to differentiate your LLM-powered solutions is to fine-tune GPT-3.5 Turbo with the best task dataset you can build (similar to the Textbooks Are All You Need paper). Especially considering emergent abilities can be achieved (as per this Stanford LLM talk by Jason Wei, OpenAI/ex-Google) with the right combination of fine-tuning & data curation.
π¦ Repos
Rust based Vector support for Postgres
Looks like the project is early but Rust has the potential to build very performant similarity search stretching the scale that can be handled in a Postgres database. The big advantage is no need for replication between your OLTP and dedicated Vector database. Will be interesting how they will compare to performance focused pgvector alternative pg_embedding by Neon https://github.com/neondatabase/pg_embedding
DS-1000 Coding benchmark by HK University that goes way beyond HumanEval
π Papers
Composable Function-preserving Expansions for Transformer Architectures
Since pre-training is so expensive it's good to know we can expand the size of pre-trained models while preserving its performance. This is a great addition to the degrees of freedom that empower success with fine-tuning.
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
A novel way to think about LLM reasoning (de-)composition. For hard tasks this tool might be able to get you from failure to success (in-case prompting schemes like Chain-of-Thought or Tree of Thoughts don't get you there).
Reinforced Self-Training (ReST) for Language Modeling: LLMs improving themselves
Exciting result by DeepMind showing we can utilize LLMs themselves to generate data that can be used to continue improving the LLM itself through further training.
Platypus: tops HF Open LLM Leaderboard
From a group of authors at Boston University. Their main contribution is a high quality fine-tuning dataset called Open-Platypus. Their fine-tuning process is very inexpensive, from the paper "a 13B Platypus model can be trained on a single A100 GPU using 25k questions in 5 hours".
π Resources
Together AI release Llama-2-7B-32k-Instruct model
Great to see Together AI push what's possible on context length with Open Source models! Available on Hugging Face https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct
How good can you get with Fine-Tuning Llama 2?
You can get a 7B model to beat GPT-4 in specific tasks!
Want more? Follow me on Twitter! @ricklamers