Open Source models take flight: fine-tuning, grammar decoding & stronger base models
Week 41 of Coding with Intelligence
This issue contains many resources related to using Open Source LLMs. You can do more useful things with Open Source LLMs than ever before. Have fun!
📰 News
Explore the API surface for a glimpse of what's next.
📦 Repos
Like other SERP (e.g. DuckDuckGo) tools in LangChain but more optimized for LLM use.
Be careful, this context window probably doesn't pay attention to all tokens.
Open LLMetry: open-source observability for your LLM application, based on OpenTelemetry
📄 Papers
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Generating synthetic traces for LLM fine-tuning
Ring Attention with Blockwise Transformers for Near-Infinite Context
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Authors from UC Berkeley, University of Washington, UCSD, CMU, MBZUAI, USC
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Are manual prompt writing days behind us? Evolutionary algorithms have historically been good at black box optimizations. This paper outlines how to build a meta prompting strategy where a system (EA based) optimizes the prompts against a performance metric. Video walkthrough available by Yannic Kilcher https://www.youtube.com/watch?v=tkX0EfNl4Fc
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Prompting strategies for more faithful LLM model reasoning
Large Language Models as Analogical Reasoners
Generate (few-shot) CoT examples and get the benefit of CoT prompting without the overhead of curating examples. Great prompt engineering result! Give this a read.
Who's Harry Potter? Approximate Unlearning in LLMs
Exploration from Microsoft in unlearning information stored in LLM weights. Aka Robot Lobotomy.
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models
Yes you read that right, diffusion models for text generation. The paper contains a plot showing it outperforms GPT2 on BLEU.
📚 Resources
Building agents? Perhaps consider being compatible with the outlined protocol.
Want more? Follow me on Twitter! @ricklamers