Open Source models take flight: fine-tuning, grammar decoding & stronger base models

Oct 12, 2023

This issue contains many resources related to using Open Source LLMs. You can do more useful things with Open Source LLMs than ever before. Have fun!

📰 News

📦 Repos

📄 Papers

MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Generating synthetic traces for LLM fine-tuning
Ring Attention with Blockwise Transformers for Near-Infinite Context
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Authors from UC Berkeley, University of Washington, UCSD, CMU, MBZUAI, USC
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Are manual prompt writing days behind us? Evolutionary algorithms have historically been good at black box optimizations. This paper outlines how to build a meta prompting strategy where a system (EA based) optimizes the prompts against a performance metric. Video walkthrough available by Yannic Kilcher https://www.youtube.com/watch?v=tkX0EfNl4Fc
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Prompting strategies for more faithful LLM model reasoning
Large Language Models as Analogical Reasoners
Generate (few-shot) CoT examples and get the benefit of CoT prompting without the overhead of curating examples. Great prompt engineering result! Give this a read.
Who's Harry Potter? Approximate Unlearning in LLMs
Exploration from Microsoft in unlearning information stored in LLM weights. Aka Robot Lobotomy.
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models
Yes you read that right, diffusion models for text generation. The paper contains a plot showing it outperforms GPT2 on BLEU.

📚 Resources

Want more? Follow me on Twitter! @ricklamers

Coding with Intelligence