Coding with Intelligence

Share this post

Will Modelfile be the new Dockerfile for OSS LLMs?

codingwithintelligence.com

Discover more from Coding with Intelligence

CoWI is a weekly newsletter covering the latest developments in Large Language Models and Machine Learning. Get the latest News, Repos, Demos, Products, and Papers.
Over 2,000 subscribers
Continue reading
Sign in

Will Modelfile be the new Dockerfile for OSS LLMs?

Week 30 of Coding with Intelligence

Rick Lamers
Jul 26, 2023
10
Share this post

Will Modelfile be the new Dockerfile for OSS LLMs?

codingwithintelligence.com
Share

πŸ“° News

  • Modelfile: a path to a "Dockerhub" for OSS models?

    Ollama described the Model File as "A model file is the blueprint to create and share models with Ollama." See https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md for details. Interestingly, the author previously created the Dockerfile spec, he's now at JP Morgan building Ollama. Mistake: jmorganca is just his GH handle!

  • TypeChat: all you need is types

    An npm tool released by the inventor of TypeScript. The core idea is to use types in your prompt to specify what you expect your output format to be, and to be able to validate the output using that typed definition you provided in the prompt. Very neat! Who’s building the Python version?

πŸ“¦ Repos

  • Grammar based sampling for Llama in ggml

    This is a really interesting variant for more guided language model sampling. CFGs are flexible enough to encode typed syntax like TypeScript or Python syntax all the way to simple YAML and JSON. ggml on fire? Thank you Evan Jones from Palo Alto Networks for contributing his work & ideas.

  • Ollama: Get up and running with large language models locally

    Interesting detail: it uses GGML under the hood.

  • SimPer: Simple Self-Supervised Learning of Periodic Targets

    SimPer introduces a self-supervised representation learning technique that addresses the shortcomings of previous self-supervised approaches. Specifically, previous methods overlook the intrinsic periodicity (i.e., the ability to identify if a frame is part of a periodic process) in data and fail to learn robust representations that capture periodic or frequency attributes. As a self-supervised method it has the ability to train on large amounts of unlabeled data, opening up the possibility of learning complex patterns in real-world data.

πŸ“„ Papers

  • SpecInfer: Accelerating Generative LLM Serving with Speculative Inference

    There's no shortage of ideas for accelerating inference. This paper from CMU introduces a neat tree-based inference approach that combines multiple smaller models.

  • Amazon Research introduces SCOTT: Self-consistent chain-of-thought distillation

    β€œTo curb hallucination, on the teacher side, we use contrastive decoding, which ensures that the rationales generated for true assertions differ as much as possible from the rationales generated for false assertions.” Hopefully using techniques like these we can welcome a future where hallucinations in LLMs are a thing of the past. https://arxiv.org/abs/2305.01879

πŸ“± Demos

  • RunwayML Gen-2 Image-to-Video

    Some of these are an eerily futuristic glimpse into what's to come for content production. Hollywood should probably pay attention to this Silicon powered wave.

πŸ“š Resources

  • ONNX Runtime

    You might be familiar with ONNX as a format for storing neural network weights and architecture. ONNX Runtime standardizes inference APIs and is able to dynamically adjust to an impressive list of environments. Currently handling over 1 trillion inferences a day this is clearly a production ready project.


Want more? Follow me on Twitter! @ricklamers

10
Share this post

Will Modelfile be the new Dockerfile for OSS LLMs?

codingwithintelligence.com
Share
Comments
Top
New
Community

No posts

Ready for more?

Β© 2023 Rick Lamers
Privacy βˆ™ Terms βˆ™ Collection notice
Start WritingGet the app
Substack is the home for great writing