Meta gets behind Open Source Large Language Models

Week 24 of Coding with Intelligence

Jun 18, 2023

Welcome to the first edition of Coding with Intelligence, CoWI for short. Stay up-to-date on the latest about LLMs and ML. This new paradigm of building in the software stack with an “oracle” LLM component that can perform common sense reasoning in the “inner-loop” of your applications is incredibly exciting. I hope you’re along for the ride!

📰 News

Next version of Meta’s LLaMA likely permissible for commercial use

geohot started a company to build the ML software stack for AMD GPUs and raised $5.1M

📦 Repos

OpenAI Triton

A library for writing neural network code at an abstraction level higher than CUDA but lower than PyTorch. Get the control you need without dealing with the complexity of CUDA. This could break open the monopoly position NVIDIA has had as Triton will likely be able to function on AMD accelerator hardware as ROCm support is in development.

FauxPilot – self-hosted Copilot with Open Source LLMs

Once open source models rival our outperform certain proprietary code generation models this could be a simple way to hook them into your IDE. This is very interesting for teams that have proprietary code that can’t be sent to third parties.

ggml

This is a new umbrella project for llama.cpp and whisper.cpp. The author, Georgi Gerganov, also announced he’s forming a company for the project as he raised money from Nat Friedman (CEO GitHub) and Daniel Gross (ex-YC AI, ex-Apple ML).

Mojo – a Python based language for accelerated AI/ML code

Writing CUDA isn’t always a walk in the park. Hardware maker Modular has a chance to create a more seamless experience from the comfort of Python. To keep an eye on: is this Modular only or will other hardware makers get behind it too (TPUs seem supported).

📄 Papers

Voyager: An Open-Ended Embodied Agent with Large Language Models

NVIDIA, Stanford, et al. use GPT-4 as a brain for a Minecraft bot. Spoiler: it beats all the other agent approaches. Based on earlier released MineDojo gym.

📱 Demos

h2oGPT

A ChatGPT UI of open source LLMs. Currently hosting Falcon 7B + Falcon 40B both fine-tuned on the Open Assistant dataset.

🛠️ Products

Vercel AI SDK

Since a lot of AI is behind a simple HTTP call these days, consuming them directly from Next.js applications becomes more and more sensible. This SDK should streamline that process for Vercel users.

📚 Resources

Open Source nanoGPT Transformer implementation from scratch (1:56:20) by Andrej Karpathy (OpenAI, ex-Tesla AI)

H100 and A100 availability at various cloud providers

Want more? Follow me on Twitter! @ricklamers

Coding with Intelligence

Meta gets behind Open Source Large Language Models

Week 24 of Coding with Intelligence

Discussion about this post