Amazon working on 2T AI model & Modular Deep Learning to the rescue for OS?

Nov 15, 2023

📰 News

📦 Repos

📄 Papers

Modular Deep Learning
Interesting stream of research, this provides a survey of modular architectures. "In this framework, units of computation are often implemented as autonomous parameter-efficient modules. Information is conditionally routed to a subset of modules and subsequently aggregated." Could be a great way for multiple OS specialized smaller models to become powerful in union (for example for Agent AI type workloads that depend on specialized knowledge throughout their action taking).
Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
Better decontamination is key to trusting benchmark results. With training datasets getting larger & larger and not always having access to the training data we need more sophisticated tools to discover contamination issues. Improved benchmarks helps us understand whether models are actually improving when changes are made in addition to aiding LLM selection. Great accompanying blog post https://lmsys.org/blog/2023-11-14-llm-decontaminator/
Language Models can be Logical Solvers
Interesting idea of fine-tuning on traces generated by a symbolic (logic evaluation) system. LLMs might be more flexible than commonly believed.

📱 Demos

Tokenz viz utility by Rahul (Lead AI at Ramp)
AudioSR - Audio Super-resolution: demo + paper
Very impressive, especially for muffled audio clips. This could be great for enhancing old content.

📚 Resources

Reasonable definition of Agent AI by Founder/CTO HubSpot
OWASP LLM Top 10 security guidelines
You better check your prompt injections :)
Benchmarking GPT-4 Turbo - A Cautionary Tale
Shows subtly depending on the information in the LLM weights can cause performance drops, even though the underlying LLM might not be less competent in performing the task given sufficient and unambiguous task information.
OpenAI DevDay Breakout Sessions (YouTube playlist)
Adversarial Attacks on LLMs
by Lilian Weng who works on AI safety at OpenAI
Scaling multimodal understanding to long videos by Google AI
At 3B the model is significantly smaller than previous attempts. From the blog post "At 3B parameters, Mirasol3B is compact compared to prior Flamingo (80B) and PaLI-X (55B) models. Finally, Mirasol3B outperforms the state-of-the-art approaches on video question answering (video QA), long video QA, and audio-video-text benchmarks."
Talk: SkyATC, Rethinking LLM Serving Stack (Berkeley, 10/20/2023)
Really interesting ideas, e.g. Conex an alternative to Docker containers more suitable for "fat" images needed for model serving.

Want more? Follow me on Twitter! @ricklamers

Coding with Intelligence