Discover more from Coding with Intelligence
Amazon working on 2T AI model & Modular Deep Learning to the rescue for OS?
Week 46 of Coding with Intelligence
Interesting stream of research, this provides a survey of modular architectures. "In this framework, units of computation are often implemented as autonomous parameter-efficient modules. Information is conditionally routed to a subset of modules and subsequently aggregated." Could be a great way for multiple OS specialized smaller models to become powerful in union (for example for Agent AI type workloads that depend on specialized knowledge throughout their action taking).
Better decontamination is key to trusting benchmark results. With training datasets getting larger & larger and not always having access to the training data we need more sophisticated tools to discover contamination issues. Improved benchmarks helps us understand whether models are actually improving when changes are made in addition to aiding LLM selection. Great accompanying blog post https://lmsys.org/blog/2023-11-14-llm-decontaminator/
Interesting idea of fine-tuning on traces generated by a symbolic (logic evaluation) system. LLMs might be more flexible than commonly believed.
Very impressive, especially for muffled audio clips. This could be great for enhancing old content.
You better check your prompt injections :)
Shows subtly depending on the information in the LLM weights can cause performance drops, even though the underlying LLM might not be less competent in performing the task given sufficient and unambiguous task information.
by Lilian Weng who works on AI safety at OpenAI
At 3B the model is significantly smaller than previous attempts. From the blog post "At 3B parameters, Mirasol3B is compact compared to prior Flamingo (80B) and PaLI-X (55B) models. Finally, Mirasol3B outperforms the state-of-the-art approaches on video question answering (video QA), long video QA, and audio-video-text benchmarks."
Really interesting ideas, e.g. Conex an alternative to Docker containers more suitable for "fat" images needed for model serving.
Want more? Follow me on Twitter! @ricklamers