Model Merging for MoEs and improved DPO: this is how Mixtral is being transformed by the Open Source community
Week 3 of Coding with Intelligence
It outperforms the Mixtral Instruct model on a few tasks. Available on Together AI through their API.
You can try it out here.
By ex-DeepMind researcher Scott Swingle.
Includes HF space, demo videos, code & weights. Original here.
Interesting to see DeepSeek team come out ahead: they've already released a MoE based coding model
Improvement on top of QLoRA
Behavior from LLMs hidden behind certain activation sequences. Causing safety detections to fail to pickup on malicious generation patterns. You know, a bit like the Volkswagen scandal.
By Princeton and Tsinghua researchers
It’s awesome to see text-2-video progressing 🔥
By the folks from LlamaIndex
Want more? Follow me on Twitter! @ricklamers