llama.cpp supports Sparse MoE, new Qwen3.6 GGUF, & WebWorld for local agents

1 / 2

llama.cpp supports Sparse MoE, new Qwen3.6 GGUF, & WebWorld for local agents

DEV Community·soy·25 days ago

#sXCpNozE

#llamacpp #ai #llm #selfhosted #model #local

Reading 0:00

15s threshold

llama.cpp supports Sparse MoE, new Qwen3.6 GGUF, & WebWorld for local agents Today's Highlights Today's local AI news features a significant llama.cpp update adding support for Xiaomi's Mimo v2.5 Sparse MoE model, enhancing architectural diversity for local inference. Additionally, a new uncensored Qwen3.6 27B model has been released in GGUF, alongside a Qwen3-based WebWorld series for local web agent development. llama.cpp Adds Support for Xiaomi's Mimo v2.5 Sparse MoE Model (r/LocalLLaMA) Source: https://reddit.com/r/LocalLLaMA/comments/1t67lvx/feat_add_mimo_v25_model_support_by_aessedai_pull/ The popular llama.cpp project, a C/C++ inference engine for LLMs, has merged a pull request adding support for the Xiaomi MiMo-V2.5 model. MiMo-V2.5 is a Sparse Mixture of Experts (MoE) model with an impressive 310 billion total parameters, activating 15 billion parameters during inference.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

llama.cpp supports Sparse MoE, new Qwen3.6 GGUF, & WebWorld for local agents