llama.cpp supports Sparse MoE, new Qwen3.6 GGUF, & WebWorld for local agents Today's Highlights Today's local AI news features a significant llama.cpp update adding support for Xiaomi's Mimo v2.5 Sparse MoE model, enhancing architectural diversity for local inference. Additionally, a new uncensored Qwen3.6 27B model has been released in GGUF, alongside a Qwen3-based WebWorld series for local web agent development. llama.cpp Adds Support for Xiaomi's Mimo v2.5 Sparse MoE Model (r/LocalLLaMA) Source: https://reddit.com/r/LocalLLaMA/comments/1t67lvx/feat_add_mimo_v25_model_support_by_aessedai_pull/ The popular llama.cpp project, a C/C++ inference engine for LLMs, has merged a pull request adding support for the Xiaomi MiMo-V2.5 model. MiMo-V2.5 is a Sparse Mixture of Experts (MoE) model with an impressive 310 billion total parameters, activating 15 billion parameters during inference.…