Microsoft’s new small language model can run locally on Windows 11 PCs using encoder-decoder approach that reduces AI compute cycle, breaking down large queries into more compact representation • DigiBanker

Microsoft has announced a small language model called Mu, which will enable the running of generative AI (genAI) agents on Windows without internet connectivity. Mu uses the neural processing units (NPUs) of Copilot PCs, which are provided by three chip makers — Intel, AMD, and Qualcomm. The model provides a better understanding and context of queries and is designed to operate efficiently, delivering high performance while running locally. Microsoft is pushing genAI features into the core of Windows 11 and Microsoft 365, introducing a new developer stack called Windows ML 2.0 last month for developers to make AI features accessible in software applications. The 330-million parameter Mu model is designed to reduce AI computing cycles so it can run locally on Windows 11 PCs, as laptops have limited hardware and battery life and need a cloud service for AI. The model also generates high-quality responses with a better understanding of queries. The encoder-decoder model breaks down large queries into a more compact representation of information, which is then used to generate responses. The encoder-decoder approach is significantly faster than large language models (LLMs), such as Microsoft’s Phi-3.5, which is a decoder-only model.

Read Article