Running Local LLMs on an AMD APU Laptop with 56GB Unified Memory

Mon, 13 Apr 2026 10:00:00 +1000

I recently got a Lenovo ThinkPad P14s Gen 6 with the AMD Ryzen AI 9 HX PRO 370 and 56GB of LPDDR5x RAM. I wanted to see what I could actually run on it for local LLM inference, and it turns out you can run pretty large models if you know how to get around a couple of gotchas with the AMD iGPU memory model.

The short version: I’m running Qwen3.5-35B-A3B (a 35 billion parameter MoE model) and Gemma-4-26B-A4B (26B, also MoE) locally, served as an OpenAI-compatible API accessible from other machines on my network. No discrete GPU required. I’ve since put them to work on a real batch classification task (triaging ~4000 GitHub issues for the MicroPython project) and compared their output quality against Claude Sonnet.

Gemma on alelec blog

Running Local LLMs on an AMD APU Laptop with 56GB Unified Memory