Back to AI intel
重点
llama.cpp Release b9745 Adds MTP3 Flash Speculation Support
AI intel briefing
Core summary
One sentence to understand this update
llama.cpp released version b9745, which includes support for Step3.5/3.7 flash MTP3, along with new API functionalities for MTP layer offset and graph reuse.
Impact & opportunity
What this could mean
Builders working with llama.cpp can now achieve faster and more efficient speculative inference for supported models, potentially improving local LLM performance.
Source
View original