logoalt Hacker News

a3b_unknownyesterday at 4:57 PM1 replyview on HN

What is the meaning of 'A3B'?


Replies

simonwyesterday at 5:00 PM

It's the number of active parameters for a Mixture of Experts (misleading name IMO) model.

Qwen3.5-35B-A3B means that the model itself consists of 35 billion floating point numbers - very roughly 35GB of data - which are all loaded into memory at once.

But... on any given pass through the model weights only 3 billion of those parameters are "active" aka have matrix arithmetic applied against them.

This speeds up inference considerably because the computer has to do less operations for each token that is processed. It still needs the full amount of memory though as the 3B active it uses are likely different on every iteration.

show 1 reply