The simplification is correct but not correctly stated (even though colloquially it's common to state it that way). Technically it's when some component of the algorithm is dwarfed by another, then you can exclude it (i.e. O(n+W) =~ O(n) when W <<< n, O(mn) =~ O(n) when m <<< n etc). It's colloquially shortened to "when it's a constant" because generally it's constant regardless of program execution whereas the variable parts of the formula actually do change and thus the big-O analysis does still give you a correct lower bound on the performance.
TLDR: You're being unhelpfully pedantic.